A Bayesian Hierarchical CACE Model Accounting for Incomplete Noncompliance With Application to a Meta-analysis of Epidural Analgesia on Cesarean Section

Jincheng Zhou; James S Hodges; Haitao Chu

doi:10.1080/01621459.2021.1900859

. Author manuscript; available in PMC: 2022 Mar 7.

Published in final edited form as: J Am Stat Assoc. 2021 Apr 27;116(536):1700–1712. doi: 10.1080/01621459.2021.1900859

A Bayesian Hierarchical CACE Model Accounting for Incomplete Noncompliance With Application to a Meta-analysis of Epidural Analgesia on Cesarean Section

Jincheng Zhou ^1,^*, James S Hodges ², Haitao Chu ^2,^*

PMCID: PMC8901124 NIHMSID: NIHMS1687558 PMID: 35261417

Abstract

Noncompliance with assigned treatments is a common challenge in analyzing and interpreting randomized clinical trials (RCTs). One way to handle noncompliance is to estimate the complier-average causal effect (CACE), the intervention’s efficacy in the subpopulation that complies with assigned treatment. In a two-step meta-analysis, one could first estimate CACE for each study, then combine them to estimate the population-averaged CACE. However, when some trials do not report noncompliance data, the two-step meta-analysis can be less efficient and potentially biased by excluding these trials. This paper proposes a flexible Bayesian hierarchical CACE framework to simultaneously account for heterogeneous and incomplete noncompliance data in a meta-analysis of RCTs. The models are motivated by and used for a meta-analysis estimating the CACE of epidural analgesia on cesarean section, in which only 10 of 27 trials reported complete noncompliance data. The new analysis includes all 27 studies and the results present new insights on the causal effect after accounting for noncompliance. Compared to the estimated risk difference of 0.8% (95% CI: −0.3%, 1.9%) given by the two-step intention-to-treat meta-analysis, the estimated CACE is 4.1% (95% CrI: −0.3%, 10.5%). We also report simulation studies to evaluate the performance of the proposed method.

Keywords: Bayesian methods, causal effect, missing data, randomized trial, meta-analysis

1. Introduction

Well-conducted randomized controlled trials (RCTs) are considered the hallmark of evidence-based medicine and the gold standard for evaluating efficacy and safety in clinical research. However, noncompliance with treatment assignment and missing data occur frequently in clinical trials and can affect their validity. Noncompliance occurs when some participants do not take or receive their assigned treatments. Missing outcome or compliance status happens when study investigators do not collect those items on some subjects because of loss to follow-up or other reasons. Ignoring noncompliance or missing data may lead to biased estimates of causal effects in the standard intention-to-treat (ITT) analysis.

Many methods have been developed for analyzing a single study with noncompliance, or noncompliance together with missing outcome data. Baker and Lindeman (1994) and Angrist et al. (1996) independently estimated the effect of treatment using the latent class instrumental variable (IV) method. Later on, Frangakis and Rubin (2002) proposed a principal stratification framework to estimate the complier average causal effect (CACE) with binary compliance status. An extensive literature uses this framework to estimate CACE for different types of outcome in a single study with noncompliance (Yau and Little, 2001; Ye et al., 2014; Cheng, 2009). When a study has both noncompliance and missing outcome data, the CACE approach can still be used but further assumptions about the missing data mechanism are required. One commonly used assumption is “latent ignorability” (LI), which means that the missing data are missing at random conditional on compliance status, i.e., missingness has no residual dependence on the outcomes, given the observed data and the latent unobserved compliance classes. Under this assumption, several models that accommodate missing outcomes have been developed for inference about CACE (O’Malley and Normand, 2005; Peng et al., 2004). Chen et al. (2009) discussed identifiability and estimation of CACE with missing outcome data under a nonignorability assumption, i.e., when the missing data mechanism depends on the unobserved outcome. Analytical strategies for handling noncompliance are also increasingly used (Jo et al., 2010; Stuart et al., 2008), although not as widely as missing data methods.

Although inference for a clinical trial with noncompliance or missing data has been well studied, little attention has been paid to handling both missing data and noncompliance in a meta-analysis. Meta-analysis, the statistical approach for synthesizing evidence from multiple studies, is gaining popularity in many fields due to the rapid growth of interest in comparative effectiveness research and evidence-based medicine (Egger et al., 2008). While multivariate and network meta-analysis (NMA) methods have been developed recently for meta-analyses of data consisting of multiple outcomes, multiple treatments, or multiple diagnostic tests (Lumley, 2002; Jackson et al., 2011; Zhang et al., 2014; Riley et al., 2017; Ma et al., 2018; Lian et al., 2019), important research gaps remain in meta-analysis in the area of causal inference. In particular, researchers have only recently started investigating causal effects in meta-analysis accounting for noncompliance (Baker and Kramer, 2005; Baker et al., 2016).

When noncompliance data are reported in each trial, intuitively one can first estimate CACE for each study, then combine these estimates using a meta-analytic method such as a common effect, fixed effects, or random effects model to estimate the population-averaged CACE. We call this naive method a “two-step” approach. The two-step approach — which can be viewed as a special case of a model using only trials with complete noncompliance data — can be less efficient and potentially biased because it excludes trials without noncompliance data. In a meta-analysis of randomized clinical trials, Zhou et al. (2019) proposed a Bayesian hierarchical model to estimate the CACE accounting for heterogeneous noncompliance. However, trials that do not report noncompliance data must be excluded, potentially leading to less efficient and biased estimates (Baker, 2020; Zhou et al., 2020).

In real meta-analyses, it is common that some trials do not report noncompliance data because they may not have been reported in the primary analysis. The present paper’s motivating study, a meta-analysis by Bannister-Tyrrell et al. (2015), has full compliance data reported for only 10 of 27 studies. Their goal was to estimate the causal effect of epidural analgesia in labor on the occurrence of cesarean section, but their analysis included only 9 studies with full compliance data and non-zero cesarean section events. Our proposed Bayesian hierarchical model framework aims to include studies that do not report noncompliance data and studies with zero events. This is the first paper dealing with this important issue. The main purposes are 1) to develop a flexible statistical framework that uses noncompliance data that is both heterogeneous across studies and incomplete in some studies, in a meta-analysis of RCTs with ordinal or binary outcomes; 2) to apply the method to a meta-analysis estimating the CACE of epidural analgesia on cesarean section, and compare it with the traditional two-step ITT meta-analysis.

This rest of this article is organized as follows. Section 2 describes the motivating case study of epidural analgesia, in which noncompliance varies between studies, and compliance status was missing for 17 of 27 studies. Section 3 first presents the assumptions for estimating the causal effect and for missingness, then describes the Bayesian hierarchical model and how to compute the posterior distributions for the overall and study-specific CACEs. Section 4 applies the model to the epidural analgesia case study using a particular approach to model selection and presents an analysis of the results’ sensitivity to the missing data assumptions. Section 5 reports simulation studies evaluating the proposed approach under a variety of conditions. Finally, Section 6 discusses our findings and potential extensions in future work.

2. A Motivating Meta-analysis of the Effect of Epidural Analgesia on Cesarean Section

2.1. Data Sources

Epidural analgesia in labor is a highly effective method of labor pain relief but it remains controversial whether epidural analgesia in labor increases the risk of cesarean section delivery. Solid evidence to support or refute this association is still limited, mainly because RCTs in obstetrics often have high rates of noncompliance.

In this setting, the consequences of receiving epidural analgesia are more important to clinicians and patients than the impact of being assigned to epidural analgesia, thus the ITT analysis, which estimates the difference in cesarean section risk between women randomized to epidural analgesia versus control, can give a biased estimate of the effect of receiving epidural analgesia, due to differential noncompliance. Bannister-Tyrrell et al. (2015) conducted an exploratory meta-analysis of the association between epidural analgesia in labor and cesarean section by using the 9 trials, out of 27 RCTs included in their systematic review, that have full compliance data with non-zero events.

Data were recorded on treatment assignment r (r = 1 for epidural analgesia, r = 0 for no/other analgesia in labor), actual received intervention t (t = 1 for epidural analgesia, t = 0 for no/other analgesia in labor), and frequency of cesarean section o (o = 1 for yes, o = 0 for no) by compliance with the assigned intervention, where noncompliance describes participants who were randomly assigned to receive epidural analgesia in labor but who in fact received either another or no analgesia, or who were assigned to the control group but ultimately received epidural analgesia in labor. Then for study i (i = 1,2, …,I), the count N_irto denotes the number of patients in randomization group r who received intervention t and had outcome o.

The cesarean section event rates and noncompliance rates vary substantially between trials as the inclusion and exclusion criteria, labor management strategies, etc. differ between trials. In the 27 RCTs, 4,459 women were assigned to receive epidural analgesia and 4,426 were assigned to receive non-epidural or no analgesia. Complete data were available on the cesarean outcome, with 470 cesarean deliveries in women assigned to the epidural and 419 cesarean deliveries in women assigned to non-epidural or no analgesia.

However, complete data on the number of cesarean sections in the compliant and noncompliant groups were available for only 10 studies, and data on noncompliance status per randomization group were only partly available for 13 of the 27 RCTs. We use t = ∗ to denote when the actually-received intervention is missing, then reorganize the available complete data and marginal data in Table 1. If N_irto is available for each t ∈ {0, 1}, the corresponding marginal count N_ir∗o was assigned as 0; otherwise if the actual received intervention data for arm r of study i is missing, only the marginal data N_ir∗o are shown in the table.

Table 1:

Data from randomized controlled trials of epidural analgesia in labor

Study	Author, Year	Complete data								Missing data
		Allocated control				Allocated epidural				Allocated control		Allocated epidural
		Received Control		Received epidural		Received Control		Received epidural		Allocated control		Allocated epidural

		Cesarean − N_i000	+ N_i001	Cesarean − N_i010	+ N_i011	Cesarean − N_i100	+ N_i101	Cesarean − N_i110	+ N_i111	Cesarean − N_i0*0	+ N_i0*1	Cesarean − N_i1*0	+ N_i1*1

1	Bofill, 1997 ^†	37	2	11	1	2	0	42	5	0	0	0	0
2	Clark, 1998 ^†	72	6	68	16	7	2	134	13	0	0	0	0
3	Dickinson, 2002	0	0	0	0	0	0	0	0	428	71	408	85
4	Evron, 2008	40	4	0	0	0	0	0	0	0	0	129	19
5	El Kerdawy, 2010	0	0	0	0	0	0	0	0	12	3	11	4
6	Gambling, 1998	0	0	0	0	206	10	371	29	573	34	0	0
7	Grandjean, 1979	0	0	0	0	0	0	0	0	59	1	30	0
8	Halpern, 2004 ^†	62	5	44	7	0	0	112	12	0	0	0	0
9	Head, 2002 ^†	51	7	2	0	3	0	43	10	0	0	0	0
10	Hogg, 2000	0	0	0	0	0	0	0	0	46	6	46	7
11	Howell, 2001	0	0	0	0	0	0	0	0	169	16	171	13
12	Jain, 2003 ^†	72	11	0	0	0	2	36	7	0	0	0	0
13	Long, 2003	0	0	0	0	0	0	0	0	44	6	29	1
14	Loughnan, 2000	0	0	0	0	0	0	0	0	270	40	268	36
15	Lucas, 2001	0	0	0	0	0	0	0	0	304	62	309	63
16	Muir, 1996	0	0	0	0	0	0	0	0	20	2	25	3
17	Muir, 2000	0	0	0	0	0	0	0	0	79	9	86	11
18	Nafisi, 2006 ^†	179	19	0	0	0	0	173	24	0	0	0	0
19	Nikkola, 1997 ^†	6	0	4	0	0	0	10	0	0	0	0	0
20	Philipsen, 1989	0	0	0	0	0	0	0	0	48	6	47	10
21	Ramin, 1995 ^†	546	17	95	8	230	2	393	39	0	0	0	0
22	Sharma, 1997 ^†	336	16	5	0	114	1	231	12	0	0	0	0
23	Sharma, 2002	0	0	0	0	11	1	199	15	213	20	0	0
24	Shifman, 2007	0	0	0	0	0	0	0	0	32	18	45	15
25	Thalme, 1974	0	0	0	0	0	0	0	0	10	4	8	6
26	Thorp, 1993	0	0	0	0	0	0	0	0	44	1	36	12
27	Volmanen, 2008 ^†	23	1	3	0	1	0	23	1	0	0	0	0

Open in a new tab

The † indicates that the corresponding study has complete data on compliance status.

2.2. Analysis of Event Rates and Noncompliance Rates

Bannister-Tyrrell et al. (2015) estimated the effect of epidural analgesia in labor on cesarean section using the basic ITT analysis on all of the 27 RCTs, and also using the IV analysis but including only the 9 studies with complete data on the number of cesarean sections in compliant and noncompliant participants. Zhou et al. (2019) estimated the CACE using the 10 studies that reported full compliance data. In this paper, we further investigate whether the studies with incomplete data provide extra information about the causal effect of epidural analgesia.

The ITT meta-analysis of the 27 RCTs gave a pooled risk ratio 1.10 (95% confidence interval: 0.97, 1.25; P=0.071) for cesarean section following epidural analgesia in labor, which implies that epidural analgesia in labor does not increase the risk of cesarean section. However, due to high rates of noncompliance, an ITT meta-analysis may not be a good way to estimate the effect of receiving epidural analgesia. The ITT meta-analysis pooled effect is potentially biased, especially when noncompliance reporting cannot be assumed to be random with respect to the outcome in the meta-analysis. To investigate the association between the ITT event rates and the noncompliance rates, we used a bivariate generalized linear mixed effects model (BGLMM) (Chu et al., 2012) to do the analysis because several studies had 0 events or 0 noncompliance. The BGLMM assumes a bivariate normal distribution of probabilities in the two groups (p_1i, p_0i) in a transformed scale, where the probabilities can be either event rates (p_1i = P(o_i = 1|r_i = 1), p_0i = P(o_i = 1|r_i = 0)) or noncompliance rates (p_1i = P(t_i = 0|r_i = 1), p_0i = P(t_i = 1|r_i = 0)). Specifically, we use a probit random effects model specified as:

Φ^{- 1} (p_{1 i}) = u + η_{1 i}, Φ^{- 1} (p_{0 i}) = v + η_{0 i}, {(η_{1 i}, η_{0 i})}^{T} ~ M V N (0, Σ_{η}) .

(1)

In this model, Φ(.) is the standard Gaussian cumulative distribution function, (η_1i, η_0i) are random effects, and the covariance matrix is $Σ_{η} = (\begin{matrix} σ_{u}^{2} & ρ σ_{u} σ_{v} \\ ρ σ_{u} σ_{v} & σ_{v}^{2} \end{matrix})$ . We chose the probit link because it has a closed-form formula for the marginal probabilities $E (p_{1 i}) = Φ (u / \sqrt{1 + σ_{u}^{2}})$ and $E (p_{0 i}) = Φ (v / \sqrt{1 + σ_{v}^{2}})$ , based on Equation (1).

We did a Bayesian analysis using JAGS (Plummer, 2003) to draw Markov chain Monte Carlo (MCMC) samples from the joint posterior distribution. We assigned vague priors N(0, 1000) to the fixed effects u, v, and the commonly-used inverse Wishart distribution InvW(I, ν = 3) to the covariance matrix $Σ_{η}$ , where I is the identity matrix. The cesarean section event rates (p_1i = P(o_i = 1|r_i = 1), p_0i = P(o_i = 1|r_i = 0)), and the noncompliance rates (p_1i = P(t_i = 0|r_i = 1), p_0i = P(t_i = 1|r_i = 0)) were analyzed separately using the model in Equation (1). After 10,000 burn-in samples, 40,000 posterior samples were drawn. The overall estimates E(p_1i) and E(p_0i) were calculated using the closed-form formula shown above. We present MCMC results as posterior medians followed by 95% equal-tail credible interval (CrI) in brackets for the rest of this article. The marginal probability of having a cesarean section in patients assigned to epidural analgesia was estimated as 12.9% (9.9%, 17.0%), while in those assigned to no/other analgesia it was 11.3% (8.5%, 15.0%). Also, the noncompliance rate in the epidural analgesia arm E{P(t_i = 0|r_i = 1)} was 15.6% (5.4%, 29.0%), while in the no/other analgesia arm E{P(t_i = 1|r_i = 0)} was 13.8% (3.4%, 31.3%).

Figure 1 shows the study-specific posterior medians and 95% CrIs for the cesarean section event rates (horizontal lines) and noncompliance rates (vertical lines) in both the epidural analgesia arm (dashed line) and the control arm (solid line). Noncompliance rates show somewhat different patterns in the two randomization groups: as the event rate increases, the noncompliance rate tends to be higher, but this trend is more obvious in the control groups. Arguably, the association is in the opposite direction for the treated groups.

The relationship between these two rates motivates us to develop a causal inference meta-analysis framework for the treatment effects, rather than use the ITT meta-analysis ignoring noncompliance. However, the existing complier average causal effect (CACE) framework needs complete information on compliance for each study. With completely or partially missing data on compliance in many studies, we aim to develop a new method that can use all studies and still have a valid causal interpretation. We introduce this method in Section 3 by first defining essential notation and assumptions.

3. Statistical Methods

3.1. Definition of the Complier Average Causal Effect (CACE)

3.1.1. Notation

In a meta-analysis with I two-armed randomized trials, N_i is the number of subjects in the i-th trial, where N_i0 is the number randomly assigned to the control/placebo group and N_i1 to the active treatment group. Let R_ij = r index the randomization assignment for subject j in study i with r = 0 for assignment to control and r = 1 for assignment to treatment. Let $T_{i j}^{r} = t \in {0, 1}$ be the potential treatment received under the randomization assignment r, where t = 1 indicates receiving the active treatment and t = 0 placebo. Let $Y_{i j}^{r, t} = o \in {1, 2, \dots, O}$ be the potential outcomes under randomization assignment r and treatment received t for the j-th subject in the i-th trial. Note that the sets of ${Y_{i j}^{r, t}}$ and ${T_{i j}^{r}}$ are the potential outcome and treatment-received status under possible r and t, but for each subject in a trial, only one of the possible values of each set can be observed. Therefore, we denote the observed response and received treatment variables as Y_ij and T_ij for the j-th subject in the i-th trial. We allow T_ij = ∗ if the actual received treatment is not recorded, and Y_ij = ∗ if the outcome is not recorded for the j-th patient in the i-th study. Then we let M_i be the N_i-dimensional vector of missingness indicators for all subjects in trial i, with individual element M_ij = m corresponding to whether subject j has actual treatment received status on record (m = 0) or missing (m = 1).

Following Imbens and Rubin (1997), we let C_ij be the latent compliance class of the j-th patient in the i-th trial, defined as follows:

C_ij= 0, never-taker, if $(T_{i j}^{0}, T_{i j}^{1}) = (0, 0)$ , i.e., subjects who would receive control if randomized to either group;
C_ij= 1, complier, if $(T_{i j}^{0}, T_{i j}^{1}) = (0, 1)$ , i.e., subjects who would receive the intervention to which they were randomized;
C_ij= 2, always-taker, if $(T_{i j}^{0}, T_{i j}^{1}) = (1, 1)$ , i.e., subjects who would receive active treatment if randomized to either group;
C_ij= 3, defier, if $(T_{i j}^{0}, T_{i j}^{1}) = (1, 0)$ , i.e., subjects who would receive the intervention opposite to their randomized assignment.

A subject’s compliance status C_ij is not observable because, in a two-arm trial, only one of $T_{i j}^{1}$ and can be observed. Based on the observed randomization group and actual treatment received, the compliance classes can only be partially identified (see Table 2, columns R_ij, T_ij, and C_ij).

Table 2:

Observed groups, latent compliance classes and outcome probabilities of trial i

R_ij	T_ij	C_ij	Y_ij = o ∈ {1, …, O}	Count

0	0	0 (never-taker) or 1 (complier)	$M (N_{i 00}, q_{i o} = \frac{π_{i c} v_{i o} + π_{i n} s_{i o}}{1 - π_{i a}})$	N _i00o
0	1	2 (always-taker) or 3 (defier)	M(N_i01, b_io)	N _i01o
1	0	0 (never-taker) or 3 (defier)	M(N_i10, s_io)	N _i10o
1	1	1 (complier) or 2 (always-taker)	$M (N_{i 11}, p_{i o} = \frac{π_{i c} u_{i o} + π_{i a} b_{i o}}{1 - π_{i n}})$	N _i11o

Open in a new tab

Defiers are ruled out by the monotonicity assumption.

3.1.2. Assumptions and Outcome Distributions

For each study, we make assumptions identical to those listed in Angrist et al. (1996):

Assumption 1: Stable unit treatment value assumption (SUTVA) (Rubin, 1980).

The outcome for a subject is unaffected by the particular assignments of treatments to the other subjects. That is, if r = r′ then $T_{i j}^{r} = T_{i j}^{r^{'}}$ ; and if r = r′ and t = t′ then $Y_{i j}^{r, t} = Y_{i j}^{r^{'}, t^{'}}$ .

Assumption 2: Random assignment to randomization groups.

For all N_i subjects in the i-th trial, the treatment assignment is random. This assumption implies that the proportion of compliers should be the same in the intervention and control groups.

Assumption 3: Exclusion restriction.

For subject j in the i-th trial $Y_{i j}^{r, t} = Y_{i j}^{r^{'}, t^{'}}$ , for all r, r′ and t, i.e., the randomization assignment affects responses only through its effect on treatment received. This assumption allows us to define $Y_{i j}^{t} \equiv Y_{i j}^{r, t} \equiv Y_{i j}^{r^{'}, t}$ for all r, r′ and t. Therefore, for always-takers and never-takers, the distribution of outcomes does not depend on the randomization group.

Assumption 4: $E [T_{i j}^{1} - T_{i j}^{0}] \neq 0$ for each i.

For each trial, we assume the fraction of subjects who receive each intervention varies by randomization group.

Assumption 5: Monotonicity.

$P [T_{i j}^{1} \geq T_{i j}^{0}] = 1$ for each trial. This implies that no subject necessarily receives the treatment opposite to the assignment, under assignment to both active treatment and control. This assumption rules out the existence of defiers and reduces the number of compliance types for which we must derive estimates, permitting a properly identified model.

Assuming randomized assignment and the exclusion restriction implies two restrictions: 1) the proportions of always-takers, never-takers, and compliers are the same in the control and treatment groups; 2) for never-takers and always-takers, the outcome distribution is the same under assignment to control and to active treatment. With these two restrictions, for discrete outcomes o ∈ {1, …,O} we can extend the notation in Cheng (2009) and Baker (2011) and define the following parameters for latent compliance classes and response rates in the i-th study: 1) π_ia and π_in are the probabilities of being an always-taker and a never-taker, respectively, so the probability of being a complier in the i-th study π_ic is 1−π_ia−π_in; 2) u_io is the probability of having outcome o for a complier randomized to the treatment group, and v_io is the probability for a complier randomized to the control group in the i-th study; s_io is the probability a never-taker has outcome o in the i-th study; and b_io is the probability an always-taker has outcome o in the i-th study; where $\sum_{o = 1}^{O} u_{i o} = \sum_{o = 1}^{O} v_{i o} = \sum_{o = 1}^{O} s_{i o} = \sum_{o = 1}^{O} b_{i o} = 1$ . Although latent compliance classes cannot be fully identified based on randomization group (R_ij) and observed treatment received (T_ij), the above two restrictions allow us to write the distributions of observed N_irt in terms of the parameters for compliance classes and response rates, where $N_{i r t} = \sum_{j} I (R_{i j} = r, T_{i j} = t)$ denotes the number of individuals in each observed group. Let M(N_irt, x_io) denote a multinomial distribution with N_irt subjects and multinomial probabilities {x_io}. The observed count for each outcome o in group {j : R_ij = r, T_ij = t} is N_irto, o = 1, …,O. Table 2 shows the distribution of each observed count in trial i, where $q_{i o} = \frac{π_{i c} v_{i o} + π_{i n} s_{i o}}{1 - π_{i a}}$ and $p_{i o} = \frac{π_{i c} u_{i o} + π_{i a} b_{i o}}{1 - π_{i n}}$ are probabilities corresponding to N_i00o and N_i11o, o ∈ {1,…,O}.

Furthermore, according to the relations between observed groups and latent compliance classes, we have $\sum_{o} N_{i 00 o} = N_{i 00} = N_{i 0} (1 - π_{i a})$ and $\sum_{o} N_{i 01 o} = N_{i 01} = N_{i 0} π_{i a}$ , so the vector of observed counts in the control group $(N_{i 001}, \dots, N_{i 00 O}, N_{i 011}, \dots, N_{i 01 O})$ follows a multinomial distribution $M (N_{i 0}, x_{i 0} = (x_{i 001}, \dots, x_{i 00 O}, x_{i 011}, \dots, x_{i 01 O}))$ , where $x_{i 00 o} = q_{i o} (1 - π_{i a}) = π_{i c} v_{i o} + π_{i n} s_{i o}, x_{i 01 o} = b_{i o} π_{i a}$ , and o ∈ {1, …,O}. Similarly, in the active treatment group, the vector of observed counts $(N_{i 101}, \dots, N_{i 10 O}, N_{i 111}, \dots, N_{i 11 O})$ follows a multinomial distribution $M (N_{i 1}, x_{i 1} = (x_{i 101}, \dots, x_{i 10 O}, x_{i 111}, \dots, x_{i 11 O}))$ , where $x_{i 10 o} = s_{i o} π_{i n}, x_{i 11 o} = p_{i o} (1 - π_{i a}) = π_{i c} u_{i o} + π_{i a} b_{i o}$ , and o ∈ {1, …,O}.

Let λ_i be the probability P(R_ij = 1), which is usually known in a trial and treated as fixed. Therefore, for study i (i = 1,2, …,I), all observed counts N_irto follow a single multinomial distribution, with corresponding probability P_irto, for r ∈ {0, 1},t ∈ {0, 1}, o ∈ {1, …,O}. In mathematical notation, the distribution is M(N_i, x_i = {P_irto}), where P_i0to = (1−λ_i)x_i0to and Pi1to = λ_ix_i0to.

In addition to Assumptions 1–5, we make the latent ignorable (LI) missing assumption described in Section 1. That is, given the observed data and the latent unobserved compliance classes, missingness has no residual dependence on the outcomes. Under the LI assumption, Table 3 summarizes a typical data structure and notation for a study i with missing treatment-received status for randomized treatment group r ∈ {0, 1}. In each cell of Table 3, the first row shows the count and the second row shows the corresponding probability of the outcome; for a study in which subjects randomized to r had missing data on actual treatment received, only the rows labeled “Missing” would be observed.

Table 3:

Typical data for study i with missing actual treatment received status in randomization group r ∈ {0, 1}

Treatment received	Outcome
Treatment received	1	…	O

0	N _ir01	…	N _ir0O
0	P _ir01	…	P _ir0O

1	N _ir11	…	N _ir1O
1	P _ir11	…	P _ir1O

Missing	N _{ir* 1}	…	N _ir*O
Missing	P_ir01 + P_ir11	…	P_ir0O + P_ir1O

Open in a new tab

In each cell, the first row: the observed count; the second row: the corresponding probability.

3.1.3. CACE in Meta-analysis

One causal effect of interest in many studies is the CACE discussed in Section 1. CACE for the i-th two-arm trial is defined as $θ_{i}^{CACE} = E (Y_{i j}^{1} - Y_{i j}^{0} ∣ C_{i j} = 1)$ . The overall causal effect θ^CACE from the meta-analysis can be estimated by taking the expectation of $θ_{i}^{CACE}$ over all I trials, $θ^{CACE} = E (θ_{i}^{CACE})$ . For an ordinal outcome Y_ij = o ∈ {1, …,O}, suppose we use equally spaced scores {1,2, …,O} to reflect the real distances between categories, then $θ_{i}^{CACE}$ is $\sum_{o} (o \times u_{i o}) - \sum_{o} (o \times v_{i o})$ . When the outcome is binary, we let o ∈ {0, 1}, so the CACE for the i-th trial is $θ_{i}^{CACE} = u_{i 1} - v_{i 1}$ .

A positive (negative) value of $θ_{i}^{CACE}$ indicates a beneficial treatment effect in the i-th trial if a higher value of o means a better (worse) outcome, and $θ_{i}^{CACE} = 0$ indicates no causal effect of treatment for compliers. Besides the aforementioned equally spaced scores {1,2, …,O}, their linear transforms may also be sensible in many cases and provide a reasonable compromise (Agresti, 2003). Alternative scoring systems such as midranks are also possible. When uncertain about which scoring choice to use, a sensitivity analysis can be conducted on different reasonable choices to see how they affect the estimates.

3.2. Estimation and Inference

3.2.1. The Likelihood

Let N_i = {N_ir} be the vector of observed data in study i, where r refers to the randomization group (r = 1 for treatment and r = 0 for the control/placebo arm). In each arm r, $N_{i r} = {N_{i r}^{c}, N_{i r}^{m}}$ , where the superscripts c and m denote complete and marginal counts, respectively. $N_{i r}^{c} = {N_{i r t o}}$ under each t ∈ {0, 1}, and o ∈ {1, …,O}. If the full compliance data were observed in arm r of study i, the corresponding marginal counts $N_{i r}^{c} = {N_{i r * o}}$ are assigned as 0. Otherwise, if the actual received-treatment status in randomization arm r of study i was missing, only the marginal data $N_{i r}^{m} = {N_{i r * o}}$ are available.

From Section 2.1, if full compliance data were observed in both randomization groups, all observed counts N_irto follow a single multinomial distribution, with probability P_irto, where P_i0to = (1 − λ_i)x_i0to and P_i1to = λ_ix_i0to. Furthermore, as indicated by Table 3, all N_ir∗o also follow a multinomial distribution with probability P_ir0o + P_ir1o if only marginal data were observed, for o ∈ {1, …,O} in the i-th trial. Therefore, defining β_i = (π_ia, π_in, s_i, b_i, u_i, v_i), where s_i = (s_i1, …,s_i(O−1)), b_i = (b_i1, …, b_i(O−1)), u_i = (u_i1, …,u_i(O−1)), v_i = (v_i1, …,v_i(O−1)), study i’s likelihood contribution is

L_{i} (β_{i}) = \prod_{j} \prod_{o} P_{i 00 o}^{(1 - R_{i j}) (1 - T_{i j}) (1 - M_{i j}) I (Y_{i j} = o)} P_{i 01 o}^{(1 - R_{i j}) T_{i j} (1 - M_{i j}) I (Y_{i j} = o)} P_{i 10 o}^{R_{i j} (1 - T_{i j}) (1 - M_{i j}) I (Y_{i j} = o)} P_{i 11 o}^{R_{i j} T_{i j} (1 - M_{i j}) I (Y_{i j} = o)} {(P_{i 00 o} + P_{i 01 o})}^{(1 - R_{i j}) M_{i j} I (Y_{i j} = o)} {(P_{i 10 o} + P_{i 11 o})}^{R_{i j} M_{i j} I (Y_{i j} = o)},

(2)

where the relations among the components of β_i and P_irto are summarized in Section 3.1.2, j = 1, …,N_i, o = 1, …,O, and the indicator function I(Y_ij = o) = 1 if Y_ij = o and 0 otherwise. The parameters are subject to $\sum_{o} u_{i o} = \sum_{o} v_{i o} = \sum_{o} s_{i o} = \sum_{o} b_{i o} = 1$ and 0 ≤ π_ia, π_in, u_io, v_io, s_io, b_io ≤ 1. The likelihood function for all trials in a meta-analysis is $L (β) = \prod_{i} L_{i} (β_{i})$ .

We use trials with binary outcomes to further illustrate the modeling; this also represents the situation in the motivating example. In this case, o ∈ {0, 1}, i.e., s_i0 + s_i1 = b_i0 + b_i1 = u_i0 + u_i1 = v_i0 + v_i1 = 1 for study i, so the vector parameters of s_i, b_i, u_i, v_i are reduced to s_i1, b_i1, u_i1, v_i1. Data can be arranged as shown in Table 1, where in each randomization arm, data are shown either in the column “Complete data” or in the column “Missing data”, with values in the other columns all 0. Thus the observed data are $N_{i r} = {N_{i r}^{c}, N_{i r}^{m}} = {N_{i r 00}, N_{i r 01}, N_{i r 10}, N_{i r 11}, N_{i r * 0}, N_{i r * 1}}$ for r ∈ {0, 1}. Then the likelihood contribution for the i-th trial can be written as

L_{i} (β_{i}) = {[(1 - λ_{i}) {π_{i c} (1 - v_{i 1}) + π_{i n} (1 - s_{i 1})}]}^{N_{i 000}} {(1 - λ_{i}) (π_{i c} v_{i 1} + π_{i n} s_{i 1})}^{N_{i 001}} {(1 - λ_{i}) π_{i a} (1 - b_{i 1})}^{N_{i 010}} {(1 - λ_{i}) π_{i a} b_{i 1}}^{N_{i 011}} {λ_{i} π_{i n} (1 - s_{i 1})}^{N_{i 100}} {λ_{i} π_{i n} s_{i 1}}^{N_{i 101}} [λ_{i} {{(π_{i c} (1 - u_{i 1}) + π_{i a} (1 - b_{i 1})}]}^{N_{i 110}} {λ_{i} (π_{i c} u_{i 1} + π_{i a} b_{i 1})}^{N_{i 111}} {[(1 - λ_{i}) {π_{i c} (1 - v_{i 1}) + π_{i n} (1 - s_{i 1}) + π_{i a} (1 - b_{i 1})}]}^{N_{i 0 * 0}} {(1 - λ_{i}) (π_{i c} v_{i 1} + π_{i n} s_{i 1} + π_{i a} b_{i 1})}^{N_{i 0 * 1}} {[λ_{i} {π_{i c} (1 - u_{i 1}) + π_{i a} (1 - b_{i 1}) + π_{i n} (1 - s_{i 1})}]}^{N_{i 1 * 0}} {λ_{i} (π_{i c} u_{i 1} + π_{i a} b_{i 1} + π_{i n} s_{i 1})}^{N_{i 1 * 1}}

(3)

where β_i = (π_ia, π_in, s_i1, b_i1, u_i1, v_i1), and the parameters vary between studies following some distributions with hyper-parameters, which we now describe.

To account for potential between-study heterogeneity of the compliance classes and outcome probabilities, we consider a random effects model. Specifically, to guarantee the desired properties of latent compliance classes in study i, i.e., π_in + π_ia + π_ic = 1 and 0 ≤ π_in, π_ia, π_ic ≤ 1, and to allow these probabilities to vary between studies, the parameters are specified as: $π_{i a} = \frac{\exp (a_{i})}{1 + \exp (n_{i}) + \exp (a_{i})}$ where n_i = α_n+δ_in, a_i = α_a+δ_ia. The random effect (δ_in, δ_ia) has a bivariate normal distribution with mean 0 and variance-covariance matrix $Σ_{l c} = (\begin{matrix} σ_{n}^{2} & ρ σ_{n} σ_{a} \\ ρ σ_{n} σ_{a} & σ_{a}^{2} \end{matrix})$ , to allow correlation between n_i and a_i across studies.

We also define random effect models on the transformed scale of each response probability s_i1, b_i1, u_i1, v_i1: g(s_i1) = α_s +δ_is, g(b_i1) = α_b +δ_ib, g(u_i1) = α_u +δ_iu, g(v_i1) = α_v +δ_iv, where g(·) is a link function such as the logit or probit. These response rates are assumed to be independent across principal strata, so $δ_{i s} ~ N (0, σ_{s}^{2})$ , $δ_{i b} ~ N (0, σ_{b}^{2})$ , $δ_{i u} ~ N (0, σ_{u}^{2})$ , $δ_{i v} ~ N (0, σ_{v}^{2})$ . The model can easily be extended to more general cases with more than binary outcomes.

3.2.2. Prior Specifications and the Posterior Distribution

We assign proper but diffuse prior distributions for the hyper-parameters. Specifically, α_n and α_a both follow N(0,2.5²), such that under the simplest situation (a fixed effects model), a 95% prior probability interval for any of the probabilities π_in, π_ia, π_ic ranges from about 0.001 to 0.91; and α_s, α_b, α_u, α_v all follow N(0,2²), which implies a 95% interval for the probabilities s_i1, b_i1, u_i1, v_i1 ranging from about 0.01 to 0.98. The hyper-priors for the precision parameters $σ_{s}^{- 2}$ , $σ_{b}^{- 2}$ , $σ_{u}^{- 2}$ and $σ_{v}^{- 2}$ are assumed to be Gamma(2,2), which corresponds to a 95% interval of (0.6, 2.9) for the corresponding standard deviations, allowing moderate heterogeneity in the response probabilities. The prior for the precision matrix $Σ_{l c}^{- 1}$ is Wishart, i.e., W(I, 3), where I is the identity matrix. In a reduced model with one of $σ_{n}^{2}$ , $σ_{a}^{2}$ set to 0, the prior of the other precision parameter is also assumed to be Gamma(2,2), which gives moderate heterogeneity for latent compliance classes probabilities.

Let function $f (β_{i} ∣ β_{0}, Σ_{0})$ be the distributions described in Section 3.2.1 of all parameters β_i = (π_ia, π_in, s_i1, b_i1, u_i1, v_i1), where β₀ refers to the vector of mean hyper-parameters (α_n, α_a, α_s, α_b, α_u, α_v), and $Σ_{0}$ is the covariance matrix of hyper-parameters $Σ_{l c}^{- 1}$ , $σ_{s}^{- 2}$ , $σ_{b}^{- 2}$ , $σ_{u}^{- 2}$ and $σ_{v}^{- 2}$ . Denoting the prior distributions specified above as f(β₀) and $f (Σ_{0})$ , the joint posterior distribution is then proportional to $\prod_{i} L_{i} (β_{i}) f (β_{i} ∣ β_{0}, Σ_{0}) f (β_{0}) f (Σ_{0})$ . We sample from the joint posterior using Markov chain Monte Carlo (MCMC) methods, specifically Gibbs and Metropolis-Hastings sampling algorithms (Gelfand and Smith, 1990).

As mentioned in Section 3.1.3, for binary outcomes, θ^CACE can be estimated as $E (θ_{i}^{CACE}) = E (u_{i 1}) - E (v_{i 1})$ . Integrating out the random effects, $E (u_{i 1}) = \int_{- \infty}^{+ \infty} g^{- 1} (α_{u} + t) σ_{u}^{- 1} ϕ (\frac{t}{σ_{u}}) d t$ and $E (v_{i 1}) = \int_{- \infty}^{+ \infty} g^{- 1} (α_{v} + t) σ_{v}^{- 1} ϕ (\frac{t}{σ_{v}}) d t$ , where ϕ(·) is the standard Gaussian density. Using probit link functions for u_i1 and v_i1, we have closed-form formulas $E (u_{i 1}) = Φ (\frac{α_{u}}{\sqrt{1 + σ_{u}^{2}}})$ and $E (v_{i 1}) = Φ (\frac{α_{v}}{\sqrt{1 + σ_{v}^{2}}})$ so that

θ^{CACE} = Φ (\frac{α_{u}}{\sqrt{1 + σ_{u}^{2}}}) - Φ (\frac{α_{v}}{\sqrt{1 + σ_{v}^{2}}}) .

(4)

For s_i1 and b_i1, we used the logit link random effects model. Though the integral in E(s_i1) does not have a closed-form formula, it has a well-established approximation, $E (s_{i 1}) \approx {logit}^{- 1} (\frac{α_{s}}{\sqrt{1 + C^{2} σ_{s}^{2}}})$ , where $C = \frac{16 \sqrt{3}}{15 π}$ (Zeger et al., 1988). This approximation also applies to estimating the overall always-taker response rate E(b_i1). One can use either the same or different links for parameters u_i1, v_i1, s_i1 and b_i1. In particular, the logit and probit links approximate each either very well. For convenience, we chose the probit link for u_i1 and v_i1 because it gives us a closed form for the posterior θ^CACE, while we chose the logit link for s_i1 and b_i1 because it is more commonly used.

In each MCMC iteration, draws of θ^CACE are calculated from the MCMC draws using Equation (4). We use medians and equal-tail credible intervals (CrIs) of these posterior samples to make inferences for the random effects models.

3.2.3. Model Selection and Implementation

The model specified in Section 3.2.1 included all possible random effects to account for possible between-study heterogeneity of the fractions in the compliance classes and heterogeneity of the response rate probabilities. However, over-fitting the data with too many random effects should be avoided because it may inflate posterior variances. Therefore, we have used a forward selection procedure to choose the final model, beginning with a model having no random effects and at each forward step adding the random-effect component that gave the largest improvement in the deviance information criterion (DIC) (Spiegelhalter et al., 2002). Other model-selection approaches can be substituted easily, e.g., using a different model-selection criterion or a different search strategy.

We used JAGS software version 4.3 via the rjags package in R to sample from the joint posterior distribution. We ran three independent MCMC chains with starting points drawn randomly from their prior distributions. After 10,000 burn-in samples, the subsequent 100,000 posterior samples were obtained for each chain. Convergence to the stationary distribution was assessed using trace plots, sample autocorrelation, and the Gelman and Rubin statistic (Gelman and Rubin, 1992).

3.2.4. Model for Complete Data Only

Here we discuss how the naive “two-step” approach introduced in Section 1 can be viewed as a special case of our model using only trials with complete noncompliance data. In this situation, only trials with complete data $N_{i r}^{c} = {N_{irto}}$ are used to make inference on CACE. Then the likelihood contribution of the i-th study is

L_{i} (β_{i}) = \prod_{j} \prod_{o} P_{i 00 o}^{(1 - R_{i j}) (1 - T_{i j}) (1 - M_{i j}) I (Y_{i j} = o)} P_{i 01 o}^{(1 - R_{i j}) T_{i j} (1 - M_{i j}) I (Y_{i j} = o)} P_{i 10 o}^{R_{i j} (1 - T_{i j}) (1 - M_{i j}) I (Y_{i j} = o)} P_{i 11 o}^{R_{i j} T_{i j} (1 - M_{i j}) I (Y_{i j} = o)} .

(5)

Note that when M_ij = 1 (i.e., for trials with incomplete noncompliance data), L_i(β_i) = 1. Thus for trial i with complete noncompliance data $N_{i r}^{c} = {N_{irto}}$ , one can separately estimate $θ_{i}^{CACE}$ and obtain a standard error. One can then combine these study-specific estimates using a standard meta-analysis method, such as a fixed-effect or random effects model, to estimate the population-averaged CACE. Alternatively, one can obtain the posterior estimate of θ^CACE through the joint posterior distribution, which is proportional to the likelihood for trials with complete noncompliance data $L (β) = \prod_{i} L_{i} (β_{i})$ multiplied by the prior distributions. Note that by Lin and Zeng (2010), the two-step approach can be viewed as asymptotically equivalent to the model maximizing the joint likelihood. Therefore, in the simulation section below, we compare the performance of our proposed model including all trials with a model using only trials with complete noncompliance data instead of a two-step frequentist approach.

4. Case Study Results

4.1. Model Selection Results

We estimated the CACE of epidural analgesia in labor on cesarean section including all of the 27 RCTs introduced in Section 2. Although the full model has 6 potential random effects in total, δ_in, δ_ia, δ_is, δ_ib, δ_iu and δ_iv, we adopted the forward selection procedure described in Section 3.2.3. DIC, DIC improvement, and the effective number of parameters (p_D) for each model considered in the forward selection procedure are presented in the Supplementary materials (Table W1). Starting with the model with no random effects (called Model I), at each forward step we added one random-effect component that gave the largest improvement in DIC until adding that random effect gave no notable improvement. The final model was Model Vc, including random effects δ_ia, δ_in, δ_is, and δ_iu.

Figure 2 shows the kernel-smoothed posterior density of θ^CACE from the model selected in each forward step. The plot suggests θ^CACE has a fairly symmetric posterior density for all models. After adding the random effect δ_iu to probit(u_i1) in Model Vc, the posterior of θ^CACE is shifted right and its variance increased considerably, which further indicates the importance of appropriately accounting for random effects.

Table 4 lists estimated parameters from the fixed effects model (Model I) and the final model (Model Vc), where the triple of percentiles, _2.550_97.5, is used to display each parameter’s posterior median with its 95% equal tail credible interval, as suggested by Louis and Zeger (2009). Monte Carlo integration (Ueberhuber, 1997) was used to estimate the probability of being in each principal stratum, π_a, π_c, and π_n when δ_in and δ_ia were both present (Model Vc). The marginal never-taker response rate s₁ = E(s_i1) of Model Vc was estimated using the approximation $E (s_{i 1}) \approx {logit}^{- 1} (\frac{α_{s}}{\sqrt{1 + C^{2} σ_{s}^{2}}})$ , $C = \frac{16 \sqrt{3}}{15 π}$ , and the marginal treated complier response rate u₁ = E(u_i1) was estimated using the closed-form formula $E (u_{i 1}) = Φ (\frac{α_{u}}{\sqrt{1 + σ_{u}^{2}}})$ . For other marginal response rates (e.g., b₁, v₁), the values were directly estimated by transforming back the fixed-effect parameters if the probabilities were assumed to be the same across studies according to either Model I or Model Vc. For example, the marginal always-taker response rate was b₁ = E(b_i1) = logit⁻¹(α_b) because b_i1 had no random effect in either model. Based on the final model (Model Vc), the posterior median and interval for θ^CACE were _−0.0030.041_0.105, which covers zero and indicates a nonsignificant complier average causal effect, though the estimated effect was about twice the estimate from the fixed effect model (Model I). The random effects for π_a, π_n, and π_c on the transformed scale had standard deviations of 1.65 and 2.24, while the random effect for s₁ had a standard deviation of 2.11 on the logit scale. After adding random effects for δ_in, δ_ia, δ_is and δ_iu, the posteriors of π_n, π_a, s₁ and u₁ changed markedly from those estimated by Model I.

Table 4:

Summary of parameter estimates for the epidural analgesia meta-analysis

Parameter	Model I(None)	Model Vc(δ_in, δ_ia, δ_is, δ_iu)

θ ^CACE	_−0.0030.017_0.038	_−0.0030.041_0.105
Overall never-taker probability π_n	_0.2140.230_0.246	_0.0330.101_0.259
Overall always-taker probability π_a	_0.1360.152_0.170	_0.0650.190_0.400
Overall complier probability π_c	_0.5940.618_0.641	_0.5440.687_0.787
Overall never-taker response s₁	_0.0290.046_0.068	_0.1160.254_0.488
Always-taker response b₁	_0.1240.168_0.216	_0.1000.140_0.174
Treated complier response u₁	_0.0930.112_0.131	_0.0650.108_0.173
Control complier response v₁	_0.0780.095_0.112	_0.0540.068_0.083
Mean parameter of n_i	_−1.089−0.988_−0.887	_−3.196−2.173_−1.224
Mean parameter of a_i	_−1.542−1.399_−1.260	_−3.521−2.038_−0.758
Standard deviation of n_i	–	_1.0551.645_2.846
Standard deviation of a_i	–	_1.4022.240_3.901
Standard deviation of logit(s_i1)	–	_1.2312.110_4.131
Standard deviation of probit(u_i1)	–	_0.4310.600_0.912

Open in a new tab

The notation _LP_U denotes the posterior median P with 95% equal tailed credible limits (L, U).

Figure 3 is a forest plot of the posterior medians and 95% equal-tail CrIs of $θ_{i}^{CACE}$ for each trial based on the final model, Model Vc. Studies with a “†” in the “Study (Author, Year)” column had complete data on compliance status and we used solid lines to represent their CrIs. For a study with incomplete data, as its $θ_{i}^{CACE}$ was not directly estimable by the single trial, we used a dashed line to show the posterior 95% CrI. The figure shows that studies with complete data tend to have shorter credible intervals, while the study-specific estimates $θ_{i}^{CACE}$ were quite heterogeneous, indicating differences in the study populations. Compared to the overall risk difference estimated by the two-step ITT meta-analysis, the overall θ^CACE from the final model had a wider 95% CrI but still covered zero, suggesting the effect of epidural analgesia in labor on cesarean section is not statistically significant from the perspective of causal inference. However, compared to the estimated risk difference of 0.8% (95% CI: −0.3%, 1.9%) given by the two-step ITT meta-analysis, the estimated CACE was 4.1% (95% CrI: −0.3%, 10.5%), suggesting a potentially much larger effect size. This shows the potential dilution of the estimated average treatment effect in a two-step ITT meta-analysis.

4.2. Sensitivity to the LI Assumption

The above models were built upon the assumption of latent ignorable (LI) missingness. However, this assumption may not be satisfied in some applications. For example, studies showing a treatment effect may have a higher chance of reporting compliance status. This is a form of missing not at random (MNAR): the probability of missing compliance data depends on the outcome. However, in practice, one can never tell from the data at hand whether missingness is LI or MNAR (Little and Rubin, 2014). Thus, we present a sensitivity analysis that uses a known MNAR mechanism to show its impact on treatment-effect estimates.

Let the I × 2 matrix Ξ denote the study-level compliance missingness of a meta-analysis dataset containing I studies and 2 treatment arms. The entries of Ξ are ξ_ir, i = 1, …,I and r = 0,1, with ξ_ir = 1 if compliance information is missing in randomized group r of study i, and ξ_ir = 0 if the data is complete. We assume $ξ_{i r} ~ Bern (p_{i r}^{mis})$ , where $p_{i r}^{mis}$ is the probability of missing compliance status (i.e., no data on the actual treatment taken) in study i’s randomized group r. We specify a model of missingness for $p_{i r}^{mis}$ as

logit (p_{i 0}^{mis}) = γ_{00} + γ_{10} \times logit (v_{i 1}), logit (p_{i 1}^{mis}) = γ_{01} + γ_{11} \times logit (u_{i 1}) .

(6)

In this model, γ₀₀ (γ₀₁) is a scalar parameter, and γ₁₀ (γ₁₁) describes the strength of association between the missingness probability and the study-specific response rate of a complier in the randomized control (treatment) group, i.e., the components of $θ_{i}^{CACE}$ . When γ_1r = 0 for r = 0,1, the missingness probabilities are not related to any model parameters, hence the missingness is completely at random (MCAR). For the purpose of assessing the effect of MNAR, for a given γ₁₀ and γ₁₁, this model of missingness can be incorporated in the likelihood in Section 3.2.1 and treated as if it is known to be true. Note that the model of missingness described here is not for general MNAR scenarios but is specific for the CACE problem, and we only consider scenarios in which missingness is related to components of θ^CACE.

In this case study, as the random effect δ_iv was not selected into the final Model Vc, the response rates for compliers randomized to the control group (v_i1) were the same across trials. Thus the missing probabilities in the control arm $p_{i 0}^{mis}$ according to Equation (6) were also the same for all studies i. For illustration, we set γ₁₀ = 0 in conducting sensitivity analyses to the specific MNAR scenario, to explore the impact on CACE estimates as γ₁₁ changes from negative to positive. Since a flat prior for γ_0r with a large variance would lead to a marginal prior distribution for $p_{i r}^{mis}$ heavily weighted towards 0 and 1, we follow Zhang et al. (2017) by specifying a logistic(0,1) prior for γ_0r, which gives an approximate uniform prior for $p_{i r}^{mis}$ on (0,1).

Figure 4 summarizes the posterior of θ^CACE from the meta-analysis of epidural analgesia in labor when we set γ₀₁ = 0 in Equation (6) and allow γ₁₁ to range from −2.5 to 2.5 under the final model (Model Vc).

As γ₁₁ increased from −2.5 to 2.5, the posterior median of θ^CACE increased from about 0.02 to 0.07, and the 95% equal-tail credible interval of θ^CACE no longer covered zero when the coefficient of logit(u_i1) was over about 0.3. If we used 95% highest posterior density credible interval instead of the equal-tailed interval, the significance of θ^CACE changed when γ₁₁ was over about 0.7. Thus, when the missingness probabilities were positively and strongly enough correlated with u_i1, the CACE became statistically significant, which differs from the conclusion drawn in Section 4.1 under the LI assumption. Therefore, the missingness mechanism for compliance influences the causal effect estimates in this epidural analgesia in labor meta-analysis.

5. Simulation

5.1. Simulation Setups

We conducted simulation studies to evaluate how the proposed method performs under different assumptions. As in the case study, we assumed o ∈ {0, 1}, i.e., the outcome is binary. We set (α_n, α_a, α_s, α_b, α_u, α_v) = (−0.4, −0.6, 0.5, −0.5, −0.5, 0.5), so that the true values in the absence of random effects were π_ic = 0.45, π_in = 0.30, π_ia = 0.25 and $θ_{i}^{CACE} = - 0.38$ . When random effects were present, we assumed the random effects had standard deviation 0.5, i.e., each of σ_n, σ_a, σ_s, σ_u = 0.5. To evaluate the model’s performance and the impact of random effects, we generated compliance status and outcomes data with three sets of random effects, corresponding to Section 4.1’s Model IIIe (δ_in, δ_ia), Model IVa (δ_in, δ_ia, δ_is), and Model Vc (δ_in, δ_ia, δ_is, δ_iu). The logit link was used for s_i1 and the probit link was used for u_i1 when δ_is or δ_iu presents in the model, respectively. Under each scenario, we simulated 2000 datasets. Each dataset comprised 20 studies in which 350 subjects per study were randomized to either the treatment or control group with a 1 : 1 ratio (λ = 0.5). The setup values for the MCMC algorithm, including the number of MCMC chains, method for generating starting points, numbers of iterations for burn-in and after burn-in, were the same as in Section 3.2.3.

We created partially missing compliance data under the MCAR, LI, and MNAR assumptions, as follows. Under the MCAR assumption, the missing indicators for all studies were prespecified such that the first ten studies in the control arm (R = 0), and the 6-th to 15-th studies in the treatment arm (R = 1) did not have compliance information, so that only 5 studies had full data in both arms. To generate partially incomplete data under the LI and MNAR assumption, we applied a logit model to calculate the missingness probabilities in the control arm (R = 0) and treatment arm (R = 1) separately, which were used to generate the random missingness indicators to keep only the marginal data in that arm of the study.

The models for the missingness indicators are:

ξ_{i r} ~ Bern (p_{i r}^{mis}),

LI: logit (p_{i r}^{mis}) = β_{0 r} + β_{1 r} \times logit (π_{i c})

MNAR: p_{i 0}^{mis} = 0.5, logit (p_{i 1}^{mis}) = γ_{01} + γ_{11} \times logit (u_{i 1}),

where r = 0,1 indicate the control and treatment groups respectively. If the missing indicator ξ_ir = 1, then data on compliance status in the i-th study arm r were set to missing, i.e., only marginal values N_ir∗1, N_ir∗0 were available. Our data-generation settings imply that the parameter π_ic is independent of $θ_{i}^{CACE}$ , so we considered the missing assumption to be LI. For ease of presentation, we let $p_{i 0}^{mis} = p_{i 1}^{mis}$ in the LI scenario so β_0r and β_1r can be reduced to β₀ and β₁. For MNAR, we assumed the probability of missing compliance data in the treatment arm is related to u_i1 on the logit scale.

The intercept terms were chosen to control the expected missingness probability at about 0.5 in each scenario. Under the LI assumption, we set β₁ = 2, referring to the scenario in which the missingness probabilities depend on the probability of being a complier. In a study with a higher proportion of compliers, the noncompliance rates tend to be smaller such that the ITT analysis would perform well, which may imply a higher probability of not reporting compliance information. Thus the coefficient β₁ was set to a positive value, matching the above situation.

Under the MNAR assumption, we set γ₁₁ = −2 to produce a scenario in which missingness in the treatment arm is related to the response rate in the compliers, while the missing probability in the control arm was set to the fixed value of 0.5. As the true value of $θ_{i}^{CACE}$ in our setting is negative (a beneficial complier average causal effect if the outcome o = 1 is an adverse event), we, therefore, created a scenario with 1) a fixed response rate for a control complier (v_i1); and 2) a decreasing response rate as u_i1 increases in a treated complier. Thus, when the beneficial CACE was more significant, it was more likely that the study’s investigators would not report compliance information. To do this, we set the coefficient of γ₁₁ to be negative. The value γ₁₁ = −2 not only implies a reasonable strength of the association between $p_{i 1}^{mis}$ and u_i1, but also ensures the distribution $p_{i 1}^{mis}$ is well spread out between 0 and 1.

Under each missingness assumption and each true random-effect model, we compare the performance of our proposed method with the naive approach (described in Section 3.2.4) that includes only studies with complete data. Note that to create the missing compliance data, we just added N_ir11 and N_ir01 to give the marginal N_ir∗1, and added N_ir10 and N_ir00 to give N_ir∗0. For the analyses using the data from all studies, because no studies were discarded, the true underlying parameters still describe the data, and the proposed approach can be robust to different missing data generating mechanisms. However, for the naive approach that only includes studies with complete compliance information, patterns of the missing mechanism are expected to have an impact on the results.

We used a model selection procedure in the simulation, fitting each dataset with all of the following candidate models: 1) no random effect; 2) random effects only on (δ_in, δ_ia); 3) random effects on (δ_in, δ_ia, δ_is), and 4) random effects on (δ_in, δ_ia, δ_is, δ_iu), which correspond to the models selected in each forward step from the case study model selection procedure (Model I, IIf, IIIe, IVa, and Vc). We counted the frequency of selecting each model using DIC. Note that either π_ic or u_i1 must be generated with a random effect to ensure the missingness probabilities vary across studies. Thus under LI, we generate data with random effects (δ_in, δ_ia), (δ_in, δ_ia, δ_is), and (δ_in, δ_ia, δ_is, δ_iu), and under MNAR we generate data with (δin, δia, δis, δiu).

5.2. Simulation Results

Table 5 summarizes results from the simulation studies regarding θ^CACE, comparing the two approaches, the proposed model (“Model including all studies”) and the naive method (“On studies with complete data”; described in Section 3.2.4), in terms of relative bias (ReBias), mean square error (MSE), 95% credible interval coverage probability (CP), 95% credible interval length (CIL), and relative efficiency (RelEff), defined as MSE from the naive analysis divided by MSE using the proposed model. Under each missingness mechanism considered, we fit the model including the same random effects as in generating the data.

Table 5:

Simulation results: relative bias (ReBias), mean square error (MSE), 95% credible interval coverage probabilities (CP), and 95% credible interval length (CIL) for θ^CACE

Missing Mechanism	Random Effects	Model including all studies				On studies with complete data				ReEff
Missing Mechanism	Random Effects	ReBias	MSE	CP	CIL	ReBias	MSE	CP	CIL	ReEff

MCAR	None	0.003	0.001	0.951	0.106	0.006	0.003	0.957	0.202	3.599
	δ_in, δ_ia	0.017	0.001	0.969	0.133	0.018	0.002	0.960	0.198	2.294
	δ_in, δ_ia, δ_is	−0.013	0.001	0.953	0.134	−0.001	0.003	0.952	0.200	2.432
	δ_in, δ_ia, δ_is, δ_iu	0.010	0.002	0.988	0.240	−0.087	0.007	0.993	0.455	3.022

LI, β₁ = 2	δ_in, δ_ia	0.046	0.001	0.950	0.142	0.008	0.003	0.957	0.219	2.204
	δ_in, δ_ia, δ_is	0.030	0.001	0.963	0.142	0.003	0.003	0.951	0.217	2.627
	δ_in, δ_ia, δ_is, δ_iu	0.065	0.004	0.961	0.247	−0.066	0.008	0.989	0.445	2.321

MNAR, γ₁₁ = −2	δ_in, δ_ia, δ_is, δ_iu	−0.022	0.003	0.978	0.241	−0.319	0.021	0.945	0.507	7.493

Open in a new tab

ReBias = Bias/True Value

Under the different missingness assumptions, the proposed model provided nearly unbiased estimates for θ^CACE with smaller MSE. Generally, the estimates were slightly biased when the data were generated under LI or MNAR compared to MCAR, or as the number of random effects increased. The coverage probabilities remained close to or above the nominal level 0.95 in all scenarios. The naive method that discards studies with incomplete data also performed reasonably well when data were generated under the MCAR or LI missingness mechanism, with little or no bias, though the proposed method was more efficient with consistently smaller MSE and shorter 95% credible interval length, as it gained efficiency by including information from more studies. However, when data were generated with missingness probabilities that were strongly associated with one component of $θ_{i}^{CACE}$ (the MNAR assumption), the naive approach using only studies with complete compliance data had substantially larger relative bias and MSE. Moreover, the relative efficiency values were greater than two in all scenarios, providing evidence that our proposed model is much more efficient than simply discarding studies without complete compliance data. (to be edited, also the discussion section)

Under each missing mechanism and true data-generating model, we fit four candidate models with different numbers of random effects as described in Section 5.1. The Supplementary Materials present additional simulation results with interpretations, including a table summarizing the frequency of selecting each candidate model as the “best” model in each set of simulations (Table W2), and a table of the relative bias, MSE, 95% credible interval coverage probability, and 95% CIL for θ^CACE fitting the four candidate models under each data-generating scenario (Table W3). The results indicate that random effects should be selected carefully to account for potential between-study heterogeneity when estimating CACE in a meta-analysis.

6. Discussion

We proposed an innovative Bayesian hierarchical model to estimate CACE in the meta-analysis of RCTs, accounting for both heterogeneous and incompletely reported noncompliance among studies, and we applied it to a case study of epidural analgesia trials to estimate the average causal effect on the cesarean section after accounting for noncompliance. We also conducted simulation studies to evaluate the performance of our approach under different missingness mechanisms and the impact of misspecification of random effects. To the best of our knowledge, this is the first meta-analysis of RCTs estimating CACE while adjusting for incomplete noncompliance data.

Using the proposed method, all 27 epidural analgesia trials were included in the CACE meta-analysis. Including information from studies with incomplete noncompliance data may introduce additional heterogeneity into the meta-analysis and may affect the overall CACE estimate. Compared to the estimated risk difference of 0.8% (95% CI: −0.3%, 1.9%) given by the two-step ITT meta-analysis, the estimated CACE was 4.1% (95% CrI: −0.3%, 10.5%). Thus we conclude that the potential dilution of the estimated treatment effect by the ITT meta-analysis notwithstanding, epidural analgesia in labor does not affect the risk of cesarean section in a strict causal interpretation. This method allows us simultaneously to account both for the inherent heterogeneity in noncompliance rates between treatment arms and across studies (as shown in Figure 1), and for incomplete noncompliance data in some studies. It provides a feasible way to estimate a clinically meaningful causal effect in meta-analysis by including all studies, which can be applied to different therapeutic areas in RCTs with binary or ordinal outcomes.

The simulations indicated that 1) our approach had a good chance of identifying the correct model, and 2) our proposed model had better efficiency for estimating CACE, with smaller MSE and shorter credible intervals, compared to the model only using trials with complete compliance data. Our simulations under the MNAR assumption did not discard any studies when fitting the proposed models, so we expected they would still give unbiased CACE estimates. The naive approach including only studies with complete compliance data gave biased estimates, which was not surprising because the studies with complete data are no longer representative of all studies under the MNAR missing mechanism.

Besides handling the situation in which some studies in a meta-analysis do not report compliance information, the proposed method can be extended to handle missing outcomes. Missing outcome data in RCTs commonly happens when researchers do not collect follow-up outcomes for some subjects. For example, consider a vaccine trial in which patients randomized to the vaccination group were encouraged to receive a flu shot, but patients themselves decided whether to receive flu shots, and their actual vaccination received was recorded. For the outcome of flu-related hospitalization, missing outcomes could occur if some patients had flu but were treated at hospitals not participating in the study, or if some patients simply had unknown hospitalization status. In this case, we can extend the likelihood in Equation (2) by adding a column “missing” to the right of Table 3, and the corresponding probabilities would be the sum of the probabilities of all cells in that row. This extension could improve the estimation of the probabilities of latent classes. While missing outcome data happens frequently for some patients in a single study, however, in a typical meta-analysis of randomized clinical trials, as the focus is the treatment’s effect on the outcome variable, studies reporting only compliance data but not outcome data, if any exist, are generally not included in a meta-analysis. Nevertheless, the model can be extended to incorporate partial outcome missingness in some studies.

The proposed model can also be extended to incorporate study-level predictors. Depending on whether the study-level covariates are assumed to be associated with the latent compliance class probabilities or the outcome response rates, they can be included in the models for one or more of the transformed study-level parameters π_in, π_ia, s_io, b_io, u_io or v_io. For example, if study-specific mean age in a meta-analysis is believed to be associated with the proportion of never-takers, always-takers, and compliers, one can add the mean age variable x_i to the equations for the transformed π_in, π_ia. Specifically, in Section 3.2.1 we proposed a generalized inverse logistic transformation to guarantee in study i that π_in + π_ia + π_ic = 1 and $0 \leq π_{i n}, π_{i a}, π_{i c} \leq 1 : π_{i n} = \frac{\exp (n_{i})}{1 + \exp (n_{i}) + \exp (a_{i})}$ , $, π_{i a} = \frac{\exp (a_{i})}{1 + \exp (n_{i}) + \exp (a_{i})}$ . In the case with study-level predictor x_i, we may want to let n_i = α_n + β_nx_i + δ_in and a_i = α_a + β_ax_i + δ_ia. Then the posterior median and credible interval of β_n and β_a provide information about the magnitude and significance of the association between study-level mean age and the never-taker, always-taker and complier probabilities. Study-level predictors can also be added to the models for outcome response rates depending on the specific nature of the trials and outcome measure. As suggested by several reviewers, study-level covariates may make the LI+MAR assumption more plausible in practice.

Recently, extensions of models estimating CACE with missing data in a single study have been developed. Specifically, Chen et al. (2009) discussed the identifiability and estimation of CACE under a nonignorable missing mechanism; Peng et al. (2004) proposed an extended general location model to estimate the CACE with missing data in the outcome and in baseline covariates. Estimating CACE with missing data in longitudinal and survival outcomes has also been discussed (Yau and Little, 2001). These methods have been proposed only for the single-study setting; potential extensions for estimating CACE in meta-analysis await further development. Furthermore, as network meta-analysis expands the scope of a conventional pairwise meta-analysis to simultaneously compare multiple treatments by synthesizing both direct and indirect information (Lumley, 2002; Zhang et al., 2014), extending the CACE meta-analysis methods to network meta-analysis is also a promising future research topic that awaits further exploration.

Supplementary Material

Web Supp

NIHMS1687558-supplement-Web_Supp.pdf^{(130.2KB, pdf)}

Acknowledgments

We are grateful to the Editor, the associate Editor, and anonymous reviewers whose comments greatly improved this article.

This research was supported in part by NIH NLM R21012744 and NLM R01LM012982.

Footnotes

Supplementary Materials

The supplementary materials contain additional results from the case study and simulations. The data and R JAGS code used to produce the results of this paper are available at the GitHub repository https://github.com/JinchengZ/CACEmetaBayes.git.

References

Agresti A (2003). Categorical Data Analysis, volume 482. John Wiley & Sons. [Google Scholar]
Angrist JD, Imbens GW, and Rubin DB (1996). Identification of causal effects using instrumental variables. Journal of the American Statistical Association 91, 444–455. [Google Scholar]
Baker SG (2011). Estimation and inference for the causal effect of receiving treatment on a multinomial outcome: an alternative approach. Biometrics 67, 319–323. [DOI] [PMC free article] [PubMed] [Google Scholar]
Baker SG (2020). CACE and meta-analysis (letter to the editor). Biometrics 76, 1383–1384. [DOI] [PubMed] [Google Scholar]
Baker SG and Kramer BS (2005). Simple maximum likelihood estimates of efficacy in randomized trials and before-and-after studies, with implications for meta-analysis. Stat Methods Med Res 14, 349–67. [DOI] [PubMed] [Google Scholar]
Baker SG, Kramer BS, and Lindeman KS (2016). Latent class instrumental variables: a clinical and biostatistical perspective. Statistics in medicine 35, 147–160. [DOI] [PMC free article] [PubMed] [Google Scholar]
Baker SG and Lindeman KS (1994). The paired availability design: a proposal for evaluating epidural analgesia during labor. Statistics in Medicine 13, 2269–2278. [DOI] [PubMed] [Google Scholar]
Bannister-Tyrrell M, Miladinovic B, Roberts CL, and Ford JB (2015). Adjustment for compliance behavior in trials of epidural analgesia in labor using instrumental variable meta-analysis. Journal of Clinical Epidemiology 68, 525–533. [DOI] [PubMed] [Google Scholar]
Chen H, Geng Z, and Zhou X-H (2009). Identifiability and estimation of causal effects in randomized trials with noncompliance and completely nonignorable missing data. Biometrics 65, 675–682. [DOI] [PMC free article] [PubMed] [Google Scholar]
Cheng J (2009). Estimation and inference for the causal effect of receiving treatment on a multinomial outcome. Biometrics 65, 96–103. [DOI] [PubMed] [Google Scholar]
Chu H, Nie L, Chen Y, Huang Y, and Sun W (2012). Bivariate random effects models for meta-analysis of comparative studies with binary outcomes: methods for the absolute risk difference and relative risk. Statistical Methods in Medical Research 21, 621–633. [DOI] [PMC free article] [PubMed] [Google Scholar]
Egger M, Davey-Smith G, and Altman D (2008). Systematic Reviews in Health Care: Meta-analysis in Context. John Wiley & Sons. [Google Scholar]
Frangakis CE and Rubin DB (2002). Principal stratification in causal inference. Biometrics 58, 21–29. [DOI] [PMC free article] [PubMed] [Google Scholar]
Gelfand AE and Smith AFM (1990). Sampling-based approaches to calculating marginal densities. Journal of the American Statistical Association 85, 398–409. [Google Scholar]
Gelman A and Rubin DB (1992). Inference from iterative simulation using multiple sequences. Statistical Science 7, 457–472. [Google Scholar]
Imbens GW and Rubin DB (1997). Bayesian inference for causal effects in randomized experiments with noncompliance. The Annals of Statistics pages 305–327. [Google Scholar]
Jackson D, Riley R, and White IR (2011). Multivariate meta-analysis: Potential and promise. Statistics in Medicine 30, 2481–2498. [DOI] [PMC free article] [PubMed] [Google Scholar]
Jo B, Ginexi EM, and Ialongo NS (2010). Handling missing data in randomized experiments with noncompliance. Prevention Science 11, 384–396. [DOI] [PMC free article] [PubMed] [Google Scholar]
Lian Q, Hodges JS, and Chu H (2019). A Bayesian hierarchical summary receiver operating characteristic model for network meta-analysis of diagnostic tests. Journal of the American Statistical Association 114, 949–961. [DOI] [PMC free article] [PubMed] [Google Scholar]
Lin D and Zeng D (2010). On the relative efficiency of using summary statistics versus individual-level data in meta-analysis. Biometrika 97, 321–332. [DOI] [PMC free article] [PubMed] [Google Scholar]
Little RJ and Rubin DB (2014). Statistical Analysis with Missing Data. John Wiley & Sons. [Google Scholar]
Louis TA and Zeger SL (2009). Effective communication of standard errors and confidence intervals. Biostatistics 10, 1–2. [DOI] [PMC free article] [PubMed] [Google Scholar]
Lumley T (2002). Network meta-analysis for indirect treatment comparisons. Statistics in Medicine 21, 2313–2324. [DOI] [PubMed] [Google Scholar]
Ma X, Lian Q, Chu H, Ibrahim JG, and Chen Y (2018). A Bayesian hierarchical model for network meta-analysis of multiple diagnostic tests. Biostatistics 19, 87–102. [DOI] [PMC free article] [PubMed] [Google Scholar]
O’Malley AJ and Normand S-LT (2005). Likelihood methods for treatment noncompliance and subsequent nonresponse in randomized trials. Biometrics 61, 325–334. [DOI] [PubMed] [Google Scholar]
Peng Y, Little RJ, and Raghunathan TE (2004). An extended general location model for causal inferences from data subject to noncompliance and missing values. Biometrics 60, 598–607. [DOI] [PubMed] [Google Scholar]
Plummer M (2003). JAGS: A program for analysis of Bayesian graphical models using gibbs sampling. In Proceedings of the 3rd international workshop on distributed statistical computing, volume 124, page 125. Vienna, Austria. [Google Scholar]
Riley RD, Jackson D, Salanti G, Burke DL, Price M, Kirkham J, and White IR (2017). Multivariate and network meta-analysis of multiple outcomes and multiple treatments: rationale, concepts, and examples. BMJ 358, j3932. [DOI] [PMC free article] [PubMed] [Google Scholar]
Rubin DB (1980). Randomization analysis of experimental data: The fisher randomization test comment. Journal of the American Statistical Association 75, 591–593. [Google Scholar]
Spiegelhalter DJ, Best NG, Carlin BP, and Van Der Linde A (2002). Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 64, 583–639. [Google Scholar]
Stuart EA, Perry DF, Le H-N, and Ialongo NS (2008). Estimating intervention effects of prevention programs: Accounting for noncompliance. Prevention Science 9, 288–298. [DOI] [PMC free article] [PubMed] [Google Scholar]
Ueberhuber CW (1997). Numerical Computation 1: Methods, Software, and Analysis, volume 16. Springer Science & Business Media. [Google Scholar]
Yau LHY and Little RJ (2001). Inference for the complier-average causal effect from longitudinal data subject to noncompliance and missing data, with application to a job training assessment for the unemployed. Journal of the American Statistical Association 96, 1232–1244. [Google Scholar]
Ye C, Beyene J, Browne G, and Thabane L (2014). Estimating treatment effects in randomised controlled trials with non-compliance: a simulation study. BMJ Open 4, e005362. [DOI] [PMC free article] [PubMed] [Google Scholar]
Zeger SL, Liang K-Y, and Albert PS (1988). Models for longitudinal data: a generalized estimating equation approach. Biometrics pages 1049–1060. [PubMed] [Google Scholar]
Zhang J, Carlin BP, Neaton JD, Soon GG, Nie L, Kane R, Virnig BA, and Chu H (2014). Network meta-analysis of randomized clinical trials: reporting the proper summaries. Clinical Trials 11, 246–262. [DOI] [PMC free article] [PubMed] [Google Scholar]
Zhang J, Chu H, Hong H, Virnig BA, and Carlin BP (2017). Bayesian hierarchical models for network meta-analysis incorporating nonignorable missingness. Statistical Methods in Medical Research 26, 2227–2243. [DOI] [PMC free article] [PubMed] [Google Scholar]
Zhou J, Hodges JS, and Chu H (2020). Rejoinder to “CACE and meta-analysis (letter to the editor)” by stuart baker. Biometrics 76, 1385–1389. [DOI] [PMC free article] [PubMed] [Google Scholar]
Zhou J, Hodges JS, Suri MFK, and Chu H (2019). A Bayesian hierarchical model estimating CACE in meta-analysis of randomized clinical trials with noncompliance. Biometrics 75, 978–987. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Web Supp

NIHMS1687558-supplement-Web_Supp.pdf^{(130.2KB, pdf)}

[R1] Agresti A (2003). Categorical Data Analysis, volume 482. John Wiley & Sons. [Google Scholar]

[R2] Angrist JD, Imbens GW, and Rubin DB (1996). Identification of causal effects using instrumental variables. Journal of the American Statistical Association 91, 444–455. [Google Scholar]

[R3] Baker SG (2011). Estimation and inference for the causal effect of receiving treatment on a multinomial outcome: an alternative approach. Biometrics 67, 319–323. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R4] Baker SG (2020). CACE and meta-analysis (letter to the editor). Biometrics 76, 1383–1384. [DOI] [PubMed] [Google Scholar]

[R5] Baker SG and Kramer BS (2005). Simple maximum likelihood estimates of efficacy in randomized trials and before-and-after studies, with implications for meta-analysis. Stat Methods Med Res 14, 349–67. [DOI] [PubMed] [Google Scholar]

[R6] Baker SG, Kramer BS, and Lindeman KS (2016). Latent class instrumental variables: a clinical and biostatistical perspective. Statistics in medicine 35, 147–160. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R7] Baker SG and Lindeman KS (1994). The paired availability design: a proposal for evaluating epidural analgesia during labor. Statistics in Medicine 13, 2269–2278. [DOI] [PubMed] [Google Scholar]

[R8] Bannister-Tyrrell M, Miladinovic B, Roberts CL, and Ford JB (2015). Adjustment for compliance behavior in trials of epidural analgesia in labor using instrumental variable meta-analysis. Journal of Clinical Epidemiology 68, 525–533. [DOI] [PubMed] [Google Scholar]

[R9] Chen H, Geng Z, and Zhou X-H (2009). Identifiability and estimation of causal effects in randomized trials with noncompliance and completely nonignorable missing data. Biometrics 65, 675–682. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R10] Cheng J (2009). Estimation and inference for the causal effect of receiving treatment on a multinomial outcome. Biometrics 65, 96–103. [DOI] [PubMed] [Google Scholar]

[R11] Chu H, Nie L, Chen Y, Huang Y, and Sun W (2012). Bivariate random effects models for meta-analysis of comparative studies with binary outcomes: methods for the absolute risk difference and relative risk. Statistical Methods in Medical Research 21, 621–633. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R12] Egger M, Davey-Smith G, and Altman D (2008). Systematic Reviews in Health Care: Meta-analysis in Context. John Wiley & Sons. [Google Scholar]

[R13] Frangakis CE and Rubin DB (2002). Principal stratification in causal inference. Biometrics 58, 21–29. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R14] Gelfand AE and Smith AFM (1990). Sampling-based approaches to calculating marginal densities. Journal of the American Statistical Association 85, 398–409. [Google Scholar]

[R15] Gelman A and Rubin DB (1992). Inference from iterative simulation using multiple sequences. Statistical Science 7, 457–472. [Google Scholar]

[R16] Imbens GW and Rubin DB (1997). Bayesian inference for causal effects in randomized experiments with noncompliance. The Annals of Statistics pages 305–327. [Google Scholar]

[R17] Jackson D, Riley R, and White IR (2011). Multivariate meta-analysis: Potential and promise. Statistics in Medicine 30, 2481–2498. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R18] Jo B, Ginexi EM, and Ialongo NS (2010). Handling missing data in randomized experiments with noncompliance. Prevention Science 11, 384–396. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R19] Lian Q, Hodges JS, and Chu H (2019). A Bayesian hierarchical summary receiver operating characteristic model for network meta-analysis of diagnostic tests. Journal of the American Statistical Association 114, 949–961. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R20] Lin D and Zeng D (2010). On the relative efficiency of using summary statistics versus individual-level data in meta-analysis. Biometrika 97, 321–332. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R21] Little RJ and Rubin DB (2014). Statistical Analysis with Missing Data. John Wiley & Sons. [Google Scholar]

[R22] Louis TA and Zeger SL (2009). Effective communication of standard errors and confidence intervals. Biostatistics 10, 1–2. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R23] Lumley T (2002). Network meta-analysis for indirect treatment comparisons. Statistics in Medicine 21, 2313–2324. [DOI] [PubMed] [Google Scholar]

[R24] Ma X, Lian Q, Chu H, Ibrahim JG, and Chen Y (2018). A Bayesian hierarchical model for network meta-analysis of multiple diagnostic tests. Biostatistics 19, 87–102. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R25] O’Malley AJ and Normand S-LT (2005). Likelihood methods for treatment noncompliance and subsequent nonresponse in randomized trials. Biometrics 61, 325–334. [DOI] [PubMed] [Google Scholar]

[R26] Peng Y, Little RJ, and Raghunathan TE (2004). An extended general location model for causal inferences from data subject to noncompliance and missing values. Biometrics 60, 598–607. [DOI] [PubMed] [Google Scholar]

[R27] Plummer M (2003). JAGS: A program for analysis of Bayesian graphical models using gibbs sampling. In Proceedings of the 3rd international workshop on distributed statistical computing, volume 124, page 125. Vienna, Austria. [Google Scholar]

[R28] Riley RD, Jackson D, Salanti G, Burke DL, Price M, Kirkham J, and White IR (2017). Multivariate and network meta-analysis of multiple outcomes and multiple treatments: rationale, concepts, and examples. BMJ 358, j3932. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R29] Rubin DB (1980). Randomization analysis of experimental data: The fisher randomization test comment. Journal of the American Statistical Association 75, 591–593. [Google Scholar]

[R30] Spiegelhalter DJ, Best NG, Carlin BP, and Van Der Linde A (2002). Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 64, 583–639. [Google Scholar]

[R31] Stuart EA, Perry DF, Le H-N, and Ialongo NS (2008). Estimating intervention effects of prevention programs: Accounting for noncompliance. Prevention Science 9, 288–298. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R32] Ueberhuber CW (1997). Numerical Computation 1: Methods, Software, and Analysis, volume 16. Springer Science & Business Media. [Google Scholar]

[R33] Yau LHY and Little RJ (2001). Inference for the complier-average causal effect from longitudinal data subject to noncompliance and missing data, with application to a job training assessment for the unemployed. Journal of the American Statistical Association 96, 1232–1244. [Google Scholar]

[R34] Ye C, Beyene J, Browne G, and Thabane L (2014). Estimating treatment effects in randomised controlled trials with non-compliance: a simulation study. BMJ Open 4, e005362. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R35] Zeger SL, Liang K-Y, and Albert PS (1988). Models for longitudinal data: a generalized estimating equation approach. Biometrics pages 1049–1060. [PubMed] [Google Scholar]

[R36] Zhang J, Carlin BP, Neaton JD, Soon GG, Nie L, Kane R, Virnig BA, and Chu H (2014). Network meta-analysis of randomized clinical trials: reporting the proper summaries. Clinical Trials 11, 246–262. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R37] Zhang J, Chu H, Hong H, Virnig BA, and Carlin BP (2017). Bayesian hierarchical models for network meta-analysis incorporating nonignorable missingness. Statistical Methods in Medical Research 26, 2227–2243. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R38] Zhou J, Hodges JS, and Chu H (2020). Rejoinder to “CACE and meta-analysis (letter to the editor)” by stuart baker. Biometrics 76, 1385–1389. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R39] Zhou J, Hodges JS, Suri MFK, and Chu H (2019). A Bayesian hierarchical model estimating CACE in meta-analysis of randomized clinical trials with noncompliance. Biometrics 75, 978–987. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

A Bayesian Hierarchical CACE Model Accounting for Incomplete Noncompliance With Application to a Meta-analysis of Epidural Analgesia on Cesarean Section

Jincheng Zhou

James S Hodges

Haitao Chu

Abstract

1. Introduction

2. A Motivating Meta-analysis of the Effect of Epidural Analgesia on Cesarean Section

2.1. Data Sources

Table 1:

2.2. Analysis of Event Rates and Noncompliance Rates

Figure 1:

3. Statistical Methods

3.1. Definition of the Complier Average Causal Effect (CACE)

3.1.1. Notation

Table 2:

3.1.2. Assumptions and Outcome Distributions

Assumption 1: Stable unit treatment value assumption (SUTVA) (Rubin, 1980).

Assumption 2: Random assignment to randomization groups.

Assumption 3: Exclusion restriction.

Assumption 4: E[Tij1−Tij0]≠0 for each i.

Assumption 5: Monotonicity.

Table 3:

3.1.3. CACE in Meta-analysis

3.2. Estimation and Inference

3.2.1. The Likelihood

3.2.2. Prior Specifications and the Posterior Distribution

3.2.3. Model Selection and Implementation

3.2.4. Model for Complete Data Only

4. Case Study Results

4.1. Model Selection Results

Figure 2:

Table 4:

Figure 3:

4.2. Sensitivity to the LI Assumption

Figure 4:

5. Simulation

5.1. Simulation Setups

5.2. Simulation Results

Table 5:

6. Discussion

Supplementary Material

Acknowledgments

Footnotes

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

Assumption 4: $E [T_{i j}^{1} - T_{i j}^{0}] \neq 0$ for each i.