Comparing competing risk outcomes within principal strata, with application to studies of mother-to-child transmission of HIV

Dustin M Long; Michael G Hudgens

doi:10.1002/sim.5583

. Author manuscript; available in PMC: 2013 Nov 30.

Published in final edited form as: Stat Med. 2012 Aug 28;31(27):3406–3418. doi: 10.1002/sim.5583

Comparing competing risk outcomes within principal strata, with application to studies of mother-to-child transmission of HIV

Dustin M Long, Michael G Hudgens ^*

PMCID: PMC3494821 NIHMSID: NIHMS403096 PMID: 22927321

Abstract

In randomized trials to prevent breast milk transmission of human immunodeficiency virus (HIV) from mother to infant, investigators are often interested in assessing the effect of a treatment or intervention on the cumulative risk of HIV infection by time (age) t in infants who are alive and uninfected at a certain time point τ₀ < t. Such comparisons are challenging for two reasons. First, infants are typically randomized at birth (time 0 < τ₀) such that comparisons between trial arms among the subset of infants alive and uninfected at τ₀ are subject to selection bias. Second, in most mother-to-child transmission (MTCT) trials competing risks are often present, such as death or cessation of breastfeeding prior to HIV infection. In this paper we present methods for assessing the causal effect of a treatment on competing risk outcomes within principal strata. In MTCT trials, the causal effect of interest is that of treatment on the risk of HIV infection by time t > τ₀ within the principal stratum of infants who would be alive and uninfected by τ₀ regardless of randomization assignment. Large sample non-parametric bounds and a semi-parametric sensitivity analysis model are developed for drawing inference about this causal effect. A simulation study is presented demonstrating that the proposed methods perform well in finite samples. The proposed methods are applied to a large, recent MTCT trial.

Keywords: Causal inference, Infectious diseases, Principal stratification, Sensitivity analysis

1. Introduction

Every year approximately 200,000 infants become infected with HIV through breastfeeding; in the absence of treatment, half of these infants will die within two years of birth [1, 2]. In clinical trials to prevent MTCT of HIV through breast milk, investigators are often interested in comparing interventions conditional on the infant being alive and uninfected up to a certain time point during the trial [3–6]. Specifically, when randomization occurs at birth (time 0), a time point τ₀ > 0 is often chosen prior to the beginning of the trial and only randomized infants alive and uninfected at τ₀ are considered for analysis. For example, in the Breastfeeding, Antiretroviral, and Nutrition (BAN) study [3, 4] infants were randomized at birth but the primary analysis included only infants HIV uninfected and alive at τ₀ = 2 weeks. Infants infected prior to 2 weeks were excluded because these transmissions likely occurred in utero or during labor and delivery, whereas the primary objective of the trial was to assess the effects of interventions to prevent infection due to breast milk. Similar exclusions were made in the primary analysis of the SWEN and PEPI trials [6, 7].

There are two aspects of the analysis described above that are the focus of this paper. First, an analysis comparing risk of HIV infection between trial arms among infants who are alive and uninfected at time τ₀ after randomization is subject to selection bias. One method to protect against selection bias in this scenario entails principal stratification [8]. Principal stratification uses the potential outcomes of a variable collected post-randomization to define strata of individuals. In the MTCT trial setting, the principal stratum of interest is infants who would be alive and uninfected by time τ₀ under either treatment assignment. Because principal stratum membership is not affected by treatment assignment, comparisons between trial arms within a particular principal stratum are not subject to selection bias. For a recent discussion of the strengths and weaknesses of principal stratification, see Pearl [9] and subsequent responses such as VanderWeele [10].

The second aspect in the analysis of the effect of treatment on the risk of HIV infection in MTCT trials is the presence of competing risks [11]. In particular, death or weaning prior to HIV transmission are competing risks for HIV infection since these events (death, weaning) can preclude HIV infection from occurring. Likewise, HIV infection precludes the possibility of an HIV-free death or weaning prior to HIV infection. One analytical approach that avoids the complication of competing risks is to use a composite endpoint, such as time until HIV infection or death. Using a composite endpoint simplifies analysis and has the advantage of providing a single measure of the overall effect of treatment. However, such an analysis does not provide inference about whether the treatment is having an effect on the risk of HIV infection, death, or both endpoints. Another common approach in the analysis of MTCT trials is to treat infants experiencing HIV-free death as right censored, e.g., when computing the Kaplan-Meier estimator of the cumulative probability of HIV infection (for instance, see Figure 2a of Kumwenda et al. [6]). It is well known that computing the Kaplan-Meier estimator by right censoring competing events does not in general yield a consistent estimator of the cumulative risk of the event of interest [12, 13]; in the MTCT setting such Kaplan-Meier estimators will tend to overestimate the risk of HIV infection when there is a non-zero probability of death prior to HIV infection. A third approach, adopted in this paper, is to estimate the cumulative incidence functions of each competing event, namely HIV, death, and weaning. The resulting estimates have a straightforward interpretation as the cumulative risk of each event in settings such as the trial where the other events may occur. Contrasts is the estimated risks between trial arms can then be used to assess treatment effects on each of the competing events.

Previous work on estimating treatment effects within principal strata has considered binary outcomes (e.g., Hudgens and Halloran [14]), continuous outcomes (e.g., Gilbert et al. [15]) and survival outcomes (e.g., Hayden et al. [16] and Shepherd et al. [17]). In this paper we develop methods for estimating treatment effects within principal strata for a survival outcome in the presence of competing risks. In the absence of competing risks the developed methods essentially reduce to those of Shepherd et al. [17]. The outline of the remainder of the paper is as follows. In Section 2 notation and assumptions are discussed. In Section 3 inferential methods for the causal effect of interest are presented. The finite sample performance of the methods are assessed in a simulation study in Section 4. These simulations also illustrate how misleading inferences can arise if selection bias are ignored. In Section 5 the methods are applied to investigate the effect of infant antiretroviral therapy (ART) on the cumulative risk of HIV infection in the BAN trial. A brief discussion is given in Section 6.

2. Notation and Assumptions

Suppose n individuals are randomly assigned one of two treatments, 0 or 1, at baseline (birth or time 0). For i = 1, …, n, let Z_i = 0 if subject i is assigned treatment 0 and Z_i = 1 otherwise. Let n₀ = Σ (1 − Z_i) and n₁ = Σ Z_i, where here and throughout $\sum = \sum_{i = 1}^{n}$ . Without loss of generality, assume Z_i = 0 corresponds to placebo or control, and Z_i = 1 corresponds to active treatment. In the BAN study analysis, Z_i = 1 will refer to the infant ART arm and Z_i = 0 will refer to the control arm. Suppose the primary objective is to assess the effect of treatment on the time T_i (from baseline) until some particular event occurs. Assume there are k possible causes or types of events and let J_i denote the event type for individual i with J_i ∈ {1, …, k}. In the BAN study there are k = 3 competing risks: HIV infection (J_i = 1), death prior to HIV infection or weaning (J_i = 2), or cessation of breastfeeding prior to HIV infection (J_i = 3).

Suppose in the analysis of the effect of treatment Z_i on (T_i, J_i) we would like to condition on some binary post-randomization variable S_i (taking on values 0 or 1) measured at some pre-specified post-randomization time τ₀ > 0. For instance, in the analysis of BAN it is desired to assess the effect of treatment in infants alive and uninfected at time τ₀; in this case we let S_i = 1 if an infant becomes infected or dies by τ₀ and S_i = 0 otherwise. Note for the BAN example that S_i = I(T_i ≤ τ₀, J_i ≤ 2) where I(·) is the usual indicator function, however in the methods developed below S_i need not be defined in terms of T_i or J_i.

Define C_i to be a potential right censoring time and assume τ₀ ≤ C_i, i.e., no individuals drop out of the study prior to τ₀ such that S_i is always observed. Let τ₁ denote the maximum length of follow-up for the study such that any individual who has not had an event or dropped out of the study by time τ₁ is administratively censored at that time, i.e., C_i ≤ τ₁. Let Y_i = min{T_i, C_i} and Δ_i = I(Y_i = T_i). Due to censoring, instead of (T_i, C_i, J_i) we only observe (Y_i, J_iΔ_i); i.e., T_i and J_i are observed if and only if individual i is not right censored.

Let T_i(z) be the potential survival time when assigned treatment z for z = 0,1 such that T_i = (1 − Z_i)T_i(0) + Z_iT_i(1). Define C_i(z), S_i(z), and J_i(z) similarly. Assume the treatment assignment of individual i does not affect the potential outcomes of other individuals (i.e., there is no interference) and there are not multiple forms of treatment, i.e., the stable unit treatment value assumption (SUTVA) holds [18]. Let W_i = (S_i(0), S_i(1), T_i(0), T_i(1), J_i(0), J_i(1), C_i(0), C_i(1)) denote the vector of potential outcomes and O_i = (Z_i, S_i, Y_i, J_iΔ_i) denote the vector of observable random variables. Assume individuals in the study are a random sample from a larger population such that W₁, …, W_n and O₁, …, O_n are iid copies of W and O respectively.

Principal strata can be defined by sets of individuals with the same potential outcome pair (S_i(0) = s₀, S_i(1) = s₁). Define the never infected (NI) principal stratum to be individuals with S_i(0) = S_i(1) = 0, i.e., individuals who would be alive and uninfected at τ₀ regardless of treatment assignment. Similarly define the harmed stratum as those individuals with S_i(0) = 0, S_i(1) = 1; the protected stratum as those individuals with S_i(0) = 1, S_i(1) = 0; and the doomed stratum as those individuals with S_i(0) = S_i(1) = 1. Motivated by MTCT studies of HIV, we focus on drawing inference about causal effects in the NI principal stratum. For example, in the BAN study we are interested in the principal stratum of infants who would be alive and not infected with HIV by τ₀ = 2 weeks under either randomization assignment.

In the presence of competing risks, a quantity of interest is the cumulative incidence function (CIF) or subdistribution function of (T, J). Let F(t, j) = P(T ≤ t, J = j) denote the CIF, i.e., the probability of having event j at or before time t. Define the causal estimand of interest to be $C E (t, j) = F_{1}^{N I} (t, j) - F_{0}^{N I} (t, j)$ for t ∈ [τ₀, τ₁] where $F_{z}^{N I} (t, j) = Pr [T_{i} (z) \leq t, J_{i} (z) = j ∣ S_{i} (0) = S_{i} (1) = 0]$ for z = 0, 1. In words, CE(t, j) is the difference in the probability of having an event of type j by time t for treatment 0 compared to treatment 1 within the NI principal stratum. For example, in the BAN study (where j = 1 corresponds to HIV infection), CE(28, 1) is the difference in the probability of HIV infection by 28 weeks between the two study arms among infants who would be alive and HIV negative by τ₀ weeks regardless of treatment assignment. In the analysis of BAN, CE(28, 1) was of particular interest because per protocol a primary endpoint of the trial was HIV infection by 28 weeks [3].

To draw inference about CE(t, j) we make the following assumptions:

Assumption 2.1
Independent treatment assignment: Z_i ⊥ W_i
Assumption 2.2
Monotonicity: S_i(1) ≤ S_i(0) for all i
Assumption 2.3
Independent censoring: C_i(z) ⊥ {T_i(z), J_i(z), S_i(z)} for z = 0, 1

Assumption 2.1 is plausible in randomized clinical trials. Assumption 2.2 is a strong assumption that must be considered carefully and is discussed further in Section 5 in the context of the BAN study. Methods not requiring the monotonicity assumption are discussed in Section 6. Assumption 2.3 is a common assumption when analyzing competing risks data. In the infant ART and control arms of BAN, 15% of participants were administratively censored at τ₁ = 28 weeks and 12% were censored at earlier time points due to drop-out from the study prior to week 28.

Under Assumptions 2.1 and 2.2, Z_i = 0 and S_i = 0 imply S_i(0) = S_i(1) = 0; i.e., individuals who are alive and uninfected by τ₀ when assigned control must be members of the NI principal stratum. Letting F₀(t, j) = Pr[T_i(0) ≤ t, J_i(0) = j|S_i(0) = 0], it follows under Assumptions 2.1 – 2.2 that $F_{0}^{N I} (t, j) = F_{0} (t, j)$ , which is identifiable from the observable data under Assumption 2.3. However $F_{1}^{N I} (t, j)$ remains unidentifiable under Assumptions 2.1 – 2.3 because individuals who are alive and uninfected by τ₀ when assigned treatment (Z_i = 1) are a mixture of individuals from the NI and protected principal strata. In particular, following Gilbert et al. [15], one can show

F_{1} (t, j) = γ F_{1}^{N I} (t, j) + (1 - γ) F_{1}^{prot} (t, j),

(1)

where γ = Pr[S_i(0) = 0|S_i(1) = 0] is the probability an individual is uninfected under control given they would be uninfected under treatment, F₁(t, j) = Pr[T_i(1) ≤ t, J_i(1) = j|S_i(1) = 0] and $F_{1}^{prot} (t, j) = Pr [T_{i} (1) \leq t, J_{i} (1) = j ∣ S_{i} (0) = 1, S_{i} (1) = 0]$ .

To proceed, one can introduce an additional assumption about the selective effect of conditioning on S_i which renders $F_{1}^{N I} (t, j)$ identifiable. For example, following Hudgens and Halloran [14], large-sample upper and lower bounds can be obtained by considering extreme selection bias models. The upper bound selection model is given by assuming either $F_{1}^{prot} (t, j) = 0$ or $F_{1}^{N I} (t, j) = 1$ , while the lower bound selection model is given by assuming either $F_{1}^{prot} (t, j) = 1$ or $F_{1}^{N I} (t, j) = 0$ . By (1), these models are equivalent to assuming either

F_{1}^{N I} (t, j) = min {γ^{- 1} F_{1} (t, j), 1},

(2)

F_{1}^{N I} (t, j) = max {\frac{F_{1} (t, j) - (1 - γ)}{γ}, 0} .

(3)

Estimating CE(t, j) under (2) or (3) is useful in bounding the estimate of the causal effect above and beyond any possible selective effects induced by conditioning on S_i = 0.

The true degree of selection bias may be considerably less than that assumed by (2) or (3). Therefore, we consider a class of selection models that includes the extreme models above as special cases. Through sensitivity analysis over the entire class (as in Robins et al. [19] and Gilbert et al. [15]), the relationship between the assumed degree of selection bias and inference about CE(t, j) can be explored. These selection models are semiparametric in the sense that no additional restrictions are placed on the distribution of the observable random variables O₁, …, O_n but an unidentifiable parameter (β_j in the model below) is used to quantify the selection bias. One possible selection model is:

Assumption 2.4

$exp (β_{j}) = \frac{F_{1}^{N I} (t, j) / {1 - F_{1}^{N I} (t, j)}}{F_{1}^{prot} (t, j) / {1 - F_{1}^{prot} (t, j)}} .$ (4)

The parameter β_j equals the log odds ratio of having an event of type j by time t under treatment assignment z = 1 in the NI principal stratum versus the protected principal stratum. Note Assumption 2.4 allows for the log odds to differ across event types as indicated by the subscript on β. Also note (4) is unverifiable since β_j is not identifiable from the observable data. For fixed β_j, under Assumptions 2.1 – 2.4 $F_{1}^{N I} (t, j; β_{j})$ is identifiable from the observable data and CE(t, j) can be estimated as described in Section 3 below. The extreme models (2) and (3) can be viewed as special cases of Assumption 2.4 as β_j → ∞ and β_j → −∞. We refer to β_j = 0 as the no selection bias model because in this case the odds of having an event of type j by time t are the same in the NI and protected principal strata. Sensitivity analysis of inference about CE(t; j) can be conducted by letting β_j range from −∞ to ∞. Gains in power or precision may be achieved by restricting the range of β_j based on prior information about β_j elicited from subject matter experts [20, 21].

3. Inference

In this section we first consider nonparametric estimation of CE(t, j) under the extreme selection models (2) and (3). Then inference for CE(t, j) under the semiparametric selection model (4) given some value of β_j is discussed in Section 3.2. The construction of uncertainty intervals about CE(t, j) is considered in Section 3.3.

3.1. Nonparametric Estimation: Bounds

Under Assumptions 2.1 – 2.3 consistent estimators of $F_{1}^{N I} (t, j)$ assuming (2) or (3) are given, respectively, by

{\hat{F}}_{1}^{N I, u p} (t, j) = min {{\hat{γ}}^{- 1} {\hat{F}}_{1} (t, j), 1} and {\hat{F}}_{1}^{N I, low} (t, j) = max {\frac{{\hat{F}}_{1} (t, j) - (1 - \hat{γ})}{\hat{γ}}, 0},

(5)

where

\hat{γ} = min {\frac{\sum (1 - S_{i}) (1 - Z_{i}) / n_{0}}{\sum (1 - S_{i}) Z_{i} / n_{1}}, 1},

and F̂₁(t, j) is the Aalen-Johansen estimator [22] of F₁(t, j) calculated using (Y_i, J_iΔ_i) for individuals with Z_i = 1 and S_i = 0. It can be shown that γ̂ and F̂₁(t, j) are nonparametric maximum likelihood estimators (NPMLEs) of γ and F₁(t, j). Thus the estimators in (5) can be viewed as NPMLEs of $F_{1}^{N I} (t, j)$ . Because Assumptions 2.1 and 2.2 imply $F_{0}^{N I} (t, j) = F_{0} (t, j)$ , consistent estimators of CE(t, j) assuming either (2) or (3) are ${\hat{C E}}^{u p} (t, j) = {\hat{F}}_{1}^{N I, u p} (t, j) - {\hat{F}}_{0} (t, j)$ or ${\hat{C E}}^{low} (t, j) = {\hat{F}}_{1}^{N I, low} (t, j) - {\hat{F}}_{0} (t, j)$ , where F̂₀(t, j) is the Aalen-Johansen estimator of F₀(t, j) calculated using (Y_i, J_i Δ_i) for individuals with Z_i = S_i = 0. In the nomenclature of Vansteelandt et al. [23], the interval [ ${\hat{C E}}^{low} (t, j), {\hat{C E}}^{u p} (t, j)$ ] is an estimated ignorance region of CE(t, j).

If 0 < γ < 1, then γ̂ is asymptotically normal. The Aalen-Johansen estimators F̂_z(t, j), for z = 0, 1 are asymptotically normal assuming 0 < F_z(t; j) < 1 and certain regularity conditions [24]. Therefore, ${\hat{F}}_{1}^{N I, u p} (t, j)$ is asymptotically normal if, in addition to these conditions,

F_{1} (t, j) < γ .

(6)

If (6) does not hold, then ${\hat{F}}_{1}^{N I, u p} (t, j) \overset{p}{\to} 1$ and hence is not asymptotically normal. Under conditions where ${\hat{F}}_{1}^{N I, u p} (t, j)$ is asymptotically normal, a consistent estimator of the variance of ${\hat{F}}_{1}^{N I, u p} (t, j)$ is

\hat{var} {{\hat{F}}_{1}^{N I, u p} (t, j)} = \frac{\hat{var} {{\hat{F}}_{1} (t, j)}}{{\hat{γ}}^{2}} + {\frac{{\hat{F}}_{1} (t, j)}{\hat{γ}}}^{2} (\frac{1}{N_{0}} - \frac{1}{n_{0}} + \frac{1}{N_{1}} - \frac{1}{n_{1}}),

(7)

where $\hat{var} {{\hat{F}}_{1} (t, j)}$ is a consistent estimator of the variance of F̂₁(t, j) (e.g., see Aalen et al. [24], Section 3.4.5) and N_z = Σ I(S_i = 0, Z_i = z). Similarly ${\hat{F}}_{1}^{N I, low} (t, j)$ is asymptotically normal if, in addition to the conditions above,

1 - γ < F_{1} (t, j) .

(8)

If (8) does not hold, ${\hat{F}}_{1}^{N I, low} (t, j) \overset{p}{\to} 0$ and hence is not asymptotically normal. If ${\hat{F}}_{1}^{N I, low} (t, j)$ is asymptotically normal, the variance can be consistently estimated by

\hat{var} {{\hat{F}}_{1}^{N I, low} (t, j)} = \frac{\hat{var} {{\hat{F}}_{1} (t, j)}}{{\hat{γ}}^{2}} + {\frac{1 - {\hat{F}}_{1} (t, j)}{\hat{γ}}}^{2} (\frac{1}{N_{0}} - \frac{1}{n_{0}} + \frac{1}{N_{1}} - \frac{1}{n_{1}}) .

(9)

Derivations of (7) and (9) are given in the appendix. When (6) and (8) hold, pointwise Wald-type confidence intervals for CE(t, j) can be constructed in the usual manner. Alternatively, the bootstrap percentile method can be used for computing confidence intervals of CE(t, j). If (6) and (8) do not hold, then ${\hat{F}}_{1}^{N I, u p} (t, j) \overset{p}{\to} 1$ and ${\hat{F}}_{1}^{N I, low} (t, j) \overset{p}{\to} 0$ , i.e., the bounds are non-informative. Note that conditions (6) and (8) can be assessed based on observed data by comparing γ̂ and F̂₁(t, j).

3.2. Semiparametric Estimation

Under Assumptions 2.1 – 2.4, for fixed β_j a semiparametric estimator of $F_{1}^{N I} (t, j)$ can be constructed by plugging F̂₁(t, j) and γ̂ into equation (1) and then simultaneously solving (1) and (4) for $F_{1}^{N I} (t, j)$ . This can be accomplished by expressing $F_{1}^{prot} (t, j)$ as a function of β_j and $F_{1}^{N I} (t, j)$ using (4), replacing $F_{1}^{prot} (t, j)$ by this expression in (1), and finding the solution to (1) using a one-dimensional line search. Define the solution as ${\hat{F}}_{1}^{N I} (t, j; β_{j})$ and let the corresponding estimator of the causal effect be $\hat{C E} (t, j; β_{j}) = {\hat{F}}_{1}^{N I} (t, j; β_{j}) - {\hat{F}}_{0} (t, j)$ . Without a closed form for ${\hat{F}}_{1}^{N I} (t, j; β_{j})$ , confidence intervals of $F_{1}^{N I} (t, j)$ and CE(t, j) for an assumed value of β_j can be constructed using the bootstrap percentile method; alternatively, Wald-type confidence intervals can be constructed based on bootstrap estimates of $var {{\hat{F}}_{1}^{N I} (t, j; β_{j})}$ and $var {\hat{C E} (t, j; β_{j})}$ .

Note ${lim}_{β_{j} \to \infty} \hat{C E} (t, j; β_{j}) = {\hat{C E}}^{u p} (t, j)$ and ${lim}_{β_{j} \to - \infty} \hat{C E} (t, j; β_{j}) = {\hat{C E}}^{low} (t, j)$ , i.e., the estimators that arise from the extreme selection models (2) and (3) are special cases of the estimators from the semiparametric bias model (4). Under the no selection model β_j = 0, $\hat{C E} (t, j; β_{j}) = {\hat{F}}_{1} (t, j) - {\hat{F}}_{0} (t, j)$ , i.e., the causal effect is estimated by the difference in Aalen-Johansen estimators from the two treatment groups as in a standard competing risks analysis. In other words, assuming the no selection model gives rise to a naive or “net” estimator [8] which simply compares subsets of the two randomization groups conditional on being observed HIV free and alive at τ₀.

3.3. Uncertainty Regions

The pointwise confidence intervals described in Sections 3.1 and 3.2 will contain CE(t, j) with the stated coverage probability provided the correct value of β_j is assumed. However, the true value of β_j is not identifiable from the observed data. Therefore, following Vansteelandt et al. [23], it is useful to also construct a (1 − α)100% uncertainty interval which contains CE(t, j) with probability 1 − α without conditioning on any assumption about the value of β_j. Under the assumptions given in Section 3.1 where ${\hat{C E}}^{u p} (t, j)$ and ${\hat{C E}}^{low} (t, j)$ are consistent and asymptotically normal, a large sample (1 − α)100% pointwise uncertainty interval for CE(t, j) is given by

[{\hat{C E}}^{low} (t, j) - c_{α / 2}^{*} \hat{var} {{\hat{C E}}^{low} (t, j)}^{1 / 2}, {\hat{C E}}^{u p} (t, j) + c_{α / 2}^{*} \hat{var} {{\hat{C E}}^{u p} (t, j)}^{1 / 2}]

where $c_{α / 2}^{*}$ can be computed using equation (4.3) of Vansteelandt et al. [23], $\hat{var} {{\hat{C E}}^{low} (t, j)} = \hat{var} {{\hat{F}}_{1}^{N I, low} (t, j)} + \hat{var} {{\hat{F}}_{0} (t, j)}$ and $\hat{var} {{\hat{C E}}^{u p} (t, j)} = \hat{var} {{\hat{F}}_{1}^{N I, u p} (t, j)} + \hat{var} {{\hat{F}}_{0} (t, j)}$

4. Simulation Study

Simulations were conducted to evaluate the performance of the methods described in Section 3 for drawing inference about CE(t, j). Data were simulated based on the BAN study under five models: β_j = −∞, −1, 0, 1, ∞ for fixed j. These five choices of β_j correspond to the two extreme selection models (β_j = −∞, ∞), two intermediate selection models (β_j = −1, 1), and the no selection bias model (β_j = 0). The Gompertz distribution was used to simulate competing risks data [25]. Under the Gompertz distribution the CIF can be expressed as F(t, j) = 1 − exp [λ_j{1 − exp (α_jt)}/α_j] where {α₁, …, α_k, λ₁, …, λ_k} are chosen such that $\sum_{j = 1}^{k} Pr [J = j] = \sum_{j = 1}^{k} F (\infty, j) = 1$ . For the simulation study k = 3 and the parameters {α₁, α₂, α₃, λ₁, λ₂, λ₃} were selected such that F₁(28, 1) = 0.02, F₁(28, 2) = 0.02, F₁(28, 3) = 0.70, and $\sum_{j = 1}^{3} F_{1} (\infty, j) = 1$ . These probabilities correspond roughly to the estimated risk of HIV infection (j = 1), death (j = 2) prior to HIV infection or weaning, and cessation of breastfeeding prior to HIV infection (j = 3) at 28 weeks in the BAN study among infants randomized to the infant ART arm who were HIV negative and alive at 2 weeks.

Simulations were conducted under two scenarios (for each of the five models). For the first scenario we let γ = 0.9884, corresponding to the estimated value of γ from the BAN study. In this scenario we considered estimating CE(28, 1), i.e., the effect of treatment on risk HIV infection at 28 weeks. Note (6) and (8) hold in this scenario for t = 28 and j = 1 such that the estimators of the bounds are asymptotically normal. Because γ = 0.9884 is near the boundary value of 1, for the second scenario we let γ = 0.75. In order for (6) and (8) to hold in the second scenario, we considered estimating CE(28, 3), i.e., the effect of treatment on weaning at 28 weeks. For the first scenario simulations were conducted under the alternative hypothesis CE(28, 1) = −0.05, i.e., the risk of HIV infection is lowered by 5% due to treatment. For the second scenario simulations were conducted where CE(28, 3) = 0.05, i.e., women are more likely to breastfeed at 28 weeks when the infant receives ART. For each model and each scenario, data sets of n = 1520 iid copies of W were simulated according to the following steps. The description below is for the first scenario where j = 1 is the event of interest; simulations were conducted analogously for the second scenario where j = 3 is the event of interest.

Step 1
S_i (1) was drawn from a Bernoulli(0.0458), where 0.0458 was the estimated risk of infection or death at two weeks in the infant ART arm of BAN.
Step 2
If S_i (1) = 1, then by monotonicity S_i(0) = 1. In this case we let T_i(0) = J_i(0) = T_i(1) = J_i(1) = * because the survival time and failure type for individuals with S_i = 1 are not used by any of the estimators of CE(t, j).
Step 3
If S_i(1) = 0, then (T_i(1), J_i(1)) were generated according to the Gompertz models described above. In particular, first J_i(1) was generated from a multinomial distribution with cell probabilities 1 − exp(λ_j/α_j) for j = 1, 2, 3. Then T_i(1) was set equal to τ₀ + U_i where U_i was randomly generated from the conditional distribution Pr[T_i(1) ≤ t|J_i(1) = j] = F (t, j)/Pr[J_i(1) = j] using the inverse probability transformation. Generating T_i(1) in this fashion guarantees that T_i(1) > τ₀ = 2 whenever S_i(1) = 0.
Step 4
If S_i(1) = 0, S_i(0) was generated as follows. For β₁ = −∞, $S_{i} (0) = I (T_{i} (1) < q_{1}^{(1 - γ)}, J_{i} (1) = 1)$ where $q_{j}^{(1 - γ)}$ is defined in general such that $Pr [T_{i} (1) \leq q_{j}^{(1 - γ)}, J_{i} (1) = j ∣ S_{i} (1) = 0] = 1 - γ$ . Note for the first scenario (8) holds for t = 28 and j = 1, guaranteeing the existence of $q_{1}^{(1 - γ)}$ . For, β₁ = −1, 0, 1, the value of $F_{1}^{prot} (28, 1; β_{1})$
was found by solving (1) and (4) simultaneously, and then S_i(0) ~ Bernoulli(pβ₁) where $p_{β_{1}} = (1 - γ) I (T_{i} (1) < 28, J_{i} (1) = 1) F_{1}^{prot} (28, 1; β_{1}) / F_{1} (28, 1) + (1 - γ) {1 - I (T_{i} (1) < 28, J_{i} (1) = 1)} {1 - F_{1}^{prot} (28, 1; β_{1})} / {1 - F_{1} (28, 1)}$ . For β₁ = ∞, S_i(0) ~ Bernoulli(p_∞) where p_∞ = (1 − γ){1 − I(T_i(1) < 28, J_i(1) = 1)}/{1 − F₁(28, 1)}. Note for the first scenario (6) holds for t = 28 and j = 1, implying 1 −γ < 1 − F₁(28, 1) thus ensuring p_∞ < 1.
Step 5
If S_i(0) = 0, then we let J_i(0) = J_i(1). If S_i(0) = 0 and J_i(0) = 1, then T_i(0) = T_i(1)/ε, where ε was chosen such that CE(28, 1) = −0.05. If S_i(0) = 0 and J_i(0) ≠ 1, then T_i(0) = T_i(1). If S_i(0) = 1, then we set T_i(0) = J_i(0) = *.
Step 6
C_i(0) and C_i(1) were generated from exponential distributions with means 29 weeks and 18 weeks respectively.
Step 7
Z_i was randomly assigned such that n₁ = 852 and n₀ = 668.
Step 8
GivenZ_i, we set Y_i = min{T_i(Z_i), C_i(Z_i)}, Δ_i = I(Y_i = T_i(Z_i)), J_i = J_i(Z_i), and S_i = S_i(Z_i).

These steps resulted in simulated data sets satisfying Assumptions 2.1 – 2.4 with CE(28, 1) = −0.05 for the first scenario. For each data set simulated, $\hat{C E} (28, 1; β_{j})$ was computed for β_j = −∞, −1, 0, 1, −. Bootstrap percentile and Wald 95% confidence intervals as well as the uncertainty intervals described in Section 3.3 were also computed for each simulated data set, assumed value of β₁, and estimator of CE(28, 1).

Table 1 reports the mean relative bias of $\hat{C E} (28, j; β_{j})$ based on 10,000 simulated data sets for both scenarios (γ = 0.9884, j = 1, and γ = 0.75, j = 3) and each model (β_j = −∞, −1, 0, 1, ∞). The proposed estimator $\hat{C E} (28, j; β_{j})$ is approximately unbiased when β_j is correctly specified; for incorrectly specified β_j the relative bias can be quite large. For example, if β₁ is (incorrectly) assumed to be zero, corresponding to the naive analysis that simply compares infants HIV free and alive at two weeks from each study arm, when in fact β₁ = −∞, then the relative bias of $\hat{C E} (28, 1; β_{1})$ is 23%. This demonstrates how a naive analysis that ignores the potential for selection bias can yield incorrect inference. This is demonstrated further in the scenario where γ = 0.75, in which case misspecifying β₃ leads to even greater relative bias.

Table 1.

Empirical relative bias of estimates of CE(28, j) from simulation study described in Section 4 for both scenarios. Bold entries correspond to estimates where the assumed β_j was correct. Relative bias of $\hat{C E} (28, j; β_{j})$ defined as ${\hat{C E} (28, j; β_{j}) - C E (28, j)} / C E (28, j)$ .

True parameters			Assumed β_j
γ	CE	β_j	−∞	−1	0	1	∞
0.9884	−0.05	−∞	0.02	0.22	0.23	0.24	0.24
		−1	−0.19	0.01	0.01	0.02	0.02
		0	−0.21	−0.01	0.00	0.00	0.01
		1	−0.20	−0.01	0.00	0.00	0.00
		∞	−0.21	−0.01	−0.00	0.00	0.00
0.75	0.05	−∞	−0.01	−1.10	−2.01	−3.13	−6.69
		−1	1.02	−0.01	−0.88	−1.96	−5.32
		0	2.00	0.91	−0.01	−1.12	−4.67
		1	2.66	1.77	0.98	0.00	−2.93
		∞	6.50	5.44	4.55	3.45	0.00

Open in a new tab

Table 2 shows the empirical coverage probabilities of 95% pointwise bootstrap confidence intervals based on 500 bootstrap replications per simulated data set. When the correct β_j is specified, the confidence intervals associated with $\hat{C E} (28, j; β_{j})$ have approximately 95% coverage. Similar results were found using Wald confidence intervals (results not shown). Because β_j is not identifiable from the observable data, coverage of the uncertainty regions is perhaps of more practical interest. For the 50,000 simulated data sets from the first scenario (i.e., combining across the 10,000 data sets for each of the five values of β_j), the empirical coverage of the 95% pointwise uncertainty regions was 97%. Similarly for the second scenario, the empirical coverage of the uncertainty intervals was 97%.

Table 2.

Empirical coverage of pointwise 95% bootstrap percentile confidence intervals of CE(28, j) from simulation study described in Section 4 for both scenarios. Bold entries correspond to estimates where the assumed β_j was correct.

True parameters			Assumed β_j
γ	CE	β_j	−∞	−1	0	1	∞
0.9884	−0.05	−∞	0.95	0.81	0.80	0.80	0.79
		−1	0.90	0.95	0.94	0.94	0.94
		0	0.88	0.94	0.94	0.94	0.94
		1	0.89	0.95	0.95	0.95	0.95
		∞	0.89	0.95	0.95	0.95	0.95
0.75	0.05	−∞	0.95	0.55	0.06	0.00	0.00
		−1	0.66	0.94	0.65	0.06	0.00
		0	0.14	0.66	0.94	0.47	0.00
		1	0.01	0.12	0.55	0.94	0.01
		∞	0.00	0.00	0.00	0.00	0.94

Open in a new tab

5. Application to BAN Study

The BAN study was a randomized clinical trial to assess interventions for the prevention of breast milk transmission of HIV in 2369 HIV infected mothers and their infants in Lilongwe, Malawi [3,4]. There were three arms in the BAN study: daily ART for the infant, daily ART for the mother, or control. While the primary analysis of the study considered comparisons of both ART arms to control, we will focus on comparing the infant ART and control arms only. In March 2008 the data and safety monitoring board stopped the control arm due to efficacy but recommended continued enrollment of mother/infant pairs into the two active treatment arms. This led to an imbalance in the final number of infants randomized to the three arms, with 852 infants in the infant ART arm and 668 infants in the control arm. In the infant ART arm there were 37 HIV infections and 2 deaths before τ₀ = 2 weeks, while the control arm had 36 HIV infections and 2 infant deaths prior to 2 weeks. Thus γ̂ = (630/668)/(813/852) = 0.9884, as in the first scenario of the simulations in Section 4. Among infants HIV free and alive at 2 weeks, in the infant ART (control) arm 12 (32) became HIV infected, 588 (384) weaned prior to HIV infection, and 5 (6) died prior to HIV infection or weaning by 28 weeks. Figure 1 shows the Aalen-Johansen estimates of the cumulative risk of HIV, death prior to HIV infection or weaning, and cessation of breastfeeding prior to HIV infection for infants who were alive and uninfected at 2 weeks as in a standard analysis, i.e., assuming the no selection model β_j = 0 holds for all j. Figure 1(a) suggests a difference in the risk of HIV infection between the infant ART arm and the control arm, however direct comparison between the arms is subject to selection bias.

Estimated cumulative incidence functions, *F̂_z* (23, j), for the three events from the BAN study: (a) HIV infection, (b) HIV-free death prior to weaning, and (c) cessation of breastfeeding prior to HIV infection. For eachpanel, *Z_i* = 0 (control) is represented by the solid line (—) and *Z_i* = 1 (infant ART) is represented by the dashed line (– – –).

Figure 2 shows the semiparametric sensitivity analysis described in Section 3.2. The plot depicts $\hat{C E} (28, 1; β_{1})$ and pointwise 95% Wald confidence intervals for each value of β₁ (using bootstrap variance estimates). Note for the infant ART arm F̂₁(28, 1) = 0.0141, suggesting (6) and (8) hold for t = 28 and j = 1. The estimated ignorance region for CE(28, 1) equals [−0.056, −0.044] and the estimated 95% uncertainty interval equals [−0.078, −0.025]. This estimated uncertainty interval was computed using bootstrap variance estimates; using the analytical variance estimates (7) and (9) yielded a slightly wider uncertainty interval of [−0.084, −0.025]. In either case, because the uncertainty interval excludes 0, we conclude there is evidence of a causal effect of infant ART on the cumulative incidence of HIV at 28 weeks in the NI stratum. Moreover, without any assumptions about the selection bias mechanism, we are 95% confident daily infant ART lowers the risk of HIV infection at 28 weeks between 3% and 8%.

Sensitivity analysis of the effect of infant ART on the cumulative incidence of HIV at 28 weeks for the BAN study. The solid line — denotes $\hat{C E} (28, 1; β_{1})$ and the dotted lines · · · denote pointwise 95% confidence intervals. The estimated non-parametric bounds corresponding to β₁ = −∞ and β₁ = ∞ are given by ○.

The veracity of these results relies on several key assumptions. While interference between infants was not likely, SUTVA could have been violated by changes in the infant ART regimen. Per protocol, if an infant on ART had an adverse event due to the study drug (nevirapine), the ART was changed (to lamivudine) and the infant remained in the study. Thus not all infants were on the same treatment for the duration of the study. Therefore, the effect of ART being estimated can be viewed as an average causal effect over all administered ARTs [26]. While this interpretation answers the hypothesis proposed for the BAN study, it does not indicate which particular ART causes the greatest reduction in risk of HIV infection. Assumption 2.1 seems reasonable because treatment was randomized. While mothers were not blinded, they were counseled to breastfeed their infants regardless of randomization assignment and self-reported frequency of exclusive breastfeeding was comparable between study arms [4]. The BAN study principal investigator, Dr. Charles van der Horst, indicated that monotonicity (Assumption 2.2) is reasonable (personal communication). Dr. van der Horst conjectured that an infant could have an adverse reaction to ART leading to increased susceptibility to HIV infection but he felt this was “highly unlikely.” Monotonicity is also supported by the estimated risk of HIV infection or death at two weeks being lower in the infant ART arm than in the control arm.

Finally, note that two of the three endpoints in BAN were interval censored. In particular, the HIV infection times of the infants were interval censored, known only to be between the last negative and first positive HIV tests. Similarly, the actual timing of weaning is known only to be visits where the mother reported still breastfeeding and weaning. On the other hand, the time of death was known exactly for all infants. Other analyses of the BAN data have found that formally accounting for interval censoring almost always gives nearly the same result as using the midpoint or right endpoint of the interval. This is not surprising given the visits in the BAN study were fairly close together, typically two to four weeks apart. In settings where the intervals are wider, midpoint or right endpoint imputation may yield misleading results. Instead, a non-parametric estimator of F₁(t, j) that allows for interval censored event times [27] can be employed in place of the Aalen-Johansen estimator. Inference that formally accounts for interval censoring is challenging however, owing to slow rates of convergence and non-standard limiting distributions of non-parametric estimators (for continuous time models) [28, 29].

6. Discussion

The objective of many MTCT trials is to determine differences in the cumulative risk of breastfeeding transmission of HIV between study arms conditional on infants being HIV free and alive by some time point τ₀ > 0. Here we have presented methods for evaluating the effect of treatment on the cumulative risk of HIV within a principal stratum when death and weaning are competing risks. Large sample non-parametric bounds and a semi-parametric sensitivity analysis model were developed, and the methods were applied to the BAN study, a large, recent MTCT trial. A simulation study was presented demonstrating that the proposed methods perform well in finite samples similar to the BAN study. The simulations also illustrated how analyses that ignore the potential for selection bias by simply conditioning on being HIV free and alive at τ₀ can give misleading results in settings similar to the BAN study.

The analysis of the BAN study indicates infant ART reduces the risk of HIV infection by 28 weeks in infants who would be HIV free and alive at two weeks regardless of treatment assignment. The proposed methods could be applied in other settings as well. For example, BAN investigators (personal communication) were interested in comparing the risk of HIV infection or death by 48 weeks conditional on infants being HIV free and alive at 28 weeks; here τ₀ = 28 weeks is further from time 0 and the potential for selection bias is even greater than the analysis presented in Section 5. Another example is given by the Zambia Exclusive Breastfeeding (ZEB) study, a randomized MTCT study conducted to evaluate whether abrupt weaning at four months compared with continued breastfeeding increases survival of children of HIV-infected mothers [30]. Randomization occurred at one month postpartum in the ZEB study, however Kuhn et al. [30] presented a comparison of the randomized groups conditional on infants being HIV free and breastfeeding at four months.

A key assumption of the methods described in this paper is monotonicity, which implies that the treatment is no worse than control for any individual in terms of the intermediate variable S. This assumption seems reasonable in the analysis of the BAN study presented in Section 5, but in other settings it may be unrealistic. For example, monotonicity might be considered dubious in an analysis comparing the two active arms of the BAN trial, i.e., maternal ART versus infant ART. In such settings methods that relax or do not require this assumption would be needed. Following Zhang and Rubin [31], nonparametric bounds analogous to those in Section 2 can be derived without assuming monotonicity. Specifically, note that $F_{0} (t, j) = φ F_{0}^{N I} (t, j) + (1 - φ) F_{0}^{harm} (t, j)$ , where φ = Pr[S_i(1) = 0|S_i(0) = 0] and $F_{0}^{harm} (t, j) = Pr [T_{i} (0) \leq t, J_{i} (0) = j ∣ S_{i} (0) = 0, S_{i} (1) = 1]$ . If γ and φ were identifiable, then bounds for $F_{0}^{N I} (t, j)$ can be constructed analogous to (2) and (3) and combined with bounds for $F_{1}^{N I} (t, j)$ to obtain the following bounds on CE(t, j):

C E^{low} (t, j) = max {\frac{F_{1} (t, j) - (1 - γ)}{γ}, 0} - min {\frac{F_{0} (t, j)}{φ}, 1}

(10)

and

C E^{u p} (t, j) = min {\frac{F_{1} (t, j)}{γ}, 1} - max {\frac{F_{0} (t, j) - (1 - φ)}{φ}, 0} .

(11)

However, without the monotonicity assumption γ and φ are not identifiable. Let π = Pr[S_i(0) = 0, S_i(1) = 1] and note that

γ = Pr [S_{i} (0) = 0 ∣ S_{i} (1) = 0] = \frac{Pr [S_{i} (0) = 0, S_{i} (1) = 0]}{Pr [S_{i} (1) = 0]} = \frac{Pr [S_{i} (0) = 0] - π}{Pr [S_{i} (1) = 0]}

and

φ = Pr [S_{i} (1) = 0 ∣ S_{i} (0) = 0] = \frac{Pr [S_{i} (0) = 0, S_{i} (1) = 0]}{Pr [S_{i} (0) = 0]} = \frac{Pr [S_{i} (0) = 0] - π}{Pr [S_{i} (0) = 0]}

are identifiable from the observed data for a fixed value of π. Thus, the lower bound of CE(t, j) is found by minimizing (10) over π where max{0, Pr[S_i(0) = 0] − Pr[S_i(1) = 0]} ≤ π ≤ min{Pr[S_i(0) = 0], Pr[S_i(1) = 1]}. Likewise, the upper bound of CE(t, j) is found by maximizing (11) over the same range of π. Sensitivity analysis could be performed by adapting the methods of Shepherd et al. [32]. For instance, similar to Assumption 2.4, a selection model for $F_{0}^{N I} (t, j)$ could be assumed, such as:

Assumption 2.5:

exp (η_{j}) = \frac{F_{0}^{N I} (t, j) / {1 - F_{0}^{N I} (t, j)}}{F_{0}^{harm} (t, j) / {1 - F_{0}^{harm} (t, j)}} .

(12)

Sensitivity analysis under Assumptions 2.1, 2.3, 2.4, and 2.5 would be performed by varying π over max{0, Pr[S_i(0) = 0] − Pr[S_i(1) = 0]} ≤ π ≤ min{Pr[S_i(0) = 0], Pr[S_i(1) = 1]} and η_j, β_j each over (− ∞ , ∞). The resulting inference will be more precise if the ranges of π, η_j, and β_j can be further restricted based on prior information elicited from subject matter experts.

For the MTCT research motivating this work, interest focused on the principal stratum of infants HIV free and alive at τ₀ under either treatment assignment. The methods developed could also be applied to infants HIV infected and alive at τ₀ under either treatment where T might denote the time until death from various causes. Beyond MTCT trials, the methods developed could be applied in other settings where inference about treatment effects within principal strata is of interest (e.g., truncation-by-death or non-compliance) and the endpoint is a time-to-event outcome subject to competing risks. Further research might entail allowing the cumulative incidence functions to depend on baseline covariates (e.g., as in Jeong and Fine [33]).

Acknowledgments

Contract/grant sponsor: NIH grants R01-AI085073 and U48-DP001944.

This work was supported in part by grants U48-DP001944 from the US Centers for Disease Control and Prevention (CDC) and R01-AI085073 from the US National Institutes of Health (NIH). The content is solely the responsibility of the authors and does not necessarily represent the official views of CDC or NIH. The authors would like to thank Dr. Jason Fine for his helpful comments and the BAN investigators for access to the data from their study.

7. Appendix

Asymptotic Variances of ${\hat{F}}_{1}^{N I, u p} (t, j)$ and ${\hat{F}}_{1}^{N I, low} (t, j)$

To derive the asymptotic variances of ${\hat{F}}_{1}^{N I, u p} (t, j)$ and ${\hat{F}}_{1}^{N I, low} (t, j)$ , we first derive the large sample variance of γ̂. Under monotonicity, it is straightforward to show $\hat{γ} - (N_{0} / n_{0}) / (N_{1} / n_{1}) \overset{p}{\to} 0$ , implying γ̂ and (N₀/n₀)/(N₁/n₁) have the same limiting distribution; therefore for the derivation below we can assume γ̂ = (N₀/n₀)/(N₁/n₁). For z, s = 0, 1, define p_zs = Σ I [Z_i = z, S_i = s]/n and π_zs = Pr[Z_i = z, S_i = s], and let p = (p₀₀, p₀₁, p₁₀, p₁₁)′ and π = (π₀₀, π₀₁, π₁₀, π₁₁)′. Define the function g as g(π) = π₀₀(π₁₀ + π₁₁)/{π₁₀(π₀₀ + π₀₁)g and note that g(p) = γ̂ and g(π) = γ. Then by the multivariate central limit theorem and the delta method (e.g., see Agresti 2002 [34], page 580), $\sqrt{n} (\hat{γ} - γ) = \sqrt{n} {g (p) - g (π)} \overset{D}{\to} N (0, σ_{γ}^{2})$ where $σ_{γ}^{2} = \sum_{z, s = 0}^{1} π_{z s} {(\nabla g_{z s})}^{2} - {(\sum_{z, s = 0}^{1} π_{z s} \nabla g_{z s})}^{2}$ and ∇g_zs = ∂g(π)/∂π_zs. It follows from straightforward algebra that $σ_{γ}^{2} = γ^{2} [π_{01} / {π_{00} (π_{00} + π_{01})} + π_{11} / {π_{10} (π_{10} + π_{11})}]$ for which a consistent estimator is ${\hat{σ}}_{γ}^{2} = {\hat{γ}}^{2} n (1 / N_{0} - 1 / n_{0} + 1 / N_{1} - 1 / n_{1})$ .

For fixed t and j, let θ_tj = (F₁(t, j); γ)′ and θ̂_tj = (F̂₁(t, j); γ̂)′. Under the conditions stated in Section 3.1 of the main text, in particular assuming equation (6), it is straightforward to show ${\hat{F}}_{1}^{N I, u p} (t, j) - {\hat{F}}_{1} (t, j) / \hat{γ} \overset{p}{\to} 0$ , implying ${\hat{F}}_{1}^{N I, u p} (t, j)$ and F̂₁(t, j)/γ̂ have the same limiting distribution. Therefore we can assume ${\hat{F}}_{1}^{N I, u p} (t, j) = {\hat{F}}_{1} (t, j) / \hat{γ}$ and, analogously, by equation (8) of the main text we can assume ${\hat{F}}_{1}^{N I, low} (t, j) = {{\hat{F}}_{1} (t, j) - (1 - \hat{γ})} / \hat{γ}$ . Define the vector of functions h(x, y) = (x/y, {x − (1 − y)}/y)′ such that $h ({\hat{θ}}_{t j}) = ({\hat{F}}_{1}^{N I, u p} (t, j), {\hat{F}}_{1}^{N I, low} {(t, j))}^{'}$ . Because F̂₁(t, j) and γ̂ are consistent and asymptotically normal, by the delta method $\sqrt{n} {h ({\hat{θ}}_{t j}) - h (θ_{t j})} \overset{D}{\to} N (0, \nabla h (θ_{t j}) \sum_{t j} \nabla h {(θ_{t j})}^{'})$ where

\nabla h (x, y) = [\begin{array}{r} 1 / y & - x / y^{2} \\ 1 / y & (1 - x) / y^{2} \end{array}], \sum_{t j} = [\begin{array}{r} σ_{t j}^{2} & 0 \\ 0 & σ_{γ}^{2} \end{array}],

and $σ_{t j}^{2}$ is the asymptotic variance of $\sqrt{n} {{\hat{F}}_{1} (t, j) - F_{1} (t, j)}$ such that in large samples $var {{\hat{F}}_{1} (t, j)} = σ_{t j}^{2} / n$ . It follows that ${\hat{F}}_{1}^{N I, u p} (t, j)$ and ${\hat{F}}_{1}^{N I, low} (t, j)$ are asymptotically normal with variances

var {{\hat{F}}_{1}^{N I, u p} (t, j)} = \frac{var {{\hat{F}}_{1} (t, j)}}{γ^{2}} + \frac{F_{1} {(t, j)}^{2} σ_{γ}^{2}}{n γ^{4}},

(13)

and

var {{\hat{F}}_{1}^{N I, low} (t, j)} = \frac{var {{\hat{F}}_{1} (t, j)}}{γ^{2}} + \frac{{1 - F_{1} (t, j)}^{2} σ_{γ}^{2}}{n γ^{4}} .

(14)

Replacing var {F̂₁(t, j)}, γ, F₁(t, j), and $σ_{γ}^{2}$ in (A.1) and (A.2) with $\hat{var} {{\hat{F}}_{1} (t, j)}$ , γ̂, F̂₁(t, j), and ${\hat{σ}}_{γ}^{2}$ yields equations (7) and (9) from the main text.

References

1.World Health Organization. HIV transmission through breastfeeding: a review of available evidence. Geneva: 2007. [Google Scholar]
2.UNAIDS. AIDS epidemic update. Joint United Nations Programme on HIV/AIDS; Geneva: Dec, 2007. [Google Scholar]
3.van der Horst C, Chasela C, Ahmed Y, Hoffman I, Hosseinipour M, Knight R, Fiscus S, Hudgens M, Kazembe P, Bentley M, et al. Modifications of a large HIV prevention clinical trial to fit changing realities: A case study of the Breastfeeding, Antiretroviral, and Nutrition (BAN) protocol in Lilongwe, Malawi. Contemporary Clinical Trials. 2009;30:24–33. doi: 10.1016/j.cct.2008.09.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
4.Chasela C, Hudgens M, Jamieson D, Kayira D, Hosseinipour M, Kourtis A, Martinson F, Tegha G, Knight R, Ahmed Y, et al. Maternal or infant antiretroviral drugs to reduce HIV-1 transmission. New England Journal of Medicine. 2010;362(24):2271–2281. doi: 10.1056/NEJMoa0911486. [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Kilewo C, Karlsson K, Ngarina M, Massawe A, Lyamuya E, Swai A, Lipyoga R, Mhalu F, Biberfeld G Mitra Plus Study Team. Prevention of mother-to-child transmission of HIV-1 through breastfeeding by treating mothers with triple antiretroviral therapy in Dar es Salaam, Tanzania: The Mitra Plus Study. Journal of Acquired Immune Deficiency Syndromes. 2009;52(3):406–416. doi: 10.1097/QAI.0b013e3181b323ff. [DOI] [PubMed] [Google Scholar]
6.Kumwenda N, Hoover D, Mofenson L, Thigpen M, Kafulafula G, Li Q, Mipando L, Nkanaunena K, Mebrahtu T, Bulterys M, et al. Extended antiretroviral prophylaxis to reduce breast-milk HIV-1 transmission. New England Journal of Medicine. 2008;359(2):119–129. doi: 10.1056/NEJMoa0801941. [DOI] [PubMed] [Google Scholar]
7.Bedri A, Gudetta B, Isehak A, Kumbi S, Lulseged S, Mengistu Y, Bhore AV, Bhosale R, Varadhrajan V, Gupte N, et al. Extended-dose nevirapine to 6 weeks of age for infants to prevent HIV transmission via breastfeeding in Ethiopia, India, and Uganda: an analysis of three randomised controlled trials. Lancet. 2008 Jul;372(9635):300–313. doi: 10.1016/S0140-6736(08)61114–9. [DOI] [PubMed] [Google Scholar]
8.Frangakis CE, Rubin DB. Principal stratification in causal inference. Biometrics. 2002;58:21–29. doi: 10.1111/j.0006-341X.2002.00021.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Pearl J. Principal stratification – a goal or a tool? The International Journal of Biostatistics. 2011;7(1):20. doi: 10.2202/1557–4679.1322. [DOI] [PMC free article] [PubMed] [Google Scholar]
10.VanderWeele T. Principal stratification–uses and limitations. The International Journal of Biostatistics. 2011;7(1):1–14. doi: 10.2202/1557–4679.1329. [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Alioum A, Dabis F, Dequae-Merchadou L, Haverkamp G, Hudgens M, Hughes J, Karon J, Leroy V, Newell M, Richardson B, et al. Estimating the efficacy of interventions to prevent mother-to-child transmission of HIV in breast-feeding populations: development of a consensus methodology. Statistics in Medicine. 2001;20:3539–3556. doi: 10.1002/sim.1076. [DOI] [PubMed] [Google Scholar]
12.Tsiatis AA. Competing risks. In: Armitage P, Colton T, editors. Encyclopedia of Biostatistics. Wiley: New York; 1998. pp. 824–834. [Google Scholar]
13.Andersen P, Abildstrom S, Rosthøj S. Competing risks as a multi-state model. Statistical Methods in Medical Research. 2002;11(2):203–215. doi: 10.1191/0962280202sm281ra. [DOI] [PubMed] [Google Scholar]
14.Hudgens MG, Halloran ME. Causal vaccine effects on binary post-infection outcomes. Journal of the American Statistical Association. 2006;101:51–64. doi: 10.1198/016214505000000970. [DOI] [PMC free article] [PubMed] [Google Scholar]
15.Gilbert PB, Bosch R, Hudgens MG. Sensitivity analysis for the assessment of causal vaccine effects on viral load in HIV vaccine trials. Biometrics. 2003;59:531–541. doi: 10.1111/1541-0420.00063. [DOI] [PubMed] [Google Scholar]
16.Hayden D, Pauler DK, Schoenfeld D. An estimator for treatment comparisons among survivors in randomized trials. Biometrics. 2005;61:305–310. doi: 10.1111/j.0006-341X.2005.030227.x. [DOI] [PubMed] [Google Scholar]
17.Shepherd BE, Gilbert PB, Lumley T. Sensitivity analyses comparing time-to-event outcomes existing only in a subset selected postrandomization. Journal of the American Statistical Association. 2007;102(478):573–582. doi: 10.1198/016214507000000130. [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Rubin DB. Discussion of “Randomization analysis of experimental data in the Fisher randomization test” by D. Basu. Journal of the American Statistical Association. 1980;75:591–593. doi: 10.2307/2287653. [DOI] [Google Scholar]
19.Robins JM, Rotnitzky A, Scharfstein DO. Sensitivity analysis for selection bias and unmeasured confounding in missing data and causal inference models. In: Halloran M, Berry D, editors. Statistics in Epidemiology, Environment and Clinical Trials. Springer-Verlag; 2000. [Google Scholar]
20.Scharfstein DO, Halloran ME, Chu H, Daniels MJ. On estimation of vaccine efficacy using validation samples with selection bias. Biostatistics. 2006;7:615–629. doi: 10.1093/biostatistics/kxj031.. [DOI] [PMC free article] [PubMed] [Google Scholar]
21.Shepherd BE, Gilbert PB, Mehrotra DV. Eliciting a counterfactual sensitivity parameter. The American Statistician. 2007;61(1):56–63. doi: 10.1198/000313007X163213. [DOI] [Google Scholar]
22.Aalen O, Johansen S. An empirical transition matrix for non-homogeneous markov chains based on censored observations. Scandinavian Journal of Statistics. 1978;5(3):141–150. [Google Scholar]
23.Vansteelandt S, Goetghebeur E, Kenward M, Molenberghs G. Ignorance and uncertainty regions as inferential tools in a sensitivity analysis. Statistica Sinica. 2006;16(3):953–979. [Google Scholar]
24.Aalen O, Borgan O, Gjessing H. Suvival and Event History Analysis: A Process Point of View. Springer; New York: 2008. [Google Scholar]
25.Jeong J, Fine J. Direct parametric inference for the cumulative incidence function. Journal of the Royal Statistical Society, Series C-Applied Statistics. 2006;55(2):187–200. doi: 10.1111/j.1467-9876.2006.00532.x. [DOI] [Google Scholar]
26.VanderWeele TJ, Hernan MA. Causal inference under multiple versions of treatment. COBRA Preprint Series. 2011:77. doi: 10.1515/jci-2012-0002. http://biostats.bepress.com/cobra/ps/art77/ [DOI] [PMC free article] [PubMed]
27.Hudgens MG, Satten GA, Longini IM. Nonparametric maximum likelihood estimation for competing risks survival data subject to interval censoring and truncation. Biometrics. 2001;57:74–80. doi: 10.1111/j.0006-341X.2001.00074.x. [DOI] [PubMed] [Google Scholar]
28.Groeneboom P, Maathuis MH, Wellner JA. Current status data with competing risks: Consistency and rates of convergence of the MLE. Annals of Statistics. 2008;36:1031–1063. doi: 10.1214/009053607000000974. [DOI] [PMC free article] [PubMed] [Google Scholar]
29.Groeneboom P, Maathuis MH, Wellner JA. Current status data with competing risks: Limiting distribution of the MLE. Annals of Statistics. 2008;36:1064–1089. doi: 10.1214/009053607000000983. [DOI] [PMC free article] [PubMed] [Google Scholar]
30.Kuhn L, Aldrovandi G, Sinkala M, Kankasa C, Semrau K, Mwiya M, Kasonde P, Scott N, Vwalika C, Walter J, et al. Effects of early, abrupt weaning on HIV-free survival of children in Zambia. N Engl J Med. 2008;359:130–141. doi: 10.1056/NEJMoa073788. [DOI] [PMC free article] [PubMed] [Google Scholar]
31.Zhang JL, Rubin DB. Estimation of causal effects via principal stratification when some outcomes are truncated by “death”. Journal of Educational and Behavioral Statistics. 2003;28:353–368. doi: 10.3102/10769986028004353. [DOI] [Google Scholar]
32.Shepherd B, Gilbert P, Dupont C. Sensitivity analyses comparing time-to-event outcomes only existing in a subset selected postrandomization and relaxing monotonicity. Biometrics. 2011;67(3):1100–1110. doi: 10.1111/j.1541-0420.2010.01508.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
33.Jeong J, Fine J. Parametric regression on cumulative incidence function. Biostatistics. 2007;8(2):184–196. doi: 10.1093/biostatistics/kxj040. [DOI] [PubMed] [Google Scholar]
34.Agresti A. Categorical Data Analysis. 2. John Wiley and Sons; New York: 2002. [Google Scholar]

[R1] 1.World Health Organization. HIV transmission through breastfeeding: a review of available evidence. Geneva: 2007. [Google Scholar]

[R2] 2.UNAIDS. AIDS epidemic update. Joint United Nations Programme on HIV/AIDS; Geneva: Dec, 2007. [Google Scholar]

[R3] 3.van der Horst C, Chasela C, Ahmed Y, Hoffman I, Hosseinipour M, Knight R, Fiscus S, Hudgens M, Kazembe P, Bentley M, et al. Modifications of a large HIV prevention clinical trial to fit changing realities: A case study of the Breastfeeding, Antiretroviral, and Nutrition (BAN) protocol in Lilongwe, Malawi. Contemporary Clinical Trials. 2009;30:24–33. doi: 10.1016/j.cct.2008.09.001. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R4] 4.Chasela C, Hudgens M, Jamieson D, Kayira D, Hosseinipour M, Kourtis A, Martinson F, Tegha G, Knight R, Ahmed Y, et al. Maternal or infant antiretroviral drugs to reduce HIV-1 transmission. New England Journal of Medicine. 2010;362(24):2271–2281. doi: 10.1056/NEJMoa0911486. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R5] 5.Kilewo C, Karlsson K, Ngarina M, Massawe A, Lyamuya E, Swai A, Lipyoga R, Mhalu F, Biberfeld G Mitra Plus Study Team. Prevention of mother-to-child transmission of HIV-1 through breastfeeding by treating mothers with triple antiretroviral therapy in Dar es Salaam, Tanzania: The Mitra Plus Study. Journal of Acquired Immune Deficiency Syndromes. 2009;52(3):406–416. doi: 10.1097/QAI.0b013e3181b323ff. [DOI] [PubMed] [Google Scholar]

[R6] 6.Kumwenda N, Hoover D, Mofenson L, Thigpen M, Kafulafula G, Li Q, Mipando L, Nkanaunena K, Mebrahtu T, Bulterys M, et al. Extended antiretroviral prophylaxis to reduce breast-milk HIV-1 transmission. New England Journal of Medicine. 2008;359(2):119–129. doi: 10.1056/NEJMoa0801941. [DOI] [PubMed] [Google Scholar]

[R7] 7.Bedri A, Gudetta B, Isehak A, Kumbi S, Lulseged S, Mengistu Y, Bhore AV, Bhosale R, Varadhrajan V, Gupte N, et al. Extended-dose nevirapine to 6 weeks of age for infants to prevent HIV transmission via breastfeeding in Ethiopia, India, and Uganda: an analysis of three randomised controlled trials. Lancet. 2008 Jul;372(9635):300–313. doi: 10.1016/S0140-6736(08)61114–9. [DOI] [PubMed] [Google Scholar]

[R8] 8.Frangakis CE, Rubin DB. Principal stratification in causal inference. Biometrics. 2002;58:21–29. doi: 10.1111/j.0006-341X.2002.00021.x. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R9] 9.Pearl J. Principal stratification – a goal or a tool? The International Journal of Biostatistics. 2011;7(1):20. doi: 10.2202/1557–4679.1322. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R10] 10.VanderWeele T. Principal stratification–uses and limitations. The International Journal of Biostatistics. 2011;7(1):1–14. doi: 10.2202/1557–4679.1329. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R11] 11.Alioum A, Dabis F, Dequae-Merchadou L, Haverkamp G, Hudgens M, Hughes J, Karon J, Leroy V, Newell M, Richardson B, et al. Estimating the efficacy of interventions to prevent mother-to-child transmission of HIV in breast-feeding populations: development of a consensus methodology. Statistics in Medicine. 2001;20:3539–3556. doi: 10.1002/sim.1076. [DOI] [PubMed] [Google Scholar]

[R12] 12.Tsiatis AA. Competing risks. In: Armitage P, Colton T, editors. Encyclopedia of Biostatistics. Wiley: New York; 1998. pp. 824–834. [Google Scholar]

[R13] 13.Andersen P, Abildstrom S, Rosthøj S. Competing risks as a multi-state model. Statistical Methods in Medical Research. 2002;11(2):203–215. doi: 10.1191/0962280202sm281ra. [DOI] [PubMed] [Google Scholar]

[R14] 14.Hudgens MG, Halloran ME. Causal vaccine effects on binary post-infection outcomes. Journal of the American Statistical Association. 2006;101:51–64. doi: 10.1198/016214505000000970. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R15] 15.Gilbert PB, Bosch R, Hudgens MG. Sensitivity analysis for the assessment of causal vaccine effects on viral load in HIV vaccine trials. Biometrics. 2003;59:531–541. doi: 10.1111/1541-0420.00063. [DOI] [PubMed] [Google Scholar]

[R16] 16.Hayden D, Pauler DK, Schoenfeld D. An estimator for treatment comparisons among survivors in randomized trials. Biometrics. 2005;61:305–310. doi: 10.1111/j.0006-341X.2005.030227.x. [DOI] [PubMed] [Google Scholar]

[R17] 17.Shepherd BE, Gilbert PB, Lumley T. Sensitivity analyses comparing time-to-event outcomes existing only in a subset selected postrandomization. Journal of the American Statistical Association. 2007;102(478):573–582. doi: 10.1198/016214507000000130. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R18] 18.Rubin DB. Discussion of “Randomization analysis of experimental data in the Fisher randomization test” by D. Basu. Journal of the American Statistical Association. 1980;75:591–593. doi: 10.2307/2287653. [DOI] [Google Scholar]

[R19] 19.Robins JM, Rotnitzky A, Scharfstein DO. Sensitivity analysis for selection bias and unmeasured confounding in missing data and causal inference models. In: Halloran M, Berry D, editors. Statistics in Epidemiology, Environment and Clinical Trials. Springer-Verlag; 2000. [Google Scholar]

[R20] 20.Scharfstein DO, Halloran ME, Chu H, Daniels MJ. On estimation of vaccine efficacy using validation samples with selection bias. Biostatistics. 2006;7:615–629. doi: 10.1093/biostatistics/kxj031.. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R21] 21.Shepherd BE, Gilbert PB, Mehrotra DV. Eliciting a counterfactual sensitivity parameter. The American Statistician. 2007;61(1):56–63. doi: 10.1198/000313007X163213. [DOI] [Google Scholar]

[R22] 22.Aalen O, Johansen S. An empirical transition matrix for non-homogeneous markov chains based on censored observations. Scandinavian Journal of Statistics. 1978;5(3):141–150. [Google Scholar]

[R23] 23.Vansteelandt S, Goetghebeur E, Kenward M, Molenberghs G. Ignorance and uncertainty regions as inferential tools in a sensitivity analysis. Statistica Sinica. 2006;16(3):953–979. [Google Scholar]

[R24] 24.Aalen O, Borgan O, Gjessing H. Suvival and Event History Analysis: A Process Point of View. Springer; New York: 2008. [Google Scholar]

[R25] 25.Jeong J, Fine J. Direct parametric inference for the cumulative incidence function. Journal of the Royal Statistical Society, Series C-Applied Statistics. 2006;55(2):187–200. doi: 10.1111/j.1467-9876.2006.00532.x. [DOI] [Google Scholar]

[R26] 26.VanderWeele TJ, Hernan MA. Causal inference under multiple versions of treatment. COBRA Preprint Series. 2011:77. doi: 10.1515/jci-2012-0002. http://biostats.bepress.com/cobra/ps/art77/ [DOI] [PMC free article] [PubMed]

[R27] 27.Hudgens MG, Satten GA, Longini IM. Nonparametric maximum likelihood estimation for competing risks survival data subject to interval censoring and truncation. Biometrics. 2001;57:74–80. doi: 10.1111/j.0006-341X.2001.00074.x. [DOI] [PubMed] [Google Scholar]

[R28] 28.Groeneboom P, Maathuis MH, Wellner JA. Current status data with competing risks: Consistency and rates of convergence of the MLE. Annals of Statistics. 2008;36:1031–1063. doi: 10.1214/009053607000000974. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R29] 29.Groeneboom P, Maathuis MH, Wellner JA. Current status data with competing risks: Limiting distribution of the MLE. Annals of Statistics. 2008;36:1064–1089. doi: 10.1214/009053607000000983. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R30] 30.Kuhn L, Aldrovandi G, Sinkala M, Kankasa C, Semrau K, Mwiya M, Kasonde P, Scott N, Vwalika C, Walter J, et al. Effects of early, abrupt weaning on HIV-free survival of children in Zambia. N Engl J Med. 2008;359:130–141. doi: 10.1056/NEJMoa073788. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R31] 31.Zhang JL, Rubin DB. Estimation of causal effects via principal stratification when some outcomes are truncated by “death”. Journal of Educational and Behavioral Statistics. 2003;28:353–368. doi: 10.3102/10769986028004353. [DOI] [Google Scholar]

[R32] 32.Shepherd B, Gilbert P, Dupont C. Sensitivity analyses comparing time-to-event outcomes only existing in a subset selected postrandomization and relaxing monotonicity. Biometrics. 2011;67(3):1100–1110. doi: 10.1111/j.1541-0420.2010.01508.x. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R33] 33.Jeong J, Fine J. Parametric regression on cumulative incidence function. Biostatistics. 2007;8(2):184–196. doi: 10.1093/biostatistics/kxj040. [DOI] [PubMed] [Google Scholar]

[R34] 34.Agresti A. Categorical Data Analysis. 2. John Wiley and Sons; New York: 2002. [Google Scholar]

PERMALINK

Comparing competing risk outcomes within principal strata, with application to studies of mother-to-child transmission of HIV

Dustin M Long

Michael G Hudgens

Abstract

1. Introduction

2. Notation and Assumptions

3. Inference

3.1. Nonparametric Estimation: Bounds

3.2. Semiparametric Estimation

3.3. Uncertainty Regions

4. Simulation Study

Table 1.

Table 2.

5. Application to BAN Study

Figure 1.

Figure 2.

6. Discussion

Acknowledgments

7. Appendix

Asymptotic Variances of ${\hat{F}}_{1}^{N I, u p} (t, j)$ and ${\hat{F}}_{1}^{N I, low} (t, j)$

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Comparing competing risk outcomes within principal strata, with application to studies of mother-to-child transmission of HIV

Dustin M Long

Michael G Hudgens

Abstract

1. Introduction

2. Notation and Assumptions

3. Inference

3.1. Nonparametric Estimation: Bounds

3.2. Semiparametric Estimation

3.3. Uncertainty Regions

4. Simulation Study

Table 1.

Table 2.

5. Application to BAN Study

Figure 1.

Figure 2.

6. Discussion

Acknowledgments

7. Appendix

Asymptotic Variances of F^1NI,up(t,j) and F^1NI,low(t,j)

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

Asymptotic Variances of ${\hat{F}}_{1}^{N I, u p} (t, j)$ and ${\hat{F}}_{1}^{N I, low} (t, j)$