Skip to main content
The International Journal of Biostatistics logoLink to The International Journal of Biostatistics
. 2010 Apr 13;6(2):18. doi: 10.2202/1557-4679.1212

When to Start Treatment? A Systematic Approach to the Comparison of Dynamic Regimes Using Observational Data*

Lauren E Cain *, James M Robins , Emilie Lanoy , Roger Logan **, Dominique Costagliola ††, Miguel A Hernán ‡‡
PMCID: PMC3406513  PMID: 21972433

Abstract

Dynamic treatment regimes are the type of regime most commonly used in clinical practice. For example, physicians may initiate combined antiretroviral therapy the first time an individual’s recorded CD4 cell count drops below either 500 cells/mm3 or 350 cells/mm3. This paper describes an approach for using observational data to emulate randomized clinical trials that compare dynamic regimes of the form “initiate treatment within a certain time period of some time-varying covariate first crossing a particular threshold.” We applied this method to data from the French Hospital database on HIV (FHDH-ANRS CO4), an observational study of HIV-infected patients, in order to compare dynamic regimes of the form “initiate treatment within m months after the recorded CD4 cell count first drops below x cells/mm3” where x takes values from 200 to 500 in increments of 10 and m takes values 0 or 3. We describe the method in the context of this example and discuss some complications that arise in emulating a randomized experiment using observational data.

Keywords: dynamic treatment regimes, marginal structural models, HIV infection, antiretroviral therapy

1. Introduction

The goal of many observational studies is to compare the effects of two or more treatment regimes on a clinical outcome. Treatment regimes are dynamic when they depend on time-dependent covariates and static otherwise. For example, some guidelines for the use of combined antiretroviral therapy (cART) as a treatment for HIV recommend initiating treatment the first time the CD4 cell count drops below 350 cells/mm3 (Panel on Antiretroviral Guidelines, 2008; Recommandations du groupe d’experts, 2008). This recommendation is an example of a dynamic regime (Robins and Hernán, 2008) because the initiation of treatment depends on the evolution of a time-varying covariate, CD4 cell count. In contrast, most randomized trials have compared static regimes like “initiate treatment at the beginning of the study” and “do not initiate treatment during the study”.

Static regimes are the type of regime most commonly compared in randomized clinical trials, but they are rarely used in clinical practice. Dynamic regimes, on the other hand, are rarely compared in clinical trials, but they are the type of regime most commonly used in clinical practice. For example, randomized trials have demonstrated that cART is an effective therapy to prevent AIDS and death in HIV-infected individuals (Cameron et al., 1998; Hammer et al., 1997), but the comparison of static regimes does not provide information on the optimal CD4 cell count at which to initiate treatment. Until randomized trials comparing dynamic regimes are conducted, we need to rely on observational studies of HIV-infected individuals to identify the optimal CD4 cell count at which to initiate cART.

Hernán et al. (2006) described an approach to emulate randomized clinical trials with two dynamic regimes using observational data. They applied the method to data from the French Hospital database on HIV (FHDH-ANRS CO4), an observational study of HIV-infected individuals, to compare dynamic regimes of the form “initiate treatment when the recorded CD4 cell count first drops below x cells/mm3” where x takes two values, e.g., 200 and 500. Later van der Laan and Petersen (2007), Orellana et al. (2010a, b), and Robins et al. (2008) proposed generalizations of the method to simultaneously compare many dynamic regimes. In this paper, we extend the analysis of the FHDH data from 2 to 31 dynamic regimes of the form “initiate treatment when the recorded CD4 cell count first drops below x cells/mm3” where x takes values from 200 to 500 in increments of 10.

The analysis by Hernán et al. compared regimes in which individuals initiate treatment immediately (during the same month) after their CD4 cell count crosses a particular threshold. Since the expectation of immediate action is often unrealistic due to administrative delays and other factors, regimes that allow delayed action may be more clinically relevant than regimes that require immediate action. In this paper, we extend the method to allow for delayed action by considering regimes of the form “initiate treatment within m months after the recorded CD4 cell count first drops below x cells/mm3” where x takes values from 200 to 500 in increments of 10 and m > 0. The analysis by Hernán et al. was restricted to regimes with m = 0.

Below we describe how to emulate a randomized experiment involving multiple dynamic regimes using observational data, and how to address some complications that arise. For pedagogic reasons, we initially restrict our attention to dynamic regimes with m = 0, and then extend the method to dynamic regimes with m > 0. First, we provide a brief description of the FHDH data used in our analyses and introduce the notation used throughout the paper.

2. Data and notation

The French Hospital database on HIV (FHDH ANRS CO4) (Piketty et al., 2008) includes HIV-infected individuals seen at 62 French teaching hospitals belonging to 29 HIV Treatment and Information Centres (COREVIH) in mainland France and French overseas territories. Data have been collected since 1992 through medical records review by trained research assistants. Quality control is performed via monitored on-site source documentation (AUDIT on randomized individuals). Individuals are followed-up at the time of their clinic appointments, usually every three to four months, and the study attempts to collect information at least every six months. Death is ascertained through medical records review.

Our analysis was restricted to the 4,237 HIV-infected individuals who met the following eligibility criteria: age 18 years or older, antiretroviral therapy-naïve, no history of AIDS-defining illness (Ancelle-Park et al., 1993; CDC, 1992), no pregnancy, HIV-RNA >500 copies/mL, CD4 cell count and HIV-RNA measurements within six months of each other, and CD4 cell count between 200 and 500 cells/mm3 with no history of CD4 cell count less than 500 cells/mm3. All analyses were conducted with SAS 9.2 (Cary, North Carolina).

An individual’s time zero was defined as the first time all of the above criteria were met. Time t is measured in months and ranges from 0 to 142. For each individual, follow-up ended at the time the outcome occurred, 12 months after the most recent laboratory measurement, pregnancy, or the administrative end of follow-up (December 2008), whichever occurred earlier. Initiation of cART was defined as the date at which an individual initiated use of either three or more antiretroviral drugs, or two ritonavir-boosted protease inhibitors, or one non-nucleoside reverse transcriptase inhibitor plus one boosted protease inhibitor. Ait = 1 indicates that individual i has initiated treatment by time t, 0 otherwise. The outcome of interest was clinical AIDS or death. The date of death was identified as described elsewhere (HIV-CAUSAL Collaboration, 2010) and AIDS was ascertained by the treating physicians. Dit = 1 indicates that individual i developed the outcome during time t, 0 otherwise. Lit is a vector of individual i’s covariates measured at time t. Vi is a vector of the time-fixed covariates measured at time zero (a subset of Li0) that includes sex, age (<35, 35–49, ≥50 years), geographic origin (Europe/North America, Sub-Saharan Africa, Latin America/Caribbean, other), mode of transmission (heterosexual, homosexual/bisexual, injection drug use, other or unknown), CD4 cell count (restricted cubic spline with three knots), HIV-1 RNA (<10,000, 10,000–100,000, >100,000 copies/mL), calendar year (1997–1998, 1999–2000, 2001–2003, 2004–2008), and years since HIV diagnosis (<1, 1–4, ≥5 years, unknown). For simplicity, the analyses below assume that censoring due to infrequent laboratory measurement was ignorable given the measured baseline covariates V. This assumption can be relaxed easily by using time-varying covariates to estimate inverse probability weights as described by Hernán, Brumback and Robins (2001).

We use overbars to denote the history of a time-dependent variable. For example, Āit represents individual i’s treatment history through time t, or Āit = [Ai0, Ai1, Ai2,…, Ait]. Likewise, it is individual i’s covariate history through time t. We often suppress the i subscript because we assume that the random vector for each individual is drawn independently from a distribution common to all individuals.

A static regime is defined as ā = [a0, a1, a2, …, at]. For example, ā = [1,1,1,…,1] and ā = [0,0,0,…,0] are the static regimes “initiate treatment at the beginning of the study and continue throughout” and “do not initiate treatment during the study”, respectively. In contrast, we cannot define a dynamic regime as a sequence of 0s and 1s because the actual treatment at each time is not known until the time-varying covariates are measured. In this paper, we first consider the 31 dynamic regimes “initiate treatment in the same month that the recorded CD4 cell count first drops below x cells/mm3” where x takes values from 200 to 500 in increments of 10 and treatment initiation occurs during the same month that CD4 cell count crosses the threshold x (i.e., m = 0). In a slight abuse of notation, we index the dynamic regimes by x. Therefore, x = 350 corresponds to the regime “initiate treatment in the same month that the recorded CD4 cell count first drops below 350 cells/mm3”.

3. Emulation of a randomized experiment

The preferred method for comparing the 31 dynamic regimes indexed by x is to conduct a randomized clinical trial with 31 arms. The first step of this hypothetical trial would be to identify individuals who meet the eligibility criteria given above. Second, we would randomly assign eligible individuals to one of the 31 regimes and follow them until AIDS, death, or the administrative end of follow-up. Third, we would compare the regime-specific AIDS-free survival. One simple approach would be to compare the AIDS-free survival at a predefined time point (e.g., five years from randomization). Then, among these 31 regimes, the optimal regime would be the one that results in the greatest proportion surviving without AIDS after five years.

This experiment would require extremely large sample sizes and is unlikely to be conducted. In the absence of a randomized clinical trial, we can try to emulate one using observational data (Hernán et al., 2006). The first step in emulating the trial is to identify eligible individuals and observations using the same criteria as the randomized clinical trial. Second, we review each individual’s CD4 cell count and treatment initiation histories to determine with which of the 31 regimes their data are consistent. If an individual’s data are consistent with regime x through time t, we consider him to be “following” regime x through time t. If and when the individual’s data are no longer consistent with following a particular regime, we artificially censor him at that time. For example, if an individual enters the study with a CD4 cell count of 205 cells/mm3 and does not initiate treatment at that time, his data are consistent with the regime x = 200 (i.e., initiate treatment in the same month that the recorded CD4 cell count first drops below 200 cells/mm3), so we say he is following the regime x = 200. He does not follow the other 30 regimes given he did not initiate treatment the first time his CD4 cell count dropped below 500, 490, 480, …, nor 210 cells/mm3. Because his CD4 cell count is above 200 cells/mm3, he could still initiate treatment the first time his CD4 cell count drops below 200 cells/mm3. However, if he does not initiate treatment when his CD4 cell count first drops below 200 cells/mm3, he is artificially censored at that time. As in the randomized clinical trial, we follow each individual until AIDS, death, artificial censoring, or the administrative end of follow-up.

Third, we compare the (appropriately adjusted) AIDS-free survival across regimes at a predefined time point (e.g., five years from time zero). Among these 31 regimes, the optimal regime is the one that results in the greatest proportion surviving without AIDS after five years.

In order to emulate randomized experiments using observational data, we consider “having data consistent with a regime in the observational study” analogous to “following a regime in a randomized experiment with perfect adherence”. There are, however, some key complications in the implementation of this approach. We now review them and propose some solutions.

Complication #1: Multiple regimes

The individual described above (who enters the study with a CD4 cell count of 205 cells/mm3 and does not initiate treatment at that time) is unusual in that he follows only one regime x. Most individuals have data that are consistent with more than one regime x. For example, Table 1 shows data for six hypothetical individuals and the regimes they follow when m = 0. All six individuals enter the study with CD4 cell counts of 352 cells/mm3 and are followed for four months. In the first month, their CD4 cell count increases to 380 cells/mm3. In the second month, their CD4 cell count drops to 273 cells/mm3. In the third month, there is no CD4 cell count measurement and thus the most recent value is carried forward. In the fourth month, their CD4 cell count drops to 198 cells/mm3. The only difference between the six individuals is the time at which they initiate treatment. Individual 1 initiates treatment the first time his CD4 cell count drops below 500 cells/mm3 when his CD4 cell count was 352 cells/mm3. Therefore, individual 1’s data are consistent with 15 regimes where x takes values from 360 to 500 in increments of 10 cells/mm3 given he initiated treatment the first time his CD4 cell count dropped below 500, 490, 480, …, and 360 cells/mm3. The other five individuals do not initiate treatment the first time their CD4 cell counts drop below 500 cells/mm3. Therefore, at time zero, their data are consistent with 16 regimes where x takes values from 200 to 350 in increments of 10 cells/mm3 given they could still initiate treatment the first time their CD4 cell counts drop below 350, 340, 330, …, and 200 cells/mm3.

Table 1:

Six hypothetical individuals who follow multiple regimes in the class “initiate treatment within m months after the recorded CD4 cell count first drops below x cells/mm3” where x takes the values 200 to 500 in increments of 10 and m takes values 0 and 3.

Individual Time (months) CD4 cell count Treatment (1: yes, 0: no) No. of regimes followed (range of x)
m = 0 m = 3
1 0 352 1 15 (360–500) 15 (360–500)
1 1 380 1 15 (360–500) 15 (360–500)
1 2 273 1 15 (360–500) 15 (360–500)
1 3 273 1 15 (360–500) 15 (360–500)
1 4 198 1 15 (360–500) 15 (360–500)
2 0 352 0 16 (200–350) 31 (200–500)
2 1 380 1 0 15 (360–500)
2 2 273 1 0 15 (360–500)
2 3 273 1 0 15 (360–500)
2 4 198 1 0 15 (360–500)
3 0 352 0 16 (200–350) 31 (200–500)
3 1 380 0 16 (200–350) 31 (200–500)
3 2 273 1 8 (280–350) 23 (280–500)
3 3 273 1 8 (280–350) 23 (280–500)
3 4 198 1 8 (280–350) 23 (280–500)
4 0 352 0 16 (200–350) 31 (200–500)
4 1 380 0 16 (200–350) 31 (200–500)
4 2 273 0 8 (200–270) 31 (200–500)
4 3 273 1 0 23 (280–500)
4 4 198 1 0 23 (280–500)
5 0 352 0 16 (200–350) 31 (200–500)
5 1 380 0 16 (200–350) 31 (200–500)
5 2 273 0 8 (200–270) 31 (200–500)
5 3 273 0 8 (200–270) 16 (200–350)
5 4 198 1 8 (200–270) 16 (200–350)
6 0 352 0 16 (200–350) 31 (200–500)
6 1 380 0 16 (200–350) 31 (200–500)
6 2 273 0 8 (200–270) 31 (200–500)
6 3 273 0 8 (200–270) 16 (200–350)
6 4 198 0 0 16 (200–350)

Thus it is necessary to either randomly allocate an individual to one of the multiple regimes he is following or use a more statistically efficient approach where individuals are allowed to follow more than one regime simultaneously. To allow individuals to follow more than one regime, we make a replicate of each individual for each regime he follows at some time during the follow-up. We add a replicate-specific variable X to the dataset. The variable X for each individual’s replicate is assigned a different value x. If and when a replicate with X = x deviates from the regime x, we artificially censor that replicate. Table S1 in the Supplemental Materials shows the expanded data for the same six hypothetical individuals from Table 1 for m = 0. Recall that at time zero individual 1 is following 15 regimes and the other five individuals are following 16 regimes. Therefore, 15 replicates are made of individual 1’s follow-up and 16 replicates are made of each of the other five individuals’ follow-ups. Individual 1 continues to follow the same 15 regimes for his entire follow-up and is never artificially censored. Individual 2 initiates treatment in the first month when his CD4 cell count is above his minimum CD4 cell count to date and, as a result, is censored from all 16 regimes. Individual 3 initiates treatment in the second month when his CD4 cell count is at a new minimum, 273 cells/mm3. The eight regimes where x takes values from 200 to 270 cells/mm3 are censored in the second month, but individual 3 continues to follow the other eight regimes for the remainder of his follow-up.

Individual 4 is censored from the eight regimes where x takes values from 280 to 350 cells/mm3 when his CD4 cell count drops in the second month. When he initiates treatment in the third month, he is censored from the remaining eight regimes because he initiates at either a repeated or carried forward CD4 cell count. Like individual 4, individual 5 is censored from eight regimes when his CD4 cell count drops in the second month. In the fourth month, he initiates treatment with a CD4 cell count that is a new minimum, 198 cells/mm3. He continues to follow the remaining eight regimes for the rest of his follow-up. Finally, individual 6 is identical to individual 5 except that he does not initiate treatment. Therefore, he is censored from eight regimes in the second month and from the remaining eight regimes in the fourth month.

The artificial censoring procedure to emulate this randomized experiment can be summarized as follows. Replicates can be censored for two different reasons. First, replicates with X = x are censored if and when the individual’s CD4 cell count first drops below x and the individual does not initiate treatment. Second, replicates with X = x are censored if and when the individual initiates treatment before his CD4 cell count drops below x. Once an individual initiates treatment at an appropriate CD4 cell count for a particular replicate, that replicate becomes ineligible to be censored at a later time point.

The left portion of Table 2 shows the number of individuals that were eligible for 7 of the 31 dynamic regimes with m = 0 where x takes values from 200 to 500 by 50 cells/mm3 as well as the observed number of outcomes under each regime. When m = 0, individuals in the FHDH followed, on average, 22 regimes. Therefore, our expanded dataset had 92,460 individuals. Each replicate was “assigned” to a different regime and had a potentially different length of follow-up because of the differential artificial censoring by regime. Thus, before estimating the five-year AIDS-free survival that would have been observed if all individuals had followed every regime, we need to adjust for the bias this artificial censoring may create.

Table 2:

Numbers of individuals, numbers of outcomes, and estimated five-year AIDS-free survivals under the regimes “initiate treatment in the same month that the recorded CD4 cell count first drops below x cells/mm3” where x ranges from 200 to 500 in increments of 50 cells/mm3.

Regime No. of individuals* No. of outcomes* 5-year AIDS-free survival 95% CI
500 288 10 0.95 0.91, 0.98
450 1,835 34 0.94 0.91, 0.97
400 2,891 54 0.93 0.90, 0.97
350 3,507 72 0.94 0.92, 0.96
300 3,764 89 0.94 0.92, 0.96
250 3,885 100 0.93 0.91, 0.95
200 3,949 114 0.91 0.89, 0.94
*

Each individual’s data may be consistent with his following several regimes.

Complication #2: Selection bias due to censoring

In the absence of confounding by unmeasured factors, as formalized in the strengthened identifiability conditions of exchangeability, positivity, and consistency described by Robins and Hernán (2008) we can eliminate the bias introduced by the artificial censoring if we weight each replicate of an individual by the individual’s time-varying, unstabilized inverse probability weight Wt=k=0t1[f(Ak|A¯k1,Dk=0,L¯k)] where f(Ak|Āk–1, Dk = 0, k) is by definition the conditional probability mass function fAk|Āk–1, Dk=0,k (ak|āk–1, dk = 0, k) with (ak|āk–1, dk = 0, k) evaluated at the random argument (Ak|Āk–1, Dk = 0, k), and A–1= 0. Informally, the denominator of an individual’s inverse probability weight at time t is the probability of having his own observed treatment history conditional on not developing the outcome by time t and his observed values of covariate history t and treatment history Āt–1. In our models for (Ak|Āk–1, Dk = 0, k), we summarized the dependence on t by V and Lt, the most recently available values of CD4 cell count (restricted cubic spline with five knots) and HIV-1 RNA (<10,000, 10,000–100,000, ≥100,000 copies/mL) at time t, and months between time t and the most recent laboratory measurement (0, 1–2, 3–4, 5–6, ≥7).

Like in previous analyses of observational HIV data (Cole et al., 2003; Hernán et al., 2000; Hernán et al., 2002; Sterne et al., 2005) we assumed that treatment was never stopped once initiated. Therefore, for each individual, the factors in the denominator of the weights Wt were set to 1 for times t subsequent to treatment initiation, and estimated from the data for all other times, i.e., times when Āt–1 = 0. The conditional probability of treatment initiation was estimated by, fitting the pooled logistic regression model logit Pr(At = 1|Āt–1 = 0, Dt = 0, V, Lt) = β0t + β′1V +β′2Lt where β0t is a month-specific intercept (restricted cubic splines with four knots), β′1 and β′2 are the transposes of the column vectors of log hazard ratios for the components of the baseline covariates V and the time-varying covariates Lt, respectively. The logistic model was fit to and the weights Wt were estimated in the original, unexpanded study population. The time-varying weights estimated for an individual were then applied to all of his replicates when employing the approach described in the previous section. In the Appendix (Section 2) we show that, even when β′1 = β′2 = 0 (i.e., the probability of initiating treatment did not depend on past covariate history), it is still necessary to use these inverse probability weights to prevent bias when estimating the effect of dynamic regimes on survival.

Complication #3: Unstable estimates

After inverse probability weighting and artificial censoring of the replicates, one still needs to compare the AIDS-free survival across the regimes. A simple solution would be to create one inverse-probability-weighted Kaplan-Meier curve per regime of interest, and identify the regime with the greatest five-year AIDS-free survival. However, this method would result in unstable survival estimates when, as in most real applications, few individuals in the study population follow any given regime for a long time. In practice, to estimate the five-year AIDS-free survival for any given regime, we need to use a model that uses a smooth function h(X) to combine information from many different regimes.

We fit (to the expanded data including all replicates) the inverse probability weighted pooled logistic model logit Pr(Dt+1 = 0|Dt = 0, Ct = 0, X, V) = θ0t + θ′1h(X) + θ′2V + θ3h(X)tm where h(X) is a restricted cubic spline for 500X100 with four knots at 0.50, 1.17, 1.83, and 2.50, h(X)t is the product (“interaction”) of h(X) with follow-up time tm (≤6, 7–12, 13–24, >24 months). The inclusion of the product terms allows the log hazard ratio for x to vary over time which is crucial since the assumption of a constant hazard ratio is substantively untenable for dynamic regimes.

Under the assumptions of strengthened exchangeability, positivity, and consistency (Robins and Hernán, 2008), the parameters of this inverse probability weighted model γ consistently estimate the parameters ψ of the dynamic marginal structural pooled logistic model logit Pr(Dt+1x=0|Dtx=0,V)=θ0t+θ1h(x)+θ2V+θ3h(x)tm. Here Dtx is the counterfactual indicator that an individual would have developed the outcome during time t under regime X = x.

We then used the predicted values of the inverse probability weighted model to estimate the survival probability at each time t under each of the 31 regimes x.

Complication #4: Stabilization of the weights

The use of unstabilized inverse probability weights may result in highly unstable estimates, which makes this approach problematic. In practical implementations, stabilized weights are preferred. Unfortunately, as described below and in the Appendix (Section 3), the stabilization procedures commonly used for static regimes (Cole and Hernán, 2008) are not valid for dynamic regimes. We now describe one approach to stabilize the inverse probability weights Wt.

First note that, for a replicate with X = x who is uncensored through time t, the contribution f(At|Āt–1, Dt = 0, t) to the denominator of Wt is equal to the probability Pr(Ct = 0|Ct–1= 0, Dt = 0, X, t, Āt–1) that the replicate remains uncensored through time t conditional on not developing the outcome by time t, covariate history, and treatment history. Table S1 in the Supplemental Materials shows the relation between the probability of treatment in the original dataset and the probability of remaining uncensored in the expanded dataset.

We argue in the Appendix (Section 3) that the numerator of any stabilized weight can depend on (X, V, Dt = 0) but cannot depend on Āt–1 or t, which makes Pr(Ct = 0|Ct–1 = 0, Dt = 0, X, V) a natural choice for the numerator of the stabilized weights. Thus, we define a replicate’s time-varying stabilized inverse probability weight to be SWt,x=k=0tPr(Ck=0|Ck1=0,Dk=0,X=x,V)Pr(Ck=0|Ck1=0,Dk=0,X=x,L¯k,A¯k1). Note the denominator of an individual’s stabilized weight SWt,x is equal to the denominator of his unstabilized weight Wt. Unfortunately, the stabilized weights SWt,x are not guaranteed to produce estimates that are less variable than those obtained using the unstabilized weights Wt. See the Appendix for more detail on the stabilized weights SWt,x. Optimal, locally semiparametric efficient weights have been derived (Orellana et al., 2010a, b) for certain dynamic marginal structural models, but their implementation is less straightforward.

We now consider estimation of the numerator of the weights, the stabilizing factor. The probabilities in the numerator of SWt,x were estimated by fitting the pooled logistic regression model logit Pr(Ct = 0|Ct–1 = 0, Dt = 0, X, V) = α0t + α′1h(X) + α′2V. Note that we do not restrict this model to times when Āt–1 = 0 since the numerator cannot be a function of At for any time t as mentioned above. Under the assumption that the denominator model is correct, our estimates of our marginal structural model will be consistent even if the model for the numerator of the weights is misspecified.

We truncated the stabilized weights to protect against misspecification of the model for the denominator of the weights, and against the near violations of positivity expected in small samples. When we truncated at 10.00, the mean estimated stabilized weight was 0.99 (range 0.01 to 10.00). The right portion of Table 2 shows the five-year survival for 7 of the 31 dynamic regimes with m = 0 where x takes values from 200 to 500 by 50 cells/mm3. For example, the five-year survival was 0.95, 0.94, and 0.91 for the regimes x = 500, 350, and 200, respectively. We used a nonparametric bootstrap to calculate 95% confidence intervals for these survival estimates.

4. Dynamic regimes with a grace period

Thus far, we have considered regimes of the form “initiate treatment within m months after the recorded CD4 cell count first drops below x cells/mm3” where m = 0 and x takes values from 200 to 500 in increments of 10. Under regimes with m = 0 individuals are forced to initiate treatment immediately after the threshold x is crossed. We now describe how to extend our approach to the comparison of dynamic regimes where m > 0. Under regimes with m > 0 individuals are given a grace period of m months after the threshold x is crossed before they are forced to initiate treatment. To illustrate the idea of dynamic regimes with a grace period m, we discuss regimes of the form “initiate treatment within m months after the recorded CD4 cell count first drops below x cells/mm3” where m = 3.

Recall that for m = 0, it was necessary to make a replicate of each individual for each regime he follows at some time during the follow-up and then artificially censor the replicate if and when the replicate deviates from a particular regime. The same applies when m > 0. However, the number of regimes followed and the timing of artificially censoring are affected by m. For example, consider the individuals in Table 1.

Individuals who initiate treatment the first time their CD4 cell count drops below 500 cells/mm3 will follow the same regimes for their entire follow-up regardless of the value of m. For instance, individual 1 initiates treatment the first time his CD4 cell count drops below 500 cells/mm3 when his CD4 cell count was 352 cells/mm3. Therefore, individual 1’s data are consistent with 15 regimes where x takes values from 360 to 500 in increments of 10 cells/mm3 given he initiated the first time his CD4 cell count dropped below 500, 490, 480,…, and 360 cells/mm3. Individual 1 will follow these same 15 regimes for his entire follow-up when m = 0 and when m > 0.

Individuals who do not initiate treatment the first time their CD4 cell count drops below 500 cells/mm3 follow different regimes over the course of their follow-up depending on the value of m. Individuals 2–6 do not initiate treatment the first time their CD4 cell counts drop below 500 cells/mm3. When m = 0, their data were consistent with the 16 regimes where x takes values from 200 to 350 in increments of 10 cells/mm3 at time zero. However, when m > 0, their data are also consistent with the 15 regimes where x takes values from 360 to 500 in increments of 10 cells/mm3 at time zero given that they could still initiate treatment within m months of their CD4 cell count first dropping below 500, 490, 480, …, and 360 cells/mm3. The last column of Table 1 shows which regimes are followed by the six hypothetical individuals when m = 3.

As before, we make a replicate of each individual for each regime he follows at some time during the follow-up. When m = 3, 15 replicates are made of individual 1’s follow-up and 31 replicates are made of each of the other five individuals’ follow-ups. Individual 1 continues to follow the same 15 regimes for his entire follow-up and is never artificially censored when m = 0 nor when m = 3. Individual 2 initiates treatment in the first month when his minimum CD4 cell count to date was 352 cells/mm3 and, as a result, is censored from the 16 regimes where x takes values from 200 to 350 cells/mm3. When m = 3, he continues to follow the other 15 regimes for the remainder of his follow-up. Individual 3 initiates treatment in the second month when his CD4 cell count is at a new minimum, 273 cells/mm3. The 8 regimes where x takes values from 200 to 270 cells/mm3 are censored in the second month, but individual 3 continues to follow the other 23 regimes for the remainder of his follow-up when m = 3. Individual 4 initiates treatment in the third month when his CD4 cell count remains 273 cells/mm3 having been carried forward from the second month. At that time, he is censored from the 8 regimes where x takes values from 200 to 270 cells/mm3, but continues to follow the other 23 regimes for the remainder of his follow-up when m = 3. Individual 5 does not initiate treatment until the fourth month. Therefore, when m = 3, he is censored from the 15 regimes where x takes values from 360 to 500 cells/mm3 in the third month, but continues to follow the other 16 regimes for the remainder of his follow-up. Like individual 5, when m = 3, individual 6 is censored from the 15 regimes where x takes values from 360 to 500 cells/mm3 in the third month, but continues to follow the other 16 regimes for the remainder of his follow-up. Table S2 in the Supplemental Materials shows the expanded data for these six hypothetical individuals from Table 1 for m = 3.

For m = 0, the factors in the denominator of the weights were estimated from the data for all times when Āt–1 = 0, and set to 1 for all times when Āt–1 = 1. For m > 0, there are additional times when the factor in the denominator of the weights SWt,x must be set to 1. Specifically, the month-specific factor in the denominator of the weight must be set to 1 for any month in which a replicate was not eligible to be censored. Note that when the value of x is greater than the value of the CD4 cell count at time zero, a replicate is ineligible for censoring prior to month m. For example, in the m = 3 column of Table 1, individuals 5 and 6 are censored from the 15 regimes where x takes values from 360 to 500 cells/mm3 in the third month. Additionally, when the CD4 cell count first drops below x, regimes where X = x cannot be censored for m months. For example, individual 6 cannot be censored from regimes where x takes values from 280 to 350 cells/mm3 in the second, third, or fourth month because his CD4 cell count first drops below 350, 340, 330, …, 280 cells/mm3 in the second month. These regimes become eligible for censoring again in the fifth month.

For m = 0, the factors in the numerator of the weights were estimated from the data for all times including when Āt–1 = 1 since the numerator cannot be a function of At for any time t as mentioned above. For m > 0, there are additional times when the factor in the numerator of the weights SWt,x must be set to 1. When the value of x is greater than the value of the CD4 cell count at time zero, a replicate is ineligible for censoring prior to month m as is the case in the denominator. However, when the CD4 cell count first drops below x, regimes where X = x are immediately eligible for censoring in the numerator since the numerator cannot be a function of time-varying covariates. Individuals follow more regimes when m > 0 than when m = 0. As m increases individuals follow more regimes for longer periods of time, which results in more precise survival estimates because more events are included in the analysis.

When m = 3, individuals in the FHDH followed, on average, 30 regimes. Therefore, our expanded dataset had 125,010 individuals. Table 3 shows the number of individuals that were eligible for the seven regimes where x takes values from 200 to 500 by 50 cells/mm3, as well as the observed number of outcomes and the estimated five-year AIDS-free survival under each regime when the stabilized weights were truncated at 10.00 (mean: 1.03, range 0.00 to 10.00). For example, the five-year AIDS-free survival was 0.92, 0.93, and 0.91 for the regimes x = 500, 350, and 200, respectively. In the Appendix (Section 4) we provide additional discussion of the clinical meaning of the regime “initiate treatment within m months after the recorded CD4 cell count first drops below x cells/mm3” with grace period m > 0.

Table 3:

Numbers of individuals, numbers of outcomes, and estimated five-year AIDS-free survivals under the regimes “initiate treatment within 3 months after the recorded CD4 cell count first drops below x cells/mm3” where x ranges from 200 to 500 in increments of 50 cells/mm3.

Regime No. of individuals* No. of outcomes* 5-year AIDS-free survival 95% CI
500 4,237 58 0.92 0.86, 0.98
450 4,146 82 0.92 0.88, 0.96
400 4,072 94 0.93 0.89, 0.96
350 4,027 105 0.93 0.91, 0.95
300 3,986 119 0.93 0.91, 0.95
250 3,970 119 0.92 0.90, 0.94
200 3,949 125 0.91 0.88, 0.93
*

Each individual’s data may be consistent with his following several regimes.

5. Discussion

We have described how to use dynamic marginal structural models to identify the optimal treatment regime in a set of regimes. Our results, although imprecise, suggest that the optimal CD4 threshold to initiate treatment was above 350 cells/mm3 whether we considered immediate or delayed (up to 3 months) treatment initiation after the threshold is crossed. We defined “optimal regime” in terms of time to AIDS or death, whichever happened first. Had we considered an outcome other than AIDS or death (e.g., death alone or serious non-AIDS defining illnesses), we might have found that a different regime is the optimal one.

A strength of our approach is that it eliminates the possibility of “lead time” bias (Cole et al., 2004) by design. However, the validity of our effect estimates relies on a number of assumptions.

First, we made the untestable assumption of no unmeasured confounding (or conditional exchangeability) given the measured covariates, i.e., the assumption that t includes all joint predictors of treatment initiation and the outcome. This assumption may approximately hold here because t included time-varying CD4 cell count and HIV-RNA, the most important clinical measures used by physicians as indications for treatment initiation. To further protect our estimates from unmeasured confounding, we analyzed our data under the intent-to-treat principle used in the analysis of randomized clinical trials. We defined the dynamic treatment regimes in terms of treatment initiation under whatever degree of subsequent adherence to treatment existed in our study population. This strategy makes it unnecessary to adjust for joint determinants of treatment discontinuation and the outcome, which are less well-measured in most observational studies, at the expense of potential bias due to misclassification of treatment status. Note, however, that adjusting for treatment discontinuation would not be appropriate, nor clinically interesting, if most individuals who discontinue treatment do so for toxicity-related reasons.

Second, we assumed a correct specification of the model for treatment initiation as a function of the measured confounders. If this model is misspecified, the weights could be extreme and lead to bias. To prevent this bias and to mitigate the effect of any near violations of positivity, we truncated the estimated weights at the value 10.00, which was at or above the 99th percentile of the distribution of the estimated weight. Other levels of truncation (e.g., 50.00) yielded virtually the same estimates. Finally, the validity of our method also depends on modeling assumptions involving h(x), the function of the time-fixed covariate regime. Our choice to consider x in increments of 10 cells/mm3 was not completely arbitrary. Increments of less than 10 cells/mm3 are not clinically relevant and may also be too dependent on the accuracy of measurement. Increments of greater than 10 cells/mm3 will limit the flexibility of h(x). Alternative parameterizations of h(x) (quadratic, cubic, and quartic polynomials) resulted in similar estimates.

The methods presented in this paper can be extended to more complex dynamic regimes. For instance, if we were interested in death alone as the outcome, we might consider regimes of the form “initiate cART within m months after the recorded CD4 cell count first drops below x cells/mm3 or an AIDS diagnosis, whichever occurs earlier”. In these dynamic regimes, the initiation of treatment depends on the evolution of two time-varying covariates, CD4 cell count and AIDS. We might also consider another randomized experiment where individuals enter the study with CD4 cell counts above 500 cells/mm3 and receive a fixed intervention until their CD4 cell count drops below 500 cells/mm3. The regimes in this experiment would be of the form “do not initiate treatment when CD4 cell count is above 500 cells/mm3 and initiate treatment within m months after the recorded CD4 cell count first drops below x cells/mm3”.

In summary, this paper described a method to emulate randomized experiments with dynamic regimes using observational data. To obtain more stable estimates, these analyses will need to be conducted using a larger dataset. Also, as more longitudinal observational data become available, more questions involving dynamic regimes can be answered. Though estimates obtained from observational data will always be suspect because of the possibility of residual confounding, the application of this method may help identify promising regimes to be compared in randomized clinical trials.

cain_methods_supplement.pdf

cain_methods_supplement.pdf

Appendix

A1. General theory

The main text considers regimes x with grace period m of the form “initiate treatment within m months after the recorded CD4 cell count first drops below x”. Formally this means that, under regime x with grace period m, individuals are prevented from starting treatment before their CD4 cell count drops below x, and are then forced to initiate treatment exactly m months after the time that their CD4 cell count first drops below x, if they are still alive and have not initiated treatment during the m-month grace period on their own.

Let Tx denote an individual’s counterfactual failure time under regime x. Let counterfactual death indicator Dtx=1 if Txt and Dtx=0 otherwise for t = 0, 1, 2, …, K, where K is the known maximum possible follow-up time of any individual. Consider the dynamic marginal structural discrete hazard model Pr[Dt+1x=1|Dtx=0,V=υ]=λ(t,x,υ;β*) where V is a subset of the baseline covariates L (0), λ (t, x, υ; β) is a known function taking values between 0 and 1, and β* is the true value of the parameter β.

Let t, Āt be an individual’s covariate history and treatment history through t, respectively. As in the text, we assume At = 1 implies At+1 = 1 since treatment once begun is never stopped. Let Qx denote the time at which an individual’s recorded CD4 cell count drops below x. Define Ct,x = 0 if an individual’s observed data is consistent with having followed regime x through time t and Ct,x = 1 otherwise. Then, by definition,

Ct,x=0ifandonlyifbothAj=0forj<min(Qx,t+1,T)andAQx+m=1wheneverQx+mmin(t,T)

Thus Ct,x is a deterministic function of t, Āt, t, and x. Note, in particular, if Ct,x = 0 and either Dt+1 = 1 or At = 1 then Ct′,x = 0 for t′ > t since, by definition, the individual continues to follow regime x after failure or if he started treatment while following the regime. We make the consistency assumption that

Dt+1x=Dt+1ifCt,x=0,

the exchangeability assumption

{D¯Kx}At|L¯t,A¯t1=0,Dk=0,

where D¯Kx={D0x,,DKx}, and the positivity assumption that

1>Pr[At=1|L¯t,A¯t1=0,Dt=0]>0withprobability1.

The above exchangeability and positivity assumptions for treatment imply the following exchangeability and positivity assumptions for censoring:

{D¯Kx}Ct,x|Ct1,x=0,L¯t,A¯t1,Dt=0,and
Pr[Ct,x=0|Ct1,x=0,L¯t,A¯t1,Dt=0]>0withprobability1

because Ct−1,x is a deterministic function of t, Āt−1, Dt = 0 and, conditional on Dt = 0 and a given realization of t, Āt−1, either (i) Ct,x = At, (ii) Ct,x = 1 − At, or (iii) Ct,x has a degenerate distribution at 0.

Define

Wt,x=1/j=0tPr[Cj,x=0|Cj1,x=0,L¯j,A¯j1,Dj=0]

Given the consistency, exchangeability, and positivity assumptions for censoring, we have E[I{Ct,x=0}Wt,x|V,D¯Kx]=1. Arguing as in Orellana et al. (2010a, b), the set of all unbiased estimating functions (which is also the linear space spanned by all influence functions) for β* when f (At|Āt−1, t, Dt = 0) is known is given by Pn{U (β, q) + M (h)} as q and h are varied arbitrarily, where Pn denotes a sample average over the n study population members,

U(β,q)=xt=0K{Dt+1λ(t,x,V;β)}(1Dt)qt(x,V)I{Ct,x=0}Wt,x,
M(h)=t=0K{AtPr(At=1|A¯t1,L¯t,Dt=0)}{1Dt}I{At1=0}ht(L¯t),

and q ={q1, …qK}, h ={h1, …hk} are sets of arbitrary functions of (x, V) and t respectively (Robins and Rotnitzky, 1992).

The estimators of β* in the main text solve Pn{U (β, q) + M (h)} = 0 with each ht (t) ≡ 0 with the estimated weight function W^t,x replacing Wt,x for two different choices of q depending on whether unstabilized or stabilized weights (see below) are used. As noted in the text, when Dt = 0, I {Ct,x = 0} Wt,x is equal to

I{Ct,x=0}{f[AQx+m|L¯Qx+m,A¯Qx+m1,DQx+m=0]I(Qx+mt,AQx+m1=0)×j=0min(t,Qx1)f[Aj|L¯j,A¯j1,Dj=0]}1

which can be written as I{Ct,x=0}/j=0tf[Aj|L¯j,A¯j1,Dj=0] when m = 0. When m > 0, the time-specific contribution to the weight Wt,x is 1 at times Qx, Qx + 1, …, Qx + m − 1.

A2. Need for inverse probability weighting even when treatment probabilities are constant

Suppose that the grace period m equals 0 and that Pr[At = 1|t, Āt−1 = 0, Dt = 0] is a constant pt that does not depend on t as might be the case in an experiment that randomly assigned a month of treatment initiation to each individual. Then one might hope that one could ignore the weight Wt,x and use Pn{xt=0K{Dt+1λ(t,x,V;β)}(1Dt)qt(x,V)I{Ct,x=0}} as an unbiased estimating equation. Since this latter estimating equation can be written as

Pn{xt=0K{Dt+1λ(t,x,V;β)}(1Dt)qt(x,V)I{Ct,x=0}(Wt,x/Wt,x)},

it is clear that this approach gives an unbiased estimating equation only if 1/Wt,x is a function just of (x, V). However, we now show that Wt,x is in fact a function of K through the CD4 cell count history. To see this note that the weight Wt,x for an individual with Ct,x = 0 is given by Wt,x{pQx}I{Qxt}j=0min(t,Qx1)(1pj), which is a function of an individual’s time-dependent CD4 cell count through Qx. Note that Wt,x is a function of Qx, even if pj = 1/2 for every j because then Wt,x = (1/2)min(t,Qx).

A3. Restrictions on the stabilized weights

If we wish to substitute a stabilized weight SWt,x = Numt,x × Wt,x for the unstabilized weight Wt,x, Numt,x must be a function of (x, V) only; otherwise xt=0K{Dt+1λ(t,x,V;β)}(1Dt)qt(x,V)I{Ct,x=0}SWt,x would not be in the aforementioned set of unbiased estimating functions. In particular, we cannot take Numt,x=j=0tf(At|A¯t1,Dt=0), as we typically do for a nondynamic (static) marginal structural model.

A4. Alternative, possibly more clinically relevant regimes

Following Robins and Rotnitzky (unpublished technical report) we now argue that one may wish to consider regimes consistent with “initiating treatment within m months after the recorded CD4 cell count first drops below x” other than the regimes x with grace period m in the main text. To see why, suppose that m = 6 and the observed probability of starting treatment in each of the months 0, 1, …, m after the CD4 cell count first drops below x is 1%. Then, under regime x with grace period m, roughly 94% of individuals would start treatment in the 6th month after their CD4 cell count fell below x. However, if one recommended to physicians that patients should “initiate treatment within m months after the recorded CD4 cell count first drops below x”, the distribution of start times might actually be closer to uniform over the months 0, 1, …, m. If that were so, we would like to find the x that optimizes survival under the set of regimes “initiate treatment within m months after the recorded CD4 cell count first drops below x, such that there is a uniform probability of starting in each of months 0, 1, …, m.” Below we describe how to estimate the optimal x for a wide variety of regime sets, each consistent with the requirement that an individual “initiate treatment within m months after the recorded CD4 cell count first drops below x.”

The larger point of this subsection is that counterfactual survival corresponding to the regime “initiate treatment within m months after the recorded CD4 cell count first falls below x” is vague and ill-defined because there are many different regimes (i.e., versions of treatment) consistent with this regime. The problem of vague counterfactuals due to many versions of treatment has been frequently discussed in the literature (Robins and Greenland, 2000; Hernán and Taubman, 2008; VanderWeele, 2009); what is interesting in our setting is that, as we next show, one can actually identify the effect on survival of many different versions of this regime, once the versions are formally defined. See Taubman et al. (2008) for related results and discussion on different versions of treatment regimes.

Consider the set of regimes defined as follows. For fixed m and x, and a set of conditional probabilities Prj,t,x [At+j = 1|t+j, At+j−1 = 0, Dt+j = 0] of starting treatment conditional on t+j, indexed by x, t, j, with j = 0, …, m, satisfying Prm,t,x [At+m = 1|t+m, At+m−1 = 0, Dt+m = 0] = 1, consider the (random) dynamic regime x in which (i) individuals are prevented from starting treatment before their CD4 cell count drops below x, and (ii) individuals initiate treatment at time t+j with probability Prj,t,x [*At+j = 1|Lt+j, Āt+j−1 = 0, Dt+j = 0], provided they are alive and at risk to initiate treatment at t + j, if the time Qx their recorded CD4 cell count first drops below x is equal to t. Under this regime, all individuals alive at t + m who are yet to initiate treatment will initiate at t + m.

Note an individual’s counterfactual failure time Tx under this regime x is well defined (as a stochastic counterfactual) but different for each choice of m and the initiation probabilities Prj,t,x*[At+j=1|L¯t+j,At+j1=0,Dt+j=0]. To estimate the optimal x for a given m and a given choice of the probabilities Prj,t,x*[At+j=1|L¯t+j,At+j1=0,Dt+j=0], we can estimate the parameters of a dynamic marginal structural discrete hazard model

Pr[Dt+1x=1|Dtx=0,V=υ]=λ(t,x,υ;β*)

for Tx with the weights Wt,x redefined to be

Wt,x={j=0min(t,Qx1)Pr[Aj=0|L¯j,A¯j1=0,Dj=0]}1×j=0m{fj,Qx,x*[AQx+j|L¯Qx+j,A¯Qx+j1=0,DQx+j=0]f[AQx+j|L¯Qx+j,A¯Qx+j1=0,DQx+j=0]}I(Qx+jt,DQx+j=0,A¯Qx+j1=0)

where

fj,t,x*[a|L¯t+j,A¯t+j1=0,Dt+j=0]=Prj,t,x*[At+j=a|L¯t+j,A¯t+j1=0,Dt+j=0]

for a = 0, 1. If we choose Prj,t,x*[At+j=1|L¯t+j,A¯t+j1=0,Dt+j=0] to be the observed probability Pr [At+j = 1|t+j, Āt+j−1 = 0, Dt+j = 0], then the distribution of Tx is the same as it is under the regime x with grace period m considered in the main text and the redefined Wt,x equals the earlier Wt,x. On the other hand, if we choose Prj,t,x*[At+j=1|L¯t+j,A¯t+j1=0,Dt+j=0]=1/(m+1j), then our regime has a roughly uniform distribution of starting times with the probability of starting in each of months 0, 1, …, m after the CD4 cell count first drops below x approximately 1/ (m + 1). The distribution would be precisely uniform in the limit as the number of deaths in any m + 1 month interval goes to zero. Note that although the choice Prj,t,x*[At+j=1|L¯t+j,A¯t+j1=0,Dt+j=0]=1/(m+1j) produces a regime that satisfies the condition “initiate treatment within m months after the recorded CD4 cell count first drops below x, such that there is a uniform probability of starting in each of months 0, 1, …, m”, it is not the only one. Specifically, there will exist choices of Prj,t,x*[At+j=1|L¯t+j,A¯t+j1=0,Dt+j=0] that allow dependence on t+j but still satisfy the requirement that the marginal probabilities of starting in each of months 0, 1, …, m are uniform. That is, the regime “initiate treatment within m months after the recorded CD4 cell count first drops below x, such that there is a uniform probability of starting in each of months 0, 1, …, m” still contains many versions of the treatment regime.

We now sketch the proof for the consistent estimation of the parameters of the dynamic marginal structural model for Tx using the redefined Wt,x.

Proof sketch: Under the above exchangeability, positivity, and consistency assumptions, it follows from Robins (1987) that the joint distribution of Dtx, t = 0, … , k given V under the regime defined by x, m, fj,t,x*[a|L¯t+j,A¯t+j1=0,Dt+j=0], j = 0, …, m, is given by the marginal distribution of Dt, t = 0, …, k and V under the (so-called g-formula) joint distribution

j=0min(k,int(T))f(Dj+1,Lj+1|D¯j=0,L¯j,A¯j)f˜[Aj|L¯j,A¯j1,Dj=0],

where f (Dj+1, Lj+1|j = 0, j, Āj) is the conditional density based on the distribution of the observed data, and [At|t, Āt−1, Dt = 0] is (i) 1 if Āt−1 = 1, (ii) 0 if t < Qx, (iii) fj,tj,x*[At|L¯t,A¯t1=0,Dt=0] if t = Qx + j, j = 0, …, m − 1, At−1 = 0, Dt = 0, (iv) 1 if t = Qx + m, At−1 = 0. The likelihood under the observed data generating mechanism for data up to k is

j=0min(k,int(T))f(Dj+1,Lj+1|D¯j=0,L¯j,A¯j)f[Aj|L¯j,A¯j1,Dj=0].

The likelihood ratio is

j=0min(k,int(T))f˜[Aj|L¯j,A¯j1,Dj=0]f[Aj|L¯j,A¯j1,Dj=0]

which is precisely the redefined Wk,x for an individual with Dk = 0. Because

xk=0K{Dk+1λ(k,x,V;β)}(1Dk)qk(x,V)

is an unbiased estimating function for β* under the law

j=0min(k,int(T))f(Dj+1,Lj+1|D¯j=0,L¯j,A¯j)f˜[Aj|L¯j,A¯j1,Dj=0],

the result now follows from the Radon-Nikodyn theorem.

Footnotes

*

This research was supported by NIH grant R01-AI073127. We thank Dr. Andrea Rotnitzky for her helpful comments.

References

  1. Ancelle-Park R, Klein JP, Stroobant A, et al. Expanded European AIDS case definition. The Lancet. 1993;341:441. doi: 10.1016/0140-6736(93)93040-8. [DOI] [PubMed] [Google Scholar]
  2. Cameron DW, Heath-Chiozzi M, Danner S, et al. Randomised placebo-controlled trial of ritonavir in advanced HIV-1 disease. The Advanced HIV Disease Ritonavir Study Group. The Lancet. 1998;351:543–549. doi: 10.1016/S0140-6736(97)04161-5. [DOI] [PubMed] [Google Scholar]
  3. CDC 1993 Revised classification system for HIV infection and expanded surveillance case definition for AIDS among adolescents and adults. Morbidity and Mortality Weekly Report. 1992;41:1–19. [PubMed] [Google Scholar]
  4. Cole SR, Hernán MA. Constructing inverse probability weights for marginal structural models. American Journal of Epidemiology. 2008;168:656–664. doi: 10.1093/aje/kwn164. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Cole SR, Hernán MA, Robins JM, et al. Effect of highly active antiretroviral therapy on time to acquired immunodeficiency syndrome or death using marginal structural models. American Journal of Epidemiology. 2003;158:687–694. doi: 10.1093/aje/kwg206. [DOI] [PubMed] [Google Scholar]
  6. Cole SR, Li R, Anastos K, et al. Accounting for leadtime in cohort studies: evaluating when to initiate HIV therapies. Statistics in Medicine. 2004;23:3351–3363. doi: 10.1002/sim.1579. [DOI] [PubMed] [Google Scholar]
  7. Hammer SM, Squires KE, Hughes MD, et al. A controlled trial of two nucleoside analogues plus indinavir in persons with human immunodeficiency virus infection and CD4 cell counts of 200 per cubic millimeter or less. AIDS Clinical Trials Group 320 Study Team. The New England Journal of Medicine. 1997;337:725–733. doi: 10.1056/NEJM199709113371101. [DOI] [PubMed] [Google Scholar]
  8. Hernán MA, Brumback B, Robins JM. Marginal structural models to estimate the causal effect of zidovudine on the survival of HIVpositive men. Epidemiology. 2000;11:561–570. doi: 10.1097/00001648-200009000-00012. [DOI] [PubMed] [Google Scholar]
  9. Hernán MA, Brumback B, Robins JM. Marginal structural models to estimate the joint causal effect of non-randomized treatments. Journal of the American Statistical Association. 2001;96:440–448. doi: 10.1198/016214501753168154. [DOI] [Google Scholar]
  10. Hernán MA, Brumback BA, Robins JM. Estimating the causal effect of zidovudine on CD4 count with a marginal structural model for repeated measures. Statistics in Medicine. 2002;21:1689–1709. doi: 10.1002/sim.1144. [DOI] [PubMed] [Google Scholar]
  11. Hernán MA, Lanoy E, Costagliola D, Robins JM. Comparison of dynamic treatment regimes via inverse probability weighting. Basic & Clinical Pharmacology & Toxicology. 2006;98:237–242. doi: 10.1111/j.1742-7843.2006.pto_329.x. [DOI] [PubMed] [Google Scholar]
  12. Hernán MA, Taubman SL. Does obesity shorten life? The importance of well defined interventions to answer causal questions. International Journal of Obesity. 2008;32:S8–S14. doi: 10.1038/ijo.2008.82. [DOI] [PubMed] [Google Scholar]
  13. HIV-CAUSAL Collaboration The effect of combined antiretroviral therapy on the overall mortality of HIV-infected individuals. AIDS. 2010;24:123–137. doi: 10.1097/QAD.0b013e3283324283. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Orellana L, Rotnitzky A, Robins JM.2010aDynamic regime marginal structural mean models for estimation of optimal dynamic treatment regimes, Part I: Main Content The International Journal of Biostatistics 6, Article 7. [PubMed] [Google Scholar]
  15. Orellana L, Rotnitzky A, Robins JM.2010bDynamic regime marginal structural mean models for estimation of optimal dynamic treatment regimes, Part II: Proofs and Additional Results The International Journal of Biostatistics 6, Article 8. [PubMed] [Google Scholar]
  16. Panel on Antiretroviral Guidelines for Adults and Adolescents . Guidelines for the use of antiretroviral agents in HIV-1 infected adults and adolescents. Department of Health and Human Services; 2008. [Google Scholar]
  17. Piketty C, Selinger-Leneman H, Grabar S, et al. Marked increase in the incidence of invasive anal cancer among HIV-infected patients despite treatment with combination antiretroviral therapy. AIDS. 2008;22:1203–1211. doi: 10.1097/QAD.0b013e3283023f78. [DOI] [PubMed] [Google Scholar]
  18. Recommandations du groupe d’experts sous la direction du Pr P Yeni . Prise en charge des personnes infectées par le VIH. Flammarion Médecines-Sciences; 2008. [Google Scholar]
  19. Robins JM.1987Addendum to “A new approach to causal inference in mortality studies with sustained exposure periods - Application to control of the healthy worker survivor effect.” Computers & Mathematics with Applications 14923–945. (errata in Computers & Mathematics with Applications 18, 477). 10.1016/0898-1221(87)90238-0 [DOI] [Google Scholar]
  20. Robins JM, Rotnitzky A.1992Recovery of information and adjustment for dependent censoring using surrogate markers AIDS Epidemiology: Methodological Issues Jewell NP, Dietz K, Farewell VT.297–331. (includes errata sheet). Boston, MA: Birkhäuser [Google Scholar]
  21. Robins JM, Greenland S. Comment on “Causal Inference Without Counterfactuals” by A.P. Dawid. Journal of the American Statistical Association -- Theory and Methods. 2000;95:477–482. doi: 10.2307/2669391. [DOI] [Google Scholar]
  22. Robins J, Hernán M. Estimation of the causal effects of timevarying exposures. In: Fitzmaurice G, Davidian M, Verbeke G, Molenberghs G, editors. Longitudinal Data Analysis. New York: Chapman and Hall/CRC Press; 2008. pp. 553–599. [Google Scholar]
  23. Robins J, Orellana L, Rotnitzky A. Estimation and extrapolation of optimal treatment and testing strategies. Statistics in Medicine. 2008;27:4678–4721. doi: 10.1002/sim.3301. [DOI] [PubMed] [Google Scholar]
  24. Sterne JA, Hernán MA, Ledergerber B, et al. Long-term effectiveness of potent antiretroviral therapy in preventing AIDS and death: a prospective cohort study. The Lancet. 2005;366:378–384. doi: 10.1016/S0140-6736(05)67022-5. [DOI] [PubMed] [Google Scholar]
  25. Taubman SL, Robins JM, Mittleman MA, Hernán MA. JSM Proceedings, Health Policy Statistics Section. Alexandria, VA: American Statistical Association; 2008. Alternative approaches to estimating the effects of hypothetical interventions. [Google Scholar]
  26. van der Laan MJ, Petersen ML.2007Causal effect models for realistic individualized treatment and intention to treat rules The International Journal of Biostatistics 3, Article 3. 10.2202/1557-4679.1022 [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. VanderWeele TJ. Concerning the consistency assumption in causal inference. Epidemiology. 2009;20:880–883. doi: 10.1097/EDE.0b013e3181bd5638. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

cain_methods_supplement.pdf

cain_methods_supplement.pdf


Articles from The International Journal of Biostatistics are provided here courtesy of Berkeley Electronic Press

RESOURCES