Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2016 Apr 29.
Published in final edited form as: Ann Appl Stat. 2011 Mar 21;5(1):400–426. doi: 10.1214/10-AOAS391

HIV DYNAMICS AND NATURAL HISTORY STUDIES: JOINT MODELING WITH DOUBLY INTERVAL-CENSORED EVENT TIME AND INFREQUENT LONGITUDINAL DATA1

Li Su 1, Joseph W Hogan 2
PMCID: PMC4851349  NIHMSID: NIHMS756914  PMID: 27134691

Abstract

Hepatitis C virus (HCV) coinfection has become one of the most challenging clinical situations to manage in HIV-infected patients. Recently the effect of HCV coinfection on HIV dynamics following initiation of highly active antiretroviral therapy (HAART) has drawn considerable attention. Post-HAART HIV dynamics are commonly studied in short-term clinical trials with frequent data collection design. For example, the elimination process of plasma virus during treatment is closely monitored with daily assessments in viral dynamics studies of AIDS clinical trials. In this article instead we use infrequent cohort data from long-term natural history studies and develop a model for characterizing post-HAART HIV dynamics and their associations with HCV coinfection. Specifically, we propose a joint model for doubly interval-censored data for the time between HAART initiation and viral suppression, and the longitudinal CD4 count measurements relative to the viral suppression. Inference is accomplished using a fully Bayesian approach. Doubly interval-censored data are modeled semiparametrically by Dirichlet process priors and Bayesian penalized splines are used for modeling population-level and individual-level mean CD4 count profiles. We use the proposed methods and data from the HIV Epidemiology Research Study (HERS) to investigate the effect of HCV coinfection on the response to HAART.

Key words and phrases: AIDS, antiviral treatment, interval censoring, semiparametric regression

1. Introduction

1.1. HIV dynamics following initiation of antiviral therapy

The wide-spread use of highly active antiretroviral therapies (HAART) against HIV in the United States has resulted in reducing the burden of HIV-related morbidity and mortality [Jacobson, Phair and Yamashita (2004)]. HIV dynamics following HAART are usually studied in short-term clinical trials with frequent data collection design. For example, in viral dynamics studies of AIDS clinical trials the elimination process of plasma virus after treatment is closely monitored with daily measurements, which has led to a new understanding of the pathogenesis of HIV infection and provides guidance for treating AIDS patients and evaluating antiviral therapies [Wu (2005)]. Here in this article HIV dynamics refer to a two-part response to HAART: viral suppression and concurrent or subsequent immune reconstitution. In clinical practice, the virus is considered suppressed when plasma HIV RNA (viral load) is below a lower limit of detection; the degree of immune reconstitution is commonly measured by the change of CD4+ lymphocyte cell count (CD4 count) after HAART initiation.

It is well known that CD4+ lymphocyte cells are targets of HIV and their abundance declines after HIV infection. Investigators have studied the association between viral load and CD4 count during HAART treatment and, in general, they are negatively correlated [Lederman et al. (1998); Liang, Wu and Carroll (2003)]. Longitudinal data on these markers have been analyzed separately, particularly by using random-effects models. Recently, bivariate linear mixed models were proposed to jointly model viral load and CD4 count by incorporating correlated random effects. These models were specified in terms of concurrent association between viral load and CD4 count [Thiébaut et al. (2005); Pantazis et al. (2005)]. However, a natural time ordering for virologic and immunologic response to HAART (or any antiviral therapy) is often observed: when a patient begins a successful HAART regimen, viral replication is usually inhibited first, leading to a decrease in viral load; then, CD4 count often increases as the immune system begins to recover. Consequently, increase in CD4 count is thought to depend on the degree of viral suppression; it may be slower to respond than viral load or it may not increase at all if the virus is not suppressed [Jacobson, Phair and Yamashita (2004)]. Therefore, it would be advantageous to acknowledge these common sequential changes of viral load and CD4 count when modeling post-HAART HIV dynamics.

1.2. Coinfection with Hepatitis C virus and HIV dynamics

Hepatitis C virus (HCV) coinfection is estimated to occur in 30% of HIV-infected patients in the United States and has become one of the most challenging clinical situations to manage in HIV-infected patients [Sherman et al. (2002)]. Several studies have suggested that HCV serostatus is not associated with the virologic response to HAART [Greub et al. (2000); Rockstroh et al. (2005)]. However, the evidence for immunologic response is conflicting. Some studies have shown that HIV–HCV coinfected patients have a blunted immunologic response to HAART, compared to those with HIV infection alone, although others have found comparable degrees of immune reconstitution in persons with HIV–HCV coinfection [Miller et al. (2005); Stebbing et al. (2005); Rockstroh (2006); Sullivan et al. (2006)]. A primary motivation of our model is to investigate the effect of HCV coinfection on post-HAART HIV dynamics using cohort data from natural history studies. We focus on two important questions: (1) Do HCV-negative patients have shorter time from HAART initiation to viral suppression? (2) Do HCV-negative patients have better immune reconstitution at the time of viral suppression? Note that in the second question the sequential nature of the virologic and immunologic response to HAART is emphasized.

1.3. HIV natural history studies and the HERS

Because the incidence of clinical progression to AIDS fell rapidly following the widespread introduction of HAART in 1997, long-term clinical trials in patients with HIV become time-consuming and expensive [Mocroft et al. (2006)]. Currently, natural history studies are the major source of knowledge about the HIV epidemic and the full treatment effect of HAART over the long term. For example, studies such as Multicenter AIDS Cohort Study (MACS), Women’s Interagency HIV Study (WIHS) and Swiss HIV Cohort Study (SHCS) have played important roles in understanding the science of HIV, the AIDS epidemic and the effects of therapy [Kaslow et al. (1987); Ledergerber et al. (1994); Barkan et al. (1998)]. In HIV natural history studies, HIV viral load and CD4 count are usually measured with wide intervals (e.g., every 6 months approximately). Therefore, for some event time of scientific interest, for example, the time between HAART initiation and viral suppression, both the time origin (HAART initiation) and the failure event (viral suppression) could be interval-censored. This situation is referred to as ‘doubly interval-censored data’ in the literature. In fact, the statistical research on doubly interval-censored data was primarily motivated by scientific questions in HIV research, for example, modeling ‘AIDS incubation time’ between HIV infection and the onset of AIDS [De Gruttola and Lagakos (1989); Sun (2006)]. Both nonparametric and semiparametric methods have been proposed for the estimation of the distribution function of the AIDS incubation time and its regression analysis. A comprehensive review on the analysis of doubly interval-censored data can be found in Sun (2006).

The HIV Epidemiology Research Study (HERS) is a multi-site longitudinal cohort study of HIV natural history in women between 1993 and 2001 [Smith et al. (1997)]. At baseline between 1993 and 1995 the study enrolled 871 HIV-seropositive women and 439 HIV-seronegative women at high risk for HIV infection. Participants were scheduled for approximately a 6-year follow-up, where a variety of clinical, behavioral and sociologic outcomes were recorded approximately every 6 months and measurements correspond to dates. The top part of Table 1 gives selected baseline characteristics of the 1310 study participants; more details can be found in Smith et al. (1997). Quantification of HIV RNA viral load was performed using a branched-DNA (B-DNA) signal amplification assay with the detection limit at 50 copies/ml and flow cytometry from whole blood was used to determine CD4 counts at each visit. All participants were HAART-naive at baseline. During the study 382 participants reported HAART use based on information gathered during in-person interviews. Because assessments were scheduled to be carried out every 6 months and participants were only asked about whether they were on HAART during the last 6 months, exact dates for HAART initiation are not available. The analysis in Section 4 includes 374 women with HAART use who had HIV sero-conversion before baseline and baseline HCV coinfection information. Some characteristics of these 374 women are presented at the bottom of Table 1.

TABLE 1.

Selected characteristics of the 1310 HERS women (top) and the 374 HERS women included in the analysis (bottom) in Section 4

HIV-positive
(N = 871)
HIV-negative
(N = 439)
Median age at enrollment 35.0 34.5
Age range at enrollment 16.4–55.2 16.6–56.0
Injection drug user at enrollment (%) 25.1 26.4
CD4 count at enrollment (%)
  <200 17.1 0.0
  200–499 50.7 1.7
  ≥500 32.2 98.3
HCV antibody test at enrollment (%)
  Positive 60.3 47.8
  Negative 38.8 50.8
  Missing 0.9 1.4
HCV-positive HCV-negative
(N = 208) (N = 166)
Median follow-up time (months) 67.3 71.0
Median age at enrollment 36.7 33.1
Age range at enrollment 21.2–55.0 19.0–55.2
Injection drug user at enrollment (%) 29.8 2.4
Ever on antiviral treatment before 1996 (%) 57.2 62.1
CD4 count before first reported HAART use (%)
  <200 34.6 36.8
  200–499 52.9 45.8
  ≥500 12.5 17.5

Figure 1 shows smoothing spline fits and the corresponding derivative (change rate) curves for average CD4 count and the prevalence of detectable viral load for the 374 HERS women, where the measurement times are centered such that time 0 represents the earliest visit with HAART information reported. The left panels indicate that the increasing trend for average CD4 count started later than the decreasing trend for viral load prevalence, but this phenomenon is probably not related to HAART because the starting times for these trends are 1–2 years before the reported HAART initiation time. It might be more useful to examine the change rates for average CD4 count and viral load prevalence to assess the effectiveness of HAART. In fact, the right panels of Figure 1 show that the maximum decreasing rate for viral load prevalence occurred earlier (around 4 months before reported HAART initiation) than the maximum increasing rate for average CD4 count (around the reported HAART initiation), which suggests the possible sequential relationship in post-HAART HIV dynamics discussed in Section 1.1.

Fig. 1.

Fig. 1

Top panels: smoothing spline fit and the corresponding derivative (change rate) curve for average CD4 count since reported HAART initiation in the HERS cohort; bottom panels: smoothing spline fit and the corresponding derivative (change rate) curve for the prevalence of detectable viral load (≥50 copies/ml) since reported HAART initiation in the HERS cohort; solid lines: smoothing spline fits; dashed line: derivative curves of the smoothing spline fits; black dots: maximum of the increasing rate for average CD4 count and maximum of the decreasing rate of viral load prevalence.

1.4. Modeling post-HAART HIV dynamics in the HERS

Our objective is to develop a model for the joint distribution of the time from HAART initiation to viral suppression, and the longitudinal CD4 counts relative to the viral suppression time following HAART. As discussed in Section 1.3, the time from HAART initiation to viral suppression is doubly interval-censored. Specifically, considering the reporting bias for HAART initiation, we define the right endpoint of its corresponding censoring interval to be the first visit of reported HAART use and the definition for the left endpoint is based on assumptions about the earliest possible time of HAART initiation in the HERS cohort. Further, viral suppression following HAART can be either interval-censored or right-censored. Details can be found in Section 4.

Figure 2 shows CD4 counts and corresponding censoring intervals of HAART initiation and viral suppression following HAART for selected HERS women. As seen in the top left panel of Figure 2, viral suppression after HAART can be right-censored due to participant dropout, death and/or study end. Similarly, participants could have incomplete CD4 count measurements for 12 scheduled follow-up visits. However, because we focus on the subpopulation of HAART users in the HERS cohort, the missingness rate is relatively low compared to the whole HERS population; 90.64% of the 374 women in our analysis had at least 10 visits. Therefore, for the HERS analysis in Section 4, we assume that the missing data mechanism is missingness at random [Little and Rubin (2002)]. Given that the parameters for modeling the missing data mechanism and the outcomes are distinct and they have independent priors, the missing data are then ignorable when making posterior inference about the outcomes.

Fig. 2.

Fig. 2

CD4 counts (on square root scale) and censoring intervals for 9 selected HERS women; dotted line: censoring intervals for HAART initiation; solid line: censoring intervals for viral suppression following HAART; circles represent the data from HCV-positive group and triangles represent the data from HCV-negative group.

The remainder of the article is organized as follows. In Section 2 we specify the joint model for doubly interval-censored event time and longitudinal CD4 count data. Section 3 describes the posterior inference and gives full conditional distributions for Gibbs steps. We use the model to analyze the HERS data for investigating the HCV coinfection problem introduced in Section 1.2, and present the results in Section 4. The conclusion and some discussion are given in Section 5.

2. A model for post-HAART HIV dynamics

2.1. Model under an idealized situation

Our goal is to model the joint distribution of the time from HAART initiation to viral suppression and the longitudinal CD4 counts. Figure 3 is a schematic illustration of the variables of interest under an idealized situation. Let t (t ≥ 0) denote the time since enrollment and let H and V represent the time from enrollment to HAART initiation and the time from enrollment to viral suppression after HAART, respectively. By definition, V > H and W = VH is the time from HAART initiation to viral suppression. Further, Y (t1), Y (t2), …, Y (tn) are CD4 count measurements taken at time points t1 < ⋯ < tn. Throughout this article, the time points t1 < ⋯ < tn are assumed to be noninformative and fixed by study design. Let X denote covariates, for example, the baseline HCV serostatus. The joint density of W and Y (t1), Y (t2), …, Y (tn) given X, H and t1, …, tn can be written as

p(w,y1,y2,,yn|X,h,t1,,tn)=p(w|X,h)p{y1,y2,,yn|X,t1(h+w),,tn(h+w)}. (2.1)

The conditioning on H is because we are not interested in the marginal distribution of H and the observed H = h is only used as the time origin for W.

Fig. 3.

Fig. 3

A scheme of the variables of interest under an idealized situation for post-HAART HIV dynamics: 0 represents enrollment, t indexes the time since enrollment, H is HAART initiation time, V is viral suppression time following HAART, W is the time from HAART initiation to viral suppression, and Y (t1), Y (t2), …, Y (tn) are CD4 count measurements with their expectations represented by the curve.

The factorization in (2.1) is based on the sequential relationship in post-HAART dynamics. When HAART regimen is successful in suppressing the virus, we are able to obtain W, the time from HAART initiation to viral suppression. As mentioned in Section 1.1, there is a time ordering of virologic response and immunologic response to HAART. Because of this sequential relationship of virologic and immunologic response as well as the large between-individual heterogeneity in terms of the ability to suppress viral replication, the time to suppression and the durability of suppression, we believe that the mean CD4 count profiles from different individuals are more comparable after realigning measurement times by their individual viral suppression times following HAART. Therefore, we assume that the distribution of Y (t1), Y (t2), …, Y (tn) given X depends on H and W only through a change in the time origin for the measurement times t1, …, tn. This is similar to curve registration, a method originated in the functional data analysis literature [Ramsay and Li (1998)] for dealing with the situations where the rigid metric of physical time for real life systems is not directly relevant to internal dynamics. For example, the timing variation of salient features of individual puberty growth curves (e.g., time of puberty growth onset, time of peak velocity of puberty growth) can result in the distortion of population growth curves [Ramsay and Silverman (2005)]. Likewise, in our case, simply averaging individual CD4 count profiles along the time since enrollment (t) or the time since HAART initiation (H) can attenuate the true population immunologic response profile following HAART. Because viral suppression is the main driving force of immune reconstitution [Jacobson, Phair and Yamashita (2004)], it is sensible to center the time scale at individual viral suppression times (V = H + W) in order to describe the trends in immune reconstitution at the population level.

However, as mentioned in Section 1.3, W can be doubly interval-censored in HIV natural history studies, which presents a challenge in making inferences about the density in (2.1). In fact, for p{y1, y2, …, yn |X, t1 − (h + w), …, tn − (h + w)}, we are faced with a situation similar to the missing or interval-censored covariate problem in generalized linear model literature [Chen et al. (2005); Calle and Gómez (2005)]. To accommodate this situation, we will extend the semiparametric Bayesian approach in Calle and Gómez (2005) by modeling H and W simultaneously. Note that here we model the observed H only for taking into account the uncertainty in the time origin of W; we do not intend to make inference about the marginal distribution for HAART initiation time, which requests the right-censored data from those participants who did not initiate HAART during the study. This is different from the AIDS incubation time problem which motivated the research in doubly interval-censored data, where both HIV infection time and AIDS incubation time are of interest and HIV infection time can be right-censored [De Gruttola and Lagakos (1989)]. Moreover, for the HERS cohort, HAART was not available before 1996; therefore, when HAART initiation time is of scientific interest, it is not valid to use enrollment as the time origin because all HERS women were not at risk for HAART initiation between enrollment and 1996. However, for the purpose of accommodating uncertainty for the time origin of W, we can still use the observed censoring intervals for H with enrollment as their time origin.

In the following sections, we present the details of the proposed joint model for the HERS data.

2.2. Model with doubly interval-censored data

2.2.1. Observed data

Recall that all HERS women were HAART-naive at baseline. For those who initiated HAART during follow-up, let H be a positive random variable representing the time from enrollment to HAART initiation. Participants were monitored only periodically, and at each follow-up visit they only reported whether they were on HAART treatment since the last visit. Hence, the true value for H is only known to lie within an interval (LH, RH], where LH is the time of the visit preceding HAART initiation and RH is the time of the first visit at which HAART use is reported.

Let V be the time from enrollment to viral suppression following HAART initiation. By definition, V > H. For those whose viral load has been suppressed, V is observed to be in an interval (LV, RV], where LV and RV are defined similarly as LH and RH. For those whose viral load was not suppressed during follow-up, V ∈ (LV, +∞), which corresponds to right censoring of V. Because right censoring can be treated as a special case of interval censoring with RV = +∞, we simply write V ∈ (LV, RV]. The time between HAART initiation and viral suppression is W = VH. At a given value for H, (LH, RH] and (LV, RV] can overlap because virus suppression can occur quickly after HAART but before the next visit; therefore, W ∈ (max(0, LVH), RVH].

Further, we observe CD4 counts Y = (Y1, …, Yn)T at time points t1, …, tn, which can be different across individuals and X is the covariate that includes baseline HCV coinfection status Z, where Z ∈ {0, 1} indicates positivity of HCV antibody.

In summary, the observed data for a HAART user in the HERS cohort consist of the observed CD4 counts Y, the covariate X, the observation times t1, …, tn and the intervals (LH, RH], (LV, RV] that respectively include HAART initiation time H and viral suppression time V.

2.2.2. Noninformative assumption for interval-censoring

The joint density for the above observed data and the unobserved H and W can be written as

p(lH,rH,lV,rV,h,w,y|X,t1,,tn)=p0(lH,rH,lV,rV|X)p1(h|X,lH,rH,lV,rV)×p2(w|X,h,lH,rH,lV,rV)×p3{y|X,t1(h+w),,tn(h+w),lH,rH,lV,rV}. (2.2)

Denote the cumulative distribution function (CDF) of H given X by GH (h|X; λH), and the CDF of W given X by GW (w|X; λW). The corresponding probability density functions (PDF) are gH (h|X; λH) and gW (w|X; λW), respectively. We assume that the censoring of H and W occurs noninformatively [Oller, Calle and Gómez (2004); Calle and Gómez (2005)], in the following sense:

  1. (LH, RH, LV, RV) provide no additional information about Y when H and W are exactly observed. That is, the conditional density of Y given (X, H, W, t1, …, tn) and (LH, RH, LV, RV) does not depend on (LH, RH, LV, RV):
    p3{y|X,t1(h+w),,tn(h+w),lH,rH,lV,rV}=p3{y|X,t1(h+w),,tn(h+w);θ}.
  2. The only information about H and W provided by the observed censoring intervals is that (LH, RH], (LV, RV] contain H and V = H + W, respectively. That is, the conditional density of H given X and (LH, RH] satisfies
    p1(h|X,lH,rH,lV,rV)=gH(h|X;λH)GH(rH|X;λH)GH(lH|X;λH), (2.3)
    which corresponds to the density of H given X truncated in (LH, RH]. Similarly, the conditional density of W given X, H and (LV, RV] is
    p2(w|X,h,lH,rH,lV,rV)=gW(w|X;λW)GW(rVh|X;λW)GW(max(0,lVh)|X;λW), (2.4)
    the truncated density gW (w|X; λW) in the interval (max(0, LVH), RVH]. We denote (2.3) by gTH(h|X,lH,rH;λH) and (2.4) by gTW(w|X,h,lV,rV;λW), where the subscript T stands for ‘truncated’ density.

Given these noninformative conditions, the joint density in (2.2) can be simplified as

p(lH,rH,lV,rV,h,w,y,|X,tl,,tn)=p0(lH,rH,lV,rV|X)gTH(h|X,lH,rH;λH)×gTW(w|X,h,lV,rV;λW)×p3{y|X,t1(h+w),,tn(h+w);θ}. (2.5)

2.2.3. Hierarchical structure of the model

To construct the observed data likelihood, we index each individual’s data by i = 1, …, N and let ni be the number of observations for the ith individual, (Yi,Xi,LiH,RiH,LiV,RiV,ti1,,tini) are observed. If we denote by [A|B; Ω] the conditional distribution of random variable A, given random variable B and parameter Ω, we can summarize our model by a hierarchical structure from a Bayesian point of view:

[Yi|Xi,Hi,Wi,ti1,,tini;θ]~P3(y|Xi,ti1υi,,tiniυi;θ),[Wi|Xi,Hi,LiV,RiV;λW]~GTW(w|Xi,hi,liV,riV;λW),[Hi|Xi,LiH,RiH;λH]~GTH(h|Xi,liH,riH;λH),[LiH,RiH,LiV,RiV|Xi]~P0(lH,rH,lV,rV|Xi;δ),[δ,λH,λW,θ]~F(δ,λH,λW,θ),υi=hi+wi,i=1,,N, (2.6)

where P3(·), GTW(·),GTH(·), P0(·) and F(·) are the corresponding distribution functions. Assuming the independence of the priors for δ and (λH, λW, θ), the marginal distribution of the censoring intervals P0(lH, rH, lV, rV |Xi; δ) is not part of the posterior inference about (λH, λW, θ) because of the noninformative censoring conditions. Therefore, we do not need to model P0(lH, rH, lV, rV |Xi; δ) explicitly.

2.2.4. Semiparametric Bayesian approach for event time distributions

We use a semiparametric Bayesian approach for modeling H and W. The CDFs GH and GW are left unspecified and not constrained to a parametric family. Therefore, GH and GW are themselves unknown parameters, and Dirichlet process priors [Ferguson (1973)] are assigned.

A Dirichlet process prior (DPP) on a nonparametric distribution G is a distribution on the space of all possible distributions for G [Ferguson (1973)]. The parameters of DPP are a parametric distribution G0(·; λ), and a positive scalar α. The parametric distribution G0 corresponds to the prior expectation of the distribution function G. The precision parameter α indicates how similar we believe the base measure G0 and the nonparametric distribution G are. A DPP with parameters α and G0 is denoted by 𝒟(αG0).

In the HERS analysis reported in Section 4, we include baseline HCV status as the covariate for event time distributions. Therefore, adding nonparametric DPP for GH and GW with base measures G0H,G0W, and precision parameters αH, αW, the initial hierarchical model structure in (2.6) can be elaborated as

[Yi|Xi,Hi,Wi,ti1,,tini;θ]~P3(y|Xi,ti1υi,,tiniυi;θ),[Wi|Xi,Hi,LiV,RiV]~GTW(w|Zi,hi,liV,riV), (2.7)
[GW(·|Zi);λW,αW]~𝒟(αWG0W(·|Zi;λW)),[Hi|Xi,LiH,RiH]~GTH(h|Zi,liH,riH),[GH(·|Zi);λH,αH]~𝒟(αHG0H(·|Zi);λH)),[LiH,RiH,LiV,RiV|Xi]~P0(lH,rH,lV,rV|Xi;δ),[δ,λH,λW,θ]~F(δ,λH,λW,θ),υi=hi+wi,i=1,,N.

2.2.5. Model for CD4 counts

In this section we describe the model for CD4 counts. Recall that our objective is to characterize mean CD4 count profiles relative to individual viral suppression times for HCV groups, after adjusting for other covariates. In other words, our focus is on the parameter θ in P3(y|Xi, ti1 − (hi + wi), …, tini − (hi + wi); θ). Since viral suppression time V can be right-censored, those individuals with V less than or equal to the maximum follow-up time T are treated as HAART responders, while those with V > T are considered as nonresponders in the study period for comparison purpose. It is also assumed that the mean CD4 count profiles differ by both HAART responder groups and HCV groups; thus, different smooth functions are used for these subpopulations. We only realign the data for the HAART responder group by viral suppression times; for the nonresponder group the measurement time origin is still participant enrollment.

In addition, there are other important covariates that are possibly associated with immunologic response to HAART besides the HCV serostatus, for example, the overall CD4 level before HAART initiation and baseline injection drug use information. Specifically, let Xi* be a vector of other covariates excluding baseline HCV status Zi, and T be the maximum follow-up time for the study. For j = 1, …, ni, we assume that the CD4 count at tij for the ith individual follows

Yij|Xij*,Zi,υi,tij={mi(tijυi)+Xi*β*+eij,if υiT,ci(tij)+Xi*β*+eij,if υi>T, (2.8)

where

mi(t)=Zi·m1(t)+(1Zi)·m0(t)+γim(t),
ci(t)=Zi·c1(t)+(1Zi)·c0(t)+γic(t).

Here m1(t), m0(t), c1(t), c0(t) are smooth functions describing the population CD4 count profiles that are specific to HCV serostatus, γim(t) and γic(t) are individual-level smooth functions that represent random deviations from population profiles, β* is the regression coefficient for Xij*, and the within-individual error term eij~i.i.d.N(0,σ2). We assume that ei (t), γim(t) and γic(t) are mutually independent. Detailed specification for all smooth functions can be found in the Appendix. Overall, m1(t), m0(t), c1(t), c0(t) can be considered as fixed effects, γim(t),γic(t) can be considered as random effects, and eij is the measurement error in the linear mixed model framework. Because within-subject covariance is not of direct interest in our analysis, no stochastic process is further introduced into the CD4 count model except random effects and measurement error. However, when within-subject covariance is the target of inference, stochastic processes, for example, the integrated Ornstein–Uhlenbeck process in Taylor, Cumberland and Sy (1994), can be added.

3. Prior specification and posterior inference

Gibbs sampling can be used to obtain posterior samples from the full conditional posterior distributions of λH, λW and θ. Compared to the model with known H and W in (2.1), the model in (2.7) involves an extra layer in the Gibbs steps. That is, at each iteration, the doubly interval-censored W together with H are sampled from their conditional posterior distributions, which results in a complete data set that is used to update the posterior distributions of the model parameters.

For the HERS analysis in Section 4, we assume that the prior for θ and the prior for λH, λW are independent. Normal distributions are used as base measures of DPP for GH and GW. Different values of the precision parameters (αH, αW) are used to evaluate the sensitivity in estimating GH and GW. For the CD4 count model, standard vague priors, such as normal-gamma conjugate family, are used.

Let H = (H1, …, HN)T, W = (W1, …, WN)T, Ti = (t1i, …, tini)T and T = (T1, …, TN)T; LH, RH, LV and RV are the vectors of left and right endpoints for censoring intervals. To derive the full conditional distribution for model (2.7), we use the Polya urn characterization of DPP [Blackwell and MacQueen (1973)] and extend the ideas of Escobar (1994) and Calle and Gómez (2005). Specifically, we sample from [H, W, λH, λW, θ|Y1, …, YN, X1, …, XN, LH, RH, LV,RV, T], by the iterations as follows: first, H and W are imputed by using corresponding conditional distributions; second, the parameter θ is updated using the complete data set obtained from the first step and current values of the rest of parameters; last, the parameters λH, λW are updated using distinct values of imputed H and W. Details on priors and full conditional posterior distributions are given in the Appendix.

4. Data analysis

In this section we apply the joint model to the HERS data introduced in Section 1.3. Two different definitions are used for censoring intervals of HAART initiation and the results are compared. The first one is explicitly based on reported HAART use information, and we refer to them as ‘narrow’ intervals for H. Here RH is the first visit with reported HAART use; LH is the immediate previous visit without HAART use. There are 159 (89 HCV seropositive, 70 HCV seronegative) patients with right-censored viral suppression time in this case. However, we find that some patients had viral suppression immediately before LH, which could be due to the possible reporting bias regarding HAART initiation. As a result, we might miss the true viral suppression time following HAART and artificially create some cases with right-censored viral suppression time (or viral suppression that occurred long after HAART initiation). To reduce its impact in a conservative manner, we redefine all 374 left endpoints of HAART initiation intervals to be March 11th, 1996, which is the left endpoint of the censoring interval for the patient who was the first reporting HAART use in the HERS cohort. Because censoring intervals for HAART initiation are wider under this new definition, we refer to them as ‘wide’ intervals for H and here the number of patients with right-censored viral suppression time is reduced to 141 (78 HCV seropositive, 63 HCV seronegative). Figure 4 shows the CD4 count data and censoring intervals under two definitions of HAART initiation time intervals for two selected women in the HERS cohort. In the left panel, the ‘wide’ definition for H also changes the interval for viral suppression time V, while in the right panel the intervals for V remain the same.

Fig. 4.

Fig. 4

CD4 counts (on square root scale, circles: HCV positive, triangles: HCV negative) and censoring intervals of H and V = H + W under two definitions of HAART initiation time intervals for two selected women in the HERS cohort; censoring intervals under ‘narrow’ definition are represented by dashed lines, censoring intervals under ‘wide’ definition are represented by solid lines; censoring intervals of H and V = H + W are on the top and bottom of panels, respectively.

For CD4 counts, square-root transformation is used because it is more appropriate for the assumptions of Normality and homogeneous variance as shown by exploratory analysis. In addition to baseline HCV serostatus, two other covariates are included in the CD4 model: the observed CD4 count (scaled by 100) immediately before reported HAART initiation (pretreatment CD4 level) and the indicator of baseline injection drug use (IDU). For penalized splines approximating population-level smooth functions, we use truncated quadratic bases with 20 knots, allowing sufficient flexibility for capturing CD4 count changes at viral suppression times. These knots are placed at viral suppression times as well as at the sample quantiles of the realigned measurement times using midpoints of the observed censoring intervals for viral suppression. Because data for individual women are sparse over time and the maximum number of data points for individual women is 15, we use truncated quadratic bases with one knot at the viral suppression times for estimating individual-level smooth functions. Since the first derivatives (velocities) of the population-level smooth functions can be computed in analytic form when truncated quadratic bases are used, we also examine the posterior inference for these derivatives.

The prior specifications are as described in Section 3 and the Appendix. For assessing sensitivity in estimating GH and GW, precision parameters (αH, αW) of the Dirichlet process are taken to be equal to (1, 1) and (10, 10), which indicate different levels of faith in the prior normal base measures for H and W. We run two MCMC chains with 7000 iterations, the first 2000 of which are discarded. Convergence is established graphically using history plots; pooled 10,000 posterior samples are then used for inference. The results at both values of αH, αW are similar; here we present those with (αH, αW) = (10, 10). MCMC is implemented in MATLAB programs [The MathWorks Inc. (1997)].

For the purpose of modeling doubly interval-censored event time W only, marginal models can be used by excluding the part for CD4 counts from (2.7). We will compare the results from our joint model with those from marginal models, and investigate the possible impact of joint modeling.

4.1. Results for virologic response to HAART

Table 2 presents the posterior mean estimates of the percentiles of the time between HAART initiation and viral suppression for the HAART responder group. The results based on ‘wide’ intervals for H suggest that the HCV negative group might have shorter time to achieve viral suppression than the HCV positive group, but this is not the case with ‘narrow’ intervals for H, where the HCV negative group has more right skewed distribution. Further, the joint model tends to give smaller estimates than the marginal model. For example, in Table 2 both location estimates and variability estimates from the joint model based on ‘wide’ intervals for H are smaller than those from the marginal model, which suggests that modeling CD4 counts affects the estimation for doubly interval-censored W when the information from censoring intervals is limited.

TABLE 2.

Percentiles (posterior mean estimates) of the time between HAART initiation and viral suppression (in units of days) for HAART responder group by HCV serostatus in marginal and joint models; ‘narrow’ stands for ‘narrow’ intervals for H, ‘wide’ stands for ‘wide’ intervals for H

5% 25% 50% 75% 95%
‘narrow’ W |VT Marginal HCV + 15 37 126 654 1339
HCV − 13 39 118 625 1384
Joint HCV + 13 28 88 291 906
HCV − 13 31 82 356 959
‘wide’ W |VT Marginal HCV + 3 145 582 1129 1497
HCV − 1 120 436 1021 1521
Joint HCV + 1 122 350 793 1232
HCV − 1 91 322 768 1315

Table 3 gives the estimated proportions of HAART responders with time between HAART initiation and viral suppression less than or equal to 90/180 days. In both cases of ‘wide’ and ’narrow’ intervals for H, the 95% credible intervals for differences between proportions by HCV groups cover zero. Thus, in the HERS cohort, there is not sufficient evidence that baseline HCV serostatus is associated with virologic response to HAART. This is also demonstrated in Figure 5, where the hazard functions of viral suppression are plotted over grid points of 30 days. Here the hazard is defined as p(W < t2|Wt1, VT), where t1, t2 are grid points. With both ‘narrow’ and ‘wide’ intervals for H, the hazard functions of viral suppression are generally similar across the HCV groups. Note that estimated proportions of HAART responders p(VT) are also similar for the HCV groups in all cases.

TABLE 3.

Proportions (posterior mean estimates) of HAART responders and proportions of HAART responders with time between HAART initiation and viral suppression less than 90 (180) days by HCV serostatus from marginal and joint models in the HERS cohort; 95% credible intervals are in square brackets; ‘narrow’ stands for ‘narrow’ intervals for H, ‘wide’ stands for ‘wide’ intervals for H

p(VT) p(W ≤ 90|VT) p(W = 180|VT)
‘narrow’
  Marginal HCV + 0.75 0.42 0.56
HCV − 0.72 0.43 0.56
Difference − 0.03 0.02 − 0.01
[− 0.14, 0.08] [− 0.24, 0.25] [− 0.12, 0.11]
  Joint HCV + 0.63 0.48 0.66
HCV − 0.64 0.52 0.62
Difference 0.01 0.05 − 0.04
[− 0.05, 0.06] [− 0.24, 0.31] [− 0.14, 0.07]
‘wide’
  Marginal HCV + 0.85 0.13 0.29
HCV − 0.78 0.22 0.33
Difference − 0.07 0.08 0.04
[− 0.22, 0.07] [− 0.03, 0.19] [− 0.06, 0.14]
  Joint HCV + 0.68 0.17 0.36
HCV − 0.68 0.24 0.38
Difference 0.01 0.07 0.01
[− 0.05, 0.07] [− 0.06, 0.20] [− 0.10, 0.12]

Fig. 5.

Fig. 5

Hazard function of viral suppression after HAART initiation by HCV serostatus in the HERS cohort over grid points of 30 days from the joint model; left panel: ‘narrow’ intervals for H; right panel: ‘wide’ intervals for H.

From Table 2, median estimates for the time between HAART initiation and viral suppression are approximately one year with ‘wide’ intervals for H and 3–4 months with ‘narrow’ intervals for H in the joint model. Compared to the clinically expected value, the estimates with ‘wide’ intervals for H might be overestimated due to the following reasons. First, data were collected approximately every six months in the HERS, thus the immediate virologic response to HAART were not available. Second, HAART information was self-reported and we set up the left endpoints of HAART initiation time to be March 11th, 1996 for reducing reporting bias. Consequently, censoring intervals for observed HAART initiation times are wide. Third, 38% of the participants had right-censored viral suppression times, which might be related to the adherence of HAART treatment and individual heterogeneity in virologic response. However, these situations do not differ by HCV serostatus, thus the corresponding comparison can still be useful.

4.2. Results for immunologic response to HAART

The results for CD4 counts are similar under both definitions of censoring intervals for HAART initiation and we present those based on ‘wide’ intervals for H.

4.2.1. Population estimates

We compute posterior mean estimates for all targets of inference. The coefficient estimate for pretreatment CD4 level is 2.35 (95% credible interval [2.22, 2.49]), which clearly indicates the positive association between pretreatment CD4 level and the current CD4 count, given baseline HCV and IDU statuses. The coefficient estimate for baseline IDU is −0.06 (95% credible interval [−0.80, 0.64]), suggesting that baseline IDU status was not associated with current CD4 counts, given baseline HCV and pretreatment CD4 level.

For HAART responders, mean CD4 count profiles (after accounting for pretreatment CD4 level and baseline IDU) are plotted in the panel (a) of Figure 6. We transform the estimates back to the original CD4 count scale for illustration purposes. The estimated CD4 count profiles of both HCV groups were decreasing at 3–6 years before viral suppression. CD4 counts started to increase before HIV virus was completely suppressed (time point 0). This is consistent with findings from other studies, that is, CD4 cells may increase after HAART for patients who do not fully suppress the virus, because the level of viral load is decreasing [Jacobson, Phair and Yamashita (2004)]. However, Figure 6(a), also suggests that the decreasing trend for HCV-negative patients ends earlier than HCV-positive patients when HAART started to be initiated. In addition, the average CD4 level after viral suppression achieved by HCV-negative patients is higher than HCV-positive patients. For example, at viral suppression time the difference of average CD4 count for HCV groups is approximately 16 (95% credible interval [−3, 35]), controlling for pretreatment CD4 level and baseline IDU. We also plot the difference curve between mean CD4 count profiles of HCV groups [Figure 6(b), in original CD4 count scale]. The pointwise 95% credible bands are approximately above zero after CD4 counts started to increase. Note that the difference between point estimates of the mean CD4 counts at the left boundary for the time since viral suppression axis might be due to the small sample size and large estimation variability, which is suggested by the width of 95% pointwise credible bands.

Fig. 6.

Fig. 6

(a) Estimated CD4 count profiles by HCV groups for HAART responders (transformed to original CD4 count scale) in the joint model, after accounting for pretreatment CD4 level and baseline injection drug use: solid line, HCV-positive group; dotted line, HCV-negative group. (b) Difference between CD4 count profiles (in original CD4 count scale) in the joint model: solid line, posterior mean estimates; dotted lines, 95% pointwise credible bands. (c) Derivatives for CD4 count profiles by HCV groups for HAART responders (in square root CD4 count scale) in the joint model, after accounting for pretreatment CD4 level and baseline injection drug use. (d) Difference between derivatives for CD4 count profiles (in square root CD4 count scale) in the joint model. The ticks at the top and the bottom of the panels are the HAART initiation times corresponding to the 5%, 50% and 95% quantiles of the time between HAART initiation and viral suppression in Table 2: solid line, HCV-positive group; dotted line, HCV-negative group.

To evaluate immune reconstitution after HAART, the rate of CD4 count change is a useful measure. Panel (c) of Figure 6 presents the derivative (velocity) curves for mean CD4 count profiles of HAART responders. For both HCV groups, the velocities of the average CD4 count change reach the maximum approximately at viral suppression times, which is sensible because the major driving force of immune reconstitution is viral suppression [Jacobson, Phair and Yamashita (2004)]. Overall, the HCV-negative group has slightly larger point estimates of mean CD4 count change rate leading up to and following viral suppression. Panel (d) of Figure 6 gives the difference and the corresponding 95% credible bands between derivative curves of HCV groups. After controlling for pretreatment CD4 level and baseline IDU, the rates of mean CD4 count change do not appear to be different by HCV serostatus in the HERS cohort.

The left panel of Figure 7 presents the mean CD4 count profiles for HAART nonresponders (in original CD4 count scale) along the time since enrollment. Both HCV groups had the same decreasing patterns, and the difference curve and its 95% credible band (right panel of Figure 7) indicate that there is not difference in mean CD4 count levels for HCV groups in this nonresponder population, after adjusting for pretreatment CD4 level and baseline IDU.

Fig. 7.

Fig. 7

(a) Estimated CD4 count profiles by HCV groups for HAART nonresponders (transformed to original CD4 count scale) in the joint model, after accounting for pretreatment CD4 level and baseline injection drug use: solid line, HCV positive group; dotted line, HCV negative group. (b) Difference between CD4 count profiles (in original CD4 count scale) in the joint model: solid line, posterior mean estimates; dotted lines, 95% pointwise credible bands.

4.2.2. Individual estimates

The parameter estimates for individuals may not exactly follow the patterns of the population if the between-subject variation is large. Data, 50 sample curves from posterior predictive distributions and averages of 50 sampled mean curves for nine selected HERS women in Section 1, are plotted in Figure 8. Compared with Figures 6 and 7, we can see that not only the magnitude but also the patterns are different between the population and individual estimated profiles. However, the model fits well to this representative sample of individuals.

Fig. 8.

Fig. 8

CD4 count data (on square root scale) and 50 posterior predictive sample curves in the joint model from 9 selected women in the HERS cohort: vertical dotted lines are censoring intervals for HAART initiation (under ‘wide’ definition), vertical solid lines are censoring intervals for viral suppression; except for panels (a) and (e) with υi >T, ticks at the bottom of each panel are imputed viral suppression times (υi ≤ T); circles represent data from the HCV-positive group and triangles represent data from the HCV-negative group; solid lines are averages of 50 sampled mean curves.

5. Conclusion and discussion

We proposed a joint model for doubly interval-censored event time and longitudinal data in HIV natural history studies in order to investigate the post-HAART HIV dynamics and the associated factors. Using data from the HERS cohort, we found that HCV-negative and HCV-positive patients had similar virologic response, which is measured by the time from HAART initiation to viral suppression. Further, our results show that for patients with virologic response to HAART, being HCV seronegative is associated with higher average CD4 count level after viral suppression, given the same pretreatment CD4 level and baseline IDU status. The HCV-negative group showed slightly higher immune reconstitution level (measured by the rate of mean CD4 count change) leading up to and following viral suppression, however, the evidence from the HERS cohort is not sufficient to support the conclusion.

Data from natural history studies have been used to evaluate the effect of HCV coinfection on post-HAART HIV dynamics [Greub et al. (2000); Sulkowski et al. (2002); Miller et al. (2005)]. However, virologic response and immunologic response were investigated separately and simple summary statistics were used for inference, for example, average CD4 count increases after HAART initiation by visits, hazard ratio of increasing CD4 count by at least 50 cells/µl in a year, etc. In contrast, our method considers the characteristics of longitudinal cohort data as well as the biological background of the post-HAART HIV dynamics (such as the sequential relationship between virologic and immunologic response); our joint modeling approach utilizes all available information from natural history studies and the results can be informative in generating hypotheses for AIDS clinical trials.

In the HERS analysis, we considered the women with V > T as HAART nonresponders and examined their population mean CD4 count profiles. However, because the data are from a natural history study and the observed HAART initiation times vary across individuals, the observed data for viral suppression time actually depend on the timing of HAART initiation. Therefore, the HERS women with V > T might not be a homogenous group in terms of response to HAART. The definition of ‘responder,’ however, does not differ by HCV status. Thus, for comparison purposes, it would still be useful to examine the population mean CD4 count profiles for both women with V > T and women with VT.

Due to the sparse data, information on event times for evaluating virologic response is limited in the HERS cohort. In order to reduce possible reporting bias regarding HAART initiation, we use two definitions of censoring intervals for HAART initiation and investigate the impact on the analysis. The conclusions for HCV serostatus and post-HAART HIV dynamics do not differ by the definitions. However, the actual estimates for time between HAART initiation and viral suppression might be larger compared to the clinical expected values due to the study design, conservative definition of censoring intervals, participant noncompliance, drug resistance and other individual heterogeneity in virologic response to HAART. As we are being conservative by moving left endpoints of HAART initiation time to the earliest possible date, another option could be a hybrid approach by changing censoring intervals only for those with suspicious viral suppression immediately before self-reported HAART initiation date. Alternatively, we could specify a uniform prior for the left boundary of HAART initiation time between the left boundaries defined in ‘narrow’ and ‘wide’ intervals to reflect uncertainty about true HAART initiation time.

Besides HCV coinfection, other potential determinants or modifiers of post-HAART HIV dynamics include characteristics of the HAART regimen, prior antiviral treatment history, stage of disease at the time of HAART initiation (viral load level), an intact immune system and other host characteristics, such as age, race, gender and genotype [Jacobson, Phair and Yamashita (2004)]. For adjusting these possible factors, covariates can be added into the CD4 count model (2.8) similarly as for the case of pretreatment CD4 level and baseline IDU status. For doubly interval-censored data, one limitation of our Bayesian semiparametric approach is that sample sizes could be small for reliable estimation when the unique values of the covariates are large. For example, there were only 4 HERS women who were IDU and HCV negative at baseline. Therefore, we could not assign different DPP to all combinations of the covariate values when baseline IDU is included as a covariate. In this scenario, a parametric approach can be developed to adjust for additional covariates.

We believe that the proposed joint modeling approach is methodologically valuable. The proposed regression spline method is simple to implement, and naturally incorporates the typical features of longitudinal data such as between-individual and within-individual variations. The proposed model can be extended to characterize multiple processes in disease progression after treatment intervention, for example, the neurocognitive response to HAART treatment after immune reconstitution is another process of interest apart from the virologic and immunologic response [Bell (2004)].

Acknowledgments

We are grateful to Jeffrey Blume, Mike Daniels, Constantine Gatsonis, Patrick Heagerty, Tony Lancaster and the referees for helpful comments. Data for HERS were collected under CDC cooperative agreements U64-CCU106795, U64-CCU206798, U64-CCU306802 and U64-CCU506831.

APPENDIX: FULL CONDITIONAL DISTRIBUTIONS FOR GIBBS STEPS IN SECTION 3

A.1. Data augmentation for event times

A value for each censored observation, Hi, is sampled from the conditional distribution of Hi given all other parameters. Under a DPP this conditional distribution maintains the same Polya urn structure assumed a priori for H1, …, HN. It can be shown that the full conditional distribution of Hi has the following form:

[Hi|Yi,Ti,Xi,W,{Hj,ji},LH,RH,LV,RV,λH,λW,θ]~r0·g0TH(hi|Zi,υi,liH,riH,λH)+jirj·I(hj=hi), (A.1)

where g0TH is the truncated posterior distribution in the censoring interval (LiH,min(RiH,Vi)]. Note that Yi does not get involved in (A.1) because conditioning on Vi, Yi and Hi are independent. Since Vi only provides information on the range of Hi, g0TH is simply the truncated g0H, base measure of Hi given Zi. Furthermore,

r0αHliHmin(riH,υi)g0H(hi|Zi;λH)dhi,
rjI(liH<hjmin(riH,υi),Zj=Zi),

and r0 + Σji rj = 1. Thus, a new value of Hi is equal either to hj with probability rj, or to a sampled value from the distribution g0TH with probability r0. Also, we assume that depending on the value of Zi, the base measure g0H are normal distributions with distinct parameters (μ1H,τ1H) or (μ0H,τ0H).

For Wi = ViHi, the full conditional distribution follows:

[Wi|Yi,Ti,Xi,H,{Wj,ji},LH,RH,LV,RV,λH,λW,θ]~q0·g0TW(wi|Yi,Ti,Xi,hi,liV,riV,λW)+jiqj·I(wj=wi),

where

g0TW(wi|yi,Ti,Xi,hi,liV,riV,λW)p3(yi|Xi,Ti(hi+wi);θ)g0W(wi|Zi;λW)×I(max(0,liVhi)<wiriVhi)

is the truncated posterior distribution of Wi in (max(0,LiVHi),RiVHi]. Furthermore,

qoαWmax(0,liVhi)riVhip3(yi|Xi,Ti(hi+wi);θ)g0w(wi|Zi;λW)dwi,
qjp3(yi|Xi,Ti(hi+wj);θ)I(max(0,liVhi)<wjriVhi,Zj=Zi),

and q0 + Σji qj = 1. Thus, a new value of Wi is equal either to wj with probability qj, or to a sampled value from the distribution g0TW with probability q0, where g0TW is the full conditional distribution of W that would be obtained if the completely parametric hierarchical model (2.6) is used and g0W is the prior distribution (base measure) for W given Z. We again assume that g0W are normal distributions with distinct parameters (μ1W,τ1W),(μ0W,τ0W). Because p3(yi |Xi, Ti − (hi + wi); θ) is based on the model in (2.8), there is no closed form for g0TW and the Metropolis step [Gelman et al. (2003)] is used for sampling. The integral in q0 is approximated by the Gauss–Legendre quadrature with 20 nodes.

A.2. Update parameters in the CD4 count model

We use Bayesian penalized splines [Ruppert, Wand and Carroll (2003)] with a truncated polynomial basis for approximating CD4 count profiles at both population level and individual level. Following Ruppert, Wand and Carroll (2003), m1(t), m0(t), c1(t), c0(t), γim(t) and γic(t)(i=1,,N) in (2.8) can be approximated by

m1(t)=B(t)Tβ1,m0(t)=B(t)Tβ2,
c1(t)=A(t)Tα1,c0(t)=A(t)Tα2,
γim(t)=ϕ(t)Tbi,γic(t)=ψ(t)Tai,

where B(t)=(1,t,,tp,(tν1)+p,,(tνKB)+p)T,A(t)=(1,t,,tp,(tξ1)+p,,(tξKA)+p)T,ϕ(t)=(1,t,,tp,(tη1)+p,,(tηKϕ)+p)T and ψ(t)=(1,t,,tp,(tζ1)+p,,(tζKψ)+p)T are truncated polynomial bases; p ≥ 1 is an integer and (d)+p=dp·I(d0). (ν1, …, νKB), (ξ1, …, ξKA), (η1, …, ηKϕ) and (ζ1, …, ζKψ) are the corresponding knots; (KB, KA, Kϕ, Kψ) are the number of knots.

Let

β1=(β1,0,,β1,p+KB)T,β2=(β2,0,,β2,p+KB)T,
α1=(α1,0,,α1,p+KA)T,α2=(α2,0,,α2,p+KA)T,
bi=(bi,0,,bi,p+Kϕ)T,ai=(ai,0,,ai,p+Kψ)T,

and xij = tijvi, then the proposed model in (2.8) can be rewritten as

yij|Xi*,Zi,υi,tij={B(xij)Tβ1+ϕ(xij)Tbi+Xi*β*+eij,if υiT,Zi=1,B(xij)Tβ2+ϕ(xij)Tbi+Xi*β*+eij,if υiT,Zi=0,A(tij)Tα1+ψ(tij)Tai+Xi*β*+eij,if υi>T,Zi=1,A(tij)Tα2+ψ(tij)Tai+Xi*β*+eij,if υi>T,Zi=0.

We use the standard prior distributions for all parameters in the CD4 count model as follows: β* ∝ 1, for s = 0, …, p, p1, s) ∝ 1, p2, s) ∝ 1, p1, s) ∝ 1, p2, s) ∝ 1, bi,s~N(0,σbs2),ai,s~N(0,σas2),σbs2~Gamma(103,103) and σas2~Gamma(103,103); for k = 1, …, KB, β1,p+k~N(0,σβ12) and β2,p+k~N(0,σβ22); for k = 1, …, KA, α1,p+k~N(0,σα12) and α2,p+k~N(0,σα22); for k = 1, …, Kϕ, bi,p+k~N(0,σb2); for k = 1, …, Kψ, ai,p+k~N(0,σa2);σβ12,σβ22,σα12,σα22,σb2,σa2 all follow Gamma(10−3, 10−3) distribution. Note that σβ12,σβ22,σα12,σα22 are smoothing parameters for the population penalized splines; σb2 and σa2 are smoothing parameters for individual penalized splines; σbs2,σas2(s=0,,p) are variance component parameters for random effects. Further, we assume eij ~ N(0, σ2) for all observations and σ2 ~ Gamma(10−3, 10−3).

Thus, the parameter vector θ includes (β*, β1, β2, α1, α2, bi, ai) and (σβ12,σβ22,σα12,σα22,σb2,σa2,σbs2,σas2,σ2). Since all conditional posterior distributions for θ are in closed form, the Gibbs steps are straightforward.

A.3. Update parameters for DPP base measures G0H and G0W

The parameters λH and λW are updated from their full conditional distributions:

[λH|Y1,,YN,X1,,XN,T,H,W,LH,RH,LV,RV,θ,λW]~iIHg0H(hi|Zi,υi,liH,riH;λH)f(λH),
λW|Y1,,YN,X1,,XN,T,H,W,LH,RH,LV,RV,θ,λH]~iIWg0W(wi|Zi,hi,liV,riV;λW)f(λW),

where IH and IW are the subsets of indexes corresponding to the distinct Hi and Wi because the distinct Hi and Wi are random samples from G0H and G0W, respectively [Blackwell and MacQueen (1973)]. In our case, λH=(μ1H,μ0H,τ1H,τ0H) and λW=(μ1W,μ0W,τ1W,τ0W) for the normal base measures; we assume f(μ1H,μ0H,τ1H,τ0H)(τ1Hτ0H)1 and f(μ1W,μ0W,τ1W,τ0W)(τ1Wτ0W)1. The conditional posterior distributions of λH and λW are both in closed forms.

Footnotes

1

Supported by NIH Grants R01-AI-50505, R01-HL-79457 and Grant U.1052.00.009 from the Medical Research Council (UK).

Contributor Information

Li Su, MRC Biostatistics Unit, Robinson Way, Cambridge CB2 0SR, UK, li.su@mrc-bsu.cam.ac.uk.

Joseph W. Hogan, Center for Statistical Sciences, Department of Community Health, Brown University, Box G-S121-7, Providence, Rhode Island 02912, USA, jhogan@stat.brown.edu

REFERENCES

  1. Barkan SE, Melnick SL, Preston-Martin S, Weber K, Kalish LA, Miotti P, Young M, Greenblatt R, Sacks H, Feldman J. The women’s interagency HIV study. WIHS collaborative study group. Epidemiology. 1998;9:117–125. [PubMed] [Google Scholar]
  2. Bell JE. An update on the neuropathology of HIV in the HAART era. Histopathology. 2004;45:549–559. doi: 10.1111/j.1365-2559.2004.02004.x. [DOI] [PubMed] [Google Scholar]
  3. Blackwell D, MacQueen JB. Ferguson distribution visa Polya urn schemes. Ann. Statist. 1973;1:353–355. MR0362614. [Google Scholar]
  4. Calle ML, Gómez G. A semiparametric hierarchical method for a regression model with an interval-censored covariate. Austr. New Zeal. J. Statist. 2005;47:351–364. MR2169533. [Google Scholar]
  5. Chen MH, Herring AH, Ibrahim JG, Lipsitz SR. Missing-data methods for generalized linear models: A comparative review. J. Amer. Statist. Assoc. 2005;100:332–346. MR2166072. [Google Scholar]
  6. De Gruttola V, Lagakos SW. Analysis of doubly-censored survival data, with application to AIDS. Biometrics. 1989;45:1–11. MR0999438. [PubMed] [Google Scholar]
  7. Escobar MD. Estimating normal means with a Dirichlet process prior. J. Amer. Statist. Assoc. 1994;89:268–277. MR1266299. [Google Scholar]
  8. Ferguson TS. A Bayesian analysis of some nonparametric problems. Ann. Statist. 1973;1:209–230. MR0350949. [Google Scholar]
  9. Gelman A, Carlin JB, Stern HS, Rubin DB. Bayesian Data Analysis. 2nd. Boca Raton, FL: Chapman Hall/CRC Press; 2003. MR1385925. [Google Scholar]
  10. Greub G, Ledergerber B, Battegay M, Grob P, Perrin L, Furrer H, Burgisser P, Erb P, Boggian K, Piffaretti JC, Hirschel B, Janin P, Francioli P, Flepp M, Telenti A. Clinical progression, survival, and immune recovery during antiretroviral therapy in patients with HIV-1 and hepatitis C virus coinfection: The Swiss HIV Cohort Study. Lancet. 2000;356:1800–1805. doi: 10.1016/s0140-6736(00)03232-3. [DOI] [PubMed] [Google Scholar]
  11. Jacobson LP, Phair JP, Yamashita TE. Update on the virologic and immunologic response to highly active antiretroviral therapy. Current Infectious Disease Reports. 2004;6:325–332. doi: 10.1007/s11908-004-0055-9. [DOI] [PubMed] [Google Scholar]
  12. Kaslow RA, Ostrow DG, Detels R, Phair JP, Polk BF, Rinaldo CR., Jr The Multicenter AIDS Cohort Study: Rationale, organization, and selected characteristics of the participants. Am. J. Epidemiol. 1987;126:310–318. doi: 10.1093/aje/126.2.310. [DOI] [PubMed] [Google Scholar]
  13. Ledergerber B, von Overbeck J, Egger M, Luthy R. The Swiss HIV Cohort Study: Rationale, organization and selected baseline characteristics. Soz Praventivmed. 1994;39:387–394. doi: 10.1007/BF01299670. [DOI] [PubMed] [Google Scholar]
  14. Lederman MM, Connick E, Landay A, Kuritzkes DR, Spritzler J, StClair M, Kotzin BL, Fox L, Chiozzi MH, Leonard JM, Rousseau F, Wade M, Roe JD, Martinez A, Kessler H. Immunologic responses associated with 12 weeks of combination antiretroviral therapy consisting of zidovudine, lamivudine and ritonavir: Results of AIDS clinical trials group protocol 315. J. Infect. Dis. 1998;178:70–79. doi: 10.1086/515591. [DOI] [PubMed] [Google Scholar]
  15. Liang H, Wu H, Carroll RJ. The relationship between virologic and immunologic responses in AIDS clinical research using mixed-effects varying-coefficient models with measurement error. Biostatistics. 2003;4:297–312. doi: 10.1093/biostatistics/4.2.297. [DOI] [PubMed] [Google Scholar]
  16. Little RJA, Rubin DB. Statistical Analysis with Missing Data. 2nd. New York: Wiley; 2002. [Google Scholar]
  17. Miller MF, Haley C, Koziel MJ, Rowley CF. Impact of hepatitis C virus on immune restoration in HIV-infected patients who start highly active antiretroviral therapy: A meta-analysis. Clin. Infect. Dis. 2005;41:713–720. doi: 10.1086/432618. [DOI] [PubMed] [Google Scholar]
  18. Mocroft A, Neaton J, Bebchuk J, Staszewski S, Antunes F, Knysz B, Law M, Phillips AN, Lundgren JD. The feasibility of clinical endpoint trials in HIV infection in the highly active antiretroviral treatment (HAART) era. Clin. Trials. 2006;3:119–132. doi: 10.1191/1740774506cn138oa. [DOI] [PubMed] [Google Scholar]
  19. Oller R, Calle ML, Gómez G. Interval censoring: Model characterizations for the validity of the simplified likelihood. Canad. J. Statist. 2004;32:315–326. MR2101759. [Google Scholar]
  20. Pantazis N, Touloumi G, Walker AS, Babiker AG. Bivariate modelling of longitudinal measurements of two human immunodeficiency type 1 disease progression markers in the presence of informative drop-outs. Appl. Statist. 2005;54:405–423. MR2135882. [Google Scholar]
  21. Ramsay JO, Li X. Curve registration. J. Roy. Statist. Soc. Ser. B. 1998;60:351–363. MR1616045. [Google Scholar]
  22. Ramsay J, Silverman BW. Functional Data Analysis. New York: Springer; 2005. MR2168993. [Google Scholar]
  23. Rockstroh JK. Influence of viral hepatitis on HIV infection. J. Hepatol. 2006;44:S25–S27. doi: 10.1016/j.jhep.2005.11.007. [DOI] [PubMed] [Google Scholar]
  24. Rockstroh JK, Mocroft A, Soriano V, Tural C, Losso MH, Horban A, Kirk O, Phillips A, Ledergerber B, Lundgren J, Group ES. Influence of hepatitis C virus infection on HIV-1 disease progression and response to highly active antiretroviral therapy. J. Infect. Dis. 2005;192:992–1002. doi: 10.1086/432762. [DOI] [PubMed] [Google Scholar]
  25. Ruppert D, Wand MP, Carroll RJ. Semiparametric Regression. Cambridge: Cambridge Univ. Press; 2003. MR1998720. [Google Scholar]
  26. Sherman KE, Rouster SD, Chung RT, Rajicic N. Hepatitis C virus prevalence among patients infected with human immunodeficiency virus: A cross-sectional analysis of the US adult AIDS Clinical Trials Group. Clin. Infect. Dis. 2002;34:831–837. doi: 10.1086/339042. [DOI] [PubMed] [Google Scholar]
  27. Smith DK, Warren DL, Vlahov D, Schuman P, Stein MD, Greenberg BL, Holmberg SD. Design and baseline participant characteristics of the Human Immunodeficiency Virus Epidemiology Research (HER) study: A prospective cohort study of human immunodeficiency virus infection in US women. Amer. J. Epidemiol. 1997;146:459–469. doi: 10.1093/oxfordjournals.aje.a009299. [DOI] [PubMed] [Google Scholar]
  28. Stebbing J, Waters L, Mandalia S, Bower M, Nelson M, Gazzard B. Hepatitis C virus infection in HIV type 1-infected individuals does not accelerate a decrease in the CD4+ cell count but does increase the likelihood of AIDS-defining events. Clin. Infect. Dis. 2005;41:906–911. doi: 10.1086/432885. [DOI] [PubMed] [Google Scholar]
  29. Sulkowski MS, Moore RD, Mehta SH, Chaisson RE, Thomas DL. Hepatitis C and progression of HIV disease. J. Amer. Med. Assoc. 2002;11:199–206. doi: 10.1001/jama.288.2.199. [DOI] [PubMed] [Google Scholar]
  30. Sullivan PS, Hanson DL, Teshale EH, Wotring LL, Brooks JT. Effect of hepatitis C infection on progression of HIV disease and early response to initial antiretroviral therapy. AIDS. 2006;20:1171–1179. doi: 10.1097/01.aids.0000226958.87471.48. [DOI] [PubMed] [Google Scholar]
  31. Sun J. The Statistical Analysis of Interval-Censored Failure Time Data. New York: Springer; 2006. MR2287318. [Google Scholar]
  32. Taylor JMG, Cumberland WG, Sy JP. A stochastic model for analysis of longitudinal AIDS data. J. Amer. Statist. Assoc. 1994;89:727–736. [Google Scholar]
  33. The MathWorks Inc. Using MATLAB. Natick, MA: 1997. [Google Scholar]
  34. Thiébaut R, Jacqmin-Gadda H, Babiker A, Commenges D The CASCADE Collaboration. Joint modelling of bivariate longitudinal data with informative dropout and left-censoring, with application to the evolution of CD4+ cell count and HIV RNA viral load in response to treatment of HIV infection. Statist. Med. 2005;24:65–82. doi: 10.1002/sim.1923. MR2134496. [DOI] [PubMed] [Google Scholar]
  35. Wu H. Statistical methods for HIV dynamic studies in AIDS clinical trials. Statist. Methods Med. Res. 2005;14:1–22. doi: 10.1191/0962280205sm390oa. MR2135921. [DOI] [PubMed] [Google Scholar]

RESOURCES