Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2017 Jun 15.
Published in final edited form as: Biometrics. 2016 Dec 5;73(2):441–451. doi: 10.1111/biom.12626

Generalized Semiparametric Varying-Coefficient Model for Longitudinal Data with Applications to Adaptive Treatment Randomizations

Li Qi 1, Yanqing Sun 2,, Peter B Gilbert 3,4
PMCID: PMC5459686  NIHMSID: NIHMS839304  PMID: 27918612

SUMMARY

This paper investigates a generalized semiparametric varying-coefficient model for longitudinal data that can flexibly model three types of covariate effects: time-constant effects, time-varying effects, and covariate-varying effects. Different link functions can be selected to provide a rich family of models for longitudinal data. The model assumes that the time-varying effects are unspecified functions of time and the covariate-varying effects are parametric functions of an exposure variable specified up to a finite number of unknown parameters. The estimation procedure is developed using local linear smoothing and profile weighted least squares estimation techniques. Hypothesis testing procedures are developed to test the parametric functions of the covariate-varying effects. The asymptotic distributions of the proposed estimators are established. A working formula for bandwidth selection is discussed and examined through simulations. Our simulation study shows that the proposed methods have satisfactory finite sample performance. The proposed methods are applied to the ACTG 244 clinical trial of HIV infected patients being treated with Zidovudine to examine the effects of antiretroviral treatment switching before and after HIV develops the T215Y/F drug resistance mutation. Our analysis shows benefits of treatment switching to the combination therapies as compared to continuing with ZDV monotherapy before and after developing the 215-mutation.

Keywords: ACTG 244 AIDS clinical trial, Covariate-varying effects, Link function, Longitudinal data analysis, Hypothesis tests, Adaptive treatment randomization, Profile weighted least squares

1. Introduction

Longitudinal data are common in medical and public health research. In AIDS clinical trials of HIV infected patients, for example, viral loads and CD4 counts are measured repeatedly during the course of studies. These biomarkers have long been known to be prognostic for both secondary HIV transmission and progression to clinical disease, cf. Mellors et al. (1997). Semiparametric regression models for longitudinal data have been intensively studied, in which the covariate effects are constant over time for some covariates and time-varying for others; see recent works by Fan and Li (2004), Qu and Li (2006), Fan et al. (2007) and Sun et al. (2013) among others. Semiparametric time-varying coefficient models allow effective simultaneous modeling of both types of covariate effects. However, in many applications there may be a third type of covariate effect – the effect that varies with an exposure variable. The statistical methods developed here apply for examining the possible exposure-varying effects of the adaptive treatment randomization strategy on longitudinal biomarkers such as CD4 cell counts and HIV viral load, with the exposure modifying variable the time since randomization.

A motivating example is a historical case study of antiretroviral treatment regimens, the ACTG 244 clinical trial. Zidovudine (ZDV) was the first drug approved for treatment of HIV infection. Initial approval was based on evidence of a short-term survival advantage over placebo when zidovudine was given to patients with advanced HIV disease. Shortly after that, zidovudine resistance was associated with disease progression measured by a rise in plasma virus and decline in CD4 cell counts in both children and adults receiving zidovudine monotherapy, cf. Japour et al. (1995). Subsequent studies suggested benefits of switching patients to treatments that combined ZDV with didanosine (ddI) or with ddI plus nevirapine (NVP). ACTG 244 enrolled subjects receiving ZDV monotherapy and monitored their HIV in plasma bi-monthly for the T215Y/F mutation. When a subject’s viral population developed the 215 mutation, the subject was randomized to continue ZDV, add ddI or add ddI plus NVP. An important question is whether the treatment switching has any beneficial effects in treating the HIV infected patients.

We investigate the effect of the adaptive treatment randomization under the generalized semiparametric model for longitudinal data with a general link function. Our model considers three types of covariate effects: time-varying effects, covariate-varying effects and constant effects. The covariate-varying effects are specified as parametric functions of effect modifiers while the time-varying effects are modeled as nonparametric functions. While the nonparametric approach for the covariate-varying effects is more flexible, parametric modeling also has its advantage, in particular, when the dimension of the covariate that it depends on is moderately high. The parametric approach avoids the curse of dimensionality. Parametric forms are more interpretable when models are built based on the knowledge of the underline biological processes. Although in some cases one can include linear interaction terms to account for covariate-varying effects, this is limited to situations where the covariate-varying effects are linear functions of the exposure variables. It is also difficult to make inference for individual model parameters separately when higher order interactions are present. We develop hypothesis testing procedures to test the parametric functions of the covariate-varying effects. To the best of our knowledge, such procedures do not exist for longitudinal analysis. The generalized semiparametric model allows selection of different link functions. Thus our methods can be applied for continuous as well as categorical longitudinal responses.

The rest of the paper is organized as follows. In Section 2, we introduce the generalized semiparametric varying-coefficient regression model. The profile local linear estimation method for the proposed model is developed in Section 3. In Section 4, we establish the asymptotic results for the nonparametric and parametric estimators. The hypothesis testing procedures are developed in Section 5. The finite sample performances of the proposed estimators and test statistics are examined in simulations in Section 6. The methods are applied to the ACTG 244 data in Section 7. Concluding remarks are given in Section 8.

2. Generalized semiparametric varying-coefficient model

Suppose there is a random sample of n subjects and τ is the end of follow-up. The longitudinal responses Yi(t) for subject i are observed at the sampling time points 0 ≤ Ti1 < Ti2 < ⋯ < Tiniτ, where ni is the total number of observations from subject i. The sampling times can be irregular and dependent on covariates. In addition, some subjects may drop out of the study early. Let Ni(t)=j=1niI(Tijt) be the number of observations taken from the ith subject by time t, where I(·) is the indicator function. Let Ci be the end of follow-up time or censoring time, whichever comes first. The responses for subject i can only be observed at time points before Ci. Thus Ni(t) can be written as Ni(tCi), where Ni(t) is the counting process of potential sampling times. Let Xi(t) and Ui(t) be p and r dimensional vectors of possibly time-dependent covariates, respectively. Assume that {Yi(·), Xi(·), Ui(·), Ni(·), i = 1, ⋯ ,n} are independent identically distributed (iid) random processes. The censoring time Ci is noninformative in the sense that E{dNi(t)Xi(t),Ui(t),Cit}=E{dNi(t)Xi(t),Ui(t)} and E{Yi(t) ∣ Xi(t), Ui(t),Cit} = E{Yi(t) ∣ Xi(t), Ui(t)}. Assume that dNi(t) is independent of Yi(t) conditional on Xi(t), Ui(t) and Cit. The censoring time Ci is allowed to depend on Xi (·) and Ui(·).

Suppose that covariates Xi(t)=(X1iT(t),X2iT(t),X3iT(t))T consist of three parts, X1i (t) , X2i(t), X3i(t), of dimensions p1, p2 and p3, respectively, each of which has a different role in the model. We study the following generalized semiparametric varying-coefficient model:

μi(t)=E{Yi(t)Xi(t),Ui(t)}=g1{αT(t)X1i(t)+βTX2i(t)+γT(Ui(t);θ)X3i(t)} (1)

for 0 ≤ tτ, where g(·) is a known link function, α(·) is a p1-dimensional vector of completely unspecified functions representing the time-varying effects of X1i(t), β is a p2-dimensional vector of parameters, θ is a q-dimensional vector of parameters, and γ(u, θ) is a p3-dimensional vector of possibly nonlinear parametric functions defined on the range U of Ui(·). Setting the first component of X1i (t) as 1 gives a nonparametric baseline function. γ(u) = γ(u, θ) is the effect of X3i (t) at covariate level Ui(t) = u. Both categorical and continuous longitudinal responses can be modelled with appropriately chosen link functions. For example, the identity and logarithm link functions can be used for continuous response variables while the logit link function can be used for binary responses.

For the motivating example of the ACTG 244 study, it is of interest to know how biomarkers such as viral load and CD4 cell count respond to the new treatments. The effects of the new treatments are likely to depend on the time duration, Ui(t) = tSi, since the switching, where t is the time since initiation of antiretroviral therapy (ART) and Si is the time of treatment randomization. Letting X3i(t) = I(t > Si) in model (1), γ(u) represents the change in the conditional mean response at time u after treatment randomization adjusting for other covariates X1i (t) and X2i (t). On the other hand, if we let X3i(t)=X3io(t)I(t>Si) where X3io(t) are the indicators for the new treatments after randomization, then γ(u) are the effects of new treatments starting from treatment switching.

3. Statistical estimation

3.1 Estimation procedure

This section develops the estimation procedure for model (1). The approach utilizes the local linear estimation technique which has been shown to be design-adaptive and more efficient in correcting boundary bias than the kernel smoothing approach (Fan and Gijbels, 1996).

At each t0, let α(t) = α(t0) + α̇(t0)(tt0) + O((tt0)2) be the first order Taylor expansion of α(t) for tNt0 a neighborhood of t0, Where α̇(t0) is the derivative of α(t) at t = t0. Denote α(t0)=(αT(t0),α˙T(t0))T, X1i(t,tt0)=X1i(t)(1,tt0)T, where ⊗ is the Kronecker product. Let ζ = (βT, θT)T. For tNt0, model (1) can be approximated by

μ(t,t0,α(t0),ζXi,Ui)=g1{αT(t0)X1i(t,tt0)+βTX2i(t)+γT(Ui(t),θ)X3i(t)}. (2)

The local linear estimating function for α* (t0) at each t0 and for fixed ζ is given by

Uα(α;ζ,t0)=i=1n0τWi(t)[Yi(t)μ(t,t0,α(t0),ζXi,Ui)]X1i(t,tt0)Kh(tt0)dNi(t), (3)

where Wi(t) = W (t, Xi(t), Ui(t)) is a nonnegative weight process, K(·) is a kernel function, h = hn > 0 is a bandwidth parameter and Kh(·) = K(·/h)/h. The solution to the equation Uα(α*; ζ, t0) = 0 is denoted by α̃*(t0, ζ).

Let α̃ (t, ζ) be the first p1 components of α̃*(t, ζ). The profile weighted least squares estimator ζ̂ is obtained by solving the following estimating function

Uζ(ζ)=i=1nt1t2Wi(t)[Yi(t)g1{αT(t,ζ)X1i(t)+ηT(Ui(t),ζ)X2i(t)}]{(α(t,ζ)ζ)TX1i(t)+(η(Ui(t),ζ)ζ)TX2i(t)}dNi(t), (4)

where η(Ui(t), ζ) = (βT, γT(Ui(t), θ))T, η (Ui(t), ζ)/ζ = diag {Ip2,∂ γ(Ui (t) θ)/θ}, X2i(t)=(X2iT(t),X3iT(t))T, and α̃(t, ζ)/ζ is the first p1 rows of α*(t, ζ)/ζ. We take [t1, t2] ⊂ (0, τ) in order to avoid possible instability near the boundary. The profile estimator of α(t0) is obtained by α̂(t0) = α̃(t0, ζ̂) through substitution.

The partial derivatives α*(t, ζ)/ζ needed to evaluate (4) can be expressed in terms of the partial derivatives of Uα(α*; ζ, t) at α* = α̃*(t, ζ). Specifically, since Uα(α̃*(t, ζ); ζ, t) ≡ 02p1, it follows that α̃*(t, ζ) satisfies {Uα(α;ζ,t)αα(t,ζ)ζ+Uα(α;ζ,t)ζ}α=α(t,ζ)=02p1. Let φ(·) = g−1(·) be the inverse function of the link function g(·) and φ̇(·) be the derivative of φ(·). Then,

α(t,ζ)ζ={Uα(α;ζ,t)α}1Uα(α;ζ,t)ζα=α(t,ζ), (5)

where Uα(α;ζ,t0)/α=i=1n0TWi(t)μ˙i(t0,t,ζ)X1i(t,tt0)2Kh(tt0)dNi(t), Uα(α;ζ,t0)/ζ=i=1n0TWi(t)μ˙i(t0,t,ζ)X1i(t,tt0){X2i(t)}T(η(Ui(t),ζ/ζ)Kh(tt0)dNi(t). Here, μ˙i(t0,t,ζ)=ϕ˙{αT(t0)X1i(t,tt0)+ηT(Ui(t),ζ)X2i(t)}.

When the link function is the identity function, α̃*(t0, ζ) can be solved explicitly as the root of the estimating function (3). Under a general link function, α̃*(t0, ζ) can be solved using an iterative algorithm. The estimation procedure iteratively updates estimates of the nonparametric component α̃*(t0, ζ) and the parametric component ζ. Specifically, the estimators α̂(t0) and ζ̂ are obtained through the following iterated algorithm:

Computational algorithm

  1. Let α̂(t){0} and ζ̂{0} be initial values.

  2. For each jump point of {Ni(·), i = 1, ⋯ , n}, say t, the mth step estimator α̂*{m} (t) = α̂*(t, ζ̂{m−1}) is the root of the estimating function (3) satisfying Uα(α̂*{m} (t), ζ̂{m−1}, t) = 0, where ζ̂{m−1} is the estimate of ζ at the (m−1)th step.

  3. The mth step estimator ζ{m} is the root of Uζ(ζ) = 0 defined in (4) obtained after replacing α̃ (t, ζ) with α̂{m} (t), where α̂{m} (t) is the first p1 components of α̂*{m} (t).

  4. Repeating steps 2 and 3, the estimators α̂*{m} (t) and ζ̂{m} are updated at each iteration until convergence. The estimates ζ̂ and α̂(t) are ζ̂{m} and the first p1 components of α̂*{m} (t), respectively, at convergence.

In our numerical study, the main program of the estimation procedure is implemented using Matlab while part of the program that solves the estimating equation (3) for α(t) at all jump points is implemented using C++ to save computing time. The estimators usually converge in 5 iterations, and take less than one minute for a sample of size 400.

3.2 Bandwidth selection

The optimal theoretical bandwidth is difficult to achieve since it would involve estimating the second derivative α̈(t); see Theorem 2 in the next section and also Fan and Gijbels (1996) and Sun et al. (2013). In practice, the appropriate bandwidth can be based on a cross-validation method. This approach is widely used in the nonparametric function estimation literature; see Rice and Silverman (1991) for a leave-one-subject-out cross-validation approach, and Tian et al. (2005) for a K-fold cross-validation approach for survival data. The K-fold cross-validation approach in the current setting divides the data into K approximately equal-sized groups. With Dk denoting the kth subgroup of data, the kth prediction error is given by

PEk(h)=iDk{t1t2Wi(t)[Yi(t)g1{α^(k)T(t)Xi1(t)+ηT(Ui(t),ζ^(k))X2i(t)}]dNi(t)}2,

for k = 1, ⋯ , K, where α̂(−k)(t) and ζ̂(−k) are estimated using the data from all subgroups other than Dk. The K-fold cross-validation bandwidth selection is obtained by minimizing the total prediction error PE(h)=k=1KPEk(h) with respect to h.

We also investigated an alternative bandwidth selection method. The bandwidth selection formula h = Cσ̂Tn−1/3 has been examined for nonparametric density estimation and for semiparametric failure time regression in Jones et al. (1991) and Zhou and Wang (2000) among others, where C is a constant and σ̂T is the estimated standard error of the sampling times in the domain of the nonparametric functions to be estimated. To adopt the formula for longitudinal data, we note that the observation times {Tij, j = 1, …, ni} for a subject i are likely dependent. Suppose that ϕi is the random effect that induces such dependence. Then the variance of the observation times can be expressed as σT2=Var(Tij)=E{Var(Tijφi)}+Var{E(Tijφi)}, which can be estimated by σ^T2=n1i=1nSia+Sb, where Siα is the within-subject sample variance of {Tij, j = 1, …, ni}, is Sb the between-subject sample variance of T¯i·=ni1i=1niTij for i = 1, …, n.

Our numerical study for both the bandwidth selection methods shows that the bandwidth selection with h = Cσ̂Tn−1/3 by using C in the range from 3 to 5 is close to the bandwidth selected using the K-fold cross-validation for K in the range of 3 to 10. Our study used the K-fold cross-validation bandwidth selection as the bench mark to calibrate the constant C. The simulation results presented in Section 6 using C = 4 suggest that the formula h = 4σ̂Tn−1/3 works well. A larger C can be used if the distribution of the sampling times is skewed or sparse in some areas.

4. Asymptotic properties

Let ζ0 and α0(t) be the true values of ζ and α(t) under model (1), respectively. Let μi(t)=ϕ{α0T(t)X1i(t)+ηT(Ui(t),ζ0)X2i(t)}, μ˙i(t)=ϕ˙{α0T(t)X1i(t)+ηT(Ui(t),ζ0)X2i(t)} and εi (t) = Yi(t) − μi (t). Let wi(t) = w(t, Xi(t), Ui (t)), where w(t, x, u) is the deterministic limit of W (t, x, u) in probability as n → ∞. Define e11(t) = E[wi(t)μ̇i(t)X 1i(t)⊗2 λi(t)ξi(t)] and e12(t)=E[wi(t)μ˙i(t)X1i(t){X2i(t)}T(η(Ui(t),ζ0)/ζ)λi(t)ξi(t)], where ξi(t) = I(Cit) and λi (t) is the conditional mean rate of Ni (t) defined by λi(t) dt = E (dNi(t)∣Xi(t), Ui(t)). Let Qi(t)=(e12(t))T(e11(t))1X1i(t)+(η(Ui(t),ζ0)/ζ)TX2i(t). Denote by α̇0(t), α̈0(t) the first and second derivatives of the true α0(t) with respect to t, respectively.

Let μ^i(t)=ϕ{α^T(t)X1i(t)+ηT(Ui(t),ζ^)X2i(t)}, μ˙^i(t)=ϕ˙{α^T(t)X1i(t)+ηT(Ui(t),ζ^)X2i(t)} and ε̂i(t) = Yi(t) − μ̂i(t). Let E^11(t)=n1i=1n0τWi(s)μ˙^i(s)X1i(s)2Kh(st)dNi(s) and E^12(t)=n1i=1n0τWi(s)μ˙^i(s)X1i(s){X2i(s)}T(η(Ui(t),ζ^)/ζ)Kh(st)dNi(s). Let Q^i(t)=(E^12(t))T(E^11(t))1X1i(t)+(η(Ui(t),ζ^)/ζ)TX2i(t).

The following theorems characterize the asymptotic properties of the estimators ζ̂ and α̂(t) under Condition A given in the Web-based Supplementary Material.

Theorem 1

Under Condition A, ζ^pζ0, and n(ζ^ζ0) converges in distribution to a mean zero Gaussian random vector with covariance matrix A−1ΣA−1, where A=E[t1t2wi(t)μ˙i(t){Qi(t)}2dNi(t)] and =E[t1t2wi(t)Qi(t)εi(t)dNi(t)]2.

The matrices A and Σ can be consistently estimated respectively by A^=n1i=1nt1t2Wi(t)μ˙^i(t){Q^i(t)}2dNi(t) and ^=n1i=1n(t1t2Wi(s)ε^i(t)Q^i(t)dNi(t))2.

Theorem 2

Under Condition A, α^(t)pα0(t), uniformly in t ∈ [t1, t2], and

(nh)1/2(α^(t)α0(t)12μ2h2α¨0(t))DN(0,α(t)), (6)

where μ2=11t2K(t)dt, α(t)=(e11(t))1e(t)(e11(t))1, and e(t)=limnhE{0τwi(s)εi(s)X1i(s)Kh(st)dNi(s)}2.

The variance-covariance matrix Σα(t) can be estimated consistently by replacing e11(t) with Ê11(t) and Σe(t) with ^e(t)=n1hi=1n{g^i(t)}2, where

g^i(t)=0τWi(s)ε^i(s)X1i(s)Kh(st)dNi(s)E^12(t)A^1t1t2Wi(s)Q^i(s)ε^i(s)dNi(s).

5. Testing the covariate-varying effects

The generalized semiparametric varying-coefficient model (1) assumes that the covariate-varying effects are parametric functions of an effect modifier Ui(t). The parametric functions γ(Ui(t), θ) can be specified based on knowledge of the underlying biological processes in some cases, and in others, they can be chosen as polynomial functions or linear combinations of basis functions such as the B-spline basis. This section develops hypothesis testing procedures to test the parametric forms γ(Ui(t), θ). We construct the test process based on the weighted residual process that is closely related to the score function (4).

To test H0: γ(u) = γ(u, θ), we consider the test process

R(u,ζ^)=n1/2(IrA^1)i=1nt1t2Wi(t)I{Ui(t)u}Q^i(t)ε^i(t)dNi(t), (7)

for uRr, where Ir is the r × r identity matrix, I{Ui(t) ≤ u} is the column vector of the indicator functions, and ⊗ is the Kronecker product of matrices. By stratifying the score function for the values of Ui(t), the process R(u, ζ̂) is sensitive to the misspecifications of γ(u, θ). The factor IrÂ−1 balances the contributions from the covariates that might have larger or smaller variations.

We consider the supremum test statistic T1 = supu∈ΔR(u, ζ̂)∥ and the sum of the absolute deviation test statistic T2 = Σu∈ΔR(u, ζ̂)∥ where Δ is Rr or a set of grid points in Rr, and ∥·∥ is the Euclidean norm in Rr(p2+q).

By the first order approximation, R(u, ζ̂) = R(u, ζ0) + (∂R(u, ζ0)/∂ζ) (ζ̂ζ0(+op(1). Let Au=E{t1t2wi(t)μ˙i(t)(I{Ui(t)u}Qi(t))QiT(t)dNi(t)}. Following the proof of Theorem 1, we have n1/2R(u,ζ0)/ζP(IrA1)Au,R(u,ζ0)=n1/2(IrA1)i=1nt1t2wi(t)I{Ui(t)u}Qi(t)εi(t)dNi(t)+op(1), and n1/2(ζ^ζ0)=A1n1/2i=1nt1t2wi(t)εi(t)Qi(t)dNi(t)+op(1). Hence, R(u,ζ^)=n1/2i=1nDi(u)+op(1), where

Di(u)=(IrA1)t1t2wi(t)[I{Ui(t)u}Qi(t)AuA1Qi(t)]εi(t)dNi(t). (8)

By Lemma 1 in the Appendix of Sun et al. (2016), R(u, ζ̂) converges weakly to a mean-zero Gaussian process G(u), for uRr. It follows from the continuous mapping theorem that T1DsupuΔG(u) and T2DuΔG(u).

The critical values of the test statistics T1 and T2, or their asymptotic distributions, can be approximated by using the Gaussian multipliers resampling method, cf., Lin et al. (1993) and Sun et al. (2013). Let A^u=n1i=1nt1t2Wi(t)μ˙^(t)(I{Ui(t)u}Q^i(t))Q^iT(t)dNi(t), and

D^i(u)=(IrA^1)t1t2Wi(t)[I{Ui(t)u}Q^i(t)A^uA^1Q^i(t)]ε^i(t)dNi(t). (9)

Define G(u)=n1/2i=1nD^i(u)φi, where ϕ1, ⋯, ϕn are independent standard normal random variables. The distribution of G(·) is asymptotically equivalent to the distribution of G*(·) given the observed data sequence. Hence, the distributions of T1 and T2 under the null hypothesis can be approximated respectively by the conditional distributions of T1=supuΔG(u) and T2=uΔG(u), which can be approximated by repeatedly generating, say 1000, sets of independent normal random variables ϕ1, ⋯, ϕn while holding the observed data sequence fixed.

6. Simulation studies

We conducted a simulation study to assess the finite-sample performance of the proposed methods. Performance is illustrated under the following model with three popular link functions:

E{Yi(t)Xi,Si}=g1{α0(t)+α1(t)X1i(t)+βX2i+γ(tSi,θ)X3iI(t>Si)}, (10)

for 0 ≤ tτ with τ = 3.5, where α0(t)=0.2t, α1(t) = 0.1 sin(t), γ(u, θ) = θ1 exp(−θ2u), and ζ = (β, θ1, θ2) = (0.1, 1.0, 0.5). The covariate X2i is a Bernoulli random variable with success probability of 0.5, X1i(t) = (t/3 − X2i + ai)/6 is time-dependent with ai from the normal distribution N(0, 0.3), X3i is a uniform random variable on [−1, 1], and Si is uniform on [0, 1]. The proposed methods are examined under the identity link function, logarithm link function and logit link function. For the identity link and the logarithm link, the error εi = Yi(t) − E{Yi(t)∣Xi, Si} has a normal distribution with mean ϕi and variance 0.52, and ϕi is N (0, 1). For the logit link, Yi(t) is generated from Bernoulli distribution with the probability of success of E{Yi(t)∣Xi, Si}. The observation times follow a Poisson process with the proportional mean rate model h(tXi, Si) = 1.5 exp(0.7X2i). The censoring times Ci are generated from a uniform distribution on [1.5, 8]. There are approximately six observations per subject in [0, τ] and about 30% of subjects are censored before τ = 3.5. The Epanechnikov kernel K(t) = 0.75(1 − u2)I(|t| ≤ 1), the bandwidth formula h = 4σTn−1/3, and unit weight function Wi(t) = 1 are used. We take t1 = h/2 and t2 = τh/2 in the estimating functions (4) to avoid larger variations on the boundaries.

The performances of the estimators for ζ and α(t) = (α0(t), α1(t)) at a fixed time t are measured through the Bias, the sample standard error of the estimators (SEE), the sample mean of the estimated standard errors (ESE), and the 95% empirical coverage probability (CP). Table 1 summarizes the Bias, SEE, ESE and CP for ζ under the three different link functions, and three different sample sizes (n = 200; 400 and 600). The bandwidth formula h = 4σTn−1/3 yields h = 0.68 for n = 200, h = 0.54 for n = 400 and h = 0.47 for n = 600. Each entry of the table is calculated based on 1000 repetitions. Table 1 shows that the estimates are unbiased and there is good agreement between the estimated and empirical standard errors. The bias and variance of the estimators decreases as the sample size increases. The coverage probabilities are close to the 95% nominal level. Additional simulations not presented here show that the proposed methods are not overly sensitive to the bandwidth; they work well for C in the range of [3, 5].

Table 1.

Summary of Bias, SEE, ESE and CP for β, θ1 and θ2 for three different link functions and sample sizes using unit weight and bandwidth h = 4σTn−1/3 which yields h = 0.68, 0.54 and 0.47 for n = 200, n = 400 and n = 600, respectively, based on 1000 simulations.

β=.1
θ1 = 1
θ2 = .5
n Bias SEE ESE CP Bias SEE ESE CP Bias SEE ESE CP
Identity link
200 .0073 .2924 .2638 .912 .0173 .1755 .1675 .925 .0259 .1949 .1958 .930
400 .0011 .1911 .1863 .941 .0086 .1174 .1172 .947 .0230 .1446 .1329 .943
600 .0023 .1607 .1524 .932 .0040 .0966 .0941 .945 .0098 .1009 .1025 .941
Logarithm link
200 −.0018 .2034 .1856 .934 .0067 .1406 .1426 .956 .0167 .1306 .1299 .955
400 −.0019 .2062 .1861 .931 .0060 .1421 .1429 .955 .0144 .1281 .1293 .956
600 −.0018 .1118 .1069 .945 .0066 .0793 .0798 .947 .0029 .0698 .0685 .947
Logit link
200 .0030 .1314 .1266 .941 .0159 .1857 .1829 .950 .0095 .1784 .1785 .944
400 −.0069 .1106 .1104 .960 .0127 .1689 .1645 .946 .0185 .1614 .1542 .940
600 −.0022 .0876 .0892 .947 .0107 .1332 .1308 .945 .0078 .1260 .1226 .951

Figure 1 plots the bias, SSE, ESE and CP for the estimators of α0(t) and α1(t) over the time interval [0, 3.5] with n = 400 and bandwidth h = 0.54 for the three different link functions. The plots show that the estimates are close to the true values and the ESE provides a good approximation for the SSE of the pointwise estimators. The empirical coverage probabilities are close to the 95% nominal level.

Figure 1.

Figure 1

Plots of bias, SEE, ESE and CP of α̂0(t) and α̂1(t) under three link functions with n=400, h = 0.54 and unit weight Wi(t) = 1 based on 1000 simulations. The figures in the right panel are for α0(t)=0.2t, and the figures in the right panel are for α1(t) = 0.1 sin(t).

Next, we examine the finite-sample performance of the proposed test statistics under model (10). The simulation examines the parametric forms of the covariate-varying effects γ(u, θ) while leaving other model specifications unchanged. The sizes of the test statistics T1 and T2 are examined under the null model M0 : γ:(u, θ) = θ1 exp(−θ2u) with θ = (θ1, θ2) = (1.0, 0.5). The powers of the tests are examined under three alternative models M1: γ(u) = 1 − 0.4u, M2 : γ(u) = 1 − 0.8u + 0.25u2 and M3 : γ(u) = 1 − sin(2u), respectively. Table 2 shows the observed sizes and powers of the test statistics T1 and T2 at significance level 0.05. Each entry is based on 1000 Gaussian multiplier samples and 1000 simulations. The observed sizes are close to their nominal level for all three link functions and for sample sizes n = 200, 400 and 600. The powers of the tests increase as sample size increases. The powers of the tests also increase as the alternative models move from M1 to M3 which represent increasing departure from the null model M0. The power of T2 is slightly higher than the power of T1 for all cases.

Table 2.

Observed sizes and powers of the test statistics T1 and T2 for three different link functions using unit weight based on 1000 Gaussian multiplier samples and 1000 simulations. The bandwidth is calculated based on h = 4σTn−1/3 which yields h = 0.68, 0.54 and 0.47 for n = 200, n = 400 and n = 600, respectively.

Size
Power
M0
M1
M2
M3
n T1 T2 T1 T2 T1 T2 T1 T2
Identity link
200 .059 .042 .341 .368 .682 .714 .999 .987
400 .063 .065 .580 .632 .945 .963 1.00 .999
600 .060 .044 .817 .854 .996 1.00 1.00 1.00
Logarithm link
200 .053 .056 .532 .662 .912 .970 .992 .996
400 .054 .048 .856 .917 .987 .993 .998 1.00
600 .046 .043 .911 .930 .988 .995 1.00 1.00
Logit link
200 .045 .051 .126 .175 .267 .285 .965 .960
400 .056 .064 .250 .354 .562 .623 .992 .994
600 .061 .054 .655 .719 .788 .854 .998 1.00

7. Real data application

We apply the newly developed methods to investigate treatment strategies dependent on the development of drug resistance in the ACTG 244 trial. ACTG 244 was a randomized, double-blind trial that evaluated the clinical utility of monitoring HIV infected patients taking Zidovudine (ZDV) monotherapy for occurrence of the T215Y/F ZDV resistance mutation. When a subject developed the 215 mutation, the subject was randomized to continue ZDV, add ddI, or add both ddI and NVP. ACTG 244 began enrollment in February 1994, and among the 289 enrollees, 284 were dispensed ZDV, of whom 57 developed T215Y/F. Forty-nine of these subjects were randomized to ZDV (n=17), ZDV+ddI (n=15), or ZDV+ddI+NVP (n=17), and the other 8 subjects went off treatment prior to randomization. T215Y/F mutation status was determined by RT-PCR (Larder et al., 1991) and was measured at study entry and every 8 weeks thereafter, with variability in visit dates across individuals. The primary study outcome CD4 cell count, a well known independent predictor of AIDS/death (Kaufmann et al., 1998), was measured on the same visit schedule. We investigate the effect of treatment switching on longitudinal square root CD4 cell count, and also investigate the association of the timing of treatment switching on square root CD4 cell count.

In addition to the above investigations, we also applied the methods to a distinct objective that arose from the Data Safety Monitoring Board’s independent review of the study data in September 1996. Following this review, all subjects were offered randomization to the ZDV+ddI or ZDV+ddI+NVP arms with six months of additional follow-up. Of the 227 subjects who remained on ZDV treatment without the T215Y/F mutation, 137 were still taking ZDV at the time of the interim review and were randomized to ZDV+ddI (n=69) or ZDV+ddI+NVP (n=68); the remaining 90 subjects went off treatment prior to the interim review. As such, ACTG 244 investigated two treatment randomization strategies, the first comparing ZDV vs. ZDV+ddI vs. ZDV+ddI+NVP in subjects who acquired the T215Y/F mutation, and the second comparing ZDV+ddI vs. ZDV+ddI+NVP in subjects taking ZDV who did not have the T215Y/F mutation. We analyze the effects of treatment switching separately for these two investigations in Section 7.1 and Section 7.2, respectively. The main reason for the separate analyses rather than a single combined analysis is that the study populations for inference are fundamentally different, given the large impact of the T215F/Y drug resistance mutation on CD4 cell count. An additional reason is that the time to develop the T215Y/F mutation introduces informative censoring, and thus our analysis of the second randomization strategy that only includes subjects who did not develop the mutation avoids making a false assumption of non-informative censoring. A final reason for conducting two separate analyses is that the assigned treatments are different in the first and second randomizations.

7.1 Analysis of the effects of switching treatments after drug-resistant virus was detected (First randomization)

First, we examine the effects of switching treatments following detection of the T215Y/F mutation. Let Y (t) be the square root of CD4 count at t years since study entry, Z1 be Gender (1 if Female; 0 if Male), Z2 be Age in years at study entry, Z3 and Z4 be dummy variables coding race (Z3 = 1 if white and 0 otherwise, Z4 = 1 if black and 0 otherwise). Let S1 be the time from study entry until the first randomization based on occurrence of the T215Y/F mutation (the treatment switching time), where we set S1 = 3 years, a number longer than the study duration, for subjects who did not experience the mutation. Then U1(t) = tS1 is the time elapsed from the T215Y/F mutation-based treatment randomization. Let TA1(t) = 1 if t > S1 and randomized to ZDV and 0 otherwise, TA2(t) = 1 if t > S1 and randomized to ZDV+ddI and 0 otherwise, and TA3(t) = 1 if t > S1 and randomized to ZDV+ddI+NVP and 0 otherwise. All three indicators are zero prior to detection of the mutation. This analysis includes all n = 284 enrolled subjects dispensed ZDV monotherapy. The eight subjects who were off treatment prior to the first randomization as well as the 90 subjects who were off treatment prior to the interim review are censored at the time of drop-off. In addition, Section 7.1 focuses on the treatment comparison in subjects who acquired the T215Y/F mutation; the time of the second randomization is treated as the censoring time for the 137 subjects who did not develop the mutation and were randomized at the interim review.

The analysis is conducted using the following model:

Yi(t)=α0(t)+β1Z1i+β2Z2i+β3Z3i+β4Z4i+γ1(U1i(t),θ1)TA1i(t)+γ2(U1i(t),θ2)TA2i(t)+γ3(U1i(t),θ3)TA3i(t)+εi(t), (11)

for t ∈ [0, τ] where τ = 2.5 years. We assume that γk(u, θk), k = 1, 2, 3, are the second order polynomial functions. Let γ1(u, θ1) = θ10 + θ11u + θ12u2, γ2(u, θ2) = θ20 + θ21u + θ22u2, and γ3(u, θ3) = θ30 + θ31u + θ32u2, where θ1 = (θ10, θ11, θ12), θ2 = (θ20, θ21, θ22) and θ3 = (θ30, θ31, θ32). The 3-fold cross-validation bandwidth selection yields h = 0.41 while the bandwidth formula h = Cσ̂Tn−1/3 yields h = 0.41 for C = 4 and h = 0.51 for C = 5. Our analysis uses h = 0.41; the results using h = 0.51 are almost the same.

The width of the range of the observed values for U1i(t), t ∈ [0, 2.5], is 2.25. The test statistics T1 and T2 calculated with Δ = [0, 2] yield p-values of 0.138 and 0.228, respectively, suggesting that the quadratic functions γk(u, θk) for k = 1, 2, 3 fit well to the data. The tests also indicate inadequacy of the linear functions for γk(u, θk), k = 1, 2, 3, with p-values of 0.016 for T1 and 0.036 for T2.

Using the quadratic functions, the estimates of parameters are presented in the first block of Table 3. The estimations of α0(t), γ1 (u, θ1), γ2 (u, θ2) and γ3 (u, θ3) are presented in Figure 2, along with their 95% pointwise confidence intervals. The estimation of γk (u, θk), k = 1, 2, 3, are plotted on [0, 2].

Table 3.

Estimated effects of adaptive treatment randomizations based on the ACTG 244 data using h = 0.41 and unit weight.

Effect Parameter Estimate Standard deviation 95% Confidence limits p-value



Treatment effects after the T215Y/F mutation under model (11) (first randomization)
Gender β1 −1.3155 0.7216 −2.7298 0.0988 0.0683
Age β2 0.0619 0.0284 0.0063 0.1174 0.0292
Race β3 −0.7575 0.7746 −2.2758 0.7607 0.3281
β4 −0.8163 0.8565 −2.4950 0.8623 0.3405
TA1(t) θ10 −1.1593 1.0956 −3.3066 0.9881 0.2900
θ11 4.3785 2.9895 −1.4810 10.2380 0.1430
θ12 −4.0586 1.4271 −6.8558 −1.2614 0.0045
TA2(t) θ20 0.1222 0.8782 −1.5991 1.8436 0.8893
θ21 −2.7256 2.0925 −6.8270 1.3757 0.1927
θ22 1.0751 1.1880 −1.2534 3.4035 0.3655
TA3(t) θ30 −2.2771 0.8704 −3.9831 −0.5710 0.0089
θ31 4.7834 1.7902 1.2745 8.2922 0.0075
θ32 −2.5625 0.8431 −4.2149 −0.9101 0.0024
Treatment effects before the T215Y/F mutation under model (12) (second randomization)
Gender β1 −0.6674 0.6396 −1.9211 0.5862 0.2967
Age β2 0.0192 0.0250 −0.0299 0.0683 0.4438
Race β3 −1.0689 0.7223 −2.4846 0.3469 0.1389
β4 −1.7410 0.7750 −3.2600 −0.2221 0.0247
TB2(t) θ20 0.2354 0.5963 −0.9334 1.4042 0.6930
θ21 3.4616 3.6360 −3.6651 10.5882 0.3411
θ22 −0.5598 5.5558 −11.4491 10.3295 0.9197
TB3(t) θ30 0.7469 0.5363 −0.3043 1.7981 0.1637
θ31 5.5801 2.9732 −0.2474 11.4076 0.0605
θ32 −4.0646 4.4691 −12.8241 4.6949 0.3631

Figure 2.

Figure 2

Estimated effects of adaptive treatment randomizations after the T215Y/F mutation based on the ACTG 244 data using unit weight and h = 0.41. Figure (a) shows the estimated baseline function α̂0(t) with 95% pointwise confidence intervals; (b), (c) and (d) show the point and 95% confidence interval estimates of γk(u), k = 1, 2, 3, respectively, under model (11).

The results show that CD4 cell counts are significantly higher for older individuals with p-value of 0.029, females tend to have lower CD4 cell counts with p-value of 0.068, and race is not a significant factor. It appears that the downward trend in CD4 cell counts is more apparent in continuing with the monotherapy ZDV (Figure 2(b)) than switching to the combination therapies (Figure 2(c) and (d)). This analysis points to positive benefits of switching to the combination therapies as compared to continuing with ZDV monotherapy ZDV even after drug-resistant virus was detected.

7.2 Analysis of the effects of switching treatments before drug-resistant virus was detected (Second randomization)

After independent review of the study data, all subjects were offered randomization to ZDV+ddI or ZDV+ddI+NVP with six months of additional follow-up. In this section, we examine the effects of this switching of treatments before the T215Y/F mutation was detected. This analysis excludes subjects who developed the T215Y/F mutation to allow answering the study question for the sub-population without the mutation and because the time to develop T215Y/F likely introduces dependent censoring. The 90 subjects who were off treatment prior to the interim review without having the T215Y/F mutation are censored at the time of drop-off. Let S2 be the time of the second randomization after the interim review and U2(t) = tS2. We define TB2(t) = 1 if t > S2 and randomized to ZDV+ddI and 0 otherwise, TB3(t) = 1 if t > S2 and randomized to ZDV+ddI+NVP and 0 otherwise. TB2(t) = 0 and TB3(t) = 0 indicate a subject is on ZDV at time t before the interim review.

The data are analyzed using the following model:

Yi(t)=α0(t)+β1Z1i+β2Z2i+β3Z3i+β4Z4i+γ2(U2i(t),θ2)TB2i(t)+γ3(U2i(t),θ3)TB3i(t)+εi(t), (12)

for t ∈ [0, 2.5]. Similar to the previous analysis, we use the second order polynomial functions for γ2(u, θ2) and γ3(u, θ3). The range of the observed values for U2i(t), t ∈ [0, 2.5], is [0, 0.70]. The p-values of the test statistics T1 and T2 with Δ = [0, 0.7] are 0.254 and 0.144, respectively, indicating that there is no significant departure from the hypothesized parametric quadratic functions for γ2(u, θ2) and γ3(u, θ3).

The results of parameter estimation are given in the second block of Table 3. The estimations of α0(t), γ2(u, θ2) and γ3(u, θ3) are presented in Figure 3 along with 95% pointwise confidence intervals. Figure 3 shows CD4 cell counts rise significantly for subjects who switch to ZDV+ddI or ZDV+ddI+NVP compared to ZDV. The estimated switching-treatment effects and 95% confidence intervals are above zero, suggesting that switching to ZDV+ddI or ZDV+ddI+NVP improves CD4 counts for patients who had not yet developed the T215Y/F drug resistance mutation.

Figure 3.

Figure 3

Estimated effects of adaptive treatment randomizations before the T215Y/F mutation based on the ACTG 244 data using unit weight and h = 0:41. Figure (a) shows the estimated baseline function α̂0(t) with 95% pointwise confidence intervals; (b) and (c) show the point and 95% confidence interval estimates of γk(u), k = 1, 2, respectively, under model (12).

The ACTG 244 study was previously analyzed by the standard linear regression statistical methods in an unpublished manuscript. The previous analysis was inefficient because the longitudinal nature of the data were not considered. A brief description of these methods are summarized in Web Appendix D. To compare our data analysis with some benchmark methods for longitudinal data analysis, we also analyzed the data using the SAS procedure Proc Glimmix for generalized linear mixed models (GLMM) under some comparable models which are presented in Web Appendix D. These results are in line with the results by using our methods. The analysis using Proc Glimmix yields somewhat narrower confidence intervals, possibly due to the parametric specification of the baseline function α0(t) and more efficient weight function selection since the methods used completely parametric models. There are no existing methods for testing the functional forms of covariate effects even under the GLMM.

8. Concluding remarks

This article is motivated by investigating the exposure-varying effects of adaptive treatment randomizations on longitudinal biomarkers, illustrated by the ACTG 244 AIDS clinical trial. We developed estimation and hypothesis testing procedures for a generalized semiparametric varying-coefficient model for longitudinal data with a general link function. The choice of weight process is an important and complicated issue. We conducted some investigation into the two-stage estimation procedure for choosing the weight function within the framework of the marginal approach. The simulation results presented in Appendix C of the Web-based Supplementary Material show that the two-stage estimation procedure can be adopted to improve efficiency, where the first stage estimator is based on the unit weight and the second stage estimator is obtained with the weight estimated based on the first stage estimator. This article considers parametric forms γ(Ui(t), θ) of the covariate-varying effects for γ(Ui(t)). Nonparametric modeling of covariate-varying effects would provide greater flexibility when sufficient data are available. The theoretical development for estimating the nonparametric component γ(u) is, however, significantly different from that considered in this paper because the nonparametric functions α(t) and γ(u) have different domains, yet the smoothing for α(t) and γ(u) cannot be totally separated because γ(u) is a function of the covariate process Ui(t) that may vary with t. This merits future research.

Supplementary Material

Supplementary Material

Acknowledgments

The authors thank the Editor, the Associate Editor and two referees for their constructive comments and suggestions that greatly improved this article. This research was partially supported by NIAID NIH award number R37AI054165, and the research of Yanqing Sun was partially supported by National Science Foundation grant DMS-1208978, DMS-1513072 and the Reassignment of Duties fund provided by the University of North Carolina at Charlotte. The authors thank the AIDS Clinical Trials Group for providing the ACTG 244 data, in particular Ronald Bosch and Justin Ritz for preparing the data set, reviewing the manuscript, and helpful discussions. We also wish to thank the ACTG 244 study participants and study team, including the study chairs Douglas L. Mayers & Thomas C. Merigan. The project described was supported by Award Numbers U01 A038855, AI038858, AI068634 and AI068636 from the National Institute of Allergy and Infectious Diseases and supported by National Institute of Mental Health (NIMH), National Institute of Dental and Craniofacial Research (NIDCR). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institute of Allergy and Infectious Diseases or the National Institutes of Health.

Footnotes

Supplementary Materials

Web Appendices for the proofs of the theorems referenced in Section 4, additional simulations and data analysis referenced in Section 7 and Section 8, along with the data and computer code, are available with this paper at the Biometrics website on the Wiley Online Library.

References

  1. Fan J, Gijbels I. Local Polynomial Modelling and Its Applications. Chapman and Hall; London: 1996. [Google Scholar]
  2. Fan J, Huang T, Li R. Analysis of longitudinal data with semiparametric estimation of covariance function. Journal of the American Statistical Association. 2007;102(478):632–641. doi: 10.1198/016214507000000095. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Fan J, Li R. New estimation and model selection procedures for semiparametric modeling in longitudinal data analysis. Journal of the American Statistical Association. 2004;99(467):710–723. [Google Scholar]
  4. Japour AJ, Welles S, D’Aquila RT, Johnson VA, Richman DD, Coombs RW, Reichelderfer PS, Kahn JO, Crumpacker CS, Kuritzkes DR. Prevalence and clinical significance of zidovudine resistance mutations in human immunodeficiency virus isolated from patients after long-term zidovudine treatment. Journal of Infectious Diseases. 1995;171(5):1172–1179. doi: 10.1093/infdis/171.5.1172. [DOI] [PubMed] [Google Scholar]
  5. Jones MC, Marron JS, Park BU. A simple root n bandwidth selector. The Annals of Statistics. 1991;19(4):1919–1932. [Google Scholar]
  6. Kaufmann D, Pantaleo G, Sudre P, Telenti A. CD4-cell count in HIV-1-infected individuals remaining viraemic with highly active antiretroviral therapy (HAART) Lancet. 1998;351(9104):723–724. doi: 10.1016/s0140-6736(98)24010-4. [DOI] [PubMed] [Google Scholar]
  7. Larder BA, Kellam P, Kemp SD. Zidovudine resistance predicted by direct detection of mutations in DNA from HIV-infected lymphocytes. AIDS. 1991;5(2):137–144. doi: 10.1097/00002030-199102000-00002. [DOI] [PubMed] [Google Scholar]
  8. Lin DY, Wei L-J, Ying Z. Checking the cox model with cumulative sums of martingale-based residuals. Biometrika. 1993;80(3):557–572. [Google Scholar]
  9. Mellors JW, Munoz A, Giorgi JV, Margolick JB, Tassoni CJ, Gupta P, Kingsley LA, Todd JA, Saah AJ, Detels R, et al. Plasma viral load and CD4+ lymphocytes as prognostic markers of HIV-1 infection. Annals of Internal Medicine. 1997;126(12):946–954. doi: 10.7326/0003-4819-126-12-199706150-00003. [DOI] [PubMed] [Google Scholar]
  10. Qu A, Li R. Quadratic inference functions for varying-coefficient models with longitudinal data. Biometrics. 2006;62(2):379–391. doi: 10.1111/j.1541-0420.2005.00490.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Rice JA, Silverman BW. Estimating the mean and covariance structure nonparametrically when the data are curves. Journal of the Royal Statistical Society Series B (Methodological) 1991;10(6):233–243. [Google Scholar]
  12. Sun Y, Qian X, Shou Q, Gilbert PB. Analysis of two-phase sampling data with semiparametric additive hazards models. Lifetime Data Analysis. 2016 doi: 10.1007/s10985-016-9363-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Sun Y, Sun L, Zhou J. Profile local linear estimation of generalized semiparametric regression model for longitudinal data. Lifetime Data Analysis. 2013;19:317–349. doi: 10.1007/s10985-013-9251-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Tian L, Zucker D, Wei LJ. On the Cox model with time-varying regression coefficients. Journal of the American Statistical Association. 2005;100:172–183. [Google Scholar]
  15. Zhou H, Wang C-Y. Failure time regression with continuous covariates measured with error. Journal of the Royal Statistical Society: Series B. 2000;62(4):657–665. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Material

RESOURCES