A Varying-Coefficient Model for the Evaluation of Time-Varying Concomitant Intervention Effects in Longitudinal Studies

Colin O Wu; Xin Tian; Heejung Bang

doi:10.1002/sim.3262

. Author manuscript; available in PMC: 2015 Oct 23.

Published in final edited form as: Stat Med. 2008 Jul 20;27(16):3042–3056. doi: 10.1002/sim.3262

A Varying-Coefficient Model for the Evaluation of Time-Varying Concomitant Intervention Effects in Longitudinal Studies

Colin O Wu ^1,^*, Xin Tian ², Heejung Bang ³

PMCID: PMC4615703 NIHMSID: NIHMS66649 PMID: 18351714

Summary

Concomitant interventions are often introduced during a longitudinal clinical trial to patients who respond undesirably to the pre-specified treatments. In addition to the main objective of evaluating the pre-specified treatment effects, an important secondary objective in such a trial is to evaluate whether a concomitant intervention could change a patient’s response over time. Because the initiation of a concomitant intervention may depend on the patient’s general trend of pre-intervention outcomes, regression approaches that treat the presence of the intervention as a time-dependent covariate may lead to biased estimates for the intervention effects. Borrowing the techniques of Follmann and Wu (1995) for modeling informative missing data, we propose a varying-coefficient mixed-effects model to evaluate the patient’s longitudinal outcome trends before and after the patient’s starting time of the intervention. By allowing the random coefficients to be correlated with the patient’s starting time of the intervention, our model leads to less biased estimates of the intervention effects. Nonparametric estimation and inferences of the coefficient curves and intervention effects are developed using B-splines. Our methods are demonstrated through a longitudinal clinical trial in depression and heart disease and a simulation study.

Keywords: Change-Point Models, Concomitant Intervention, Longitudinal Study, Polynomial Splines, Shared Parameter Model, Varying-Coefficient Model

1 Introduction

A main objective of longitudinal analysis in clinical trials is to evaluate the effects of covariates of interest on the time-trends of the outcome variables. The treatment effects are usually modeled using a time-invariant categorical covariate, while the other covariates can be either time-invariant or time-dependent. Recent advances in longitudinal analysis have led to a wide range of regression methods. For parametric models, estimation and inference procedures, such as maximum likelihood estimation, the restricted maximum likelihood (REML) estimation and the generalized estimating equations (GEEs), can be found [1–4]. For models involving nonparametric components, smoothing methods, such as local polynomials and splines, are often used [5–9].

The above regression approaches generally lead to satisfactory results when the subjects are properly randomized so that the treatments and the covariates are not subject to “selection bias”. In many longitudinal studies, however, concomitant interventions are initiated, usually due to ethical reasons, to patients who exhibit less satisfactory trends in their medical outcomes. This phenomenon bears some resemblance to longitudinal studies with informative missing data, where patients with undesirable outcome trends tend to drop out early from the study, except that in our situation the outcomes of these patients are continuously observed under the additional concomitant treatment. In a randomized longitudinal clinical trial with pre-specified treatments, patients who have taken a concomitant intervention in addition to their assigned treatments may generally have different disease pathology from those who do not need the intervention. Thus, in addition to the primary goal of evaluating the effect of the main treatment, an important objective is to evaluate the effects of the additional concomitant intervention(s) among eligible patients.

Our motivating example is the Enhancing Recovery in Coronary Heart Disease (EN-RICHD), which is a randomized clinical trial that evaluated the efficacy of a psychosocial treatment versus usual cardiological care on survival and depression severity in 2,481 patients with depression and/or low perceived social support after acute myocardial infraction. Depression severity was measured by Hamilton Rating Scale for Depression (HRSD) and Beck Depression Inventory (BDI), where higher HRSD and BDI scores indicate worsened depression. In addition to the randomized treatments, patients with high baseline depression scores and/or nondecreasing BDI trends were eligible for pharmacotherapy with antidepressant. Antidepressants were also prescribed at the requests of the patients or their primary care physicians. Details of the study design, objectives and major findings of the trial have been described [10,11]. Taylor et al. [12] compared the survival rates for death and cardiovascular morbidity and mortality among 1,834 depressed patients in this trial and found that the use of selective serotonin reuptake inhibitors seemed to reduce subsequent cardiovascular morbidity and mortality. Bang and Robins [13] also analyzed the same data but a cross-sectional component only. However, the question of whether these antidepressant medications had added benefits for lowering the BDI scores of patients in the psychosocial treatment arm who had undergone this concomitant intervention during the trial was not addressed. To answer this question, we treat the patient’s starting time of pharmacotherapy as a subject-specific “change-point” and evaluate the effects of pharmacotherapy on the patient’s BDI scores over time using different models before and after the “change-point”. We attempt to model the relationships between the subject’s parameters for BDI trends and his/her change-point time using a B-spline nonparametric approach. It is important to note that in the ENRICHD trial pharmacotherapy was not simply initiated based on certain common HRSD or BDI threshold values, which may require a different methodology.

Methodologically, Murphy, van der Laan and Robins [14] studied estimation and causal inferences for the mean responses to dynamic treatment regimens that were tailored to subjects’ individual needs. Their designs involve a set of treatment intervals specified by selected time points and a pre-specified sequential randomization rule that assigns subjects to different treatment levels. The main difference between our data structure and their dynamic regimens is that the concomitant interventions considered in this paper are often not assigned based on specific treatment rules, so that their estimating equations may not be directly applied to the current context. Also Wall, Dai and Eberly [15] examined the impact of a misspecified time-varying covariate when they analyzed the effect of (nonrandomized) alcoholism treatment on medical utilization in the GEE framework. They implicitly suggested that a change-point model was a way to go but did not provide a solution.

We propose in this paper a nonparametric approach for estimating the effects of a concomitant intervention in longitudinal clinical trials. The proposed methodology is based on a varying-coefficient mixed-effects model and a B-spline least-squares estimation method. A key question that has not been previously well-understood in the literature is why a naive linear mixed-effects model (e.g., Verbeke and Molenberghs [3], Sec. 3.3) could lead to biased estimates when a concomitant intervention is present. Using the general framework of the change-point shared-parameter model (Section 2.3), we show that the mixed-effects models without properly incorporating the joint distributions of the random parameters and the starting time of the concomitant intervention are misspecified models. On the other hand, our varying-coefficient mixed-effects model is a flexible nonparametric version of the change-point shared-parameter model which can adequately incorporate the concomitant intervention starting time when the joint distribution of the random parameters and the concomitant intervention starting time is completely unknown. Because our model may include both parametric and nonparametric components, our B-spline estimation method is more flexible than the usual local smoothing methods, such as kernel or local polynomial methods, in the sense that it can be naturally adapted to both parametric and nonparametric situations [8, 9]. Although certain components of our approach, such as B-splines, have been used for other longitudinal settings in the literature [5–9], the systematic modeling and estimation procedure proposed in this paper fills a gap for obtaining unbiased estimates in longitudinal clinical trials with the presence of a concomitant intervention.

Our modeling approach, however, shares some similarities with the shared parameter model that addresses the informative missingness [16,17]. Follmann and Wu [17] used this approach to link a mixed-effects model for the response variable and a marginal model for the characteristics of missing data, such as time to drop-out. Their conditional model resembles our varying-coefficient model with only the observations before the change-points. In contrast, our model also includes the response curves after the change-points. Unlike the classical change-point problems where all the subjects have the same change-point and its location is the unknown parameter to be estimated, individual starting times of the concomitant intervention are observed in our data [18,19].

We describe our regression models and its biological interpretations in Section 2. Subsequently, we propose a class of nonparametric estimation and inference procedures based on B-splines in Sections 3 and 4, apply these methods to the ENRICHD data in Section 5, and present a simulation study in Section 6. Finally, we discuss some potential extensions of our methods in Section 7.

2 Change-Point Mixed-Effects Models

2.1 Data Structure

We consider a study with n randomly selected subjects. For the ith subject, n_i is the number of visits, T_ij ∈ [𝒯₀, 𝒯₁] is the trial time or study time at the jth visit, Y_ij is the real-valued outcome measured at T_ij and X_i = (1,X_1i, …,X_Pi)^T is the R^P+1-valued time-invariant covariate vector. Here [𝒯₀, 𝒯₁] is the known time interval for the study. We assume that the study has one concomitant intervention, and each subject has one change-point from non-intervention to intervention within the study period [𝒯₀, 𝒯₁]. Let S_i be the ith subject’s intervention starting time or change-point time, and δ_ij = 1_{[T_ij≥S_i]} the intervention indicator at the jth visit. The time from the concomitant intervention to the jth visit is R_ij = T_ij − S_i. The observed data are {(T_ij, Y_ij, X_i, S_i); 1 ≤ j ≤ n_i, 1 ≤ i ≤ n}.

Because our objective is to investigate the relationships between Y_ij and (T_ij,X_i) before and after the change-point S_i, our model does not include subjects who have not received the concomitant intervention within the study period [𝒯₀, 𝒯₁]. In some situations, these subjects may provide useful information for the pre-intervention covariate effects on Y_ij. However, models incorporating such subjects require additional assumptions and/or models on the treatment receipt pattern and disease pathology. Hence, discussion on this problem is out of the scope of this paper. When the context is clear, we may interchange intervention with concomitant intervention and denote the random variables by (T, Y,X, S, δ,R).

2.2 A Naive Linear Mixed-Effects Model: Review

Intuitively we may evaluate the intervention effects by comparing the response trajectories before and after the change-points. For the ease of notation, we discuss the case without X. Suppose that Y_ij are given by Y_ij = a_0i + a_1iT_ij + ε_ij when T_ij < S_i and (a_0i + b_0i) + (a_1i + b_1i)T_ij + ε_ij when T_ij ≥ S_i for some subject-specific parameters (a_0i, a_1i, b_0i, b_1i) and measurement errors ε_ij. The individual intervention effects for the ith subject are characterized by (b_0i, b_1i), and the marginal intervention effects for the population are then characterized by E(b_0i, b_1i)^T = (β₀, β₁)^T. Using the framework of linear mixed-effects models (see Ch.3 in [3]), an intuitive change-point model is

{\begin{matrix} Y_{i j} = T_{i j}^{T} a_{i} + {(T_{i j}^{*})}^{T} b_{i} + ε_{i j}, \\ (a_{i}^{T}, b_{i}^{T}) T ~ M V N ({(α^{T}, β^{T})}^{T}, Γ) & for some unkown (α, β, Γ), \end{matrix}

(2.1)

where, for known constants D₁ and D₂, a_i = (a_0i, …, a_D₁i)^T, b_i = (b_0i, …, b_D₂i)^T, $T_{i j} = {(1, T_{i j}, \dots, T_{i j}^{D_{1}})}^{T}, T_{i j}^{*} = {(δ_{i j}, δ_{i j} T_{i j}, \dots, δ_{i j} T_{i j}^{D_{2}})}^{T}$ , ε_ij are mean zero error processes, and ${(a_{i}^{T}, b_{i}^{T})}^{T}$ and ε_i = (ε_i1, …, ε_{in_i})^T are independent. The intervention effects are characterized by b_i for the ith subject and E(b_i) = β for the population. A crucial assumption of (2.1) is that {a_i, b_i} and S_i are independent. Although it appears that the time-varying intervention is incorporated as a covariate by the term involving δ_ij, we will demonstrate in Sections 5 and Section 6 that, by ignoring the correlations between {a_i, b_i} and S_i in the distribution assumption of {a_i,b_i}, (2.1) is a misspecified model for our data and may lead to biased estimates for the intervention effects.

2.3 The Shared-Parameter Model

To model the initiation of the concomitant intervention, a natural extension of (2.1) is to allow the intervention starting time S_i to be correlated with the pre-intervention random coefficients a_i or more generally {a_i, b_i}. When the context is clear, we will denote by μ₁(·; a_i) and [μ₁(·; a_i) + μ₂(·; b_i)] the subject-specific response curves before and after the start of the intervention, respectively. We interpret μ₂(·; b_i) as the intervention effect. Given {T_ij ,X_i, S_i}, our shared-parameter model is

{\begin{matrix} Y_{i j} = μ_{1} (T_{i j}, X_{i}; a_{i}) + δ_{i j} μ_{2} (T_{i j}, X_{i}, R_{i j}; b_{i}) + ε_{i j}, \\ {(a_{i}^{T}, b_{i}^{T}, S_{i})}^{T} ~ Joint Distribution, \end{matrix}

(2.2)

where ε_ij are mean zero errors with cov(ε_ij1, ε_ij2) = σ_ij₁j₂, ε_i₁j₁ and ε_i₂j₂ are independent if i₁ ≠ i₂, and, conditioning on {a_i, b_i}, S_i and {T_ij ,X_i} are independent. In addition, we assume that {a_i, b_i} and {T_ij,X_i} are independent.

Let Y_i = (Y_i1, …, Y_{in_i})^T, T_i = (T_i1, …, T_{in_i})^T and H(·, ·) be the joint distribution function of {a_i, b_i}. The joint likelihood of ${(Y_{i}^{T}, S_{i})}^{T}$ given {T_i,X_i} is

f (Y_{i}, S_{i} | T_{i}, X_{i}) = \int f (Y_{i} | T_{i}, X_{i}, S_{i}, a_{i}, b_{i}) f (S_{i} | a_{i}, b_{i}) d H (a_{i}, b_{i}),

(2.3)

where f(·|·) denotes the conditional density. Because of the extra f(S_i|a_i, b_i) in the integrand, (2.3) differs from the usual likelihood functions for the mixed-effects models (see p24 in [3]).

Unlike (2.1), (2.2) is a change-point model with shared parameters {a_i, b_i} which determine both the response curves of Y_ij and the distribution of S_i. The shared parameters approach was proposed for modeling the behaviors of informative missing data [7]. In (2.2), the correlation between S_i and a_i suggests that the ith subject’s intervention starting time is determined by the pre-intervention response curve μ₁, while the correlation between S_i and b_i suggests that S_i may also influence the response curve μ₂ that characterizes the intervention effect.

2.4 The Varying-Coefficient Mixed-Effects Model

The approach based on the joint likelihood (2.3) can be computationally complicated and requires some assumptions about the distribution of S_i. In this paper, we consider a simpler method based on the conditional model, which is robust to the distributional assumption of S_i. The conditional distribution can be written as

f (Y_{i} | S_{i}, T_{i}, X_{i}) = \int f (Y_{i} | T_{i}, X_{i}, S_{i}, a_{i}, b_{i}) d G (a_{i}, b_{i} | S_{i}) .

(2.4)

Then we can rewrite (2.2) as a varying-coefficient model using the conditional distribution of {a_i, b_i} given S_i. When μ₁ and μ₂ are linear functions, let $μ_{1} (T_{i j}, X_{i}; a_{i}) = Z_{i j}^{T} a_{i}$ for Z_ij = (Z_ij0, …, Z_ijD₁)^T generated by {(T_ij,X_i); 1 ≤ j ≤ n_i, δ_ij = 0}, and $μ_{2} (T_{i j}, X_{i}, S_{i}; b_{i}) = W_{i j}^{T} b_{i}$ for W_ij = (W_ij0, …, W_ijD₂)^T generated by {(T_ij,X_i, S_i); 1 ≤ j ≤ n_i, δ_ij = 1}. Writing α(S_i) = E(a_i|S_i), β(S_i) = E(b_i|S_i), $a_{i}^{*} = a_{i} - α (S_{i})$ and $b_{i}^{*} = b_{i} - β (S_{i})$ , our varying-coefficient mixed-effects model has the expression

{\begin{matrix} Y_{i j} = Z_{i j}^{T} [α (S_{i}) + a_{i}^{*}] + δ_{i j} W_{i j}^{T} [β (S_{i}) + b_{i}^{*}] + ε_{i j}, \\ {(a_{i}^{* T}, b_{i}^{* T})}^{T} | S_{i} ~ G (\cdot | S_{i}) \end{matrix}

(2.5)

where, for S_i = s, G(·|s) is a distribution function with mean zero and covariance matrix $cov [{(a_{i}^{* T}, b_{i}^{* T})}^{T} | s] = C (s)$ . Marginal parameters of interest are α(s) and β(s). When S_i = s, the mean intervention effect is β(s), and β(s) = 0 for all s ∈ (𝒯₀, 𝒯₁) implies that the concomitant intervention had no marginal effect on the response curve.

An obvious choice for G(·|S_i) is the multivariate normal distribution with mean zero and covariance matrix $C = cov [{(a_{i}^{* T}, b_{i}^{* T})}^{T} | s]$ , which, for simplicity, is assumed to be time-invariant. Extensions to time-dependent covariances can be made by modeling C(s). Since explicit forms of G(·|S_i) are often unknown, modeling α(s) and β(s) is often more important than modeling C(s). In linear models, we have α(s;γ) = (α₀(s; γ₀), …, α_D₁(s; γ_D₁))^T, β(s; τ) = (β₀(s; τ₀), …, β_D₂(s; τ_D₂))^T,

α_{d} (s; γ) = \sum_{l = 0}^{L_{d}} γ_{d l} 𝒯_{d l} (s) and β_{d} (s; τ) = \sum_{m = 0}^{M_{d}} τ_{d m} 𝒯_{d m}^{*} (s)

(2.6)

where {L_d,M_d} are fixed, and ${𝒯_{d l} (s), 𝒯_{d m}^{*} (s)}$ are known transformations of s. The choice of 𝒯_dl(s) = s^l and $𝒯_{d m}^{*} (s) = s^{m}$ leads to polynomials for (2.6).

Extended linear models can be used to approximate {α(s), β(s)} when their parametric forms are unknown. Let {ℬ_d₁(s) = (ℬ_d₁0(s), …, ℬ_{d₁ℒ_d₁} (s))^T; 0 ≤ d₁ ≤ D₁} and { $ℬ_{d_{2}}^{*} (s) = {(ℬ_{d_{2} 0}^{*} (s), \dots, ℬ_{d_{2} ℳ_{d_{2}}} (s))}^{T}$ ; 0 ≤ d₂ ≤ D₂} be some pre-specified basis functions. Then α(s) and β(s) can be approximated by

α_{d} (s; γ) \approx \sum_{l = 0}^{ℒ_{d}} γ_{d l} ℬ_{d l} (s) and β_{d} (s; τ) \approx \sum_{m = 0}^{ℳ_{d}} τ_{d m} ℬ_{d m}^{*} (s),

(2.7)

where ℒ_d and ℳ_d may tend to infinity as n → ∞. Popular basis choices include truncated polynomial bases, Fourier bases or B-splines. In this paper, we restrict our attention to B-splines with fixed knot sequences because of their superior numerical stability. The smoothing parameters {ℒ_d,ℳ_d} may be chosen subjectively or by a variable selection procedure, such as cross-validation and information criteria [8,9,20]. An alternative smoothing approach is to approximate {α(s), β(s)} by smoothing splines [21,22]. Because the explicit expressions and statistical properties of smoothing spline estimators are generally different from B-splines, we do not discuss this class of estimators in this paper.

3 Estimation Methods

3.1 Likelihood-Based Estimation

If (2.3) has an explicit parametric expressions, the parameters can be in principle estimated by maximizing the log-likelihood $\sum_{i = 1}^{n} log f (Y_{i}, S_{i} | T_{i}, X_{i})$ . Suppose that (2.5) and (2.6) are satisfied, G(·|s) is Gaussian, and (ε_i1, …, ε_{in_i})^T ~ N(0, Γ_i). We can estimate {γ, τ} by maximizing the partial likelihood

L ({γ, τ} | T_{i}, X_{i}, S_{i}) = \sum_{i = 1}^{n} log [\int f (Y_{i} | T_{i}, X_{i}, S_{i}, a_{i}, b_{i}) d G (a_{i}, b_{i} | S_{i})] .

(3.1)

Let 𝒲_i be the matrix whose jth row is $(Z_{i j}^{T}, δ_{i j} W_{i j}^{T}), 𝒯_{d} (s) = {(𝒯_{d 0} (s), \dots, 𝒯_{d L_{d}} (s))}^{T}, 𝒯_{d}^{*} (s) = {(𝒯_{d 0}^{*} (s), \dots, 𝒯_{d M_{d}}^{*} (s))}^{T}, 𝒯 (s) = diag {𝒯_{0}^{T} (s), \dots, 𝒯_{D_{1}}^{T} (s), 𝒯_{0}^{* T} (s), \dots, 𝒯_{D_{2}}^{* T} (s)}, 𝒯 (S_{i}) = 𝒯_{i}$ , and V_i be the covariance matrix of

e_{i j} = Z_{i j}^{T} a_{i}^{*} + δ_{i j} W_{i j}^{T} b_{i}^{*} + ε_{i j}, 1 \leq j \leq n_{i} .

(3.2)

The matrix representation for (2.7) is (α^T (s; γ), β^T(s; τ))^T = 𝒯 (s)(γ^T, τ^T)^T, where γ_d = (γ_d0, …, γ_{dL_d})^T, $γ = {(γ_{0}^{T}, \dots, γ_{D_{1}}^{T})}^{T}$ , τ_d = (τ_d0, …, τ_{dM_d})^T and $τ = {(τ_{0}^{T}, \dots, τ_{D_{2}}^{T})}^{T}$ . When V_i are known, maximizing (3.1) leads to

(\begin{matrix} {\hat{γ}}_{M L} (𝒯) \\ {\hat{τ}}_{M L} (𝒯) \end{matrix}) = {\sum_{i = 1}^{n} {[𝒲_{i} 𝒯_{i}]}^{T} V_{i}^{- 1} [𝒲_{i} 𝒯_{i}]}^{- 1} {\sum_{i = 1}^{n} {[𝒲_{i} 𝒯_{i}]}^{T} V_{i}^{- 1} Y_{i}}

(3.3)

provided that $\sum_{i = 1}^{n} [{(𝒲_{i} 𝒯_{i})}^{T} V_{i}^{- 1} (𝒲_{i} 𝒯_{i})]$ is nonsingular. When V_i are unknown but can be consistently estimated by a non-singular V̂_i, we can estimate {γ, τ} by {γ̃_ML(𝒯), τ̃_ML(𝒯)} which are given by (3.3) with V_i substituted by V̂_i.

Substituting ${𝒯_{d l} (s), 𝒯_{d m}^{*} (s)}$ in (3.3) with the basis functions ${ℬ_{d l} (s), ℬ_{d m}^{*} (s)}$ , we can compute {γ̂_ML(ℬ), τ̂_ML(ℬ)}. Likelihood-based nonparametric estimators of {α(s), β(s)} under (2.7) and known V_i are

{({\hat{α}}_{M L}^{T} (s; ℬ)), {\hat{β}}_{M L}^{T} (s; ℬ))}^{T} = ℬ (s) {({\hat{γ}}_{M L}^{T} (ℬ), {\hat{τ}}_{M L}^{T} (ℬ))}^{T},

where ℬ_i = ℬ (S_i), ℬ (s) is defined similarly to 𝒯 (s) with ${𝒯_{d l} (s), 𝒯_{d m}^{*} (s)}$ replaced by ${ℬ_{d l} (s), ℬ_{d m}^{*} (s)}$ . Nonparametric estimators computed with V̂_i used in (3.3) are

{({\tilde{α}}_{M L}^{T} (s; ℬ), {\tilde{β}}_{M L}^{T} (s; ℬ))}^{T} = ℬ (s) {({\tilde{γ}}_{M L}^{T} (ℬ), {\tilde{τ}}_{M L}^{T} (ℬ))}^{T} .

3.2 Least-Squares Based Estimation

Likelihood-based estimates of {α(s), β(s)} can not be computed when the explicit forms of G(·|S_i) and the distribution of ε_ij are unknown. In such situations, a practical approach is to first parameterize {α(s), β(s)} by certain parametric models {α(s; γ), β(s; τ)} and then derive the weighted least-squares estimators {γ̂_LS, τ̂_LS} which minimize

ℓ (γ, τ) = \sum_{i = 1}^{n} {{[Y_{i} - (Z_{i}^{T} α (S_{i}; γ) + {(δ W)}_{i}^{T} β (S_{i}; τ))]}^{T} \times Λ_{i} [Y_{i} - (Z_{i}^{T} α (S_{i}; γ) + {(δ W)}_{i}^{T} β (S_{i}; τ))]},

(3.4)

where Z_i = (Z_i1, …, Z_{in_i})^T, (δW)_i = (δ_i1W_i1, …, δ_{in_i}W_{in_i})^T, and Λ_i are some pre-specified symmetric nonsingular n_i × n_i weight matrices. The weighted least-squares estimators for (2.6) are

(\begin{matrix} {\hat{γ}}_{L S} (𝒯) \\ {\hat{τ}}_{L S} (𝒯) \end{matrix}) = {\sum_{i = 1}^{n} {[𝒲_{i} 𝒯_{i}]}^{T} Λ_{i} [𝒲_{i} 𝒯_{i}]}^{- 1} {\sum_{i = 1}^{n} {[𝒲_{i} 𝒯_{i}]}^{T} Λ_{i} Y_{i}},

(3.5)

where $\sum_{i = 1}^{n} {[𝒲_{i} 𝒯_{i}]}^{T} Λ_{i} [𝒲_{i} 𝒯_{i}]$ is nonsingular, and the jth row of 𝒲_i is $(Z_{i j}^{T}, δ_{i j} W_{i j}^{T})$ . Substituting the basis approximations (2.7) in (3.4), the least-squares based nonparametric estimators of {α(s), β(s)} are

{({\tilde{α}}_{L S}^{T} (s; ℬ), {\tilde{β}}_{L S}^{T} (s; ℬ))}^{T} = ℬ (s) {({\tilde{γ}}_{L S}^{T} (ℬ), {\tilde{τ}}_{L S}^{T} (ℬ))}^{T},

(3.6)

where {γ̂_LS(ℬ), τ̂_LS(ℬ)} are given in (3.5) with 𝒯 (s) replaced by ℬ(s). Consistency and the rates of convergence for (3.6) can be derived [9].

Clearly, (3.5) and (3.6) are the same as the likelihood-based estimators when $Λ_{i} = V_{i}^{- 1}$ and normality assumptions hold. In practice, V_i are usually unknown and often difficult to estimate, so that subjective choices for Λ_i are used. Guidance on this choice is also available [8,9,23].

3.3 Estimation of the Covariances

The covariance structure V_i defined in Section 3.1 can be modeled in a number of ways. By the definition of e_ij in (3.2), the (j₁, j₂)th component of V_i is

V_{i, j_{1}, j_{2}} = E (e_{i j_{1}} e_{i j_{2}}) = ρ_{i, j_{1}, j_{2}} (A, B, C) + σ_{i, j_{1}, j_{2}},

(3.7)

where $A = E (a_{i}^{*} a_{i}^{* T}), B = E (b_{i}^{*} b_{i}^{* T}), C = E (a_{i}^{*} b_{i}^{* T})$ , σ_i,j₁,j₂ = E(ε_ij₁ε_ij₂) and

ρ_{i, j_{1}, j_{2}} (A, B, C) = Z_{i j_{1}}^{T} A Z_{i j_{2}} + Z_{i j_{1}}^{T} C (δ_{i j_{2}} W_{i j_{2}}) + (δ_{i j_{1}} W_{i j_{1}}^{T}) C Z_{i j_{2}} + (δ_{i j_{1}} W_{i j_{1}}^{T}) B (δ_{i j_{2}} W_{i j_{2}}) .

For the special case that ε_ij are independent measurement errors such that σ_i,j₁,j₂ = 0 if j₁ ≠ j₂ and σ² if j₁ = j₂, V_i adopts the parametric model V_i(A, B, C, σ²) with V_i,j₁,j₂ = ρ_i,j₁,j₂(A, B, C) if j₁ ≠ j₂ and ρ_i,j,j(A, B, C)+ σ² if j₁ = j₂ = j. Other structures for V_i can be formulated by modeling σ_i,j₁,j₂ [4].

For the general case of ε_ij having unknown correlation structures, σ_i,j₁,j₂ is an nonparametric component in (3.7), hence can be either directly estimated or approximated by a parametric model. Under a different regression model, a local smoothing technique was suggested but can be computationally intensive [24]. To ease the computational burden, a consistent covariance estimator can be constructed by B-spline approximations [9]. σ_i,j₁,j₂ can be approximated via B-spline by $σ_{i, j_{1}, j_{2}} (u, v) = \sum_{k = 1}^{K_{1}} \sum_{l = 1}^{K_{1}} u_{k l} ℬ_{k} (T_{i j_{1}}) ℬ_{l} (T_{i j_{2}})$ if j₁ ≠ j₂, and $\sum_{k = 1}^{K_{2}} υ_{k} ℬ_{k} (T_{i j})$ if j₁ = j₂ = j, where {ℬ_k} is a spline basis with a fixed knot sequence, u = {u_kl = u_lk; k, l = 1, …,K₁} and v = {v_k; k = 1, …,K₂}. Substituting σ_i,j₁,j₂ (u, v) into (3.7), V_i is approximated by V_i(A, B, C, u, v) such that

V_{i, j_{1}, j_{2}} = {\begin{matrix} ρ_{i, j_{1}, j_{2}} (A, B, C) + \sum_{k = 1}^{K_{1}} \sum_{l = 1}^{K_{1}} u_{k l} ℬ_{k} (T_{i j_{1}}) ℬ_{l} (T_{i j_{2}}), & if j_{1} \neq j_{2}; \\ ρ_{i, j_{1}, j_{2}} (A, B, C) + \sum_{k = 1}^{K_{2}} υ_{k} ℬ_{k} (T_{i j}), & if j_{1} \neq j_{2} = j . \end{matrix}

Once an approximate parametric model for V_i is established, estimation of V_i can be achieved by least squares. Let $ê_{i j} = Y_{i j} - [Z_{i j}^{T} \hat{α} (S_{i}) + δ_{i j} W_{i j}^{T} \hat{β} (S_{i})]$ be the residual of Y_ij computed based on some consistent estimators α̂(s) and β̂(s). If V_i,j₁,j₂ = ρ_i,j₁,j₂ (A, B, C)+ σ_i,j₁,j₂ (u, v), we can estimate V_i by V_i(Â,B̂, Ĉ, û, v̂) where (Â, B̂, Ĉ, û, v̂) minimizes

\sum_{i = 1}^{n} \sum_{j_{1}, j_{2} = 1, j_{1} < j_{2}}^{n_{i}} {ê_{i j_{1}} ê_{i j_{2}} - [ρ_{i, j_{1}, j_{2}} (A, B, C) + \sum_{k} \sum_{l} u_{k l} ℬ_{k} (T_{i j_{1}}) ℬ_{l} (T_{i j_{2}})]}^{2}

subject tou_kl = u_lk when j₁ ≠ j₂, and

\sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} {ê_{i j}^{2} - [ρ_{i, j, j} (A, B, C) + \sum_{k} υ_{k} ℬ_{k} (T_{i j})]}^{2},

when j₁ = j₂ = j.

Here, V_i(Â, B̂, Ĉ, û, v̂) needs not to be positive definite for a finite sample, although, by consistency, it is asymptotically positive definite [9]. The problem of imposing finite sample positive definiteness to the spline estimators of V_i deserves substantial further investigation. The adequacy of V_i(Â, B̂, Ĉ, û, v̂) depends on the choices of knots and the degrees of the splines. Although it is possible to develop data-driven knots using cross-validation or the generalized cross-validation, statistical properties of such procedures are currently unknown. Subjective knot choices, such as using a few equal spaced knots, often give satisfactory results in biomedical applications.

4 Inferences

4.1 Inferences for Linear Models

Following the classical inferential framework with linear mixed-effects models, we consider the inferences for the fixed effects of (2.5) and (2.6) given {(Z_i, δ_ij,W_ij, S_i); 1 ≤ i ≤ n, 1 ≤ j ≤ n_i}. From (3.5), E[γ̂_LS(𝒯)] = γ, E[τ̂_LS(𝒯)] = τ and the covariance of ${({\hat{γ}}_{L S}^{T} (𝒯), {\hat{τ}}_{L S}^{T} (𝒯))}^{T}$ is

{[\sum_{i = 1}^{n} {(𝒲_{i} 𝒯_{i})}^{T} Λ_{i} (𝒲_{i} 𝒯_{i})]}^{- 1} [\sum_{i = 1}^{n} {(𝒲_{i} 𝒯_{i})}^{T} Λ_{i} V_{i} Λ_{i} (𝒲_{i} 𝒯_{i})] {[\sum_{i = 1}^{n} {(𝒲_{i} 𝒯_{i})}^{T} Λ_{i} (𝒲_{i} 𝒯_{i})]}^{- 1} .

Let ${\hat{ξ}}_{L S}^{T} = {({\hat{γ}}_{L S}^{T} (𝒯), {\hat{τ}}_{L S}^{T} (𝒯))}^{T}$ , ξ = (γ^T, γ^T)^T, and ℒξ̂_LS and ℒ ξ be their corresponding linear combinations. Following the central limit theorem (see Sec.1.9.3 in [25]), it can be shown that, when n is sufficiently large, ℓ ξ̂_LS is asymptotically distributed as N(ℒ ξ Var(ℓ ξ̂_LS)), where Var(ℒξ̂_LS) can be derived from the covariance matrix of ξ̂_LS. Substituting Var(ℒ ξ̂_LS) with a consistent estimate $\hat{Var} (ℒ {\hat{ξ}}_{L S})$ , an approximate (1 − α) confidence interval for ℒ ξ is

ℒ {\hat{ξ}}_{L S} \pm Z_{α / 2} {[\hat{Var} (ℒ {\hat{ξ}}_{L S})]}^{1 / 2},

(4.1)

where Z_α/2 is the [100 × (1 − α/2)]th percentile of the standard normal distribution.

Let L be any given matrix with rank(L) = d. The above asymptotic approximations can be used to test the null hypothesis, H₀: Lξ = C for a known constant vector C, versus the alternative, H_A: Lξ ≠ C. An α-level approximate χ²-test rejects the null hypothesis if

{(L {\hat{ξ}}_{L S} - C)}^{T} {L [cov ({\hat{ξ}}_{L S})] L^{T}}^{- 1} (L {\hat{ξ}}_{L S} - C) \geq χ_{d, α}^{2},

(4.2)

where $χ_{d, α}^{2}$ is the [100 × (1 − α)]th upper percentile of the $χ_{d}^{2}$ -distribution.

4.2 Bootstrap Confidence Intervals

Nonparametric inferences for the smoothing estimators of α(s) and β(s) can be constructed using the “resampling-subject” bootstrap [8,26]. For the construction of bootstrap pointwise confidence intervals (CIs), this procedure generates bootstrap samples { $(Y_{i j}^{b}, Z_{i j}^{b}, {(δ_{i j} W_{i j})}^{b}, S_{i}^{b})$ ; 1 ≤ i ≤ n, 1 ≤ j ≤ n_i} by sampling the subjects with replacement from the original data and obtains the spline estimators ${\tilde{α}}_{L S}^{(b)} (s, ℬ)$ and ${\tilde{β}}_{L S}^{(b)} (s, ℬ)$ . Repeat the procedure multiple times, and let L_α/2(α̃_LS,d(s, ℬ)) and U_α/2(α̃_LS,d(s, ℬ)) be the lower and upper [100 × (α/2)]th percentiles of the bootstrap estimators of α_d(s). An approximate (1 − α) pointwise CI for α_d(s) is [L_α/2(α̃_LS,d(s, ℬ)), U_α/2(α̃_LS,d(s, ℬ))]. Confidence intervals for β_d(s) and other parameters can be constructed similarly. Moreover, variances of the parameters can be estimated by the sample variances of bootstrap estimates.

5 Application to Pharmacotherapy in the ENRICHD Study

As described in Section 1, our objective is to evaluate the additional effects of pharmacotherapy (antidepressants) on the trends of depression (measured by BDI scores) for patients who received pharmacotherapy during the six-month psychosocial treatment period. Because pharmacotherapy was only designed as a concomitant intervention in this trial, the starting time of pharmacotherapy was decided by the patients or their physicians. Unfortunately, since patients in the usual care arm did not have accurate pharmacotherapy starting time and repeated BDI scores recorded within the first six-month period, the effects of pharmacotherapy could not be properly analyzed for these patients (refer to [11] for more details). Ninety one patients (total 1,446 observations) in the psychosocial treatment arm received pharmacotherapy as a concomitant intervention during this period and had clear records of their pharmacotherapy starting time. Among them, 43 started pharmacotherapy at baseline and 48 started pharmacotherapy between 7 and 172 days. The number of visits for these patients ranges from 5 to 36 and has the median of 16. Patients who did not have proper records of antidepressant use were excluded.

For the ith patient, Y_ij, T_ij, S_i, R_ij = T_ij − S_i and δ_ij = 1_{[T_ij≥S_i]} are the BDI score, trial time (in months), starting time of pharmacotherapy, time from initiation of pharmacotherapy, and pharmacotherapy indicator, respectively, at the jth visit. Our preliminary examination of the data revealed that the BDI scores over T_ij could be approximated by a linear model (results not shown). An intuitive model is the following special case of (2.1),

Y_{i j} = a_{0 i} + a_{1 i} T_{i j} + b_{0 i} δ_{i j} + b_{1 i} δ_{i j} R_{i j} + ε_{i j},

(5.1)

where E(a_0i, a_1i, b_0i, b_1i)^T = (α₀, α₁, β₀, β₁)^T and, when δ_ij = 1 and R_ij = r, (β₀ + β₁r) describes the mean pharmacotherapy effect at r months since the start of pharmacotherapy. Clearly, (5.1) ignores the correlation between S_i and the pre-pharmacotherapy depression trends. To evaluate whether (5.1) leads to potential bias, we next considered the following special case of (2.5),

Y_{i j} = α_{0} (S_{i}) + α_{1} (S_{i}) T_{i j} + β_{0} δ_{i j} + β_{1} δ_{i j} R_{i j} + e_{i j},

(5.2)

where $e_{i j} = a_{i 0}^{*} + a_{i 1}^{*} T_{i j} + b_{i 0}^{*} δ_{i j} + b_{i 1}^{*} δ_{i j} R_{i j} + ε_{i j}$ , α₀(S_i) = γ₀₀ + γ₀₁S_i and α₁(S_i) = γ₁₀ + γ₁₁S_i. In (5.2), the mean pre-pharmacotherapy BDI trend is associated with S_i through intercept α₀(S_i) and slope α₁(S_i). At r months after the start of pharmacotherapy, the mean pharmacotherapy effect is β₀+β₁r, where a negative value for β₀+β₁r corresponds to a beneficial effect for reducing depression. To reduce model complexity, we assume in (5.2) that β₀(S_i) ≡ β₀ and β₁(S_i) ≡ β₁ in the sense that the effects of pharmacotherapy only depend on how long the antidepressant has been used, but not on when it was started.

Table 1 summarizes the parameter estimates and their corresponding standard errors, 95% CIs and p-values obtained by the REML procedure with unstructured correlations. The negative estimates for (β₀, β₁) under (5.1) and (5.2) suggest that the beneficial effect of pharmacotherapy for this patient population is detected by both models. However, a slightly stronger depression lowering effect is exhibited by (5.2). The 95% CI for γ₀₁ suggests a negative correlation of S_i with baseline BDI scores, so that patients with higher baseline BDI scores tend to start pharmacotherapy sooner.

Table 1.

The ENRICHD Data Analysis

Model	Parameter	Estimate	SE	95% CI	p-value
(5.1)	α₀	23.380	1.107	(21.167, 25.594)	<0.0001
	α₁	−0.619	0.479	(−1.577, 0.339)	0.199
	β₀	−3.410	0.994	(−5.399, −1.422)	0.0013
	β₁	−1.584	0.521	(−2.626, −0.542)	0.0039

(5.2)	γ₀₀	25.670	1.431	(22.808, 28.533)	<0.0001
	γ₀₁	−1.389	0.586	(−2.562, −0.216)	0.0180
	γ₁₀	−0.278	0.822	(−1.922, 1.366)	0.736
	γ₁₁	0.078	0.174	(−0.272, 0.426)	0.654
	β₀	−4.302	1.041	(−6.385, −2.220)	0.0001
	β₁	−2.062	0.773	(−3.608, −0.516)	0.0105

Open in a new tab

Parameter estimates and their standard errors (SE), 95% confidence intervals (CIs) and p-values were obtained by restricted maximum likelihood with unstructured correlations for models (5.1) and (5.2).

6 Simulation

Following the general framework (2.3), we consider a simulation design that resembles the data structure of the ENRICHD trial. Each simulated sample contains n = 200 subjects. Each subject has 30 “scheduled visits” at time points (T_i1, …, T_i,30) = (0, 0.2+e₁, …, 5.8+ e₂₉), where {e_l} are independently generated from uniform U(−0.2, 0.2) distribution, but each scheduled visit has 40% probability skipped. This leads to unequal numbers of repeated measurements among the subjects with n_i being the number of repeated measurements for the ith subject. The random parameters ${(a_{i}^{T}, b_{i}^{T})}^{T} = {(a_{0 i}, a_{1 i}, b_{0 i}, b_{1 i})}^{T}$ are generated from the multivariate normal distribution with mean (25, 0,−4,−2)^T and covariance matrix cov(a_0i, a_1i, b_0i, b_1i) = diag(6.25, 1, 1, 1). For each {a_i, b_i}, we generate two different change-point times: (a) S_i ~ N(10−0.3 a_0i, 0.16); and (b) S_i ~ N(1+4 sin[(a_0i−4)/9], 0.09). For each given {T_ij, S_i, a_i, b_i}, Y_ij is generated from N(a_0i+a_1iT_ij+b_0iδ_ij+b_1iδ_ijR_ij, 4). When {S_i; i = 1, …, 200} are generated from (a), direct calculation based on conditional normal distributions shows that the marginal model of Y_ij is (5.2) with γ₀₀ = 31.49, γ₀₁ = −2.60, γ₁₀ = γ₁₁ = 0, τ₀ = −4 and τ₁ = −2. When {S_i; i = 1, …, 200} are generated from (b), we assume that the parametric form of α₀(S_i) is unknown and the marginal model of Y_ij is

Y_{i j} = α_{0} (S_{i}) + α_{1} T_{i j} + β_{0} δ_{i j} + β_{1} δ_{i j} R_{i j} + e_{i j},

(6.1)

where $e_{i j} = a_{i 0}^{*} + a_{i 1}^{*} T_{i j} + b_{i 0}^{*} δ_{i j} + b_{i 1}^{*} δ_{i j} R_{i j}$ , β₀ = −4, β₁ = −2 and α₀(S_i) is not a linear function.

The simulation was repeated 2,000 times. For samples with S_i generated from (a), we first ignored the correlation between a_0i and S_i and estimated (α₀, α₁, β₀, β₁) using (5.1) with unstructured correlations, and then fitted the data to (5.2) with α₁(S_i) ≡ α₁ and estimated (γ₀₀, γ₀₁, α₁, β₀, β₁), all by REML. Table 2 summarizes the averages of the estimates, their standard errors, and root mean-squared errors as well as the empirical coverage probabilities of the 95% asymptotic CIs. The bias for the estimation of β₀ in (5.1) can be seen from the large average root mean-squared errors and the low coverage probabilities compared with the estimates obtained using (5.2).

Table 2.

Simulation results for (a) S_i ~ N(10 − 0.3 a_0i, 0.16)

Model	Parameter	Estimate	SE	$\sqrt{M S E}$	CP
(5.1)	α₀ =25	25.018	0.202	0.211	0.940
	α₁ =0	−0.058	0.100	0.123	0.897
	β₀ = −4	−3.805	0.153	0.253	0.745
	β₁ = −2	−1.971	0.115	0.123	0.936

(5.2)	γ₀₀ =31.488	31.480	0.350	0.359	0.942
	γ₀₁ = −2.595	−2.592	0.131	0.134	0.943
	α₁ =0	0.001	0.099	0.099	0.951
	β₀ = −4	−4.004	0.152	0.154	0.946
	β₁ = −2	−1.998	0.114	0.114	0.955

Open in a new tab

Estimate, SE and $\sqrt{M S E}$ denote the averages of estimates, standard errors and square root of the mean squared errors and CP represents the estimated coverage probability of the 95% confidence intervals, computed from 2,000 simulated samples. Parameter estimates and SEs were obtained by restricted maximum likelihood with unstructured correlations.

For samples with S_i generated from (b), we approximated α₀(s), s ∈ [0,6], using the quadratic B-spline with 2 equally spaced interior knots (see Ch.5.2 in [27]), and estimated (α₀(s), α₁, β₀, β₁) under (6.1) using (3.5) and (3.6) with the Λ_i = I_{n_i×n_i} weight. We computed the 95% bootstrap CIs for (α₁, β₀, β₁) and the 95% pointwise bootstrap CIs for α₀(s) at 60 equally spaced values of s ∈ [0, 6] using the percentile procedures with B = 300 bootstrap replications. For comparison, we also fitted the data to (5.1), which assumes that α₀(s) ≡ α₀, and estimated (α₀, α₁, β₀, β₁) using the same procedure of (3.5). Figure 1(a) shows the spline-estimated coefficient curve α₀(s) and the 95% pointwise bootstrap CIs obtained from a randomly selected simulated sample and Figure 1(b) displays the empirical coverage probabilities of the 95% pointwise bootstrap CIs, where true α₀(s) is numerically calculated. Table 3 presents the same set of summary statistics used in Table 2 for (α₁, β₀,β₁) under both (5.1) and (6.1). Standard errors and CI coverage probabilities based on either the bootstrap procedure (Section 4.2) or the least squares and normal approximation procedure (Sections 3.3 and 4.1) are compared, and the performances of the two procedures are similar under each model. The large root mean-squared errors and poor coverage probabilities for the estimates obtained under (5.1) suggest that ignoring the association between a_0i and S_i may lead to erroneous conclusions in the present situation.

(a) True curve α₀(s) (solid line), spline estimated curve α̃₀(s) (dash line) and pointwise 95% bootstrap percentile confidence intervals (dotted lines) obtained from a randomly selected simulated sample. (b) Empirical coverage probability of pointwise 95% bootstrap confidence intervals for α₀(s) (solid line) and their sample mean (dash line).

Table 3.

Simulation results for (b) S_i ~ N(1 + 4 sin[(a_0i − 4)/9], 0.09)

Model	Parameter	Estimate	SE (SE*)	$\sqrt{M S E}$	CP (CP*)
(5.1)	α₁ =0	−0.862	0.150 (0.145)	0.874	0(0)
	β₀ = −4	−2.404	0.340 (0.329)	1.632	0.004 (0.007)
	β₁ = −2	−0.237	0.372 (0.390)	1.804	0.005 (0.012)

(6.1)	α₁ =0	−0.001	0.097 (0.101)	0.095	0.951 (0.956)
	β₀ = −4	−3.999	0.264 (0.255)	0.267	0.942 (0.940)
	β₁ = −2	−1.991	0.258 (0.247)	0.259	0.944 (0.921)

Open in a new tab

Estimate, SE and $\sqrt{M S E}$ denote the averages of estimates, bootstrap standard errors and square root of the mean squared errors and CP represents the estimated coverage probability of the 95% bootstrap confidence intervals, computed from 2,000 simulated samples. SE* and CP* denote the standard errors and coverage probability of CIs obtained by the procedures in Sections 3.3 and 4.1. Parameter estimates were obtained using (3.5) with Λ_i = I_{n_i×n_i}

7 Discussion

Our proposed methodology is focused on concomitant interventions in longitudinal clinical trials and such interventions commonly appear in other settings. For example, subjects in an epidemiological study may take antihypertensive medication during the study when their blood pressure levels either exhibit some undesirable trends or stay in an intolerable range. Crucial in dealing with this type of data is to model the intervention selection mechanism as realistic as possible. In our pharmacotherapy example of the ENRICHD trial, there was only a vague guideline for the initiation of pharmacotherapy, so that it appeared reasonable to model the intervention selection mechanism through some shared-parameters. We focus on the varying-coefficient model mainly because it has a simple and clear biological interpretation for this example, its assumptions seem to be realistic for this type of trials, and the nonparametric B-spline method can be easily implemented.

We also find that there are several possible extensions worthy of further investigation. First, our data structure allows for only a single intervention with one change-point per subject. Generally, subjects in longitudinal studies may have single or multiple concomitant interventions which can be turned on or off at different time points. In such situations, more general shared-parameter models may be needed to accommodate the possibility of multiple interventions and/or multiple change-points. Second, our model relies on linear functions to describe the trends before and after the intervention. It can be generalized to models with nonlinear response curves. Finally, we use the classical frequentist’s framework for the B-spline methods. In a different context, Fahrmeir and Lang [28] demonstrated a promising Bayesian inference procedure for generalized additive mixed models based on Markov random field priors. Analogous approaches for our model and estimators may lead to useful confidence regions and model diagnostic procedures.

Acknowledgement

Financial support for the ENRICHD study was provided by the National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, Maryland. Pfizer Inc., provided sertraline (Zoloft) for the ENRICHD study. We want to thank the participants as well as the investigators of the ENRICHD study. We also thank two referees and the associate editor for their thoughtful suggestions and comments which greatly improved our presentation.

Contributor Information

Colin O. Wu, Office of Biostatistics Research, National Heart, Lung and Blood Institute, Bethesda, MD 20892.

Xin Tian, Office of Biostatistics Research, National Heart, Lung and Blood Institute, Bethesda, MD 20892.

Heejung Bang, Division of Biostatistics and Epidemiology, Department of Public Health, Weill Medical College of Cornell University, NY 10021.

References

1.Laird NM, Ware JH. Random-effects models for longitudinal data. Biometrics. 1982;38:963–974. [PubMed] [Google Scholar]
2.Davidian M, Giltinan DM. Nonlinear Models for Repeated Measurement Data. London; New York: Chapman Hall; 1995. [Google Scholar]
3.Verbeke G, Molenberghs G. Linear Mixed Models for Longitudinal Data. New York: Springer; 2000. [Google Scholar]
4.Diggle PJ, Heagerty P, Liang K-Y, Zeger SL. Analysis of Longitudinal Data. 2nd ed. Oxford: Oxford University Press; 2002. [Google Scholar]
5.Fan J, Zhang JT. Functional linear models for longitudinal data. Journal of the Royal Statistical Society. Ser. B. 2000;62:303–322. [Google Scholar]
6.Lin X, Carroll RJ. Nonparametric function estimation for clustered data when the predictor is measured without/with error. Journal of the American Statistical Association. 2000;95:520–534. [Google Scholar]
7.Lin X, Carroll RJ. Semiparametric regression for clustered data using generalized estimating equations. Journal of the American Statistical Association. 2001;96:1045–1056. [Google Scholar]
8.Huang JZ, Wu CO, Zhou L. Varying-coefficient models and basis function approximations for the analysis of repeated measurements. Biometrika. 2002;89:111–128. [Google Scholar]
9.Huang JZ, Wu CO, Zhou L. Polynomial spline estimation and inference for varying coefficient models with longitudinal data. Statistica Sinica. 2004;14:763–788. [Google Scholar]
10.The ENRICHD Investigators. Enhancing recovery in coronary heart disease patients (ENRICHD): Study intervention rationale and design. Psychosomatic Medicine. 2001;63:747–755. [PubMed] [Google Scholar]
11.The ENRICHD Investigators. Enhancing recovery in coronary heart disease patients (ENRICHD): The effects of treating depression and low perceived social support on clinical events after myocardial infarction. Journal of the American Medical Association. 2003;289:3106–3116. doi: 10.1001/jama.289.23.3106. [DOI] [PubMed] [Google Scholar]
12.Taylor CB, Youngblood ME, Catellier D, Veith RC, Carney RM, Burg MM, Kaufmann P, Shuster J, Mellman T, Blumenthal JA, Krishnan R, Jaffe AS. Effects of antidepressant medication on morbidity and mortality in depressed patients after myocardial infarction. Archives of General Psychiatry. 2005;62:792–798. doi: 10.1001/archpsyc.62.7.792. [DOI] [PubMed] [Google Scholar]
13.Bang H, Robins JM. Doubly robust estimation in missing data and causal inference models. Biometrics. 2005;61:962–972. doi: 10.1111/j.1541-0420.2005.00377.x. [DOI] [PubMed] [Google Scholar]
14.Murphy SA, van der Laan MJ, Robins JM. Marginal mean models for dynamic regimes. Journal of the American Statistical Association. 2001;96:1410–1423. doi: 10.1198/016214501753382327. [DOI] [PMC free article] [PubMed] [Google Scholar]
15.Wall MM, Dai Y, Eberly LE. GEE estimation of a misspecified time-varying covariate: an example with the effect of alcoholism treatment on medical utilization. Statistics in Medicine. 2005;24:925–939. doi: 10.1002/sim.1966. [DOI] [PubMed] [Google Scholar]
16.Wu MC, Carroll R. Estimation and comparison fo changes in the presence of informative right censoring by modeling the censoring process. Biometrics. 1988;44:175–188. [Google Scholar]
17.Follmann D, Wu M. An approximate generalized linear model with random effects for informative missing data. Biometrics. 1995;51:151–168. [PubMed] [Google Scholar]
18.Naumova EN, Must A, Laird NM. Tutorial in Biostatistics: Evaluating the impact of critical periods in longitudinal studies of growth using piecewise mixed effects models. International Journal of Epidemiology. 2001;30:1332–1341. doi: 10.1093/ije/30.6.1332. [DOI] [PubMed] [Google Scholar]
19.Bang H, Mazumdar M, Spence JD. Tutorial in Biostatistics: Analyzing associations between total plasma Homocysteine and B vitamins using optimal categorization and segmented regression. Neuroepidemiology. 2006;27:188–200. doi: 10.1159/000096149. [DOI] [PubMed] [Google Scholar]
20.Rice JA, Wu CO. Nonparametric mixed effects models for unequally sampled noisy curves. Biometrics. 2001;57:253–259. doi: 10.1111/j.0006-341x.2001.00253.x. [DOI] [PubMed] [Google Scholar]
21.Lin X, Zhang D. Inference in generalized additive mixed models by using smoothing splines. Journal of the Royal Statistical Society. Ser. B. 1999;61:381–400. [Google Scholar]
22.Chiang CT, Rice JA, Wu CO. Smoothing spline estimation for varying coefficient models with repeatedly measured dependent variables. Journal of the American Statistical Association. 2001;96:605–619. [Google Scholar]
23.Welsh AH, Lin X, Carroll RJ. Marginal longitudinal nonparametric regression: Locality and efficiency of spline and kernel methods. Journal of the American Statistical Association. 2002;97:482–493. [Google Scholar]
24.Diggle PJ, Verbyla AP. Nonparametric estimation of covariance structure in longitudinal data. Biometrics. 1998;54:401–415. [PubMed] [Google Scholar]
25.Serfling RJ. Approximation Theorems of Mathematical Statistics. New York: John Wiley & Sons; 1980. [Google Scholar]
26.Hoover DR, Rice JA, Wu CO, Yang LP. Nonparametric smoothing estimates of time-varying coefficient models with longitudinal data. Biometrika. 1998;85:809–822. [Google Scholar]
27.Hastie TJ, Tibshirani RJ, Friedman J. The Elements of Statistical Learning; Data Mining, Inference, and Prediction. New York: Springer; 2001. [Google Scholar]
28.Fahrmeir L, Lang S. Bayesian inference for generalized additive mixed models based on Markov random field priors. Applied Statistics. 2001;50:201–220. [Google Scholar]

[R1] 1.Laird NM, Ware JH. Random-effects models for longitudinal data. Biometrics. 1982;38:963–974. [PubMed] [Google Scholar]

[R2] 2.Davidian M, Giltinan DM. Nonlinear Models for Repeated Measurement Data. London; New York: Chapman Hall; 1995. [Google Scholar]

[R3] 3.Verbeke G, Molenberghs G. Linear Mixed Models for Longitudinal Data. New York: Springer; 2000. [Google Scholar]

[R4] 4.Diggle PJ, Heagerty P, Liang K-Y, Zeger SL. Analysis of Longitudinal Data. 2nd ed. Oxford: Oxford University Press; 2002. [Google Scholar]

[R5] 5.Fan J, Zhang JT. Functional linear models for longitudinal data. Journal of the Royal Statistical Society. Ser. B. 2000;62:303–322. [Google Scholar]

[R6] 6.Lin X, Carroll RJ. Nonparametric function estimation for clustered data when the predictor is measured without/with error. Journal of the American Statistical Association. 2000;95:520–534. [Google Scholar]

[R7] 7.Lin X, Carroll RJ. Semiparametric regression for clustered data using generalized estimating equations. Journal of the American Statistical Association. 2001;96:1045–1056. [Google Scholar]

[R8] 8.Huang JZ, Wu CO, Zhou L. Varying-coefficient models and basis function approximations for the analysis of repeated measurements. Biometrika. 2002;89:111–128. [Google Scholar]

[R9] 9.Huang JZ, Wu CO, Zhou L. Polynomial spline estimation and inference for varying coefficient models with longitudinal data. Statistica Sinica. 2004;14:763–788. [Google Scholar]

[R10] 10.The ENRICHD Investigators. Enhancing recovery in coronary heart disease patients (ENRICHD): Study intervention rationale and design. Psychosomatic Medicine. 2001;63:747–755. [PubMed] [Google Scholar]

[R11] 11.The ENRICHD Investigators. Enhancing recovery in coronary heart disease patients (ENRICHD): The effects of treating depression and low perceived social support on clinical events after myocardial infarction. Journal of the American Medical Association. 2003;289:3106–3116. doi: 10.1001/jama.289.23.3106. [DOI] [PubMed] [Google Scholar]

[R12] 12.Taylor CB, Youngblood ME, Catellier D, Veith RC, Carney RM, Burg MM, Kaufmann P, Shuster J, Mellman T, Blumenthal JA, Krishnan R, Jaffe AS. Effects of antidepressant medication on morbidity and mortality in depressed patients after myocardial infarction. Archives of General Psychiatry. 2005;62:792–798. doi: 10.1001/archpsyc.62.7.792. [DOI] [PubMed] [Google Scholar]

[R13] 13.Bang H, Robins JM. Doubly robust estimation in missing data and causal inference models. Biometrics. 2005;61:962–972. doi: 10.1111/j.1541-0420.2005.00377.x. [DOI] [PubMed] [Google Scholar]

[R14] 14.Murphy SA, van der Laan MJ, Robins JM. Marginal mean models for dynamic regimes. Journal of the American Statistical Association. 2001;96:1410–1423. doi: 10.1198/016214501753382327. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R15] 15.Wall MM, Dai Y, Eberly LE. GEE estimation of a misspecified time-varying covariate: an example with the effect of alcoholism treatment on medical utilization. Statistics in Medicine. 2005;24:925–939. doi: 10.1002/sim.1966. [DOI] [PubMed] [Google Scholar]

[R16] 16.Wu MC, Carroll R. Estimation and comparison fo changes in the presence of informative right censoring by modeling the censoring process. Biometrics. 1988;44:175–188. [Google Scholar]

[R17] 17.Follmann D, Wu M. An approximate generalized linear model with random effects for informative missing data. Biometrics. 1995;51:151–168. [PubMed] [Google Scholar]

[R18] 18.Naumova EN, Must A, Laird NM. Tutorial in Biostatistics: Evaluating the impact of critical periods in longitudinal studies of growth using piecewise mixed effects models. International Journal of Epidemiology. 2001;30:1332–1341. doi: 10.1093/ije/30.6.1332. [DOI] [PubMed] [Google Scholar]

[R19] 19.Bang H, Mazumdar M, Spence JD. Tutorial in Biostatistics: Analyzing associations between total plasma Homocysteine and B vitamins using optimal categorization and segmented regression. Neuroepidemiology. 2006;27:188–200. doi: 10.1159/000096149. [DOI] [PubMed] [Google Scholar]

[R20] 20.Rice JA, Wu CO. Nonparametric mixed effects models for unequally sampled noisy curves. Biometrics. 2001;57:253–259. doi: 10.1111/j.0006-341x.2001.00253.x. [DOI] [PubMed] [Google Scholar]

[R21] 21.Lin X, Zhang D. Inference in generalized additive mixed models by using smoothing splines. Journal of the Royal Statistical Society. Ser. B. 1999;61:381–400. [Google Scholar]

[R22] 22.Chiang CT, Rice JA, Wu CO. Smoothing spline estimation for varying coefficient models with repeatedly measured dependent variables. Journal of the American Statistical Association. 2001;96:605–619. [Google Scholar]

[R23] 23.Welsh AH, Lin X, Carroll RJ. Marginal longitudinal nonparametric regression: Locality and efficiency of spline and kernel methods. Journal of the American Statistical Association. 2002;97:482–493. [Google Scholar]

[R24] 24.Diggle PJ, Verbyla AP. Nonparametric estimation of covariance structure in longitudinal data. Biometrics. 1998;54:401–415. [PubMed] [Google Scholar]

[R25] 25.Serfling RJ. Approximation Theorems of Mathematical Statistics. New York: John Wiley & Sons; 1980. [Google Scholar]

[R26] 26.Hoover DR, Rice JA, Wu CO, Yang LP. Nonparametric smoothing estimates of time-varying coefficient models with longitudinal data. Biometrika. 1998;85:809–822. [Google Scholar]

[R27] 27.Hastie TJ, Tibshirani RJ, Friedman J. The Elements of Statistical Learning; Data Mining, Inference, and Prediction. New York: Springer; 2001. [Google Scholar]

[R28] 28.Fahrmeir L, Lang S. Bayesian inference for generalized additive mixed models based on Markov random field priors. Applied Statistics. 2001;50:201–220. [Google Scholar]

PERMALINK

A Varying-Coefficient Model for the Evaluation of Time-Varying Concomitant Intervention Effects in Longitudinal Studies

Colin O Wu

Xin Tian

Heejung Bang

Summary

1 Introduction

2 Change-Point Mixed-Effects Models

2.1 Data Structure

2.2 A Naive Linear Mixed-Effects Model: Review

2.3 The Shared-Parameter Model

2.4 The Varying-Coefficient Mixed-Effects Model

3 Estimation Methods

3.1 Likelihood-Based Estimation

3.2 Least-Squares Based Estimation

3.3 Estimation of the Covariances

4 Inferences

4.1 Inferences for Linear Models

4.2 Bootstrap Confidence Intervals

5 Application to Pharmacotherapy in the ENRICHD Study

Table 1.

6 Simulation

Table 2.

Figure 1.

Table 3.

7 Discussion

Acknowledgement

Contributor Information

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

A Varying-Coefficient Model for the Evaluation of Time-Varying Concomitant Intervention Effects in Longitudinal Studies

Colin O Wu

Xin Tian

Heejung Bang

Summary

1 Introduction

2 Change-Point Mixed-Effects Models

2.1 Data Structure

2.2 A Naive Linear Mixed-Effects Model: Review

2.3 The Shared-Parameter Model

2.4 The Varying-Coefficient Mixed-Effects Model

3 Estimation Methods

3.1 Likelihood-Based Estimation

3.2 Least-Squares Based Estimation

3.3 Estimation of the Covariances

4 Inferences

4.1 Inferences for Linear Models

4.2 Bootstrap Confidence Intervals

5 Application to Pharmacotherapy in the ENRICHD Study

Table 1.

6 Simulation

Table 2.

Figure 1.

Table 3.

7 Discussion

Acknowledgement

Contributor Information

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases