A smoothed corrected score approach for proportional hazards model with misclassified discretized covariates induced by error-contaminated continuous time-dependent exposure

Xiao Song; Edward C Chao; Ching-Yun Wang

doi:10.1111/biom.13595

. Author manuscript; available in PMC: 2024 Mar 1.

Published in final edited form as: Biometrics. 2021 Nov 15;79(1):437–448. doi: 10.1111/biom.13595

A smoothed corrected score approach for proportional hazards model with misclassified discretized covariates induced by error-contaminated continuous time-dependent exposure

Xiao Song ¹, Edward C Chao ², Ching-Yun Wang ³

PMCID: PMC9399755 NIHMSID: NIHMS1826395 PMID: 34694632

Abstract

We consider the proportional hazards model in which the covariates include the discretized categories of a continuous time-dependent exposure variable measured with error. Naively ignoring the measurement error in the analysis may cause biased estimation and erroneous inference. Although various approaches have been proposed to deal with measurement error when the hazard depends linearly on the time-dependent variable, it has not yet been investigated how to correct when the hazard depends on the discretized categories of the time-dependent variable. To fill this gap in the literature, we propose a smoothed corrected score approach based on approximation of the discretized categories after smoothing the indicator function. The consistency and asymptotic normality of the proposed estimator are established. The observation times of the time-dependent variable are allowed to be informative. For comparison, we also extend to this setting two approximate approaches, the regression calibration and the risk-set regression calibration. The methods are assessed by simulation studies and by application to data from an HIV clinical trial.

Keywords: corrected score, regression calibration, smoothing, survival

1 |. INTRODUCTION

In biomedical studies, it is often of interest to characterize the relationship between a time-to-event outcome and some covariates. Time-dependent exposure variables are usually collected intermittently over time. The observations may fluctuate due to biological variation or inaccurate measurements. This induces the issue of measurement error. An example is the AIDS Clinical Trials Group (ACTG) 175, which aimed to compare four antiretroviral therapies in HIV-infected subjects (Hammer et al., 1996). During the study, 2467 subjects were recruited between December 1991 and October 1992 and followed until November 1994. CD4 count, as a reflection of immune status, was scheduled to be measured for each participant about every 12 weeks after randomization. A threshold of CD4 count below 350 cells/mm³ was used in the past to guide whether to start antiretroviral therapy for HIV-infected subjects (World Health Organization, 2009). This threshold was also used as a cut-off point on assessing survival time (May et al., 2014). It is of interest to investigate if I(CD4 count ≤ 350 cells/mm³) (I(·) is the indicator function) could be used as a surrogate marker for time to AIDS. One complication is that observations of CD4 count are subject to substantial biological variation and measurement error.

It is well known that measurement error in continuous covariates may cause bias (Carroll et al., 2006). Various approaches have been proposed to deal with measurement error for continuous time-dependent covariates, usually based on joint modeling of the observed longitudinal data and survival data. These include the regression calibration (RC) (Prentice, 1982; Tsiatis et al., 1995; Liao et al., 2011), likelihood-based approaches (Wulfsohn and Tsiatis, 1997; Song, 2002a; Xu et al., 2014), conditional score (Tsiatis and Davidian, 2001; Song, 2002b), and corrected score (Wang, 2006) approaches. The latter two approaches are more flexible without distributional assumptions on the underlying true covariates, and much simpler in implementation. Song (2017) proposed an improved corrected score approach that is more efficient and also allows the observation times to be informative; that is, the observation times may depend on the underlying longitudinal process or the observed survival time. For example, patients with more severe disease status may appear more often for hospital check-ups in observational studies or have more missing observations during follow-ups in clinical trials. In the ACTG 175 study, some patients had missing CD4 measurements before the event or censoring time. The missing rate seems significantly associated with features of the CD4 count trajectory (Song, 2017), which indicates that the observation times of the CD4 count may be informative.

Misclassification of discrete covariates may cause bias as well (Gustafson, 2004). When the discretized categories of a continuous exposure variable are used as covariates, such as the categories induced by dichotomizing CD4 counts, the measurement error in the exposure variable may lead to misclassification of the discretized covariates and consequently biased estimation of the covariate effects (Flegal et al. 1991). When the error-prone exposure variable is not time-dependent, approaches have been proposed to deal with misclassification of the corresponding discretized categories under the framework of linear regression (Gustafson and Le, 2002; Natarajan, 2009), logistic regression (Dalen et al., 2009), generalized linear regression (Wang et al., 2016), and Cox regression (Seguin et al., 2014). However, to the best of our knowledge, this issue has not yet been investigated when the exposure variable is time-dependent.

To fill this gap, we first adopt a linearization-based RC approach through joint modeling of the longitudinal exposure process and the survival time. We further improve the approach by calibration within each risk set. In addition, we propose a smoothed corrected score (SCS) approach. This approach inherits the advantage of the existing corrected score approach in its simplicity in implementation and robustness to deviation from normal error, and it works reasonably well even if the observation times of the time-dependent exposure variable are informative.

The idea of approximating indicator functions by smooth functions has been considered in the context of the binary response model (Horowitz, 1992) and rank-based approaches (Ma and Huang, 2005; Heller, 2007; Song, 2007), when the covariates are not measured with error. A common purpose of smoothing in these studies is to overcome the difficulty in computation caused by the non-continuity of the indicator function. This is different from our idea of smoothing in this study: to obtain approximately normal independent variables conditional on the true covariates so that the corrected score method can be applied. Our approach is novel in this aspect.

The paper is organized as follows. In Section 2, we give the definition of the model. We propose the RC approaches in Section 3 and the SCS approach in Section 4 for dichotomized covariates. We extend these approaches to multiple discretized categories in Section 5. The finite sample performance of the estimators is assessed by simulation studies and illustrated by an application in Section 6. We conclude with discussion in Section 7. The technical details of the asymptotic results are given in the Web Appendix.

2 |. MODEL DEFINITION

For subject i = 1, …, n, let T_i denote the survival time, and C_i denote the censoring time. The observed survival data are V_i = min(T_i, C_i), and Δ_i = I(T_i ≤ C_i); these and all other variables are independent across i. For simplicity, we consider a single continuous time-dependent exposure variable X_i(t); it is straightforward to extend to multiple time-dependent exposure variables. Longitudinal measurements of X_i(t) are taken at times $u_{i} = {(u_{i 1}, \dots, u_{i m_{i}})}^{T}$ with the observed values $W_{i} = {(W_{i 1}, \dots, W_{i m_{i}})}^{T}$ . The discretized categories of X_i(t) are the covariates of interest. Let Z_i denote p time-independent covariates.

Assume that the longitudinal exposure process follows the linear mixed effects model

X_{i} (t) = f^{T} (t) α_{i}, W_{i j} = X_{i} (u_{i j}) + e_{i j},

(1)

where f(t) is a known continuous q-dimensional function, α_i is a q-dimensional random effect, and j = 1, …, m_i. The inherent longitudinal trajectory is denoted by f^T(t)α_i, which may represent a polynomial or a spline. For example, when f(t) = (1, t)^T and α_i = (α_i0, α_i1)^T, f^T(t)α_i denotes a linear trajectory. No distributional assumption is placed on the random effect α_i. The errors $e_{i 1}, \dots, e_{i m_{i}}$ have mean zero and variance σ² and are independent across time, which denote within-subject biological fluctuation and measurement errors. Let $e_{i} = {(e_{i_{1}}, \dots, e_{i m_{i}})}^{T}$ . We assume e_i is independent of α_i, Z_i, T_i and C_i. This implies the surrogacy assumption that (T_i, C_i) is independent of W_i given α_i, Z_i.

Suppose the survival time depends on discretized categories of X_i(t) and Z_i. Since discretized categories may be represented by binary variables, for simplicity of presentation, we first consider a dichotomous variable I(X_i(t) > x₀) with a cut-off point x₀; an extension to multiple discretized categories is described in Section 5. Specifically, assume the proportional hazards model

λ (t ∣ I (X_{i} (t) > x_{0}), Z_{i}) = λ_{0} (t) \exp {β_{0} I (X_{i} (t) > x_{0}) + γ_{0}^{T} Z_{i}},

(2)

where λ(t|·) denotes the hazard of failure at time t conditional on ·, λ₀(t) is an unspecified baseline hazard function, and β₀ and γ₀ are the regression coefficients. We assume that the survival time T_i is independent of the censoring time C_i given I(X_i(t) > x₀) and Z_i. We focus on estimating the regression coefficients $θ_{0} = {(β_{0}, γ_{0}^{T})}^{T}$ .

The joint models (1) and (2) can be used to assess surrogate markers in clinical trials. According to Prentice (1989), a surrogate marker should satisfy two conditions: (i) the marker should be prognostic for clinical outcome; (ii) the risk of progression given the marker should be independent of treatment. Suppose X_i(t) is a surrogate marker and Z_i is the indicator of an effective treatment; effectiveness of the treatment can be evaluated by including Z_i only in model (2). Prentice’s first condition would be indicated by β₀ ≠ 0 by including X_i(t) only in model (2). Prentice’s second condition can be assessed by including both X_i(t) and Z_i in model (2) and γ₀ = 0 would suggest the treatment effect is mediated through X_i(t).

3 |. LINEARIZATION-BASED RC METHODS

Seguin et al. (2014) used a linearization-based RC for misclassified dichotomized covariates induced by error contaminated time-independent covariates. Here we extend this approach to misclassified dichotomized covariates induced by time-dependent covariates. A sketch of derivation is outlined here with the details given in Web Appendix S.1.5.

Let ${\hat{X}}_{i} (t) = f^{T} (t) {\hat{α}}_{i}$ , where ${\hat{α}}_{i}$ is the least square estimate of α_i based on the longitudinal observations of subject i. Under model (2), it can be shown that

λ (t ∣ I ({\hat{X}}_{i} (t) > x_{0}), Z_{i}) \approx λ_{0} (t) \exp [β_{0} E {I (X_{i} (t) > x_{0}) ∣ I ({\hat{X}}_{i} (t) > x_{0}), Z_{i}, T_{i} \geq t} + γ_{0}^{T} Z_{i}] .

Adopting the idea of RC (Prentice, 1982), when the event is rare, we may approximate $E {I (X_{i} (t) > x_{0}) ∣ I ({\hat{X}}_{i} (t) > x_{0}), Z_{i}, T_{i} \geq t}$ by $E {I (X_{i} (t) > x_{0}) ∣ I ({\hat{X}}_{i} (t) > x_{0}), Z_{i}}$ and substitute an estimator for $E {I (X_{i} (t) > x_{0}) ∣ I ({\hat{X}}_{i} (t) > x_{0}), Z_{i}}$ . Following Wang et al. (2016), at time t, we approximate I(x(t) > x₀) by a straight line that passes ({x₀ + μ_X(t) − 2σ_X(t)}/2, 0) and ({x₀ + μ_X(t) + 2σ_X(t)}/2, 1), where μ_X(t) = E{X(t)}, and $σ_{X}^{2} (t) = var {X (t)} = f^{T} (t) var (α_{i}) f (t)$ (Figure 1). Under this approximation, with similar arguments as in Wang et al. (2016), we have

E {I (X_{i} (t) > x_{0}) ∣ {\hat{X}}_{i} (t), Z_{i}} \approx ξ_{0} (t) + {2 σ_{X} (t)}^{- 1} E (X_{i} (t) ∣ {\hat{X}}_{i} (t), Z_{i}),

and subsequently,

E {I (X_{i} (t) > x_{0}) ∣ I ({\hat{X}}_{i} (t) > x_{0}), Z_{i}} = E [E {I (X_{i} (t) > x_{0}) ∣ {\hat{X}}_{i} (t), Z_{i}} ∣ I ({\hat{X}}_{i} (t) > x_{0}), Z_{i}] \approx ξ_{0}^{+} (t) + 0.8 σ_{X} (t) {(σ_{X}^{2} (t) + σ_{e i}^{2} (t))}^{- 1 / 2} I ({\hat{X}}_{i} (t) > x_{0}),

where $σ_{e i}^{2} (t) = var ({\hat{X}}_{i} (t) ∣ α_{i}) = σ^{2} f^{T} (t) {(F_{i}^{T} F_{i})}^{- 1} f (t)$ with $F_{i}^{T} = (f (u_{i 1}), \dots, f (u_{i m_{i}}))$ , and ξ₀(t) and $ξ_{0}^{+} (t)$ are functions of t defined in Web Appendix S.1.5. Therefore, conditional on $I ({\hat{X}}_{i} (t) > x_{0})$ and Z_i, the hazard function can be approximated as follows:

λ (t ∣ I ({\hat{X}}_{i} (t) > x_{0}), Z_{i}) \approx λ_{0} (t) \exp [β_{0} 0.8 σ_{X} (t) {σ_{X}^{2} (t) {+ σ_{e i}^{2} (t)}}^{- 1 / 2} I ({\hat{X}}_{i} (t) > x_{0}) + γ_{0}^{T} Z_{i}] .

Based on this approximation, the RC estimator can be obtained by replacing I(X_i(t) > x₀) by $0.8 {\hat{σ}}_{X} (t) ({\hat{σ}}_{X}^{2} (t) + {{\hat{σ}}_{e i}^{2} (t))}^{- 1 / 2} I ({\hat{X}}_{i} (t) > x_{0})$ in the partial likelihood estimating equations. Here ${\hat{σ}}_{e i}^{2} (t)$ is an estimator of $σ_{e i}^{2} (t)$ obtained by substituting an estimator ${\hat{σ}}^{2}$ for σ² (e.g., the method of moment estimator), and ${\hat{σ}}_{X}^{2} (t)$ is an estimator of $σ_{X}^{2} (t)$ , which can be obtained by replacing var(α_i) by its estimate

\hat{var} (α_{i}) = \frac{\sum_{i = 1}^{n} I (m_{i} \geq q) [({\hat{α}}_{i} - \bar{\hat{α}}) {({\hat{α}}_{i} - \bar{\hat{α}})}^{T} - σ^{2} {(F_{i}^{T} F_{i})}^{- 1}]}{\sum_{i = 1}^{n} I (m_{i} \geq q)},

where $\bar{\hat{α}} = \sum_{i = 1}^{n} I (m_{i} \geq q) {\hat{α}}_{i} / \sum I (m_{i} \geq q)$ . Thus the RC is simple to implement.

Approximation of an indicator function I(x > x₀) by a straight line. This figure appears in color in the electronic version of this article, and any mention of color refers to that version

It is known in the literature that the RC can be improved by calibration within each risk set (Xie et al., 2001). Adopting the same idea, a risk-set regression calibration (RRC) estimator can be obtained by estimating $σ_{X}^{2} (t)$ within each risk set. Specifically, at time t, we may obtain an estimator of $σ_{X}^{2} (t)$ by replacing var(α_i) by its estimate within the risk set

\hat{var} (α_{i}, t) = \frac{\sum_{i = 1}^{n} I (V_{i} \geq t, m_{i} \geq q) [({\hat{α}}_{i} - \bar{\hat{α}} (t)) {({\hat{α}}_{i} - \bar{\hat{α}} (t))}^{T} - σ^{2} {(F_{i}^{T} F_{i})}^{- 1}]}{\sum_{i = 1}^{n} I (V_{i} \geq t, m_{i} \geq q)},

where $\bar{\hat{α}} (t) = \sum_{i = 1}^{n} I (V_{i} \geq t, m_{i} \geq q) {\hat{α}}_{i} / \sum I (V_{i} \geq t, m_{i} \geq q)$ . The computation can be intensive since $\hat{var} (α_{i}, t)$ needs to be calculated at each failure time. To simplify the calculation, a usual strategy (Tsiatis and Davidian, 2001) is to estimate var(α_i) at a set of given times 0 = t₀ < t₁ < … < t_M, and estimate $σ_{X}^{2} (t)$ by substituting $\hat{var} (α_{i}, t_{j})$ for var(α_i), where t_j ≤ t < t_j+1 and t_M+1 = ∞.

We calculate the approximate standard errors based on the partial likelihood information matrix without adjustment for the estimation of the $σ_{X}^{2} (t)$ and $σ_{e i}^{2} (t)$ . As remarked by several authors (Tsiatis et al., 1995; Dafni and Tsiatis, 1998), adjustment for standard errors of the RC estimators seems unnecessary in simulation studies; moreover, such adjustment may be hard to implement.

4 |. SMOOTHED CORRECTED SCORE ESTIMATOR

For now, we assume that the error variance σ² is known. We would like to apply the corrected score approach (Wang, 2006) to estimate the regression coefficients. However, I(X_i(t) > x₀) is a noncontinuous indicator function and the correction term cannot be readily obtained. To tackle this difficulty, we propose to approximate the indicator function I(u > 0) by a continuous differentiable function K_n(u) = K(u/h_n), where K(u) is a distribution function, and h_n is a tuning parameter; when h_n goes to zero, K_n(u) converges to I(u > 0) (note that K(·) denotes a distribution function rather than a density function, differently from notation commonly used in the kernel smoothing literature). Then the hazard function can be approximated as follows:

λ (t ∣ X_{i}, Z_{i}) \approx λ_{0} (t) \exp (β_{0} K_{n} (X_{i} (t) - x_{0}) + γ_{0}^{T} Z_{i}) .

(3)

Suppose $\tilde{θ}$ is the smoothed “ideal” partial likelihood estimator based on the smoothed hazard function (3) if X_i(t) is observed, which is the solution to

{\hat{U}}^{S} (θ) = n^{- 1} \sum_{i = 1}^{n} \int_{0}^{τ} {{(K_{n} (X_{i} (t) - x_{0}), Z_{i}^{T})}^{T} - \frac{{\hat{G}}_{1}^{S} (θ, t, X, Z, Y^{I}, h_{n})}{{\hat{G}}_{0}^{S} (θ, t, X, Z, Y^{I}, h_{n})}} d N_{i}^{I} (t)

evaluated at time τ, where $Y_{i}^{I} (t) = I (V_{i} \geq t)$ is the at risk process, $N_{i}^{I} (t) = I (V_{i} \leq t, Δ_{i} = 1)$ is the event counting process, and for r = 0, 1, 2, ${\hat{G}}_{r}^{S} (θ, t, X, Z, Y^{I}, h_{n}) = n^{- 1} \sum_{i = 1}^{n} g_{r i}^{S} (θ, t, X, Z, Y^{I}, h_{n})$ with

g_{r i}^{S} (θ, t, X, Z, Y^{I}, h_{n}) = Y_{i}^{I} (t) {{(K_{n} (X_{i} (t) - x_{0}), Z_{i}^{T})}^{T}}^{\otimes r} \exp {β K_{n} (X_{i} (t) - x_{0}) + γ^{T} Z_{i}} .

Here for a vector a, a^⊗r = 1, a, aa^T for r = 0, 1, 2, respectively.

Lemma 1.

Under the assumptions C1–C6 given in Web Appendix S.1.1, as n → ∞, if h_n → 0, then (i) $\tilde{θ} \to θ_{0}$ almost surely; (ii) $n^{1 / 2} Σ^{- 1 / 2} (\tilde{θ} - θ_{0} - D) \to N (0, I)$ , where I is the identity matrix and D and Σ are given in Web Appendix S.1.2; D = O(h_n).

Since X_i(t) is not observed, a naïve approach is to replace X_i(t) by ${\hat{X}}_{i} (t)$ . Note that, given X_i(t), ${\hat{X}}_{i} (t)$ is approximately normal with mean X_i(t) and variance $σ_{e i}^{2} (t)$ . By the delta method, $K (h_{n}^{- 1} ({\hat{X}}_{i} (t) - x_{0}))$ has approximately normal distribution with mean $K (h_{n}^{- 1} (X_{i} (t) - x_{0}))$ and variance

σ_{i}^{2} (t) = h_{n}^{- 2} {K^{'} (h_{n}^{- 1} ({\hat{X}}_{i} (t) - x_{0}))}^{2} σ^{2} f^{T} (t) {(F_{i}^{T} F_{i})}^{- 1} f (t),

where K′(u) is the derivative of K(u). Then we may apply the improved corrected score approach (Song, 2017). Compared to the corrected score approach (Wang, 2006), this approach obtains the estimate ${\hat{α}}_{i}$ based on all longitudinal observations for the ith subject rather than the observations by time t only, and consequently improves the efficiency. In addition, Wang (2006) derived and corrected the bias of the naive estimating function assuming the observation times independent of (α_i, V_i, Z_i). Observing that the naive estimating function contains four empirical processes, Song (2017) applied the correction to each empirical process separately conditional on (α_i, V_i, Z_i) without the independence requirement on the observation times. Thus the method allows the observation times to depend on (α_i, V_i, Z_i). Applying this approach as if $K (h_{n}^{- 1} ({\hat{X}}_{i} (t) - x_{0}))$ were normally distributed and substituting it for ${\hat{X}}^{*} (t)$ in equation (4.3) of Song (2017), we obtain the following SCS estimating equation:

\hat{U} (θ) = n^{- 1} \sum_{i = 1}^{n} \int_{0}^{τ} {{(K_{n} ({\hat{X}}_{i} (t) - x_{0}), Z_{i}^{T})}^{T} - \frac{{\hat{G}}_{1} (θ, t, \hat{X}, Z, Y, h_{n})}{{\hat{G}}_{0} (θ, t, \hat{X}, Z, Y, h_{n})}} d N_{i} (t) = 0,

(4)

where Y_i(t) = I(V_i ≥ t, m_i ≥ q) and N_i(t) = I(V_i ≤ t, Δ_i = 1, m_i ≥ q), which differ from $Y_{i}^{I} (t)$ and $N_{i}^{I} (t)$ with a multiplier I(m_i ≥ q)), reflecting that at least q observations are required to obtain the estimate ${\hat{α}}_{i}$ ; for ${\hat{G}}_{r} (θ, t, \hat{X}, Z, Y, h_{n}) = n^{- 1} \sum_{i = 1}^{n} g_{r i} (θ, t, \hat{X}, Z, Y, h_{n})$ , and

g_{r i} (θ, t, \hat{X}, Z, Y, h_{n}) = Y_{i} (t) {{(K_{n} ({\hat{X}}_{i} (t) - x_{0}), Z_{i}^{T})}^{T} - {(β σ_{i}^{2} (t), 0^{T})}^{T}}^{\otimes r} \times \exp {β K_{n} ({\hat{X}}_{i} (t) - x_{0}) + γ^{T} Z_{i} - β^{2} σ_{i}^{2} (t) / 2} .

Let $\hat{θ}$ denote the SCS estimator. To derive the asymptotic properties, we assume that there exists M_n > 0 such that $M_{n}^{1 / 2} {(F_{i}^{T} F_{i})}^{- 1} F_{i}^{T} e_{i}$ converges to a normal distribution. Although this basically assumes that m_i is large for all i, the SCS works well even if m_i is small in our simulation studies.

Proposition 1.

Under the assumptions C3–C12 given in Web Appendix S.1.1, as n → ∞, if h_n → 0 and $h_{n}^{- 1} M_{n}^{- 1 / 2} = O (1)$ , then (i) $\hat{θ} \to θ_{0}$ almost surely; (ii) $n^{1 / 2} Σ_{*}^{- 1 / 2} (\hat{θ} - θ_{0} - D_{*}) \to N (0, I)$ , where D_* and Σ_* are given in Web Appendix S.1.3; $D_{*} = O (h_{n} + h_{n}^{- 1 - δ} M_{n}^{- 1})$ for any arbitrarily small δ > 0 if the error is normal.

The variance of $\hat{θ}$ can be estimated by n⁻¹A⁻¹B{A⁻¹}^T, where $A = \partial \hat{U} (\hat{θ}) / \partial θ^{T}$ , $B = n^{- 1} \sum_{i = 1}^{n} {\hat{φ}}_{i}^{\otimes 2}$ ,

{\hat{φ}}_{i} = \int_{0}^{τ} {{(K_{n} ({\hat{X}}_{i} (t) - x_{0}), Z_{i}^{T})}^{T} - \frac{{\hat{G}}_{1} (\hat{θ}, t, \hat{X}, Z, Y, h_{n})}{{\hat{G}}_{0} (\hat{θ}, t, \hat{X}, Z, Y, h_{n})}} d N_{i} (t) - {\frac{g_{1 i} (\hat{θ}, t, \hat{X}, Z, Y, h_{n})}{{\hat{G}}_{0} (\hat{θ}, t, \hat{X}, Z, Y, h_{n})} - \frac{g_{0 i} (\hat{θ}, t, \hat{X}, Z, Y, h_{n}) {\hat{G}}_{1} (\hat{θ}, t, \hat{X}, Z, Y, h_{n})}{{\hat{G}}_{0}^{2} (\hat{θ}, t, \hat{X}, Z, Y, h_{n})}} d {n^{- 1} \sum_{i = 1}^{n} N_{i} (t)} .

(5)

Remark.

The asymptotic coverage probability of the Wald confidence interval achieves the nominal level when the asymptotic bias equals o(n^−1/2). As shown in Web Appendix S.1.4, when the error is normal, the naive estimator is consistent and asymptotically normal if M_n → ∞. The asymptotic bias of the naive estimator is $O (M_{n}^{- 1 / 2})$ . When $h_{n} = M_{n}^{- 1 / (2 + δ)}$ , asymptotic bias of the SCS estimator is $O (M_{n}^{- 1 / (2 + δ)})$ . This is comparable to the naive estimator since δ can be arbitrarily small. Our numerical studies indicate that the SCS works better than the naive estimator when m_i is much smaller than n as usually encountered in practice.

In practice, the error variance σ² is usually unknown. It can be estimated by the method of moment estimate ${\hat{σ}}^{2}$ (Tsiatis and Davidian, 2001). This requires m_i > q for a subset of subjects. The SCS estimate can be obtained by replacing σ² by ${\hat{σ}}^{2}$ in (4). Stack the estimating equation $n^{- 1} \sum_{i = 1}^{n} ϕ_{i} = 0$ of ${\hat{σ}}^{2}$ and $\hat{U} (θ)$ together and denote the set of estimating equation by ${\hat{U}}_{*} (\hat{θ})$ . The asymptotic variance of $(\hat{θ}, {\hat{σ}}^{2})$ can be estimated by $A_{*}^{- 1} B_{*} A_{*}^{- 1 T}$ , where $A_{*} = \partial ({\hat{U}}_{*} (\hat{θ})) / \partial {(θ^{T}, σ^{2})}^{T}$ , $B_{*} = n^{- 1} \sum_{i = 1}^{n} {\hat{φ}}_{* i}^{\otimes 2}$ , and ${\hat{φ}}_{* i} = {({\hat{φ}}_{i}^{T}, {\hat{ϕ}}_{i})}^{T}$ , where $\hat{ϕ}$ is the same as ϕ with σ² replaced by ${\hat{σ}}^{2}$ .

It is important to select an appropriate tuning parameter h_n for smoothing-based approaches. Cross-validation may be used to select h_n. We may use as the objective function the smoothed corrected log partial likelihood function

L^{S} (θ) = \sum_{i = 1}^{n} \int_{0}^{τ} β K_{n} ({\hat{X}}_{i} (t) - x_{0}) + γ^{T} Z - \log ({\hat{G}}_{0} (θ, t, \hat{X}, Z, Y, h_{n})) d N_{i} (t),

following Song and Wang (2017), where the corrected log partial likelihood function was used to select smoothing parameters for the time-varying coefficient proportional hazards model.

5 |. EXTENSION TO MULTIPLE DISCRETIZED CATEGORIES

The approaches can be extended to multiple discretized categories. Specifically, with a sequence of cutoff points x₁ < x₂ < … < x_K, using I(X ≤ x₁) as the reference group, we may include K dummy variables I(x_k < X ≤ x_k+1) (k = 1, …, K, and x_K+1 = ∞) in the proportional hazards model. In parallel to the derivation in the dichotomized case, we consider an alternatively parameterization such that

λ (t ∣ X_{i}^{*}, Z_{i}) = λ_{0} (t) \exp {β_{0}^{T} X_{i}^{*} (t) + γ_{0}^{T} Z_{i}},

where $X_{i}^{*} (t) = {(X_{i 1}^{*} (t), \dots, X_{i K}^{*} (t))}^{T}$ with $X_{i k}^{*} (t) = I (X_{i} (t) > x_{k})$ for k = 1, …, K. Note that $I (x_{k} < X \leq x_{k + 1}) = X_{i k}^{*} (t) - X_{i (k + 1)}^{*} (t)$ , and K = 1 corresponds to the dichotomized case. The RC and RRC estimators can be obtained in the same way by replacing $X_{i k}^{*} (t)$ by $0.8 {\hat{σ}}_{X} (t) {({\hat{σ}}_{X}^{2} (t) + {\hat{σ}}_{e i}^{2} (t))}^{- 1 / 2} X_{i k}^{*} (t)$ in the partial likelihood estimating equation as described in Section 3. Letting ${\hat{X}}_{i}^{S} (t) = {(K_{n} ({\hat{X}}_{i 1} (t) - x_{1}), \dots, K_{n} ({\hat{X}}_{i K} (t) - x_{K}))}^{T}$ , the SCS estimating equation can be written as

\hat{U} (θ) = n^{- 1} \sum_{i = 1}^{n} \int_{0}^{τ} {({{\hat{X}}_{i}^{S} (t)}^{T}, Z_{i}^{T})^{T} - \frac{{\hat{G}}_{1} (θ, t, \hat{X}, Z, Y, h_{n})}{{\hat{G}}_{0} (θ, t, \hat{X}, Z, Y, h_{n})}} d N_{i} (t) = 0,

where for r = 0, 1, ${\hat{G}}_{r} (θ, t, \hat{X}, Z, Y, h_{n}) = n^{- 1} \sum_{i = 1}^{n} g_{r i} (θ, t, \hat{X}, Z, Y, h_{n})$ with

g_{r i} (θ, t, \hat{X}, Z, Y, h_{n}) = Y_{i} (t) {({{\hat{X}}_{i}^{S} (t)}^{T}, Z_{i}^{T})^{T} - {(σ_{i} (t) β^{T}, 0^{T})}^{T}}^{\otimes r} \times \exp {β^{T} {\hat{X}}_{i}^{S} (t) + γ^{T} Z_{i} - σ_{i} (t) β^{T} β / 2}, σ_{i} (t) = h_{n}^{- 2} σ^{2} f^{T} (t) {(F_{i}^{T} F_{i})}^{- 1} f (t) {K^{'} (h_{n}^{- 1} ({\hat{X}}_{i 1} (t) - x_{1})), \dots, {K^{'} (h_{n}^{- 1} ({\hat{X}}_{i K} (t) - x_{K}))}}^{\otimes 2} .

The asymptotic variance of the SCS estimator can be calculated the same way as described in Section 4 with $K_{n} ({\hat{X}}_{i} (t) - x_{0})$ replaced by ${{\hat{X}}_{i}^{S} (t)}^{T}$ in (5).

6 |. NUMERICAL STUDIES

6.1 |. Simulation studies

We conducted simulation studies to evaluate the performance of the proposed approaches. Note that the naive, RC, RRC, and SCS include only subjects with at least q observations (m_i ≥ q). Let ${\bar{m}}^{*} = \sum m_{i} I (m_{i} \geq q) / \sum I (m_{i} \geq q)$ be the average numbers of observations per subject for those with m_i ≥ q. We first considered the case that there was only one time-dependent covariate X_i(t) = α_i0 + α_i1t with β₀ = −1, where (α_i0, α_i1) were jointly normal with mean (2.627, −0.0019), variances (0.02408,0.000014), and the covariance −0.00008. A similar covariate process was used by Song (2002b), which mimicked the log10 transformed CD4 profile in a real HIV data set. The cutoff point is x₀ = log₁₀(350). The longitudinal observations of X(t) were scheduled to be measured at t = 0, 6, 12, …, 84. The censoring time was generated from an exponential distribution with mean equal to 185 and truncated at 84. We considered two baseline hazard models: (1) λ₀(t) = 0.02I(t ≥ 12) + 0.005I(t ≥ 60); (2) λ₀(t) = 0.0055I(t ≥ 12) + 0.0025I(t ≥ 60). These are referred to as the larger λ₀(t) and smaller λ₀(t) henceforth. The corresponding censoring rates were 50% and 80%, and ${\bar{m}}^{*} = 9.1$ (SD = 4.5) and 11.2 (SD = 4.5), respectively, where SD denotes the average standard deviation across the simulated data sets.

For each scenario, 1000 simulated data sets were generated. We estimated the regression coefficient using five methods: (1) the “ideal” approach assuming the true value of X_i(t) known; (2) the naive approach; (3) the RC approach; (4) the RRC approach; and (5) the SCS method. For each method, we calculated the empirical bias, standard deviation of the estimates across the simulated data sets, the average of the estimated standard errors, and the empirical coverage probability of the 95% Wald confidence interval. For the RRC approach, var(α_i) is estimated at 0, 20, 40, 60, and 80 percentiles of V; our preliminary studies indicate that it works reasonably well.

We used the logistic distribution function to approximate the indicator function. To speed the calculation, we set the tuning parameter

h_{n} = c {\hat{σ}}_{e} (0) {\hat{σ}}_{X} (0) {{\hat{σ}}_{e}^{2} (0) + {\hat{σ}}_{X}^{2} (0)}^{- 1 / 2},

(6)

where ${\hat{σ}}_{e}^{2} (0) = \sum_{i = 1}^{n} Y_{i} (0) {\hat{σ}}_{e i}^{2} (0) / \sum_{i = 1}^{n} Y_{i} (0)$ and c is a constant. We set h_n proportional to ${\hat{σ}}_{X} (0)$ so that the estimator is invariant to scale transformation of X. Considering the “ideal” estimator is the optimal estimator when there is no measurement error, we set h_n proportional to ${\hat{σ}}_{e} (0)$ . This also makes h_n proportional to $M_{n}^{- 1 / 2}$ . The factor ${{\hat{σ}}_{e}^{2} (0) + {\hat{σ}}_{X}^{2} (0)}^{- 1 / 2}$ is included to ensure that h_n is the same unit as X. In our simulation studies, c = 0.18 works reasonably well.

In case 1, the error was generated from a normal distribution with mean 0 and variance σ² = 0.01 or 0.02. The results for n = 500 and 1000 are shown in Table 1. The naive estimator is biased and the coverage probabilities are well below the nominal level. The RC estimator has smaller bias and better coverage than the naive estimator, and the bias is smaller with higher censoring rate. The RRC estimator outperforms the naive and RC, but still shows some bias and under-coverage. The SCS estimator has bias close to the “ideal” estimator, coverage probabilities close to the nominal level, and empirical standard deviations smaller than the RC and RRC.

TABLE 1.

Simulation results for normal error

			LH (50% censoring)				SH (80% censoring)
Censoring	σ ²	Method	Bias	SD	SE	CP	Bias	SD	SE	CP
n = 500		Ideal	−0.000	0.132	0.133	0.955	0.008	0.228	0.221	0.949
	0.01	Naive	0.206	0.131	0.130	0.640	0.218	0.219	0.212	0.794
		RC	−0.176	0.163	0.160	0.809	−0.129	0.274	0.263	0.923
		RRC	−0.100	0.162	0.161	0.912	−0.089	0.273	0.264	0.934
		SCS	−0.017	0.149	0.147	0.945	0.035	0.242	0.237	0.953
	0.02	Naive	0.264	0.133	0.130	0.462	0.281	0.214	0.211	0.699
		RC	−0.255	0.183	0.160	0.646	−0.177	0.279	0.261	0.892
		RRC	−0.087	0.164	0.161	0.909	−0.078	0.270	0.262	0.937
		SCS	−0.039	0.154	0.150	0.935	0.028	0.248	0.244	0.954
n = 1000		Ideal	0.000	0.096	0.094	0.948	0.002	0.155	0.156	0.947
	0.01	Naive	0.210	0.091	0.092	0.381	0.212	0.150	0.150	0.694
		RC	−0.165	0.114	0.113	0.688	−0.133	0.187	0.186	0.902
		RRC	−0.093	0.113	0.114	0.875	−0.095	0.188	0.187	0.921
		SCS	−0.010	0.104	0.104	0.945	0.025	0.166	0.167	0.953
	0.02	Naive	0.270	0.091	0.092	0.162	0.277	0.146	0.149	0.541
		RC	−0.232	0.127	0.113	0.466	−0.172	0.191	0.185	0.846
		RRC	−0.079	0.113	0.113	0.894	−0.080	0.184	0.186	0.932
		SCS	−0.034	0.107	0.106	0.931	0.020	0.171	0.172	0.950

Open in a new tab

Note: λ(t|I(X_i(t) > x₀)) = λ₀(t) exp{β₀I(X_i(t) > x₀)} with β₀ = −1, X_i(t) = α_i0 + α_i1t measured at t = 0, 6, 12, …, 84, x₀ = log₁₀(350). LH, larger λ₀(t) = 0.02I(t ≥ 12) + 0.005I(t ≥ 60); SH, smaller λ₀(t) = 0.0055I(t ≥ 12) + 0.0025I(t ≥ 60). SD, empirical standard deviation of the estimates across simulated data sets; SE, average of estimated standard errors; CP, coverage probability of the 95% Wald confidence interval.

Case 2 is the same as case 1 except that the error was generated from nonnormal distributions with mean zero and variance σ². We considered four nonnormal distributions: (i) a scaled t-distribution with degrees of freedom 4, mimicking the error distribution in the ACTG 175 study; (ii) a uniform distribution; (iii) a skewed bimodal normal mixture; (iv) a symmetric bimodal normal mixture. Here the mixtures in (iii) and (iv) are from two normal distributions $N (μ_{1}, σ_{N}^{2})$ and $N (μ_{2}, σ_{N}^{2})$ with μ₁ = (1 − p_m)aσ_N, μ₂ = −p_maσ_N and $σ_{N}^{2} = σ^{2} / (1 + p_{m} (1 - p_{m}) a^{2})$ , where p_m is the mixing proportion and aσ_N is the distance between the means of the two normal distributions, with p_m = 0.3 and a = 3 for (iii) and p_m = 0.5 and a = 10 for (iv). The results for n = 1000 are given in Table 2, which mostly follow a pattern similar to that observed in case 1. Although the RC estimator works better than the naive estimator in most cases, it may have larger bias when the censoring rate is 50% and the error variance is 0.2. The results for n = 500 are given in Web Table S1.

TABLE 2.

Simulation results for nonnormal error

Error distribution	σ ²	Method	LH (50% censoring)				SH (80% censoring)
Error distribution	σ ²	Method	Bias	SD	SE	CP	Bias	SD	SE	CP
		Ideal	0.000	0.096	0.094	0.948	0.002	0.155	0.156	0.947
t	0.01	Naive	0.197	0.091	0.092	0.416	0.202	0.154	0.151	0.722
		RC	−0.187	0.117	0.114	0.621	−0.149	0.192	0.187	0.890
		RRC	−0.109	0.113	0.114	0.847	−0.107	0.192	0.187	0.908
		SCS	−0.015	0.098	0.098	0.945	0.013	0.170	0.167	0.947
	0.02	N	0.253	0.092	0.092	0.204	0.263	0.149	0.149	0.555
		RC	−0.274	0.146	0.114	0.368	−0.207	0.203	0.186	0.782
		RRC	−0.099	0.115	0.114	0.863	−0.098	0.188	0.186	0.916
		SCS	−0.032	0.097	0.098	0.933	0.003	0.173	0.171	0.946
uniform	0.01	Naive	0.217	0.091	0.092	0.332	0.220	0.148	0.150	0.676
		RC	−0.156	0.113	0.113	0.728	−0.123	0.184	0.186	0.916
		RRC	−0.085	0.113	0.114	0.886	−0.085	0.185	0.186	0.934
		SCS	0.007	0.099	0.098	0.945	0.033	0.165	0.167	0.947
	0.02	Naive	0.276	0.092	0.092	0.145	0.286	0.147	0.149	0.510
		RC	−0.222	0.125	0.113	0.487	−0.158	0.190	0.184	0.868
		RRC	−0.073	0.114	0.113	0.897	−0.070	0.185	0.185	0.934
		SCS	−0.007	0.100	0.098	0.949	0.030	0.172	0.171	0.946
skewed normal mixture	0.01	Naive	0.204	0.092	0.092	0.408	0.213	0.151	0.150	0.685
skewed normal mixture		RC	−0.171	0.116	0.113	0.660	−0.131	0.187	0.186	0.898
		RRC	−0.100	0.115	0.114	0.862	−0.093	0.188	0.187	0.930
		SCS	−0.008	0.099	0.098	0.941	0.026	0.168	0.167	0.945
	0.02	N	0.255	0.093	0.092	0.224	0.273	0.150	0.149	0.541
		RC	−0.247	0.131	0.113	0.434	−0.175	0.191	0.185	0.846
		RRC	−0.095	0.116	0.114	0.858	−0.084	0.187	0.186	0.930
		SCS	−0.030	0.099	0.098	0.937	0.015	0.171	0.171	0.952
symmetric normal mixture	0.01	Naive	0.221	0.091	0.092	0.319	0.227	0.149	0.150	0.647
		RC	−0.152	0.114	0.113	0.739	−0.114	0.185	0.186	0.922
		RRC	−0.081	0.113	0.114	0.893	−0.077	0.185	0.186	0.933
		SCS	0.011	0.098	0.098	0.955	0.042	0.165	0.167	0.947
	0.02	Naive	0.280	0.091	0.092	0.133	0.294	0.147	0.148	0.489
		RC	−0.215	0.123	0.113	0.512	−0.149	0.186	0.184	0.877
		RRC	−0.067	0.113	0.113	0.912	−0.061	0.183	0.185	0.938
		SCS	−0.003	0.098	0.099	0.952	0.040	0.170	0.171	0.952

Open in a new tab

Note: λ(t|I(X_i(t) > x₀)) = λ₀(t) exp{β₀I(X_i(t) > x₀)} with β₀ = −1, x₀ = log₁₀350, X_i(t) = α_i0 + α_i1t measured at t = 0, 6, 12, …, 84, n = 1000. LH, larger λ₀(t) = 0.02I(t ≥ 12) + 0.005I(t ≥ 60); SH, smaller λ₀(t) = 0.0055I(t ≥ 12) + 0.0025I(t ≥ 60). SD, empirical standard deviation of the estimates across simulated data sets; SE, average of estimated standard errors; CP, coverage probability of the 95% Wald confidence interval.

In case 3, we added in case 1 a treatment indicator Z, which was generated from a Bernoulli distribution with probability 0.5. The covariate X(t) has mean 2.627 but with different slopes for the two treatment arms, which equal −0.035 when Z = 0 and 0.015 when Z = 1. The true coefficient γ₀ = −0.5. The censoring rates were 59% and 84%, and ${\bar{m}}^{*} = 9.7 (SD = 4.6)$ and 11.5 (SD = 4.4), respectively, corresponding to the larger and smaller λ₀(t). The results for n = 1000 are shown in Table 3. The naive estimator shows obvious bias for estimating both β₀ and γ₀. The RC and RRC reduces bias but the bias is still large for estimating β₀. The SCS works well for estimating both β₀ and γ₀.

TABLE 3.

Simulation results with an extra treatment indicator

			LH (59% censoring)				SH (84% censoring)
	σ ²	Method	Bias	SD	SE	CP	Bias	SD	SE	CP
β ₀	0.01	Ideal	−0.005	0.111	0.108	0.944	0.001	0.187	0.183	0.936
		Naive	0.212	0.109	0.106	0.492	0.228	0.183	0.178	0.740
		RC	−0.179	0.134	0.130	0.718	−0.130	0.224	0.219	0.911
		RRC	−0.107	0.134	0.131	0.868	−0.091	0.225	0.220	0.927
		SCS	−0.009	0.123	0.117	0.932	0.039	0.203	0.195	0.946
	0.02	Naive	0.270	0.108	0.106	0.278	0.292	0.184	0.177	0.613
		RC	−0.257	0.146	0.130	0.499	−0.184	0.231	0.218	0.861
		RRC	−0.104	0.132	0.131	0.882	−0.089	0.225	0.219	0.926
		SCS	−0.031	0.126	0.119	0.928	0.032	0.209	0.200	0.941
γ ₀		Ideal	−0.011	0.111	0.109	0.948	−0.008	0.178	0.181	0.955
	0.01	Naive	−0.087	0.111	0.109	0.881	−0.092	0.181	0.183	0.922
		RC	−0.041	0.111	0.109	0.933	−0.050	0.180	0.182	0.952
		RRC	−0.056	0.111	0.109	0.928	−0.058	0.180	0.182	0.947
		SCS	−0.018	0.113	0.109	0.941	−0.025	0.183	0.182	0.957
	0.02	Naive	−0.109	0.111	0.109	0.818	−0.118	0.182	0.183	0.903
		RC	−0.031	0.112	0.109	0.940	−0.044	0.180	0.182	0.957
		RRC	−0.063	0.111	0.109	0.917	−0.064	0.181	0.182	0.951
		SCS	−0.017	0.114	0.109	0.935	−0.027	0.184	0.183	0.952

Open in a new tab

Note: λ(t|I(X_i(t) > x₀)) = λ₀(t) exp{β₀I(X_i(t) > x₀) + γ₀Z_i} with β₀ = −1, γ₀ = −0.5, X_i(t) = α_i0 + α_i1t measured at t = 0, 6, 12, …, 84, normal error, x₀ = log₁₀350, normal error, n = 1000. LH, larger λ₀(t) = 0.02I(t ≥ 12) + 0.005I(t ≥ 60); SH, smaller λ₀(t). SD, empirical standard deviation of the estimates across simulated datasets; SE, average of estimated standard errors; CP, coverage probability of the 95% Wald confidence interval.

We also conducted simulations in settings with different β₀, with less observations on $X (t) ({\bar{m}}^{*} < 3)$ , with informative observation times, or with trichotomized categories of X(t) (See Web Appendix S.2). Overall, the simulation evidence suggests that the SCS works better than the naive, RC, and RRC methods. We also assessed the performance of the SCS with h_n selected by fivefold cross-validation for a small number (100) of simulated data sets under cases 1 and 2 with n = 1000 (Web Table S6). It works reasonably well compared to the results calculated based on (6).

6.2 |. Application

We applied the proposed approaches to the ACTG 175 data. We are interested in evaluating I(CD4 count ≤ 350 cells/mm³) as a potential surrogate marker. There were a total of 308 events observed during the study. The log10 transformation was applied to CD4 count to achieve approximately constant error variance. Figure 2(A) presents the log10-transformed CD4 trajectories for 10 randomly selected subjects, which shows an initial increase, with a peak around week 12, followed by an approximate linear decline. Because only nine events occurred by week 12, for simplicity, we considered the data after week 12 assumed X(u) = α₀ + α₁u represents the inherent log10 CD4 count at time u. The CD4 observations before week 12 were excluded from the analysis. The analysis included 2186 subjects with at least two CD4 observations after week 12 with an average 9.4 (SD = 3.2) observations. The residual plot from the least square estimates shows that error variance is approximately constant after the log10 transformation (Figure 2B), and the corresponding normal and Student’s t Q–Q plots indicate that the error distribution may be short tailed compared to the normal but close to a scaled t-distribution with degrees of freedom 4 (Figure 2C and D). The estimated error variance is 0.011, which was about 40% of the estimated baseline CD4 variance. The primary analysis found zidovudine alone to be inferior to the other three therapies; thus, further investigations focused on two treatment groups, zidovudine alone and the combination of the other three. Let Z = I(treatment ≠ zidovudine alone) be the treatment indicator. The estimate $\bar{\hat{α}} = {(2.525, - 0.0023)}^{T}$ for Z = 0 and (2.581, −0.0021) for Z = 1.

(A) Trajectories of log₁₀(CD4) for 10 randomly selected subjects; (B) residual plot; (C) normal Q–Q plot, plot of empirical quantiles of residuals versus the theoretical quantiles of standard normal distribution; (D) Student’s t Q–Q plot, plot of empirical quantiles of residuals versus theoretical quantiles of t distribution with degrees of freedom 4. The reference lines in (C) and (D) are obtained from robust linear regression of the empirical quantiles on the theoretical quantiles. This figure appears in color in the electronic version of this article, and any mention of color refers to that version

To assess I(CD4 count ≤ 350 cells/mm³) or equivalently I(X_i(t) > log₁₀ 350) as a surrogate marker, we consider three proportional hazards models: (1) a hazard model with the covariate treatment Z only; (2) a hazard model with I(X_i(t) > log₁₀ 350) only; (3) a hazard model with both Z and I(X_i(t) > log₁₀ 350). Model (1) includes only an error-free covariate Z and was fitted via the standard partial likelihood approach. Models (2) and (3) include a dichotomized covariate of X_i(t) and were fitted using the naive, RC, RRC, and SCS approaches.

The results are shown in Table 4. Treatment alone [Model (1)] shows a significant effect. The results from the naive, SCS, RC and RRC approaches also show that I(X_i(t) > log₁₀ 350) alone [Model (2)] has a significant effect on survival time, which indicates that CD4 count ≤ 350 cells/mm³ is associated with larger hazard and is prognostic for time to AIDS or death. Model (3) shows that the treatment effect is no longer significant after adjusted for I(X_i(t) > log₁₀ 350), which implies that the treatment effect is mediated through whether CD4 count ≤ 350 cells/mm³, and hence confirms the Prentice’s conditions of surrogate markers. The SCS estimates are larger in magnitude for estimation of β₀ and smaller for estimation of γ₀ compared to the naive estimates, which might be due to the correction of the bias caused by the measurement error. The RC and RRC estimates are larger in magnitude than the corresponding SCS estimates, and the standard errors are either larger or comparable.

TABLE 4.

Analysis of the ACTG 175 data

		β ₀		exp(β₀)		γ ₀		exp(γ₀)
		Est	SE	Est	CI	Est	SE	Est	CI
Model 1		–	–	–	–	−0.363	0.131	0.696	(0.538,0.900)
Model 2	Naive	−1.978	0.215	0.138	(0.091,0.211)	–	–	–	–
	RC	−2.864	0.293	0.057	(0.032,0.101)	–	–	–	–
	RRC	−2.828	0.290	0.059	(0.033,0.104)	–	–	–	–
	SCS	−2.673	0.265	0.069	(0.041,0.116)	–	–	–	–
Model 3	Naive	−1.958	0.216	0.141	(0.092,0.215)	−0.196	0.133	0.822	(0.633,1.067)
	RC	−2.705	0.282	0.067	(0.038,0.116)	−0.189	0.133	0.827	(0.637,1.074)
	RRC	−2.696	0.282	0.067	(0.039,0.117)	−0.190	0.133	0.827	(0.637,1.074)
	SCS	−2.670	0.268	0.069	(0.041,0.117)	−0.157	0.133	0.855	(0.659,1.109)

Open in a new tab

Abbreviations: CI, 95% confidence interval; Est, estimate; SE, estimated standard errors.

7 |. DISCUSSION

We propose an SCS approach for the proportional hazards model with time-dependent discretized covariates. We also extend the RC and RRC to time-dependent discretized covariates. Our numerical studies indicate that the SCS works better than the RC and RRC. We focus on the case when the cut-off points are known. In some situations, the cut-off points need to be estimated (e.g., quantiles). The case is of interest but beyond the scope of this paper.

The proposed approaches can be extended to more flexible models such as the time-varying coefficient model and the partially time-varying coefficient models using techniques as those in Song and Wang (2008) and Song and Wang (2017).

Supplementary Material

Supplementary material: code

NIHMS1826395-supplement-Supplementary_material__code.zip^{(525.3KB, zip)}

Supplementary material

NIHMS1826395-supplement-Supplementary_material.pdf^{(300.3KB, pdf)}

ACKNOWLEDGMENTS

This research was partially supported by NIH grants R43GM134768 and R44GM100573 (Chao, Wang and Song), CA239168 (Wang and Song), CA235122 and S10OD028685 (Wang), and NSF grant DMS-1916411 (Song).

Funding information

National Science Foundation, Grant/Award Number: 1916411; National Institutes of Health, Grant/Award Numbers: CA235122, CA239168, R43GM134768, R44GM100573, S10OD028685

Footnotes

SUPPORTING INFORMATION

Web Appendices and Tables referenced in Sections 3, 4, and 6.1 along with the implemented C++ code and the demonstrating simulation data are available with this paper at the Biometrics website on Wiley Online Library.

DATA AVAILABILITY STATEMENT

The ACTG 175 data used in Section 6.2 are available on request from AIDS Clinical Trials Group (https://actgnetwork.org/).

REFERENCES

Carroll RJ, Ruppert D, Stefanski LA & Crainiceanu CM (2006) Measurement error in nonlinear models. New York, NY: Chapman and Hall/CRC. [Google Scholar]
Dafni UG & Tsiatis AA (1998) Evaluating surrogate markers of clinical outcome measured with error. Biometrics, 54, 1445–1462. [PubMed] [Google Scholar]
Dalen I, Buonaccorsi JP, Sexton JA, Laake P & Thoresen M (2009) Correction for misclassification of a categorized exposure in binary regression using replication data. Statistics in Medicine, 28, 3386–3410. [DOI] [PubMed] [Google Scholar]
Flegal KM, Keyl PM & Nieto FJ (1991) Differential misclassification arising from nondifferential errors in exposure measurement. American Journal of Epidemiology, 134, 1233–1244. [DOI] [PubMed] [Google Scholar]
Gustafson P (2004) Measurement error and misclassification in statistics and epidemiology: impacts and Bayesian adjustments. New York, NY: Chapman and Hall. [Google Scholar]
Gustafson P & Le DN (2002) Comparing the effects of continuous and discrete covariate mismeasurement, with emphasis on the dichotomization of mismeasured predictors. Biometrics, 58, 878–887. [DOI] [PubMed] [Google Scholar]
Hammer SM, Katezstein DA, Hughes MD, Gundaker H, Schooley RT, Haubrich RH et al. (1996) A trial comparing nucleoside monotherapy with combination therapy in HIV-infected adults with CD4 cell counts from 200 to 500 per cubic millimeter. New England Journal of Medicine, 335, 1081–1089. [DOI] [PubMed] [Google Scholar]
Heller G (2007) Smoothed rank regression with censored data. Journal of the American Statistical Association, 102, 552–559. [Google Scholar]
Horowitz JL (1992) A smoothed maximum score estimator for the binary response model. Econometrica, 60, 505–531. [Google Scholar]
Liao X, Zucker DM, Li Y & Speigelman D (2011) Survival analysis with error-prone time-varying covariates: a risk set calibration approach. Biometrics, 67, 50–58. [DOI] [PMC free article] [PubMed] [Google Scholar]
Ma S & Huang J (2005) Regularized ROC method for disease classification and biomarker selection with microarray data. Bioinformatics, 21, 4356–4362. [DOI] [PubMed] [Google Scholar]
May MT, Gompels M, Delpech V, Porter K, Orkin C, Kegg S et al. (2014) Impact on life expectancy of HIV-1 positive individuals of CD4+ cell count and viral load response to antiretroviral therapy. AIDS, 28, 1193–1202. [DOI] [PMC free article] [PubMed] [Google Scholar]
Natarajan L (2009) Regression calibration for dichotomized mismeasured predictors. International Journal of Biostatistics 5(1), 1143. [DOI] [PMC free article] [PubMed] [Google Scholar]
Prentice R (1982) Covariate measurement errors and parameter estimates in a failure time regression model. Biometrika, 69, 331–342. [Google Scholar]
Prentice R (1989) Surrogate endpoints in clinical trials: definition and operation criteria. Statistics in Medicine, 8, 431–440. [DOI] [PubMed] [Google Scholar]
Seguin RA, Buchner D, Lui J, Messina C, Manson J, Moreland L et al. (2014) Sedentary behavior and mortality in older women: the womens health initiative observational and extension studies. American Journal of Preventive Medicine, 46, 122–135. [DOI] [PMC free article] [PubMed] [Google Scholar]
Song X (2017) An improved corrected score estimator for the proportional hazards model with time-dependent covariates measured with error at informative observation times. Statistica Sinica, 27, 1037–1057. [Google Scholar]
Song X, Davidian M & Tsiatis AA (2002a) A semiparametric likelihood approach to joint modeling of longitudinal and time-to-event data. Biometrics, 58, 742–753. [DOI] [PubMed] [Google Scholar]
Song X, Davidian M & Tsiatis AA (2002b) An estimator for the proportional hazards model with multiple longitudinal covariates measured with error. Biostatistics, 3, 511–528. [DOI] [PubMed] [Google Scholar]
Song X, Ma S, Huang J & Zhou XH (2007) A semiparametric approach for the nonparametric transformation survival model with multiple covariates. Biostatistics 8, 197–211. [DOI] [PubMed] [Google Scholar]
Song X & Wang CY (2008) Semiparametric approaches for joint modeling of longitudinal and survival data with time varying coefficients. Statistica Sinica, 27, 3178–3190. [DOI] [PubMed] [Google Scholar]
Song X & Wang L (2017) Partially time-varying coefficient proportional hazards models with error-prone time-dependent covariates - an application to the AIDS clinical trial group 175 data. Annals of Applied Statistics, 11, 274–296. [Google Scholar]
Tsiatis AA & Davidian M (2001) A semiparametric estimator for the proportional hazards model with longitudinal covariates measured with error. Biometrika, 88, 447–458. [DOI] [PubMed] [Google Scholar]
Tsiatis AA, DeGruttola V & Wulfsohn MS (1995) Modeling the relationship of survival to longitudinal data measured with error: applications to survival and CD4 counts in patients with aids. Journal of the American Statistical Association, 90, 27–37. [Google Scholar]
Wang CY (2006) Corrected Score Estimator for Joint Modeling of Longitudinal And Failure Time Data. Statistica Sinica, 16, 235–253. [Google Scholar]
Wang CY, Tapsoba JD, Duggan C, Campbell K & McTiernan A (2016) Methods to Adjust for Misclassification in The Quantiles for the Generalized Linear Model With Measurement Error in Continuous Exposures. Statistics in Medicine, 35, 1676–1688. [DOI] [PMC free article] [PubMed] [Google Scholar]
Wulfsohn MS & Tsiatis AA (1997) A joint model for survival and longitudinal data measured with error. Biometrics, 53, 330–339. [PubMed] [Google Scholar]
World Health Organization (2009) Rapid advice: antiretroviral therapy for HIV infection in adults and adolescents. Available at: https://apps.who.int/iris/handle/10665/107280 (last accessed on 11/08/2021). [PubMed]
Xie SX, Wang CY & Prentice RL (2001) A risk set calibration method for failure time regression by using a covariate reliability sample. Journal of Royal Statistical Society, Series B, 63, 855–870. [Google Scholar]
Xu C, Baines PD & Wang JL (2014) Standard error estimation using the EM algorithm for the joint modeling of survival and longitudinal data. Biostatistics, 15, 731–744. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary material: code

NIHMS1826395-supplement-Supplementary_material__code.zip^{(525.3KB, zip)}

Supplementary material

NIHMS1826395-supplement-Supplementary_material.pdf^{(300.3KB, pdf)}

Data Availability Statement

The ACTG 175 data used in Section 6.2 are available on request from AIDS Clinical Trials Group (https://actgnetwork.org/).

[R1] Carroll RJ, Ruppert D, Stefanski LA & Crainiceanu CM (2006) Measurement error in nonlinear models. New York, NY: Chapman and Hall/CRC. [Google Scholar]

[R2] Dafni UG & Tsiatis AA (1998) Evaluating surrogate markers of clinical outcome measured with error. Biometrics, 54, 1445–1462. [PubMed] [Google Scholar]

[R3] Dalen I, Buonaccorsi JP, Sexton JA, Laake P & Thoresen M (2009) Correction for misclassification of a categorized exposure in binary regression using replication data. Statistics in Medicine, 28, 3386–3410. [DOI] [PubMed] [Google Scholar]

[R4] Flegal KM, Keyl PM & Nieto FJ (1991) Differential misclassification arising from nondifferential errors in exposure measurement. American Journal of Epidemiology, 134, 1233–1244. [DOI] [PubMed] [Google Scholar]

[R5] Gustafson P (2004) Measurement error and misclassification in statistics and epidemiology: impacts and Bayesian adjustments. New York, NY: Chapman and Hall. [Google Scholar]

[R6] Gustafson P & Le DN (2002) Comparing the effects of continuous and discrete covariate mismeasurement, with emphasis on the dichotomization of mismeasured predictors. Biometrics, 58, 878–887. [DOI] [PubMed] [Google Scholar]

[R7] Hammer SM, Katezstein DA, Hughes MD, Gundaker H, Schooley RT, Haubrich RH et al. (1996) A trial comparing nucleoside monotherapy with combination therapy in HIV-infected adults with CD4 cell counts from 200 to 500 per cubic millimeter. New England Journal of Medicine, 335, 1081–1089. [DOI] [PubMed] [Google Scholar]

[R8] Heller G (2007) Smoothed rank regression with censored data. Journal of the American Statistical Association, 102, 552–559. [Google Scholar]

[R9] Horowitz JL (1992) A smoothed maximum score estimator for the binary response model. Econometrica, 60, 505–531. [Google Scholar]

[R10] Liao X, Zucker DM, Li Y & Speigelman D (2011) Survival analysis with error-prone time-varying covariates: a risk set calibration approach. Biometrics, 67, 50–58. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R11] Ma S & Huang J (2005) Regularized ROC method for disease classification and biomarker selection with microarray data. Bioinformatics, 21, 4356–4362. [DOI] [PubMed] [Google Scholar]

[R12] May MT, Gompels M, Delpech V, Porter K, Orkin C, Kegg S et al. (2014) Impact on life expectancy of HIV-1 positive individuals of CD4+ cell count and viral load response to antiretroviral therapy. AIDS, 28, 1193–1202. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R13] Natarajan L (2009) Regression calibration for dichotomized mismeasured predictors. International Journal of Biostatistics 5(1), 1143. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R14] Prentice R (1982) Covariate measurement errors and parameter estimates in a failure time regression model. Biometrika, 69, 331–342. [Google Scholar]

[R15] Prentice R (1989) Surrogate endpoints in clinical trials: definition and operation criteria. Statistics in Medicine, 8, 431–440. [DOI] [PubMed] [Google Scholar]

[R16] Seguin RA, Buchner D, Lui J, Messina C, Manson J, Moreland L et al. (2014) Sedentary behavior and mortality in older women: the womens health initiative observational and extension studies. American Journal of Preventive Medicine, 46, 122–135. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R17] Song X (2017) An improved corrected score estimator for the proportional hazards model with time-dependent covariates measured with error at informative observation times. Statistica Sinica, 27, 1037–1057. [Google Scholar]

[R18] Song X, Davidian M & Tsiatis AA (2002a) A semiparametric likelihood approach to joint modeling of longitudinal and time-to-event data. Biometrics, 58, 742–753. [DOI] [PubMed] [Google Scholar]

[R19] Song X, Davidian M & Tsiatis AA (2002b) An estimator for the proportional hazards model with multiple longitudinal covariates measured with error. Biostatistics, 3, 511–528. [DOI] [PubMed] [Google Scholar]

[R20] Song X, Ma S, Huang J & Zhou XH (2007) A semiparametric approach for the nonparametric transformation survival model with multiple covariates. Biostatistics 8, 197–211. [DOI] [PubMed] [Google Scholar]

[R21] Song X & Wang CY (2008) Semiparametric approaches for joint modeling of longitudinal and survival data with time varying coefficients. Statistica Sinica, 27, 3178–3190. [DOI] [PubMed] [Google Scholar]

[R22] Song X & Wang L (2017) Partially time-varying coefficient proportional hazards models with error-prone time-dependent covariates - an application to the AIDS clinical trial group 175 data. Annals of Applied Statistics, 11, 274–296. [Google Scholar]

[R23] Tsiatis AA & Davidian M (2001) A semiparametric estimator for the proportional hazards model with longitudinal covariates measured with error. Biometrika, 88, 447–458. [DOI] [PubMed] [Google Scholar]

[R24] Tsiatis AA, DeGruttola V & Wulfsohn MS (1995) Modeling the relationship of survival to longitudinal data measured with error: applications to survival and CD4 counts in patients with aids. Journal of the American Statistical Association, 90, 27–37. [Google Scholar]

[R25] Wang CY (2006) Corrected Score Estimator for Joint Modeling of Longitudinal And Failure Time Data. Statistica Sinica, 16, 235–253. [Google Scholar]

[R26] Wang CY, Tapsoba JD, Duggan C, Campbell K & McTiernan A (2016) Methods to Adjust for Misclassification in The Quantiles for the Generalized Linear Model With Measurement Error in Continuous Exposures. Statistics in Medicine, 35, 1676–1688. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R27] Wulfsohn MS & Tsiatis AA (1997) A joint model for survival and longitudinal data measured with error. Biometrics, 53, 330–339. [PubMed] [Google Scholar]

[R28] World Health Organization (2009) Rapid advice: antiretroviral therapy for HIV infection in adults and adolescents. Available at: https://apps.who.int/iris/handle/10665/107280 (last accessed on 11/08/2021). [PubMed]

[R29] Xie SX, Wang CY & Prentice RL (2001) A risk set calibration method for failure time regression by using a covariate reliability sample. Journal of Royal Statistical Society, Series B, 63, 855–870. [Google Scholar]

[R30] Xu C, Baines PD & Wang JL (2014) Standard error estimation using the EM algorithm for the joint modeling of survival and longitudinal data. Biostatistics, 15, 731–744. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

A smoothed corrected score approach for proportional hazards model with misclassified discretized covariates induced by error-contaminated continuous time-dependent exposure

Xiao Song

Edward C Chao

Ching-Yun Wang

Abstract

1 |. INTRODUCTION

2 |. MODEL DEFINITION

3 |. LINEARIZATION-BASED RC METHODS

FIGURE 1.

4 |. SMOOTHED CORRECTED SCORE ESTIMATOR

Lemma 1.

Proposition 1.

Remark.

5 |. EXTENSION TO MULTIPLE DISCRETIZED CATEGORIES

6 |. NUMERICAL STUDIES

6.1 |. Simulation studies

TABLE 1.

TABLE 2.

TABLE 3.

6.2 |. Application

FIGURE 2.

TABLE 4.

7 |. DISCUSSION

Supplementary Material

ACKNOWLEDGMENTS

Funding information

Footnotes

DATA AVAILABILITY STATEMENT

REFERENCES

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

A smoothed corrected score approach for proportional hazards model with misclassified discretized covariates induced by error-contaminated continuous time-dependent exposure

Xiao Song

Edward C Chao

Ching-Yun Wang

Abstract

1 |. INTRODUCTION

2 |. MODEL DEFINITION

3 |. LINEARIZATION-BASED RC METHODS

FIGURE 1.

4 |. SMOOTHED CORRECTED SCORE ESTIMATOR

Lemma 1.

Proposition 1.

Remark.

5 |. EXTENSION TO MULTIPLE DISCRETIZED CATEGORIES

6 |. NUMERICAL STUDIES

6.1 |. Simulation studies

TABLE 1.

TABLE 2.

TABLE 3.

6.2 |. Application

FIGURE 2.

TABLE 4.

7 |. DISCUSSION

Supplementary Material

ACKNOWLEDGMENTS

Funding information

Footnotes

DATA AVAILABILITY STATEMENT

REFERENCES

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases