Joint modeling of longitudinal and survival data with the Cox model and two-phase sampling

Rong Fu; Peter B Gilbert

doi:10.1007/s10985-016-9364-1

. Author manuscript; available in PMC: 2018 Jan 1.

Published in final edited form as: Lifetime Data Anal. 2016 Mar 23;23(1):136–159. doi: 10.1007/s10985-016-9364-1

Joint modeling of longitudinal and survival data with the Cox model and two-phase sampling

Rong Fu ¹, Peter B Gilbert ²

PMCID: PMC5035179 NIHMSID: NIHMS772253 PMID: 27007859

Abstract

A common objective of cohort studies and clinical trials is to assess time-varying longitudinal continuous biomarkers as correlates of the instantaneous hazard of a study endpoint. We consider the setting where the biomarkers are measured in a designed sub-sample (i.e., case-cohort or two-phase sampling design), as is normative for prevention trials. We address this problem via joint models, with underlying biomarker trajectories characterized by a random effects model and their relationship with instantaneous risk characterized by a Cox model. For estimation and inference we extend the conditional score method of Tsiatis and Davidian (2001) to accommodate the two-phase biomarker sampling design using augmented inverse probability weighting with nonparametric kernel regression. We present theoretical properties of the proposed estimators and finite-sample properties derived through simulations, and illustrate the methods with application to the AIDS Clinical Trials Group (ACTG) 175 antiretroviral therapy trial. We discuss how the methods are useful for evaluating a Prentice surrogate endpoint, mediation, and for generating hypotheses about biological mechanisms of treatment efficacy.

Keywords: Case-cohort, Measurement error, Proportional hazards model, Prentice surrogate endpoint evaluation, Random effects model

1 Introduction

Consider a prospective cohort study or randomized clinical trial that measures a continuous biomarker(s) longitudinally in a designed sub-sample of study participants and follows subjects for a study endpoint of interest. Our goal is to study the association of the underlying trajectory of the true biomarker over time with the instantaneous hazard rate of the study endpoint via a Cox model, with related goals to assess the biomarker as a surrogate endpoint (Prentice, 1989) and as a mediator. To allow unbiased estimation accounting for measurement error in the biomarker, we use the “joint modeling” approach that models the underlying true trajectory of the time-dependent covariate and the event time simultaneously (Tsiatis et al, 1995).

A joint model consists of two sub-models: one for the inherent trajectory of the time-dependent covariate and one for the time-to-event process. For the covariate trajectory, commonly used models include linear mixed effects models (Henderson et al, 2000; Guo and Carlin, 2004) and linear random effects models (Dafni and Tsiatis, 1998; Wulfsohn and Tsiatis, 1997; Tsiatis and Davidian, 2001). For the time-to-event process, we focus on the Cox model because it can be a good fit to vaccine efficacy trial applications where there is major interest in understanding how the current level of an immune response biomarker affects the instantaneous risk of pathogen infection, and the proportional hazards assumption may be reasonable. Unlike the standard Cox model, the joint modeling method aims to quantify the association of the underlying true and unobserved evolution of the biomarker process with the time-to-event process.

One approach to fitting the joint model is the biased two-stage method, which first fits the time-dependent covariate models and secondly uses ordinary Cox regression with fitted covariate values from the stage one models (Self and Pawitan, 1992). Another approach is recalibration methods building on Prentice (1982), which reduce the bias by estimating the hazard function given the observed covariate values (Tsiatis et al, 1995; Wang et al, 2001). However, since it is complex to derive the analytical form of the observed-covariate hazard, such methods are generally based on strong modeling assumptions and approximations, thus not completely eliminating bias. Likelihood approaches have also been developed, which are asymptotically consistent and efficient under correct specification of the joint distribution of the random effects and the event time data (DeGruttola and Tu, 1994; Wulfsohn and Tsiatis, 1997; Rizopoulos et al, 2009). However, misspecification induces bias and the method is computationally burdensome, especially for multiple biomarkers, because the likelihood function typically does not have a closed form solution for the integral over unknown random effects.

Tsiatis and Davidian (2001) developed a conditional score method with notable innovation that it does not require any distributional assumptions for the random effects. This method is much less computationally intensive and is easy to generalize to handle multiple time-dependent covariates (Song et al, 2002). Moreover, it yields consistent and asymptotically normal estimators under regularity conditions. Wang (2006) proposed a corrected score method and provided comparison studies to the two-stage, regression calibration and conditional score methods. For reviews of more joint modeling research see Tsiatis and Davidian (2004) and Wu et al (2012). Here we favor the conditional score method, because many studies assess multiple biomarkers (e.g., vaccine efficacy trials) where the likelihood approach may computationally fail. However, to the best of our knowledge, no joint modeling approaches have focused on the common situation where the longitudinal biomarkers are measured on a designed sub-sample of the full study cohort, for example with a case-cohort sample (Prentice, 1986) or two-phase sample (Haneuse et al, 2011). We extend the full data conditional score method to handle general types of missing at random sub-sampling designs with Bernoulli sampling of subjects for measuring the biomarker trajectories, using inverse probability weighting (IPW) or augmented IPW (AIPW) to correct for biased sampling. Our method is developed under a general two-phase sampling scheme, which includes the outcome-dependent sampling designs as special cases (e.g., case-control and case-cohort designs). For simplicity of exposition, we concentrate on the model with a single longitudinal biomarker. The theories and derivations also apply for multiple biomarkers.

Section 2 introduces the models and summarizes their application for assessing Prentice surrogate endpoints and mediators. Section 3 develops IPW and AIPW estimation and testing procedures, including theoretical results on consistency, asymptotic normality, and relative efficiency. Section 4 studies finite-sample properties of the methods in simulations where the case-control sub-sampling design is considered. Section 5 illustrates the methods with a case-control sample from the ACTG 175 trial, and Section 6 concludes with discussion. Web Appendix A in Online Resource provides results on using the method for assessing mediation. Web Appendix B provides background IPW and AIPW theory that undergirds Section 2. Web Appendix C contains proofs of all results in Section 2 (drawing on results in Web Appendix B), Web Appendix D contains additional simulation results, and Web Appendix E contains additional ACTG 175 results. R code implementing the methods is posted at the second author’s website.

2 Notation, Joint Model, Sampling Design, Surrogates and Mediators

Let T and C be the event time and censoring time; we observe V = min(T, C) and Δ = I(T ≤ C). Let Z̃ = (Z, L^T)^T where Z is the randomized treatment group (1 for active and 0 for placebo) and L is a p − 1 dimensional vector of baseline time-independent covariates of interest to control for in assessing the association of the biomarker with the hazard of T. The same set up applies for prospective cohort studies without an assigned treatment Z, with Z̃ = L. We denote the time-dependent biomarker process– which is not observed directly–by X̃(τ) = {X(u), 0 ≤ u ≤ τ}, with τ the end of follow-up and X(u) the biomarker value at time u. We assume the following random effects model representing the inherent trajectory of X(u):

X (u) = α^{T} f (u),

(1)

where f(u) is a q-dimensional vector of known functions of time u and α is a vector of subject-specific random effects. We assume the observed longitudinal biomarker values W (u) are from an additive measurement error model

W (u) = α^{T} f (u) + e (u),

(2)

where the measurement errors e(u) are normally distributed with mean zero and variance ℂov{e(u), e(s)} = I(u = s)σ². Suppose the set of measurement time points are $T^{m} = {(T_{1}^{m}, \dots, T_{J}^{m})}^{T}$ with $0 \leq T_{1}^{m} < T_{2}^{m} < \dots < T_{J}^{m} \leq V$ , and J the total number of time points. These time points and J can vary across subjects, which occurs for many studies. Let W = (W₁,⋯, W_J)^T where $W_{j} = W (T_{j}^{m})$ and e = (e₁,⋯, e_J)^T with $e_{j} = e (T_{j}^{m})$ .

For any fixed time u, let J(u) be the number of measurement time points up to and including time $u (i.e. 0 \leq T_{1}^{m} < \dots < T_{J (u)}^{m} \leq u < T_{J (u) + 1}^{m}), T^{m} (u) = {(T_{1}^{m}, \dots, T_{J (u)}^{m})}^{T}$ be the corresponding vector of ordered measurement time points, and ẽ(u) = (e₁,⋯, e_J₍_u₎)^T. We consider the following proportional hazards model of the event time

\begin{array}{l} λ (u) = lim_{d u \to 0} \frac{1}{d u} ℙ r {u \leq T < u + d u ∣ T \geq u, α, \tilde{Z}, T^{m} (u), \tilde{e} (u), C} \\ = lim_{d u \to 0} \frac{1}{d u} ℙ r (u \leq T < u + d u ∣ T \geq u, α, \tilde{Z}) \\ = λ_{0} (u) exp {X (u) β + {\tilde{Z}}^{T} η + X (u) Z γ} . \end{array}

(3)

We assume the censoring, measurement time points, and measurement errors are non-informative given the information already provided by α and Z̃, i.e. T ⊥ (C, T^m, e)|(α, Z̃). We also assume e is independent of (T, C, Z̃, α, T^m). More detailed discussion on the underlying assumptions of this model can be found in Tsiatis and Davidian (2004).

2.1 Two-phase sampling model

Preventive vaccine efficacy trials illustrate studies where biomarker data are not collected on all participants by design. For example, in the RV144 HIV vaccine trial, dozens of immune response biomarkers were measured using a two-phase sampling design (Haynes et al, 2012). To obtain valid inferences, appropriate techniques are needed to account for the biased sampling design that over-samples certain groups.

We consider the following sampling design. In the first phase, data $O_{i} = {(V_{i}, Δ_{i}, Z_{i}, L_{i}^{T}, A_{i}^{T})}^{T}$ , i = 1,⋯, N are collected from N independent subjects from a study population. The vector A_i is auxiliary variables that may be predictive of the biomarker. In the second phase, a Bernoulli random sample is taken from the N subjects, with sampling probabilities given by π(O_i). Let ξ_i be the binary indicator of being sampled (ξ_i = 1). Thus the longitudinal biomarker information {W_i, $T_{i}^{m}$ , J_i} is assessed only on subjects with ξ_i = 1. The observed data are {V_i, Δ_i, Z_i, L_i, A_i, ξ_i, ξ_iW_i, $ξ_{i} T_{i}^{m}$ , ξ_iJ_i} for i = 1,⋯, N.

The sampling probability model can be characterized by a parametric model with finite-dimensional parameter ρ,

ℙ r (ξ = 1 ∣ O, α, W, T^{m}, J) = ℙ r (ξ = 1 ∣ O) = π (O; ρ),

(4)

which expresses the missing at random (MAR) assumption. We also assume positive sampling probabilities with 0 < δ < π(O_i; ρ) ≤ 1 for some constant δ > 0 and all i = 1,⋯, N. As the outcome data are collected in the first phase on all subjects, an outcome-dependent sampling scheme could be employed with the sampling probabilities depending on the outcome variable.

For the special case of stratified Bernoulli sampling, the N subjects are divided exclusively and exhaustively into S strata based on O: {𝒪₁,⋯, 𝒪_S }. We use I(O ∈ 𝒪_s) to indicate whether a subject belongs to stratum 𝒪_s. Let N₁,⋯, N_S be the stratum sizes with N₁ + ;⋣+ N_S = N. In the second phase, if a subject belongs to stratum 𝒪_s, then s/he is sampled with probability $ρ_{s} : π (O; ρ) = \sum_{s = 1}^{S} I (O \in O_{s}) ρ_{s}$ . We also denote strata probabilities 𝒪s as ν_s = ℙr(O ∈ 𝒪_s) > 0, s = 1,⋯, S.

2.2 Prentice surrogate endpoint assessment

In addition to providing assessment of the association of the underlying time-dependent biomarker with instantaneous risk in one or more groups, in randomized trials our methods assess the underlying biomarker as a surrogate endpoint according to the Prentice (1989) definition. Prentice’s main operational criteria for checking this definition in terms of model (3) are: (i) Z has an effect on T, and Z has an effect on the covariate process X̃(τ); (ii) X̃(τ) is associated with the clinical endpoint T in each treatment group; and (iii) Z has no effect on T given the biomarker X̃(τ).

Conditions (i)–(iii) can be assessed based on estimation and testing of the coefficients in different versions of model (3). The key condition (iii) may be checked by first ruling out that the treatment effect on T is modified by the biomarker in (3). If ruled out, then the model λ(u) = λ₀(u) exp{X(u)β + Zη_Z|X + L^T η_L} may be fitted and one would check if η_Z|X is plausibly close to zero, for example based on a confidence interval. All of the model fits should include all known and putative dual predictors L for X(u) and T, in order that the Prentice criteria truly check the Prentice surrogate definition.

The proportion of treatment effect explained (PTE) may be used as a summary measure for the closeness of the biomarker in meeting the Prentice definition, PTE = 1 − η_Z|X /η_Z, where η_Z is the regression coefficient of Z without adjustment for the biomarker (Lin et al, 1997). The re-parameterized PTE from Kobayashi and Kuroki (2014) may also be used, the proportion of the treatment effect captured by the candidate surrogate (PCS), which is guaranteed to be in [0,1] and is less variable: PCS = 1/{1 + (1/PTE − 1)²}.

2.3 Natural direct effects/mediation approach

Mediation of the treatment effect through the true biomarker process may be studied using natural direct and indirect effects. Define the counterfactual underlying biomarker history up to and including time t as X̃^z(t) = {X^z(u), 0 ≤ u ≤ t} if treatment had been Z = z. Under the random effects model (1), this trajectory is entirely determined by the random effects. Moreover, the time to event depends on this trajectory entirely through the random effects. Therefore, we use notation $X_{α}^{z}$ to emphasize that the whole biomarker trajectory is determined by α.

The natural direct effect (NDE_z) contrasts a functional of the distributions of $T^{1 X_{α}^{z}}$ and $T^{0 X_{α}^{z}}$ , where $X_{α}^{z}$ is the biomarker value if treatment had been z, for z ∈ {0, 1}. Similarly, the natural indirect effect (NIE_z) contrasts $T^{z X_{α}^{1}}$ with $T^{z X_{α}^{0}}$ . The total effect (TE) can be decomposed into the sum of NDE_z and NIE₁_−z for each z = 0, 1. This motivates a “proportion of the total effect mediated by the biomarker” summary measure, for example pNIE_z =1−NDE_z /TE, or the PCS_z defined in Kobayashi and Kuroki (2014) that is guaranteed to be in [0,1], PCS_z = 1/{1+(1/pNIE_z −1)²}. Web Appendix A in Online Resource describes how model (3) can be used to define estimands NDE_z and NIE_z of interest, and provides results on identifiability assumptions and estimation.

3 IPW and AIPW Conditional Score Methods

Web Appendix B summarizes properties of IPW and AIPW estimators that undergird our results in this section, including results on efficiency and robustness to outlying sampling weights. We first briefly review the conditional score method with full sampling, which specifies notation needed later.

3.1 Conditional score estimator with full sampling

Let θ = (β, η^T, γ)^T be the regression coefficients in model (3). Define the event counting process N(u) = I(Δ = 1, V ≤ u, J(u) ≥ q) and the at-risk process Y (u) = I(V ≥ u, J(u) ≥ q), where J(u) ≥ q indicates that ≥ q measurements have been observed up to and including time u. Define the design matrix and the vector of observed longitudinal measurements for each subject up to and including time u as $\tilde{F} (u) = {[f (T_{1}^{m}), \dots, f (T_{J (u)}^{m})]}^{T}$ and W̃(u) = (W₁,⋯, W_J₍_u₎)^T. Then, (2) implies W̃(u) = F̃(u)α + ẽ(u). Let α̂(u) = {F̃^T (u)F̃(u)}⁻¹F̃^T (u)W̃(u) be the least squares estimate of α using data up to and including time u. Conditional on {α, Z̃, T^m(u), Y (u) = 1}, the least squares estimate of X(u), X̂(u) = α̂^T (u)f(u) is normally distributed as N{X(u), d(u, σ²)}, where d(u, σ²) = σ²f^T (u) {F̃^T (u)F̃(u)}⁻¹f(u). Define Q(u, θ, σ²) = X̂(u) + dN(u)(β + γZ)d(u, σ²). Similarly as in Tsiatis and Davidian (2001), the conditional intensity for N(u) can be derived as

\begin{array}{l} lim_{d u \to 0} {d u}^{- 1} ℙ r {d N (u) = 1 ∣ Q (u, θ, σ^{2}), \tilde{Z}, T^{m} (u), Y (u) = 1} \\ = λ_{0} (u) exp {β Q (u, θ, σ^{2}) + η^{T} \tilde{Z} + γ Z Q (u, θ, σ^{2}) - \frac{1}{2} {(β + γ Z)}^{2} d (u, σ^{2})}, \end{array}

(5)

which does not depend on the unknown random effects α (see Web Appendix C for a proof).

For notational simplicity, we let

H (u, θ, σ^{2}) = {[Q (u, θ, σ^{2}), {\tilde{Z}}^{T}, Q (u, θ, σ^{2}) Z]}^{T} .

If σ² were known, then similarly as in Tsiatis and Davidian (2001), unbiased estimating equations for θ could be derived as

U_{F} (θ, σ^{2}) = \sum_{i = 1}^{N} \int_{0}^{τ} {H_{i} (u, θ, σ^{2}) - \frac{E_{F}^{(1)} (u, θ, σ^{2})}{E_{F}^{(0)} (u, θ, σ^{2})}} {d N}_{i} (u) = 0,

(6)

where

E_{i}^{(0)} (u, θ, σ^{2}) = exp {H_{i}^{T} (u, θ, σ^{2}) θ - \frac{1}{2} {(β + γ Z_{i})}^{2} d_{i} (u, σ^{2})},

(7)

E_{i}^{(1)} (u, θ, σ^{2}) = H_{i} (u, θ, σ^{2}) E_{i}^{(0)} (u, θ, σ^{2}),

(8)

E_{F}^{(r)} (u, θ, σ^{2}) = N^{- 1} \sum_{i = 1}^{N} Y_{i} (u) E_{i}^{(r)} (u, θ, σ^{2}), r = 0, 1.

(9)

In practice, we could estimate σ² as σ̂² by solving $S_{e, F} (σ^{2}) \equiv \sum_{i = 1}^{N} S_{e, i} (σ^{2}) = 0$ , where

S_{e, i} (σ^{2}) = J_{i} I (J_{i} \geq q) [{{\tilde{W}}_{i} (V_{i}) - {\tilde{F}}_{i} (V_{i}) {\hat{α}}_{i} (V_{i})}^{T} {{\tilde{W}}_{i} (V_{i}) - {\tilde{F}}_{i} (V_{i}) {\hat{α}}_{i} (V_{i})} - σ^{2} (J_{i} - q)] .

(10)

Then, θ is estimated by solving U_F (θ, σ̂²) = 0.

3.2 IPW conditional score estimator

3.2.1 Prespecified sampling probabilities π_i = π(O_i)

We start with the case with correctly and fully specified sampling probabilities. We define the IPW conditional score estimator θ̂_{IP W} (π) for θ as the solution to $U_{IPW} (θ, {\hat{σ}}_{IPW}^{2} (π), π) = 0$ where

U_{IPW} (θ, {\hat{σ}}_{IPW}^{2} (π), π) = \sum_{i = 1}^{N} \frac{ξ_{i}}{π_{i}} \int_{0}^{τ} {H_{i} (u, θ, {\hat{σ}}_{IPW}^{2} (π)) - \frac{E_{IPW}^{(1)} (u, θ, {\hat{σ}}_{IPW}^{2} (π), π)}{E_{IPW}^{(0)} (u, θ, {\hat{σ}}_{IPW}^{2} (π), π)}} {d N}_{i} (u) .

(11)

Here ${\hat{σ}}_{IPW}^{2} (π)$ estimates σ² by solving $S_{e, IPW} (σ^{2}, π) \equiv \sum_{i = 1}^{N} (ξ_{i} / π_{i}) S_{e, i} (σ^{2}) = 0$ , and

E_{IPW}^{(r)} (u, θ, σ^{2}, π) = \frac{1}{N} \sum_{i = 1}^{N} \frac{ξ_{i}}{π_{i}} Y_{i} (u) E_{i}^{(r)} (u, θ, σ^{2}), r = 0, 1.

(12)

The baseline hazard could be estimated as $d {\hat{Λ}}_{0}^{IPW} (u) = \frac{1}{N} \sum_{i} {d N}_{i} (u) / {E_{IPW}^{(0)} (u, {\hat{θ}}_{IPW} (π), {\hat{σ}}^{2} (π), π)}$ .

To precede to theoretical results, we define

M_{i} (θ, σ^{2}) = \int_{0}^{τ} {H_{i} (u, θ, σ^{2}) - \frac{e^{(1)} (u, θ, σ^{2})}{e^{(0)} (u, θ, σ^{2})}} {d D}_{i} (u),

(13)

{d D}_{i} (u) = {d N}_{i} (u) - λ_{0} (u) Y_{i} (u) E_{i}^{(0)} (u, θ, σ^{2}) d u,

(14)

e^{(r)} (u, θ, σ^{2}) = E {Y (u) E^{(r)} (u, θ, σ^{2})}, r = 0, 1.

(15)

By Lemma C.1 (Web Appendix C) and the same arguments as for (8a) in Tsiatis and Davidian (2001), we can show that $E {M (θ_{0}, σ_{0}^{2})} = 0$ , where θ₀ is the true parameter of (3) and $σ_{0}^{2}$ is the true variance of measurement errors. We show next that under regularity conditions, the estimating functions $N^{- 1} U_{IPW} (θ, {\hat{σ}}_{IPW}^{2} (π), π)$ are asymptotically equivalent to $N^{- 1} \sum_{i = 1}^{N} (ξ_{i} / π_{i}) M_{i} (θ, σ^{2})$ . The latter is a mean of i.i.d. random variates for which empirical theoretical results are readily applied. Let ${(θ_{0}^{T}, σ_{0}^{2}, ρ_{0}^{T})}^{T}$ be the true parameters. For any parameter, for example θ, we use 𝒩(θ₀) to denote a compact neighborhood of θ₀ and 𝒩(τ, θ₀) for [0, τ] × 𝒩 (θ₀). Let 𝔼₀(·) and 𝕍ar₀(·) denote the expectation and variance evaluated under the truth. We also use ḟ_θ = ∂f/∂θ to denote the derivative of a function f with respect to the parameter θ. We sometimes omit θ in the subscript and use ḟ when f is fully parameterized by θ.

Lemma 1

Under conditions A1–A5 listed in Assumption A of Web Appendix C,

sup_{{(u, θ^{T}, σ^{2})}^{T} \in N (τ, θ_{0}, σ_{0}^{2})} | \frac{E_{IPW}^{(1)} (u, θ, σ^{2}, π)}{E_{IPW}^{(0)} (u, θ, σ^{2}, π)} - \frac{e^{(1)} (u, θ, σ^{2})}{e^{(0)} (u, θ, σ^{2})} | \overset{p}{\to} 0 as N \to \infty .

Lemma 2

Under conditions A1–A5, $N^{- 1 / 2} U_{IPW} (θ_{0}, σ_{0}^{2}, π)$ is asymptotically linear with

N^{- 1 / 2} U_{IPW} (θ_{0}, σ_{0}^{2}, π) = N^{- 1 / 2} \sum_{i = 1}^{N} \frac{ξ_{i}}{π_{i}} M_{i} (θ_{0}, σ_{0}^{2}) + o_{p} (1) .

Theorem 1

Under conditions A1–A8, as N → ∞, (i) ${\hat{θ}}_{IPW} (π) \overset{p}{\to} θ_{0}$ and (ii) $\sqrt{N} ({\hat{θ}}_{IPW} (π) - θ_{0})$ converges weakly to a normal random variate with mean zero and covariance matrix A⁻¹B(A⁻¹)^T, where

\begin{array}{l} A = E_{0} ({\dot{M}}_{θ}), B = E_{0} (\frac{1}{π} R R^{T}), \\ R = M (θ_{0}, σ_{0}^{2}) - E_{0} ({\dot{M}}_{σ^{2}}) {E_{0} ({\dot{S}}_{e})}^{- 1} S_{e} (σ_{0}^{2}) . \end{array}

3.2.2 Estimated sampling probabilities π̂_i = π(O_i, ρ̂)

Results in Web Appendix B show that the IPW estimator with prespecified sampling probabilities π is inefficient. It is often recommended to use the estimated π̂ = π̂(ρ̂) to improve efficiency. To implement this we still apply the sampling probability model (4) and the resulting likelihood score functions

\begin{array}{l} S_{π, F} (ρ) = \sum_{i = 1}^{N} S_{π, i} (ρ) = \sum_{i = 1}^{N} \frac{\partial}{\partial ρ} log [π {(O_{i}; ρ)}^{ξ_{i}} {1 - π (O_{i}; ρ)}^{1 - ξ_{i}}] \\ = \sum_{i = 1}^{N} \frac{ξ_{i} - π (O_{i}; ρ)}{π (O_{i}; ρ) {1 - π (O_{i}; ρ)}} \frac{\partial π (O_{i}; ρ)}{\partial ρ} . \end{array}

(16)

The IPW estimators ${\hat{σ}}_{IPW}^{2} (\hat{π})$ and θ̂_{IP W} (π̂) are obtained by solving S_{e,IP W} (σ², π̂) = 0 and $U_{IPW} (θ, {\hat{σ}}_{IPW}^{2} (\hat{π}), \hat{π}) = 0$ , respectively.

Theorem 2

Under conditions A1–A8, as N → ∞, (i) ${\hat{θ}}_{IPW} (\hat{π}) \overset{p}{\to} θ_{0}$ and (ii) $\sqrt{N} ({\hat{θ}}_{IPW} (\hat{π}) - θ_{0})$ converges weakly to a normal random variate with mean zero and covariance matrix A⁻¹B^*(A⁻¹)^T, where

B^{*} = B - E_{0} (R \frac{\dot{π}}{π}) {E_{0} (S_{π} S_{π}^{T})}^{- 1} {E_{0} (R \frac{\dot{π}}{π})}^{T},

p with A, B and R defined in Theorem 1.

3.3 AIPW conditional score estimator

The IPW estimator is easy to implement but is biased if the sampling model is misspecified, which could occur if some selected subjects have happenstance missing data. In Web Appendix B we state the result that the use of estimated sampling probabilities can improve the efficiency of the IPW estimator, but it does not achieve the minimal variance bound unless it is augmented as the AIPW estimator and implemented with a correct model for the full data given O under stratified Bernoulli sampling.

If we temporarily assume the e⁽^r⁾(u, θ, σ²) (r = 0, 1) in (13) are known, then for our problem, the double robustness property of the AIPW estimator is expressed as follows– if either the sampling probability model for π(O; ρ) or the model for 𝔼{M(θ, σ²)|O} is correct, then the estimating equations are unbiased for θ. In practice, 𝔼{M(θ, σ²)|O} is estimated as accurately as possible. Given the MAR assumption in (4), we have 𝔼{M(θ, σ²)|ξ = 1, O} = 𝔼{M(θ, σ²)|O}. Therefore we could build a model to estimate it based on the second phase sample. However, since the e⁽^r⁾(u, θ, σ²) (r = 0, 1) are unknown we have to estimate them, with approaches for doing so described below. We first develop the AIPW conditional score estimator and its asymptotic properties assuming π and 𝔼(·|O) related to the estimating equations are fully and correctly specified. Then we move on to the situations where either or both of these models are estimated.

3.3.1 Prespecified sampling probabilities π_i = π(O_i) and conditional expectation 𝔼(·|O)

Similar to Qi et al (2005), we define the AIPW conditional score estimating functions for θ as

\begin{array}{l} U_{AUG} (θ, σ^{2}, π, E) = \sum_{i = 1}^{N} \frac{ξ_{i}}{π_{i}} \int_{0}^{τ} {H_{i} (u, θ, σ^{2}) - \frac{E_{AUG}^{(1)} (u, θ, σ^{2}, π)}{E_{AUG}^{(0)} (u, θ, σ^{2}, π)}} {d N}_{i} (u) + \\ \sum_{i = 1}^{N} (1 - \frac{ξ_{i}}{π_{i}}) \int_{0}^{τ} [E {H_{i} (u, θ, σ^{2}) {d N}_{i} (u) ∣ O_{i}} - \frac{E_{AUG}^{(1)} (u, θ, σ^{2}, π)}{E_{AUG}^{(0)} (u, θ, σ^{2}, π)} E {{d N}_{i} (u) ∣ O_{i}}], \end{array}

(17)

where, for r = 0, 1,

E_{AUG}^{(r)} (u, θ, σ^{2}, π) = \frac{1}{N} \sum_{i = 1}^{N} \frac{ξ_{i}}{π_{i}} Y_{i} (u) E_{i}^{(r)} (u, θ, σ^{2}) + \frac{1}{N} \sum_{i = 1}^{N} (1 - \frac{ξ_{i}}{π_{i}}) E {Y_{i} (u) E_{i}^{(r)} (u, θ, σ^{2}) ∣ O_{i}} .

Unlike the partial likelihood for classic Cox regression, here we define the at-risk process as Y_i(u) = I(V_i ≥ u, J_i(u) ≥ q), which contains the incomplete data of measurement time points. The estimate θ̂_AUG(π, 𝔼) for θ solves $U_{AUG} (θ, {\hat{σ}}_{AUG}^{2} (π, E), π, E) = 0$ , where ${\hat{σ}}_{AUG}^{2} (π, E)$ solves

S_{e, AUG} (σ^{2}, π, E) = \sum_{i = 1}^{N} \frac{ξ_{i}}{π_{i}} S_{e, i} (σ^{2}) + \sum_{i = 1}^{N} (1 - \frac{ξ_{i}}{π_{i}}) E {S_{e, i} (σ^{2}) ∣ O_{i}} = 0.

(18)

Similarly as in (13), we define

M_{AUG, i} (θ, σ^{2}, π) = \frac{ξ_{i}}{π_{i}} M_{i} (θ, σ^{2}) + (1 - \frac{ξ_{i}}{π_{i}}) E {M_{i} (θ, σ^{2}) ∣ O_{i}} .

(19)

Evidently 𝔼{M_AUG(θ, σ², π)} = 𝔼{M (θ, σ²)} with correctly specified sampling probabilities. We will demonstrate that under the regularity conditions listed in Assumption A and Assumption B (stated in Web Appendix C), N⁻¹U_AUG(θ, σ², π, 𝔼) is asymptotically equivalent to the i.i.d. average $N^{- 1} \sum_{i = 1}^{N} M_{AUG, i} (θ, σ^{2}, π)$ .

Lemma 3

Under conditions listed in Assumption A and B, as N → ∞,

sup_{{(u, θ^{T}, σ^{2})}^{T} \in N (τ, θ_{0}, σ_{0}^{2})} | \frac{E_{AUG}^{(1)} (u, θ, σ^{2}, π)}{E_{AUG}^{(0)} (u, θ, σ^{2}, π)} - \frac{e^{(1)} (u, θ, σ^{2})}{e^{(0)} (u, θ, σ^{2})} | \overset{p}{\to} 0.

Lemma 4

Under conditions in Assumption A and B, $N^{- 1 / 2} U_{AUG} (θ_{0}, σ_{0}^{2}, π, E)$ is asymptotically linear with

N^{- 1 / 2} U_{AUG} (θ_{0}, σ_{0}^{2}, π, E) = N^{- 1 / 2} \sum_{i = 1}^{N} M_{AUG, i} (θ_{0}, σ_{0}^{2}, π) + o_{p} (1) .

Theorem 3

Under conditions in Assumption A and B, as N → ∞, (i) ${\hat{θ}}_{AUG} (π, E) \overset{p}{\to} θ_{0}$ and (ii) $\sqrt{N} ({\hat{θ}}_{AUG} (π, E) - θ_{0})$ converges weakly to a normal random variate with mean zero and covariance matrix A⁻¹C(A⁻¹)^T, where

C = B - E_{0} {\frac{1 - π}{π} E_{0} (R ∣ O) E_{0} {(R ∣ O)}^{T}} .

3.3.2 Estimated sampling probabilities π̂_i = π(O_i, ρ̂) and prespecified conditional expectation 𝔼(·|O)

Parallel theoretical results hold when using estimated sampling probabilities, which we relegate to Theorem C.4 in Web Appendix C.

3.3.3 Prespecified sampling probabilities π_i = π(O_i) and estimated 𝔼̂ (·|O)

In practice it is hard to derive the analytical form of 𝔼(·|O). A common approach replaces it with a function h(O; γ) in terms of a finite-dimensional parameter γ. However, when h(O; γ) ≠ 𝔼(·|O) for any γ, the resulting estimator cannot achieve the optimal asymptotic variance given in Theorem 3 and Theorem C.4 (in Web Appendix C). Cao et al (2009) investigated how to find the optimal γ that minimizes the variance of the estimator of θ with h fixed.

Qi et al (2005) investigated the AIPW estimator for the Cox model with time-independent covariates. They proposed to estimate 𝔼(·|O) by the non-parametric Nadaraya-Watson method, and proved that the optimal asymptotic variance can be achieved. We extend their method to our more complicated setting with a time-dependent covariate.

The scores in (17) show that we need to estimate 𝔼{G(u, θ, σ²)|O} as a continuous function of u and integrate it over u, where G(u, θ, σ²) could be Y(u)E⁽^r⁾(u, θ, σ²), dN(u), or H(u, θ, σ²)dN(u). Suppose the predictor variables O are d continuous variables, in which case we use the Nadaraya-Watson estimator, i.e., for any g(u, O; θ, σ²) ≡ 𝔼{G(u, θ, σ²)|O},

\hat{g} (u, O; θ, σ^{2}) \equiv \hat{E} {G (u, θ, σ^{2}) ∣ O} = \frac{\sum_{j = 1}^{N} ξ_{j} G_{j} (u, θ, σ^{2}) K_{H} (O - O_{j})}{\sum_{j = 1}^{N} ξ_{j} K_{H} (O - O_{j})},

(20)

where K_H(·) is an s^th order kernel function and H is the bandwidth that is a d × d symmetric and positive definite matrix. If any component of O is discrete, we consider kernel regression with mixed types of predictor variables (Hall et al, 2004), which can be implemented by the R package np. The resulting estimating equations for θ are U_AUG(θ, σ², π, 𝔼̂) = 0, where 𝔼 in (17) is replaced with 𝔼̂ given by (20).

However, intuitively such estimation relies on strong assumptions for the functional form of g(u, ·; ·, ·) in u. By examining the two stochastic processes involving dN(u) mentioned above, we find that they are always estimated as zero for time u if there are no events occurring at that time. This implies that, practically, the estimates of their conditional expectations given O at non-event time points are zero. Thus they only need to be estimated at event times. Alternatively, if O includes the event time (V, Δ), then by conditioning on it, U_AUG(θ, σ², π, 𝔼̂) reduces to

\begin{array}{l} U_{AUG} (θ, σ^{2}, π, \hat{E}) = \sum_{i = 1}^{N} \frac{ξ_{i}}{π_{i}} \int_{0}^{τ} {H_{i} (u, θ, σ^{2}) - \frac{{\hat{E}}_{AUG}^{(1)} (u, θ, σ^{2}, π)}{{\hat{E}}_{AUG}^{(0)} (u, θ, σ^{2}, π)}} {d N}_{i} (u) + \\ \sum_{i = 1}^{N} (1 - \frac{ξ_{i}}{π_{i}}) \int_{0}^{τ} {{\hat{g}}^{H} (u, O_{i}; θ, σ^{2}) - \frac{{\hat{E}}_{AUG}^{(1)} (u, θ, σ^{2}, π)}{{\hat{E}}_{AUG}^{(0)} (u, θ, σ^{2}, π)} {\hat{g}}^{J} (u, O_{i}; θ, σ^{2})} {d N}_{i}^{*} (u) \end{array}

(21)

where $Y_{i}^{*} (u) = I (V_{i} \geq u), {d N}_{i}^{*} (u) = I (V_{i} = u, Δ_{i} = 1)$ , and, for r = 0, 1,

\begin{array}{l} {\hat{E}}_{AUG}^{(r)} (u, θ, σ^{2}, π) & = \frac{1}{N} \sum_{i = 1}^{N} \frac{ξ_{i}}{π_{i}} Y_{i} (u) E_{i}^{(r)} (u, θ, σ^{2}) + \frac{1}{N} \sum_{i = 1}^{N} (1 - \frac{ξ_{i}}{π_{i}}) Y_{i}^{*} (u) {\hat{g}}^{(r)} (u, O_{i}; θ, σ^{2}), \\ g^{J} (u, O; θ, σ^{2}) & = E {I (J (u) \geq q) ∣ O}, \\ g^{H} (u, O; θ, σ^{2}) & = E {I (J (u) \geq q) H (u, θ, σ^{2}) ∣ O}, \\ g^{(r)} (u, O; θ, σ^{2}) & = E {I (J (u) \geq q) E^{(r)} (u, θ, σ^{2}) ∣ O} . \end{array}

Similarly, σ² can be estimated by solving S_e,AUG(σ², π, Ê) = 0 with estimators denoted by θ̂_AUG(π, 𝔼̂) and ${\hat{σ}}_{AUG}^{2} (π, \hat{E})$ .

Theorem 4

Under conditions in Assumption A, B and C, as N → ∞, (i) ${\hat{θ}}_{AUG} (π, \hat{E}) \overset{p}{\to} θ_{0}$ and (ii) $\sqrt{N} ({\hat{θ}}_{AUG} (π, \hat{E}) - θ_{0})$ converges weakly to a normal random variate with mean zero and covariance matrix A⁻¹C(A⁻¹)^T.

3.3.4 Estimated sampling probabilities π̂_i = π(O_i, ρ̂) and estimated 𝔼̂(·|O)

When the sampling probabilities are estimated via setting the score functions in (16) to zero, results parallel to those in Section 3.3.2 hold. We estimate θ and σ² by solving U_AUG(θ, σ², π̂, 𝔼̂) = 0 and S_e,AUG(σ², π̂, 𝔼̂) = 0. We denote the obtained estimators as θ̂_AUG(π̂, 𝔼̂) and ${\hat{σ}}_{AUG}^{2} (\hat{π}, \hat{E})$ .

Theorem 5

Under conditions in Assumptions A, B, and C, as N → ∞, (i) ${\hat{θ}}_{AUG} (\hat{π}, \hat{E}) \overset{p}{\to} θ_{0}$ and (ii) $\sqrt{N} ({\hat{θ}}_{AUG} (\hat{π}, \hat{E}) - θ_{0})$ converges weakly to a normal random variate with mean zero and covariance matrix A⁻¹C(A⁻¹)^T.

The above theorems yield sandwich variance estimators for the IPW and AIPW estimators based on empirical score functions, where parameters are replaced by estimates. The sandwich variance estimators are used in the simulation studies and in the application.

4 Simulation Studies of the IPW and AIPW Methods

We conduct simulation studies that mimic disease prevention trials such as vaccine efficacy trials. We generate the event time data (Δ, V) from Cox models with different combinations of the time-varying biomarker and treatment assignment to vaccine or placebo. We carry out 500 simulation runs for each scenario. Our primary goal is to evaluate the IPW and AIPW conditional score methods in two-phase sampling design cohort studies. For comparison, we also calculate the conditional score estimator based on full cohort data (Full)– the unobtainable benchmark– and conduct the naive complete-case (CC) analysis only using subjects selected into the second phase without any weighting. While bandwidth selection is important, the AIPW method requires fitting the kernel regression across time points, making it highly computationally intensive to seek the optimal bandwidth for each regression. Therefore the simulation studies use a fixed bandwidth determined from the variance of the predictor variables over time.

We conduct three main simulation studies. Study I assesses the Cox model for a biomarker in one treatment group only, with objective to compare the IPW and AIPW estimators using various sets of auxiliary variables to estimate the augmentation terms. Study II assesses the Cox model with both the biomarker and the treatment group as covariates. It also studies the impact of the number of biomarker measurements and misspecified measurement error models on estimator performance. Study III assesses the Cox model also including the interaction term for biomarker with treatment. All IPW and AIPW methods are implemented using both the prespecified true sampling probabilities (π) and the estimated sampling probabilities (π̂). The results are generally similar, with the latter sometimes slightly more efficient as expected from the theory (results reported in Web Appendix D Tables 1–6. For brevity we focus on reporting the true π results. In all simulation studies, we evaluate the bias, Monte Carlo standard deviation (SD), and averages of estimated standard error (ASE) of parameter estimates. All three studies use a full cohort sample size of N = 1500, with phase II sample drawn with Bernoulli sampling probabilities ℙr(ξ = 1|Δ = 1) = 1 and ℙr(ξ = 1|Δ = 0) = 0.33, resulting in about 40% of subjects with biomarker data (Table 1). Web Appendix D Tables 1–6 show detailed results; below we report selected key results.

Table 1.

Sample sizes for Phase I (average N) and Phase II (average n) under case-control Bernoulli sampling for Simulation Studies I–III.

Simulation Study		Case (Δ = 1) N (n)	Control (Δ = 0) N (n)
I	β

	0	156 (156)	1344 (443)
	−ln 2	151 (151)	1349 (445)
	−ln 4	158 (158)	1342 (443)

II	(β, η)

	(−ln 2, 0)	156 (156)	1344 (443)
	(−ln 2, −ln 2)	157 (157)	1343 (443)

III	(β, η, γ)

	(−ln 2, −ln 2, 0)	157 (157)	1343 (443)

Open in a new tab

4.1 Simulation Study I

We consider the model λ(u) = λ₀(u) exp{βX(u)}, where X(u) is characterized by a linear random effects model X(u) = α₀ + α₁u. We simulate X(u) to imitate log₁₀CD4 cell count in the ACTG 175 study as described in Song et al (2002). The random effects (α₀, α₁)^T are generated from a bivariate normal distribution with 𝔼(α) = (2.5915, −0.00315)^T and ℂov(α) = D with elements (D₁₁, D₁₂, D₂₂)^T = (0.02408, −0.0008, 0.000014)^T. The variance of the measurement error e is σ² = 0.01. This set-up represents a noise-to-signal ratio of 𝕍ar(e)/𝕍ar{X(0)} ≈ 0.42, approximately the same as in ACTG 175. The scheduled visits for measuring X(u) are at baseline and at weeks 2, 4, 8, 20, 32, 44, 56, 68, 80, where we create random uniform variability in visit attendance ± one-half week. Event times are generated from the Cox model with hazard ratios e^β = {1, 0.5, 0.25}. Censoring times follow the exponential distribution Exp(1/180) and are subject to administrative censoring at Week 85. We choose the baseline hazards to yield an event rate of around 10%. Specifically, the proportions of subjects dropping off the study during follow-up and completing the study free of events are (35.5%, 54.1%), (36.0%, 54.0%) and (36.2%, 53.3%) when the hazard ratios are 1, 0.5 and 0.25, respectively. The average number of biomarker measurements per subject is around eight.

For the AIPW method, we evaluate several sets of auxiliary predictor variables A = (A₁, A₂)^T that serve to estimate the augmentation terms in the nonparametric kernel regression (Nadaraya-Watson). We generate null, moderate, and strong correlation with the biomarker, quantified by correlation coefficient R = {0, 0.5, 0.95}. For R = 0, A is generated independently from α and (Δ, V), from A₁ ~ N {𝔼(α₁), 1} and A₂ ~ N{𝔼(α₁+40α₂), 1}. For R = 0.5 and 0.95, A₁ = α₁ + e₁, A₂ = α₁ + 40α₂ + e₂, $e_{1} ~ N (0, d_{1}^{2}), e_{2} ~ N (0, d_{2}^{2})$ , with $d_{1}^{2} = (1 / R^{2} - 1) V a r (α_{1})$ and $d_{2}^{2} = (1 / R^{2} - 1) V a r (α_{1} + 40 α_{2})$ . In total, we consider eleven sets of predictor variables in the kernel regression: Δ, (Δ, V), A, (Δ, A) and (Δ, V, A), with A set with R = 0, 0.5, 0.95.

As reported in Figure 1, as expected the CC analysis generates very biased estimates when β is large. We observe slightly larger biases for the AIPW(Δ, V, A) estimators when β = −ln 4, When the auxiliary variables are highly correlated with X(u), including them provides less variable AIPW estimates than other methods, especially when X(u) is strongly associated with event time. The 95% coverage probabilities are slightly below the nominal level when β = −ln 4, suggesting that the variance is underestimated by the sandwich variance estimator.

Fig. 1 — Simulation results for Simulation Study I: λ(u) = λ₀(u) exp{βX(u)} and correctly specified error distribution. The variables enclosed in the parentheses of AIPW() indicate which variables are used in estimating the augmentation terms, and indicate the strength of correlation with X(u) (R = 0, 0.5, 0.95). Relative efficiency is calculated as the Monte Carlo variance of β̂ from Full divided by that from IPW or AIPW method.

4.2 Simulation Study II

We next consider the Cox model λ(u) = λ₀(u) exp{βX(u) + ηZ}. We can assess the Prentice criteria by examining if η is plausibly close to zero. We model X(u) with a linear random effects model as above and generate Z ~ Bernoulli(0.5). The random effects α are simulated from a bivariate normal distribution with 𝔼(α|Z = 0) = (2.5915, − 0.00145)^T, 𝔼(α|Z = 1) = (2.5915, − 0.00315)^T and ℂov(α|Z) = D with elements (D₁₁, D₁₂, D₂₂)^T = (0.02408, −0.0008, 0.000014)^T. We consider two sets of hazard ratios (e^β, e^η)^T = {(0.5, 1)^T, (0.5, 0.5)^T}.

To study the influence of i) the number of measurement time points; and ii) misspecification of the measurement error model, we consider the following four scenarios, all with random uniform variability in visit attendance ± one-half week:

II(a): X(u) is measured at weeks 0, 2, 4, 8, 20, 32, 44, 56, 68, 80; e(u) ~ N(0, 0.01).
II(b): X(u) is measured at weeks 0, 8, 44, 80; e(u) ~ N(0, 0.01).
II(c): X(u) is measured at weeks 0, 2, 4, 8, 20, 32, 44, 56, 68, 80; e(u) ~ Exp(10) − 0.1.
II(d): X(u) is measured at weeks 0, 2, 4, 8, 20, 32, 44, 56, 68, 80; measurement error depends on the value of the biomarker; e(u) ~ N(0, 0.01) if X(u) > 2.5 and e(u) ~ N(0, 0.05) if X(u) ≤ 2.5.

Studies II(c) and II(d) assess the influence of the misspecified error models. The proportions of subjects dropping off the study during follow-up and completing the study event-free are (35.9%, 53.7%) and (36.1%, 53.4%) when the hazard ratios (e^β, e^η)^T are (0.5, 1)^T and (0.5, 0.5)^T, respectively. The average number of biomarker measurements per subject in Studies II(a), (c), and (d) is 8.3, and in Study II(b) is 3.3.

Figure 2 shows results for Studies II(a) and II(b). For II(b), AIPW does not yield an efficiency gain over IPW for β̂ unless A is very strongly correlated with the biomarker, suggesting that when a limited number of biomarker measurements are available, using the more complex AIPW method can yield poorer results than the simpler IPW method.

Fig. 2 — Simulation results for Simulation Studies II(a) and II(b): λ(u) = λ₀(u) exp{βX (u)+ ηZ} and correctly specified error distribution; II(b) with few measurement times. The variables enclosed in the parentheses of AIPW() indicate which variables are used in estimating the augmentation terms, and indicate the strength of correlation with X(u) (R = 0, 0.5, 0.95).

Figure 3 shows the results of Studies II(c) and II(d). For II(c) where the error distribution has a heavy right tail, the bias for β̂ is slightly greater than for II(a) where the error distribution was correctly specified. In contrast bias was near zero for η̂, the same as for II(a), which occurs because Z has no measurement error. The 95% coverage probabilities are very close to the nominal level expect for the CC method. There are similar patterns of relative efficiency to those in II(a), suggesting that unless very strong predictors of the biomarker are available, the simple IPW method may be recommended. The influence of misspecified measurement error seems to center on the bias of β̂, but the level of bias is acceptable, with relative bias up to 7.4%.

Fig. 3 — Simulation results for Simulation Studies II(c) and II(d): λ(u) = λ₀(u) exp{βX (u)+ ηZ} and misspecified error distribution. The variables enclosed in the parentheses of AIPW() indicate which variables are used in estimating the augmentation terms, and indicate the strength of correlation with X(u) (R = 0, 0.5, 0.95).

In II(d) where the measurement error depends on the biomarker, bias of β̂ is greater than in II(c), especially in the setting with (β, η)^T = (−ln 2, −ln 2)^T, with relative bias as high as 10.7%.

4.3 Simulation study III

Lastly, we consider the Cox model λ(u) = λ₀(u) exp{βX(u) + ηZ + γX(u)Z}. Data are simulated as for II(a) with hazard ratios (e^β, e^η)^T = (0.5, 0.5)^T, and we specify no interaction, e^γ = 1. As reported in Figure 4, estimator performance for β and η is fairly similar to the other simulations, with similar conclusions. There is negligible bias for estimating the interaction coefficient γ.

Fig. 4 — Simulation results for Simulation Study III: λ(u) = λ₀(u) exp{βX (u) + ηZ + γX(u)Z} and correctly specified error distribution. The variables enclosed in the parentheses of AIPW() indicate which variables are used in estimating the augmentation terms, and indicate the strength of correlation with X(u) (R = 0, 0.5, 0.95).

5 Application to ACTG 175

We apply the methods to ACTG 175, a randomized trial comparing four antiretroviral regimens (zidovudine only, zidovudine+didanosine, zidovudine+ zalcitabine and didanosine only) in N = 2, 467 HIV-infected subjects with CD4 cell counts 200–500 per cubic millimeter (Hammer et al, 1996). The study design measured CD4 cell counts in all subjects every 12 weeks starting at Week 8, and followed subjects for occurrence of the composite clinical endpoint of AIDS or death, whichever occurred first. A total of 308 endpoints/cases were observed. The median follow-up time was 143 weeks. Addressing the same objectives as Song et al (2002), we study treatment Z = 1 as three regimens pooled (zidovudine+didanosine, zidovudine+zalcitabine, and didanosine) versus Z = 0 as zidovudine only, and study the inherent log₁₀CD4 cell count trajectory X(u) as a correlate of risk and as a Prentice surrogate endpoint for the composite clinical endpoint.

We created a case-control sample from the full study cohort with post-randomization CD4 cell data available, generating a study where the CD4 cell counts were measured only on this case-control sample. This sample included all cases and a simple random sample of 14.3% of controls, yielding 306 cases and 306 controls. The average number of CD4 measurements per subject was 9.3. Web Appendix Figure 1 shows observed X(u) trajectories by case/control status and treatment group, showing that cases had a steeper drop over time than controls, and Z = 1 subjects had higher average X(u) than Z = 0 subjects. Web Appendix Figure 2 shows that higher X(u) is associated with lower rates of AIDS/death.

We explored the functional form of X(u) over time using linear and quadratic weighted mixed effects models (fitted by maximum likelihood estimation with the lme() R function). Figure 5 and Web Appendix Table 7 support that the quadratic model provides a better fit. Accordingly, we select the random effects model (1) for log₁₀CD4 as X(u) = α₀ + α₁u + α₂u².

Fig. 5 — ACTG 175: Spaghetti plot of observed log₁₀CD4 cell counts X(u) from 10 randomly selected subjects with fitted lines from linear and quadratic mixed effects models.

We now check the Prentice criteria using the joint model, where we adjust for the average baseline log₁₀CD4 cell count measured at the two time points before randomization, as it is a dual predictor of X(u) and the clinical endpoint. Table 2 shows estimated coefficients from the IPW and AIPW methods using estimated π̂ and using treatment and event status for estimating the augmentation terms. Five joint models are fit to evaluate the Prentice criteria, using five different independent variable sets. The IPW and AIPW methods yield similar point estimates, and for brevity we only interpret the results of the IPW analysis.

Table 2.

ACTG 175: Estimates and 95% confidence intervals (CIs) for the coefficients in the joint Cox model, to assess continuous log₁₀CD4 cell count as a time-dependent Prentice surrogate endpoint.

	Model 0	log₁₀CD4	Trt	log₁₀CD4 × Trt	bsl.log₁₀CD4

			−0.37 (−0.61, −0.13)
		IPW(π̂)^a
	Model 1	−2.14 (−2.81, −1.47)	−3.33 (−5.31, −1.34)	1.42 (0.46, 2.38)	−3.60 (−5.16, −2.03)
	Model 2	−1.04 (−1.93, −0.15)	−0.44 (−0.79, −0.08)		−3.61 (−5.70, −1.51)
Trt Group Z = 1	Model 3	−0.73 (−1.68, 0.22)			−4.23 (−6.46, −2.00)
Control Group Z = 0	Model 4	−2.89 (−3.60, −2.18)			−1.03 (−3.68, 1.62)

		AIPW(π̂)^b
	Model 1	−2.35 (−3.22, −1.47)	−2.93 (−6.29, 0.43)	1.36 (−0.22, 2.94)	−1.55 (−2.90, −0.20)
	Model 2	−1.17 (−1.94, −0.39)	−0.35 (−0.67, −0.02)		−2.69 (−6.61, 1.23)
Trt Group Z = 1	Model 3	−0.92 (−2.24, 0.41)			−3.14 (−7.91, 1.62)
Control Group Z = 0	Model 4	−3.03 (−3.84, −2.23)			−0.57 (−9.33, 8.20)

Open in a new tab

PTE = −0.173 (95% CI: −1.026, 0.680) (PTE = 1−θ_trt,₂/θ_trt,₀, where θ_trt,j is the coefficient for Trt in Model j). PCS = 0.021 (95% CI: 0.000, 0.586) (PCS from Kobayashi and Kuroki (2014)). pNIE₀ = 0.079 (95% CI: −0.836, 0.999) (pNIE₀ = 1 − NDE₀(26)/TE(26), where ${NDE}_{0} (26) = E {I (T^{1 X_{α}^{0}} \geq 26) - I (T^{0 X_{α}^{0}} \geq 26)}$ . We choose the Month 26 time point because it is a late follow-up time).

For the AIPW method, the treatment indicator Z and the event indicator Δ were used in estimating the augmentation terms.

Model 0 shows that treatment significantly associates with AIDS/death, such that Prentice criterion (i) holds. Models 3 and 4 indicate that log₁₀CD4 cell count significantly associates with AIDS/death in the control group (Z = 0) but not in the treatment group (Z = 1), and Model 1 shows that the interaction is significant. Therefore the analysis rejects Prentice criteria (ii) and (iii). The surrogate-quality summary measures PTE and PCS = 1/{1 + (1/PTE − 1)²} are estimated to be near zero (−0.173 and 0.021), respectively. In addition, the mediation measure pNIE₀ is estimated to be near zero (0.079), indicating little evidence for mediation. In Web Appendix Table 8, we apply the conditional score method to all subjects from the ACTG 175 trial with CD4 data available. The results are generally comparable to those from the IPW or AIPW method, except that log₁₀CD4 cell count becomes significantly associated with AIDS/death in the treatment group (Z = 1).

6 Discussion

Joint models of longitudinal biomarkers measured with error and an event time have been studied extensively for full cohort data. We generalized the conditional score method to case-cohort/two-phase sampling design cohort studies with Bernoulli sampling, where only subjects in the second phase sample have biomarker data. We developed IPW and AIPW techniques, providing theoretical results on consistency and asymptotic normality with resulting sandwich variance estimators, and studied their performance under correct and incorrect modeling assumptions. Focusing on nonparametric kernel regression as a component of the AIPW method, we found that AIPW outperforms IPW when there is an auxiliary variable strongly correlated with the biomarker. However, with only weak correlations, including auxiliaries can reduce efficiency and even increase bias, especially when the number of biomarker measurements per subject is limited.

Our simulation studies of misspecified measurement error models suggest that when the measurement error is non-normal but still random, the conditional score estimators are slightly biased. More serious problems arise when the measurement error violates the homoscedasticity assumption. Where deviation from the assumption of normal measurement errors is detected, it is recommended to transform the original biomarker to meet the assumption more closely. However, it is challenging to check this assumption based on observed biomarker data because of the need to first specify a correct distribution for the inherent true biomarker. Replicate samples from subjects and validation studies of the assay used to measure the biomarker can be helpful for checking and specifying a credible measurement error model.

In addition to the simulation studies of the case-control Bernoulli sampling design presented in Section 4, we also studied two-phase sampling designs that over-sample control subjects with auxiliary values that predict extreme values of the biomarker. These designs attempt to increase efficiency by increasing the variability in the longitudinal biomarker process. (See Gilbert et al (2014) for a study of this concept.) However, the resulting estimators did not provide any efficiency gains over the simpler case-control design. We conjecture that this is because we focused on scenarios with a substantial amount of measurement error and that sample all cases, which dominate the variance of the influence functions. Additional research of interest would seek scenarios where an auxiliary covariate-dependent two-phase sampling design would improve efficiency.

Supplementary Material

10985_2016_9364_MOESM1_ESM

NIHMS772253-supplement-10985_2016_9364_MOESM1_ESM.pdf^{(448.5KB, pdf)}

Acknowledgments

This work was supported by the National Institute Of Allergy And Infectious Diseases of the National Institutes of Health [Award Numbers R37AI054165, UM1AI068635], and by the Bill and Melinda Gates Foundation [Award Number OPP1110049]. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH or BMGF. The authors thank Ching-Yun Wang for statistical input, the AIDS Clinical Trials Group (ACTG) operations, laboratory, and statistical data management centers for generating and providing the ACTG 175 data, and the ACTG 175 study participants and investigators, in particular protocol chair Scott Hammer who reviewed and approved this work.

Footnotes

Supplementary Material

Supplementary materials are provided in nline Resource including the methods for mediation analysis, proofs of theorems and lemmas, and additional simulation and ACTG 175 results.

Contributor Information

Rong Fu, Department of Biostatistics, University of Washington, Seattle, WA 98195, USA.

Peter B. Gilbert, Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA

References

Cao W, Tsiatis A, Davidian M. Improving efficiency and robustness of the doubly robust estimator for a population mean with incomplete data. Biometrika. 2009;96(3):723–734. doi: 10.1093/biomet/asp033. [DOI] [PMC free article] [PubMed] [Google Scholar]
Dafni U, Tsiatis A. Evaluating surrogate markers of clinical outcome when measured with error. Biometrics. 1998;54(4):1445–1462. [PubMed] [Google Scholar]
DeGruttola V, Tu X. Modeling progression of CD4-lymphocyte count and its relationship to survival time. Biometrics. 1994;50(4):1003–1014. [PubMed] [Google Scholar]
Gilbert P, Yu X, Rotnitzky A. Optimal auxiliary-covariate-based two-phase sampling design for semiparametric efficient estimation of a mean or mean difference, with application to clinical trials. Statistics in Medicine. 2014;33(6):901–917. doi: 10.1002/sim.6006. [DOI] [PMC free article] [PubMed] [Google Scholar]
Guo X, Carlin B. Separate and joint modeling of longitudinal and event time data using standard computer packages. The American Statistician. 2004;58(1):16–24. [Google Scholar]
Hall P, Racine J, Li Q. Cross-validation and the estimation of conditional probability densities. Journal of the American Statistical Association. 2004;99(468):1015–1026. [Google Scholar]
Hammer S, Katzenstein D, Hughes M, Gundacker H, Schooley R, Haubrich R, Henry W, Lederman M, Phair J, Niu M, Hirsch M, Merigan T. A trial comparing nucleoside monotherapy with combination therapy in HIV-infected adults with CD4 cell counts from 200 to 500 per cubic millimeter. The New England Journal of Medicine. 1996;335(15):1081–1090. doi: 10.1056/NEJM199610103351501. [DOI] [PubMed] [Google Scholar]
Haneuse S, Saegusa T, Lumley T. osDesign: An R package for the analysis, evaluation, and design of two-phase and case-control studies. Journal of Statistical Software. 2011;43(11) doi: 10.18637/jss.v043.i11. pii: v43/i11/paper. [DOI] [PMC free article] [PubMed] [Google Scholar]
Haynes B, Gilbert P, McElrath M, Zolla-Pazner S, Tomaras G, Alam S, Evans D, Montefiori D, Karnasuta C, Sutthent R, Liao H, DeVico A, Lewis CGK, Williams Pinter A, Fong Y, Janes H, DeCamp A, Huang Y, Rao M, Billings E, Karasavvas N, Robb M, Ngauy V, de Souza M, Paris R, Ferrari G, Bailer R, Soderberg K, Andrews C, Berman P, Frahm N, De Rosa S, Alpert M, Yates N, Shen X, Koup R, Pitisuttithum P, Kaewkungwal J, Nitayaphan S, Rerks-Ngarm S, Michael N, Kim J. Immune-correlates analysis of an HIV-1 vaccine efficacy trial. The New England Journal of Medicine. 2012;366(14):1275–1286. doi: 10.1056/NEJMoa1113425. [DOI] [PMC free article] [PubMed] [Google Scholar]
Henderson R, Diggle P, Dobson A. Joint modelling of longitudinal measurements and event time data. Biostatistics. 2000;1(4):465–480. doi: 10.1093/biostatistics/1.4.465. [DOI] [PubMed] [Google Scholar]
Kobayashi F, Kuroki M. A new proportion measure of the treatment effect captured by candidate surrogate endpoints. Statistics in Medicine. 2014;33(19):3338–3353. doi: 10.1002/sim.6180. [DOI] [PubMed] [Google Scholar]
Lin D, Fleming T, DeGruttola V. Estimating the proportion of treatment effect explained by a surrogate marker. Statistics in Medicine. 1997;16(13):1515–1527. doi: 10.1002/(sici)1097-0258(19970715)16:13<1515::aid-sim572>3.0.co;2-1. [DOI] [PubMed] [Google Scholar]
Prentice R. Covariate measurement errors and parameter estimation in a failure time regression model. Biometrika. 1982;69(2):331–342. [Google Scholar]
Prentice R. A case-cohort design for epidemiologic cohort studies and disease prevention trials. Biometrika. 1986;73(1):1–11. [Google Scholar]
Prentice R. Surrogate endpoints in clinical trials: definition and operational criteria. Statistics in Medicine. 1989;8(4):431–440. doi: 10.1002/sim.4780080407. [DOI] [PubMed] [Google Scholar]
Qi L, Wang C, Prentice R. Weighted estimators for proportional hazards regression with missing covariates. Journal of the American Statistical Association. 2005;100(472):1250–1263. [Google Scholar]
Rizopoulos D, Verbeke G, Lesaffre E. Fully exponential Laplace approximations for the joint modelling of survival and longitudinal data. Journal of the Royal Statistical Society Series B (Statistical Methodology) 2009;71(3):637–654. [Google Scholar]
Self S, Pawitan Y. Modeling a marker of disease progression and onset of disease. In: Jewell N, Dietz K, Farewell V, editors. AIDS Epidemiology. Birkhäuser; Boston: 1992. pp. 231–255. [Google Scholar]
Song X, Davidian M, Tsiatis A. An estimator for the proportional hazards model with multiple longitudinal covariates measured with error. Biostatistics. 2002;3(4):511–528. doi: 10.1093/biostatistics/3.4.511. [DOI] [PubMed] [Google Scholar]
Tsiatis A, Davidian M. A semiparametric estimator for the proportional hazards model with longitudinal covariates measured with error. Biometrika. 2001;88(2):447–458. doi: 10.1093/biostatistics/3.4.511. [DOI] [PubMed] [Google Scholar]
Tsiatis A, Davidian M. Joint modeling of longitudinal and time-to-event data: an overview. Statistica Sinica. 2004;14:809–834. [Google Scholar]
Tsiatis A, DeGruttola V, Wulfsohn M. Modeling the relationship of survival to longitudinal data measured with error. Applications to survival and CD4 counts in patients with AIDS. Journal of the American Statistical Association. 1995;90(429):27–37. [Google Scholar]
Wang C. Corrected score estimator for joint modeling of longitudinal and failure time data. Statistica Sinica. 2006;16:235–253. [Google Scholar]
Wang C, Xie S, Prentice R. Recalibration based on an approximate relative risk estimator in Cox regression with missing covariates. Statistica Sinica. 2001;11:1081–1104. [Google Scholar]
Wu L, Liu W, Yi G, Huang Y. Analysis of longitudinal and survival data: joint modeling, inference methods, and issues. Journal of Probability and Statistics. 2012 doi: 10.1155/2012/640153. [DOI]
Wulfsohn M, Tsiatis A. A joint model for survival and longitudinal data measured with error. Biometrics. 1997;53(1):330–339. [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

10985_2016_9364_MOESM1_ESM

NIHMS772253-supplement-10985_2016_9364_MOESM1_ESM.pdf^{(448.5KB, pdf)}

[R1] Cao W, Tsiatis A, Davidian M. Improving efficiency and robustness of the doubly robust estimator for a population mean with incomplete data. Biometrika. 2009;96(3):723–734. doi: 10.1093/biomet/asp033. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R2] Dafni U, Tsiatis A. Evaluating surrogate markers of clinical outcome when measured with error. Biometrics. 1998;54(4):1445–1462. [PubMed] [Google Scholar]

[R3] DeGruttola V, Tu X. Modeling progression of CD4-lymphocyte count and its relationship to survival time. Biometrics. 1994;50(4):1003–1014. [PubMed] [Google Scholar]

[R4] Gilbert P, Yu X, Rotnitzky A. Optimal auxiliary-covariate-based two-phase sampling design for semiparametric efficient estimation of a mean or mean difference, with application to clinical trials. Statistics in Medicine. 2014;33(6):901–917. doi: 10.1002/sim.6006. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R5] Guo X, Carlin B. Separate and joint modeling of longitudinal and event time data using standard computer packages. The American Statistician. 2004;58(1):16–24. [Google Scholar]

[R6] Hall P, Racine J, Li Q. Cross-validation and the estimation of conditional probability densities. Journal of the American Statistical Association. 2004;99(468):1015–1026. [Google Scholar]

[R7] Hammer S, Katzenstein D, Hughes M, Gundacker H, Schooley R, Haubrich R, Henry W, Lederman M, Phair J, Niu M, Hirsch M, Merigan T. A trial comparing nucleoside monotherapy with combination therapy in HIV-infected adults with CD4 cell counts from 200 to 500 per cubic millimeter. The New England Journal of Medicine. 1996;335(15):1081–1090. doi: 10.1056/NEJM199610103351501. [DOI] [PubMed] [Google Scholar]

[R8] Haneuse S, Saegusa T, Lumley T. osDesign: An R package for the analysis, evaluation, and design of two-phase and case-control studies. Journal of Statistical Software. 2011;43(11) doi: 10.18637/jss.v043.i11. pii: v43/i11/paper. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R9] Haynes B, Gilbert P, McElrath M, Zolla-Pazner S, Tomaras G, Alam S, Evans D, Montefiori D, Karnasuta C, Sutthent R, Liao H, DeVico A, Lewis CGK, Williams Pinter A, Fong Y, Janes H, DeCamp A, Huang Y, Rao M, Billings E, Karasavvas N, Robb M, Ngauy V, de Souza M, Paris R, Ferrari G, Bailer R, Soderberg K, Andrews C, Berman P, Frahm N, De Rosa S, Alpert M, Yates N, Shen X, Koup R, Pitisuttithum P, Kaewkungwal J, Nitayaphan S, Rerks-Ngarm S, Michael N, Kim J. Immune-correlates analysis of an HIV-1 vaccine efficacy trial. The New England Journal of Medicine. 2012;366(14):1275–1286. doi: 10.1056/NEJMoa1113425. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R10] Henderson R, Diggle P, Dobson A. Joint modelling of longitudinal measurements and event time data. Biostatistics. 2000;1(4):465–480. doi: 10.1093/biostatistics/1.4.465. [DOI] [PubMed] [Google Scholar]

[R11] Kobayashi F, Kuroki M. A new proportion measure of the treatment effect captured by candidate surrogate endpoints. Statistics in Medicine. 2014;33(19):3338–3353. doi: 10.1002/sim.6180. [DOI] [PubMed] [Google Scholar]

[R12] Lin D, Fleming T, DeGruttola V. Estimating the proportion of treatment effect explained by a surrogate marker. Statistics in Medicine. 1997;16(13):1515–1527. doi: 10.1002/(sici)1097-0258(19970715)16:13<1515::aid-sim572>3.0.co;2-1. [DOI] [PubMed] [Google Scholar]

[R13] Prentice R. Covariate measurement errors and parameter estimation in a failure time regression model. Biometrika. 1982;69(2):331–342. [Google Scholar]

[R14] Prentice R. A case-cohort design for epidemiologic cohort studies and disease prevention trials. Biometrika. 1986;73(1):1–11. [Google Scholar]

[R15] Prentice R. Surrogate endpoints in clinical trials: definition and operational criteria. Statistics in Medicine. 1989;8(4):431–440. doi: 10.1002/sim.4780080407. [DOI] [PubMed] [Google Scholar]

[R16] Qi L, Wang C, Prentice R. Weighted estimators for proportional hazards regression with missing covariates. Journal of the American Statistical Association. 2005;100(472):1250–1263. [Google Scholar]

[R17] Rizopoulos D, Verbeke G, Lesaffre E. Fully exponential Laplace approximations for the joint modelling of survival and longitudinal data. Journal of the Royal Statistical Society Series B (Statistical Methodology) 2009;71(3):637–654. [Google Scholar]

[R18] Self S, Pawitan Y. Modeling a marker of disease progression and onset of disease. In: Jewell N, Dietz K, Farewell V, editors. AIDS Epidemiology. Birkhäuser; Boston: 1992. pp. 231–255. [Google Scholar]

[R19] Song X, Davidian M, Tsiatis A. An estimator for the proportional hazards model with multiple longitudinal covariates measured with error. Biostatistics. 2002;3(4):511–528. doi: 10.1093/biostatistics/3.4.511. [DOI] [PubMed] [Google Scholar]

[R20] Tsiatis A, Davidian M. A semiparametric estimator for the proportional hazards model with longitudinal covariates measured with error. Biometrika. 2001;88(2):447–458. doi: 10.1093/biostatistics/3.4.511. [DOI] [PubMed] [Google Scholar]

[R21] Tsiatis A, Davidian M. Joint modeling of longitudinal and time-to-event data: an overview. Statistica Sinica. 2004;14:809–834. [Google Scholar]

[R22] Tsiatis A, DeGruttola V, Wulfsohn M. Modeling the relationship of survival to longitudinal data measured with error. Applications to survival and CD4 counts in patients with AIDS. Journal of the American Statistical Association. 1995;90(429):27–37. [Google Scholar]

[R23] Wang C. Corrected score estimator for joint modeling of longitudinal and failure time data. Statistica Sinica. 2006;16:235–253. [Google Scholar]

[R24] Wang C, Xie S, Prentice R. Recalibration based on an approximate relative risk estimator in Cox regression with missing covariates. Statistica Sinica. 2001;11:1081–1104. [Google Scholar]

[R25] Wu L, Liu W, Yi G, Huang Y. Analysis of longitudinal and survival data: joint modeling, inference methods, and issues. Journal of Probability and Statistics. 2012 doi: 10.1155/2012/640153. [DOI]

[R26] Wulfsohn M, Tsiatis A. A joint model for survival and longitudinal data measured with error. Biometrics. 1997;53(1):330–339. [PubMed] [Google Scholar]

PERMALINK

Joint modeling of longitudinal and survival data with the Cox model and two-phase sampling

Rong Fu

Peter B Gilbert

Abstract

1 Introduction

2 Notation, Joint Model, Sampling Design, Surrogates and Mediators

2.1 Two-phase sampling model

2.2 Prentice surrogate endpoint assessment

2.3 Natural direct effects/mediation approach

3 IPW and AIPW Conditional Score Methods

3.1 Conditional score estimator with full sampling

3.2 IPW conditional score estimator

3.2.1 Prespecified sampling probabilities πi = π(Oi)

Lemma 1

Lemma 2

Theorem 1

3.2.2 Estimated sampling probabilities π̂i = π(Oi, ρ̂)

Theorem 2

3.3 AIPW conditional score estimator

3.3.1 Prespecified sampling probabilities πi = π(Oi) and conditional expectation 𝔼(·|O)

Lemma 3

Lemma 4

Theorem 3

3.3.2 Estimated sampling probabilities π̂i = π(Oi, ρ̂) and prespecified conditional expectation 𝔼(·|O)

3.3.3 Prespecified sampling probabilities πi = π(Oi) and estimated 𝔼̂ (·|O)

Theorem 4

3.3.4 Estimated sampling probabilities π̂i = π(Oi, ρ̂) and estimated 𝔼̂(·|O)

Theorem 5

4 Simulation Studies of the IPW and AIPW Methods

Table 1.

4.1 Simulation Study I

Fig. 1.

4.2 Simulation Study II

Fig. 2.

Fig. 3.

4.3 Simulation study III

Fig. 4.

5 Application to ACTG 175

Fig. 5.

Table 2.

6 Discussion

Supplementary Material

Acknowledgments

Footnotes

Contributor Information

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

3.2.1 Prespecified sampling probabilities π_i = π(O_i)

3.2.2 Estimated sampling probabilities π̂_i = π(O_i, ρ̂)

3.3.1 Prespecified sampling probabilities π_i = π(O_i) and conditional expectation 𝔼(·|O)

3.3.2 Estimated sampling probabilities π̂_i = π(O_i, ρ̂) and prespecified conditional expectation 𝔼(·|O)

3.3.3 Prespecified sampling probabilities π_i = π(O_i) and estimated 𝔼̂ (·|O)

3.3.4 Estimated sampling probabilities π̂_i = π(O_i, ρ̂) and estimated 𝔼̂(·|O)