Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2013 May 8.
Published in final edited form as: Biometrics. 2011 Oct 17;68(2):397–407. doi: 10.1111/j.1541-0420.2011.01690.x

Hazard Ratio Estimation for Biomarker-Calibrated Dietary Exposures

PAMELA A SHAW 1, ROSS L PRENTICE 2
PMCID: PMC3648661  NIHMSID: NIHMS468302  PMID: 22004367

Summary

Uncertainty concerning the measurement error properties of self-reported diet has important implications for the reliability of nutritional epidemiology reports. Biomarkers based on the urinary recovery of expended nutrients can provide an objective measure of short-term nutrient consumption for certain nutrients and, when applied to a subset of a study cohort, can be used to calibrate corresponding self-report nutrient consumption assessments. A non-standard measurement error model that makes provision for systematic error and subject-specific error, along with the usual independent random error, is needed for the self-report data. Three estimation procedures for hazard ratio (Cox model) parameters are extended for application to this more complex measurement error structure. These procedures are risk set regression calibration, conditional score, and nonparametric corrected score. An estimator for the cumulative baseline hazard function is also provided. The performance of each method is assessed in a simulation study. The methods are then applied to an example from the Women's Health Initiative Dietary Modification Trial.

Keywords: Biomarkers, Conditional score, Cox regression, Measurement error, Nonparametric corrected score, Nutrition assessment, Regression calibration

1. Introduction

International reviews of diet and chronic disease data report many possible diet-disease associations, but few that are firmly established (World Cancer Research Fund/American Institute for Cancer Research, 2007). Studies of the same association in different populations, or using differing dietary assessment methods, often yield conflicting results. There are now cohorts where diet is assessed with both food frequency and food record data. In these contexts a positive association between dietary fat and breast cancer (Bingham et al., 2003; Freedman et al., 2006) and an inverse association between dietary fiber and colorectal cancer (Dahm et al., 2010) were found when food records were used, but these associations were not apparent when food frequency questionnaire data were substituted. These reports strongly suggest that the measurement error properties of the dietary assessment methods used need to be assessed and accommodated if reliable diet and disease associations are to be obtained.

Some nutritional epidemiology observational studies have attempted to address the measurement error issue by using one self-report assessment to calibrate another. However, a basic requirement for a `reference' estimate to be used to calibrate (or correct) another assessment is that of independent measurement errors. Since measurement errors for two self-report assessments instead may be strongly positively correlated, recent efforts have instead focused on biomarkers of nutrient consumption. For example, a doubly-labeled water technique (Schoeller, 1988) provides a reliable assessment of short-term total energy expenditure, while urinary nitrogen yields a good biomarker of short-term protein expenditure (Bingham and Cummings, 1985). These `recovery biomarkers' (Kaaks, 1997) also provide estimates of consumption among weight-stable persons. Available biomarker study data document important systematic bias in relation to body mass index for energy, protein and percent of energy from protein for self-reported food frequency data (Heitman and Lissner, 1995; Subar et al., 2003; Neuhouser et al., 2008).

Building on the work of several authors (Prentice, 1996; Carroll et al., 1998; Kipnis et al., 2001; Jiang et al., 2001), Prentice et al. (2002) proposed a general measurement error model for self-reported dietary intake which incorporates location and scale bias terms that may depend on observed covariates. This model also allows the measurement error variance to depend on observed subject characteristics. Sugar et al. (2007) considered this model for covariate measurement error and developed methods for odds ratio estimation (logistic regression model). Many potential applications, however, involve time-to-event outcomes. Here we consider hazard ratio (Cox model) estimators that accommodate this general error structure in order to relate nutrient consumption to a time-to-disease outcome.

Several methods have been proposed for Cox regression with mismeasured covariates, under a classical additive measurement model. These include regression calibration (Prentice, 1982; Wang et al., 1997), risk set regression calibration (Xie et al., 2001), parametric, semiparametric and nonparametric likelihood procedures (Hu et al., 1998), conditional scores (Tsiatis and Davidian, 2001), parametric corrected scores (Nakamura, 1992), and nonparametric corrected scores procedures (Huang and Wang, 2000, 2006; Hu and Lin, 2002, 2004; Gorfine et al., 2004; Song and Huang, 2005). There has also been consideration of more general error models (Hu and Lin, 2002; Liao et al., 2011) under the assumption that a `validation' subsample is available where the covariate of interest is precisely measured. In our setting, the precisely measured covariate is not obtainable.

Here our interest focuses on a study cohort with available self-report data and a biomarker subsample. Our applications to energy and protein consumption in relation to cancer (Prentice et al., 2009) and to cardiovascular disease incidence (Prentice et al., 2011) indicate that these assessments can be much improved by using such study subject characteristics as body mass index, age and ethnicity to augment the food frequency self-report assessments. The disease occurrence rates are low (<5%) and censoring rates are high in the context of these studies, configurations under which bias in the regression calibration estimators is expected to be negligible. In this paper we extend risk set regression calibration, conditional score and nonparametric corrected score procedures to the measurement model of Prentice et al. (2002) that allows for these types of dependencies. We evaluate and compare the performance of these procedures in simulation study and application.

2. Methods

Let Xi be the covariate of interest measured with error, a nutritional intake in our setting, and Zi a vector of precisely measured covariates, for subjects i = 1, … , n. The usual proportional hazards model for the continuous failure time Ti ≥ 0 is assumed. The hazard rate λi(t) for individual i at time t is given by λ0(t)exp(β1Xi+β2Zi), where λ0(t) is an arbitrary `baseline' hazard function. Assume the right-censoring time C is independent of T given (X, Z). Let Ni(t) denote the counting process for observed events, Yi(t) = I{Ui = min(Ti, Ci) ≥ t} be the at-risk indicator at time t, and Δi = I(TiCi). A finite follow-up interval [0, M] is assumed.

2.1 Measurement Error Model

Instead of observing Xi, ki replicates of a self-reported Qij are observed, where Qij follows the general measurement error model (Prentice et al., 2002)

Qij=δ0+δ1Xi+δ2Zi+δ3ZXi+γi+ξij, (1)

for i = 1, … , n and j = 1, … , ki. Here, ξij is mean-zero random error and γi is a mean-zero random effect that allows errors in repeat assessments for subject i to be correlated. The δ parameters determine the systematic bias of the assessment, including scale and location bias dependent on Zi. A variance of the form aebVi is considered for γi to allow the subject-specific error variance to depend on Vi, the categorical components of Zi. Note, more generally that Vi could be a categorical characteristic derived from Zi. For regression calibration, there may be continuous and discrete components of Zi. For the conditional and corrected score methods that follow, all components of Zi that impact the scale bias (i.e. have a nonzero δ3 coefficient) are assumed to be discrete. The error ξij is assumed independent of the other random variables on the right side of (1).

For i = 1, … , n, we also assume there are κi replicates of an additional covariate Wij, for j = 1, … , κi, a biomarker that obeys the classical measurement error model

Wij=Xi+ij. (2)

Importantly, the mean-zero error ∊ij is assumed independent of Xi and other terms on the right side of (1). Typically, due to expense, the biomarker would only be measured on a random subset of subjects, called the biomarker subset. Let Ri be the indicator that subject i is in the biomarker subset.

We assume that (Ni, Yi, Xi, Zi, Ri, γi, ki, κi, ∊i1, … , ∊ki, ξi1, … , ξκi) are i.i.d. random vectors and that (Ri, ki, κi) may depend on observed baseline covariates Zi, but is otherwise independent of all other random variables in the survival and error models. With the additional assumptions that P(R = 1) > 0 and P(V = υ) > 0 for all υ ∈ {υ|V = υ}, one has ρ = P (R = 1, V = υ) > 0 from the independence of R and V. With this result and the SLLN, we have the necessary regularity that n/n → ρ as n → ∞ for all υ, where n is the number of individuals with R = 1 and V = υ. If P(R = 1) < 1, similar regularity holds for the non-biomarker subset (R = 0).

For ease of notation, X and Z are specialized to be univariate, with Z being categorical, and we assume all n individuals have k replicates of Q and all individuals in the biomarker subset have κ observations of W. The error model nuisance parameters are estimated separately using method of moments (Sugar et al., 2007). To ensure identifiability of all parameters in the measurement error model, it is enough that only a random subset of the biomarker cohort has κ > 1 replicates of W and a random subset of the `main' cohort has k > 1 replicates of Q. Importantly, for regression calibration, only single measures of Q and W are needed. The variance of γ in (1) is assumed to follow the model Σγ = aebZ.

2.2 Risk Set Regression Calibration

In this section, we extend the risk set regression calibration (RRC) estimator of Xie et al. (2001) to the generalized measurement error model (Section 2.1). For this model, the unobserved Xi are estimated separately depending on membership in the biomarker subset.

For an observed failure time t, define

X^i(t)={E^{XiYi(t)=1,Wi,Qi,Zi}ifRi=1E^{XiYi(t)=1,Qi,Zi}ifRi=0} (3)

where Wi=κ1j=1κWij, Qi=κ1j=1κQij, and `^' denotes estimate. Sugar et al (2007) discuss a class of n12-consistent estimators for the nuisance parameters pertaining to the same measurement error model considered here. Explicit moment plug-in estimates for the nuisance parameters are given in the Supplementary Materials. The nuisance parameters can also be estimated by performing linear regression of the biomarker on the self-report and other observed covariates in the measurement error model, as discussed in the appendix of Neuhouser et al. (2008). This method requires no replicates of the error prone W or Q.

The RRC estimator is found by solving the following estimating equation for β = (β1, β2)′

n1i=1n0M[{X^i(t),Zi}j=1nYj(t){X^i(t),Zi}exp{β1X^j(t)+β2Zi}j=1nYj(t)exp{β1X^j(t)+β2Zi}]dNi(t)=0. (4)

The ordinary regression calibration (RC) estimator is found by a similar equation to (4), only X^j is estimated only once (at t = 0) instead of being reestimated for each risk set. The RC and RRC estimators are generally not consistent for the true β, even if Xi is normally distributed, as the distribution for Xi|{Yi(t) = 1, Wi·, Qi·, Zi} is not normal typically (Prentice, 1982). In the classical measurement error setting, regression calibration provides an estimate for β with little asymptotic bias, provided there is small to moderate β and failure probabilities are small (Prentice, 1982; Xie et al., 2001). In many settings this simple estimator substantially eliminates the naive estimator bias and has good efficiency. Issues of bias will be explored for the proposed RRC estimator using simulation studies. Regularity conditions sufficient for asymptotic normality are listed in the Supplementary Materials.

2.3 Conditional Score

Stefanski and Carroll (1987) developed the conditional score estimator for generalized linear models. In the conditional score approach, a joint probability model for the mismeasured covariates and the response variable Y is assumed, and the unobserved covariates are treated as parameters. The conditional score estimating equation is obtained by conditioning the derived estimating equation on the sufficient statistics for the unobserved covariates, the Xi in our setting. Tsiatis and Davidian (2001) adapted this approach to the partial likelihood score, assuming the mismeasured covariates follow a linear mixed effects model with classical normal measurement error. Here, their conditional score method is extended to the generalized error model described above.

First consider an individual in the biomarker subset. Assuming normally distributed errors, one can condition the likelihood of {dNi(t), Qi, Wi} given {Xi, Zi, Yi(t) = 1} on the statistic

ζi=β1ΣeiΣidNi(t)+Σi(δ1+δ3Zi)(Qiδ0δ2Zi)+ΣeiWiΣei+Σi(δ1+δ3Zi)2,

where Σi = Σ/κ is the variance of the error in Wi· and Σei = var(Qi·|Xi, Zi) = aebZiξ/k. The resulting conditional intensity

limdt0dt1P{dNi(t)=1ζi,Zi,Yi(t)}=λ0(t)exp{β1ζ1β12ΣeiΣi2Σei+Σi(δ1+δ3Zi)2+β2Zi}Yi(t)

does not depend on the unobserved Xi. Similarly, for a non-member of the biomarker cohort, conditioning on the statistic ζi=β1Σei(δ1+δ3Zi)2dNi(t)+(δ1+δ3Zi)1(Qiδ0δ2Zi) gives the conditional intensity

limdt0dt1P{dNi(t)=1ζi,Zi,Yi(t)}=λ0(t)exp{β1ζ1β12Σei2(δ1+δ3Zi)2+β2Zi}Yi(t).

As in the classical measurement error case, it can be shown ζi in both cases above is of the form ζi=W~i+β1E~idNi(t), where W~Xi has mean Xi and variance E~i. For the biomarker cohort, W~i is a weighted average of the self-report measure Qi·, recentered and rescaled so that it is unbiased for Xi at the true value of the nuisance parameters, and the biomarker Wi·, where the weights are inversely proportional to the error variance in these two variables. For the non-members of the biomarker cohort, W~i is simply the recentered and rescaled Qi·. Now define E0i(t;β,ϕ)=exp(β1ζiβ12E~i2+β2Zi)Yi(t), where ϕ is the vector of error model nuisance parameters. Proceeding in a manner similar to Tsiatis and Davidian (2001), the estimating equation for β1 is given by

z{Z}i=1nz0M{ζzij=1nzζzjE0j(t,β,ϕ^)j=1nzE0j(t,β,ϕ^)}dNi(t)=0, (5)

where for the Z = z stratum, nz denotes the number of individuals and subscript zi denotes the i-th member. This conditional score equation reduces to the ordinary stratified Cox partial likelihood score equation when there is no measurement error, i.e. when δ = (0, 1, 0, 0) and Σ03BE; = Σ = 0. As discussed in Section 2.1, a plug-in estimate for the vector of the measurement error nuisance parameters can be estimated separately. Details of this derivation and regularity conditions for equation (5) are provided in the Supplementary Materials.

A second estimator β^csw using a conditional score approach can be obtained by taking a weighted combination of two estimating functions: 1) the conditional score function for β of Tsiatis and Davidian (2001), which assumes classical measurement error, using only the biomarker data Wij, and 2) the left-hand side of the proposed conditional score equation (5), using only the self-report data Qij. For the latter conditional score, the biomarker data are still needed to estimate the nuisance parameters. Subject to normality and regularity conditions, every weighted average of these two conditional score estimating equations would be consistent for β. One could choose the weight w that minimizes the variance of β^csw. This strategy is evaluated in the simulation study that follows. Note, however, this approach may not be practical if the biomarker subsample includes few uncensored failure times.

2.4 Nonparametric Corrected Score

The idea behind the corrected score approach for consistent estimation with mismeasured covariates is to derive the necessary adjustment to the estimating equation with the error prone covariate so that it has the same expected value as the desired estimating equation with the true covariate and outcome of interest. Nakamura (1992) and Buzas (1998) developed a parametric corrected score for Cox regression. Huang and Wang (2000, 2006) developed nonparametric corrected scores for Cox regression assuming classical measurement error. The estimator of Huang and Wang (2006) requires replicate mismeasured covariates only on a subset and is extended here to the error model in Section 2.1.

Define X~i to be the `main' instrument Qi· recentered and rescaled by the nuisance parameters δ = (δ0, δ1, δ2, δ3) from the error model in equation (1), i.e. X~i(δ)=(Qiδ0δ2Zi)(δ1+δ3Zi). At the true parameter value δ0 = (δ00, δ10, δ20, δ30), the variable X~i is composed of Xi plus an error term. That is X~i(δ0)=Xi+(γi+ξi)(δ10+δ30Zi)=Xi+νi, where νi given Zi = z has zero mean and variance (Σγi· + Σξi·)/(δ10 + δ30z)2. With this transformed covariate X~i(δ), one can adapt the corrected score approach of Huang and Wang (2006). For consistent estimation, the method of Huang and Wang (2000, 2006) requires there to be individuals with at least two `error prone' measures observed which are conditionally independent given Xi and whose errors are independent of Xi and the at-risk process. The distribution of the error in X~i depends on Zi so if Zi, either through correlation with Xi or independently, is associated with the hazard, then the error νi will be correlated with both Xi and I{Yi(t) = 1}. Assuming discreteness and conditioning on the value of Zi, νi is independent of Xi, the failure time, and ∊i. Thus by stratifying the partial likelihood score on Z, a technique similar to Huang and Wang (2006) can be applied to achieve consistency.

To derive the corrected score, first note at δ = δ0 the solution to the following estimating equation based only on individuals in the biomarker subset:

z{Z}i=1nz0MRzi[X~zi(δ)j=1nzYzj(t)Wzjexp{β1X~zj(δ)}j=1nzYzj(t)exp{β1X~zj(δ)}]dNzi(t)=0

is consistent for β1; where for the Z = z stratum, nz denotes the number of individuals and subscript zi denotes the i-th member. This equation can be rewritten as

z{Z}i=1nz0MRzi[X~zi(δ)j=1nzYzj(t)X~zjexp{β1X~zj(δ)}j=1nzYzj(t)exp{β1X~zj(δ)}+j=1nzYzj(t){X~zj(δ)Wzj}exp{β1X~zj(δ)}j=1nzYzj(t)exp{β1X~zj(δ)}]dNzi(t)=0.

This suggests the following corrected score equation based on the entire cohort:

z{Z}i=1nz0M[X~zi(δ)+D^z(θ,t)j=1nzYzj(t)X~zjexp{β1X~zj(δ)}j=1nzYzj(t)exp{β1X~zj(δ)}]dNzi(t)=0, (6)

where

D^z(θ,t)=j=1nzYzj(t)Rzi{X~zj(δ)Wzi}exp{β1X~zi(δ)}j=1nzYzj(t)Rziexp{β1X~zi(δ)}

and θ = (β1, δ). The estimate of Dz(θ, t) is a nonparametric moment estimator using data from individuals in the biomarker sub-cohort with Zi = z. If the value of the nuisance parameter δ is not known, a separate moment estimator can again be used as a plug-in. Notably, a subset of individuals with at least one measure of both W and Q at risk at time t is all that is necessary to estimate D^z(θ,t). The solution β^np to equation (6) is referred to as the nonparametric corrected score estimator.

As was done for the conditional score approach in Section 2.3, a second potentially more efficient nonparametric estimator β^npw can be obtained by taking a weighted average of the above score equation (6) and the nonparametric score equation for classical measurement error (Huang and Wang, 2000, 2006) based on the biomarker data alone. The weight w can be chosen to minimize the sample variance of β^npw.

3. Estimation of Cumulative Baseline Hazard Function

For the assumed Cox model, the Breslow estimator for the cumulative baseline hazard is

Λ^0(t)=0tdN(u)i=1nYj(u)exp(β^1Xj+β^2Zj)=TitΔijRiepx(β^1Xj+β^2Zj).

Huang and Wang (2000) provided a nonparametric consistent estimator for Λ0 under classical measurement error using a representation of this estimator as a functional of empirical processes. This estimator, unlike their β^, requires additional assumptions of mean zero and symmetric error. Making these assumptions for ∊ only in (2), we extend their estimator to accommodate error model (1). We adopt their notation, using ε^ to denote the sample empirical mean and I(U = min(T, C) ≥ u) in place of Y (u) to highlight the connection between their estimator and equation (7) below. For notational simplicity, assume two repeat measures of Q on everyone (ki = 2). One approach to estimating Λ0 involves stratifying on values of Z, so that approximately λz0(t) = λ0(t)exp(β2Z), where Z denotes a `representative' Z-value in stratum z. A consistent estimator for Λz0(t) for stratum Z = z is

Λ^z0NP(t;β^1δ^)=(ε^[I(Z=z)Rexp{β^1(W(1)W(2))2}])1×ε^(I(Z=z)Rexp[β^1{X^(δ^)(W(1)+W(2))2}])0tdε^{I(Z=z)ΔI(Uu)}ε^[I(Z=z)exp{β^1X~(δ^)}I(Uu)] (7)

where β^1 is the solution to (6). Stratification is useful, as in (6), because the error in X~ij depends on values of Z. For the RC and RRC estimators, this also leads to the convenient overall estimator of Λ0,Λ^0(t)=z{Z}0texp(β^2z)nz(u)1Λ^Z0NP(du;β^1,δ^), where nz(u) and n(u) denote the risk set size in stratum Z = z and the overall risk set size, respectively, at time u. Equation (7) relies only on X~ derived from the error model and a suitable estimate of β, and it can be used in conjunction with any of the hazard ratio estimators described above. Details showing consistency are provided in the Supplementary Materials.

4. Simulation Study

Through simulation, the relative performance of the risk set regression calibration (RRC), conditional store (CS), and nonparametric corrected score (NP) are studied. Properties of these estimators will be compared to the `true' method, Cox regression on the unobserved true exposure Xi; the naive method, Cox regression on the error prone Qi·; and ordinary regression calibration (RC). For the CS and NP methods, we examine the performance of the weighted estimator described in Sections 2.3 and 2.4 compared with the classical measurement error versions of these estimators based on the biomarker data alone. We compare performance for different scenarios that vary the magnitude of the relative risk parameter β, the random-and systematic subject-specific measurement error nuisance parameters, and the assumed covariate and error distributions. We also consider versions of these estimators that ignore the dependence of subject-specific error variance on the observed covariate Z. Standard errors for the error-correction estimators are estimated using a bootstrap procedure.

For this simulation study, the cohort size is set at 1000 and the randomly selected biomarker subset at 250. Individuals have two copies of the main instrument Qij; biomarker cohort members have two copies of the biomarker Wij. Let η = (δ0, δ1, δ2, δ3, a, b), from the measurement error model (1). The scenario η = (0, 0.9, −0.2, −0.3, 0.5, log 2) represents a moderate amount of subject-specific error, with 10% scale bias for Zi = 0 and 40% for Zi = 1. The variance for γi, the subject-specific random effect term in (1), is allowed to vary between 0.5 (Zi = 0) and 1 (Zi = 1), while the variances of ∊ij and ξij are fixed at 0.5. The scenario η = (0, 0.5, −0.2, −0.2, 0.5, log 2) represents strong subject-specific bias, with 50% scale bias for Zi = 0 and 70% for Zi = 1; variances for error terms γi, ∊ij and ξij are kept as before. Results are presented for β = log 1.5, log 3. Note that log 3 is quite extreme, with a hazard ratio of 3 for a unit increase in a standard normal exposure variable. It is included here because the regression calibration estimator is known to perform less well in Cox regression for large β (Xie et al., 2001). Survival data are generated with an exponential distribution with unit rate and a fixed censoring time of t0 = 1, resulting in roughly 40% censoring. Much larger censoring rates, with associated smaller biases for regression calibration procedures, attend the type of application that motivated this research.

Mean bias, bootstrap standard deviation, empirical standard deviation (across simulations), root mean squared error, and empirical coverage probability for the bootstrap 95% confidence intervals are provided. Bootstrap estimates are based on 100 bootstrap samples and the empirical results are based on 1000 simulations.

Table 1 presents the results under normal covariate and error distributions. Xi and Zi are generated from a bivariate normal with zero mean, unit variance, and ρ = 0.5. Zi is converted to a binary indicator variate for being above the median. The upper left of Table 1 shows the results for β = log 1.5 and moderate systematic error. The naive estimate has a bias of −0.217 (54% reduction from target) and a smaller standard error than for the estimate based on the true exposure, leading to 0% coverage for a nominal 95% confidence interval. For this scenario, with moderate error and β = log 1.5, all of the measurement error corrected methods had small biases and came close to the nominal 95% coverage. The nonparametric estimators had the largest sample standard error. For the strong systematic error scenario and β = log 1.5 (top right), results were similar. The bias for the RRC is somewhat lower than for RC, but the RC estimator had the smallest mean-squared error of all the methods.

Table 1.

Simulation results for the general measurement error model with Gaussian subject-specific and random error. For 1000 simulated data sets, the mean bias, empirical standard deviation (SD), bootstrap standard deviation (BSD), root mean squared error (RSME) and estimated 95% coverage probability (CP) are given for β = log 1.5, log 3.

Moderate Subject-Specific Bias
Strong Subject-Specific Bias
β = log 1.5 Bias SD BSD RMSE CP Bias SD BSD RMSE CP
True 0.000 0.041 0.041 0.041 96.5 0.000 0.041 0.041 0.041 96.5
Naive −0.217 0.032 0.032 0.219 0.0 −0.265 0.037 0.037 0.268 0.0
RC −0.014 0.063 0.062 0.065 92.9 −0.016 0.069 0.069 0.071 93.3
RRC −0.012 0.068 0.073 0.069 95.4 −0.013 0.073 0.073 0.074 93.4
CS B 0.010 0.099 0.102 0.099 95.8 0.010 0.099 0.102 0.099 95.8
CS W 0.011 0.091 0.096 0.092 96.5 0.010 0.102 0.105 0.102 95.6
NP B 0.014 0.123 0.128 0.124 96.9 0.014 0.123 0.128 0.124 96.9
NP W 0.012 0.119 0.123 0.120 96.7 0.007 0.122 0.126 0.122 95.7
β = log 3 Bias SD BSD RMSE CP Bias SD BSD RMSE CP
True 0.001 0.050 0.051 0.050 95.4 0.001 0.050 0.051 0.050 95.4
Naive −0.678 0.035 0.034 0.679 0.0 −0.806 0.039 0.038 0.806 0.0
RC −0.190 0.095 0.094 0.213 46.2 −0.219 0.098 0.097 0.240 38.5
RRC −0.120 0.107 0.119 0.161 79.7 −0.146 0.107 0.108 0.181 69.9
CS B 0.035 0.182 0.198 0.186 97.4 0.035 0.182 0.198 0.186 97.4
CS W 0.023 0.171 0.183 0.172 96.5 0.025 0.180 0.195 0.181 97.6
NP B 0.070 0.255 0.289 0.265 96.6 0.070 0.255 0.289 0.265 96.6
NP W 0.056 0.246 0.275 0.252 96.3 0.041 0.262 0.276 0.265 95.2

True: Cox regression with true X; Naive: Cox regression with unadjusted Q; RC: ordinary regression calibration; RRC: risk set regression calibration; CS B: CS from Tsiatis and Davidian (2001) using biomarker W data only; CS W: weighted combination of CS using Q only and W only; NP B: NP equation from Huang and Wang (2000) using only W; NP W: weighted combination of NP using Q only and W only.

The lower half of Table 1 shows the results for the same error settings, with β = log 3. As expected, bias of the regression calibration estimators increased, but RRC had less bias and better RMSE than RC. For the larger β, CS had the smallest bias, good nominal coverage, and nearly the same RMSE as RRC. The RRC estimator had the lowest mean squared error for the large β for both subject-specific error scenarios, though it had appreciable bias and poorer coverage for the larger value of β than either the CS or NP estimator.

The CS and NP estimators had variances that were appreciably larger than the other estimators. It is noteworthy for this setting, with relatively large error in Qij compared to Wij and a substantial number of events observed in the biomarker subset, the NP and CS estimators based on the complete data had modest to no gain in efficiency over their counterparts, CS B and NP B, based on the biomarker subset alone. Sugar et al. (2007) also observed the CS estimator in the context of logistic regression with the same general measurement error model was highly variable, particularly in presence of strong subject-specific bias. For classical measurement error, the nonparametric corrected estimator has been observed to have numerical instability and problems due to multiple roots (Song and Huang, 2005; Carroll et al., 2006), with these problems getting worse as the measurement error variance increases (Song and Huang, 2005). In the case of multiple roots, the root closest to the RC estimate was selected. These simulations suggest that for a similar setting of moderate sample sizes, potentially large and normal error, the RRC and conditional score methods perform better overall, with conditional score preferred for very extreme β values.

To explore robustness, a similar set of simulations were repeated with skewed distributions. The systematic and random error terms in Wij and Qij were generated from a unit exponential distribution, reflected about zero to create left skewness and offset by its mean to create mean-zero errors. The same bivariate normal distribution was used to create X and Z as above, and then both were exponentiated to create skewed lognormal random variables. Results are shown in Table 2. As expected, the regression calibration estimators, which rely on approximate normality, have more bias particularly for the larger β. For the extreme β, however, the RRC estimator was much less affected. The conditional score methods, which rely on Gaussian error for consistency, had noticeably larger bias and relatively larger variance. The weighted CS estimator did not improve on the CS estimator based on the biomarker alone, likely due to the larger amount of error and skewness in Q. The displayed CS W estimator had a weight of 0.1 for the score using information from Q, having the smallest variance among the (nontrivial) decile weights. The skewness had a larger impact on the relative performance of the RC, RRC, and CS estimators (in terms of bias and variance) for the larger β. The performance of the NP estimator, as expected, was unaffected by the skewness in the distributions, with little small sample bias, good nominal coverage, and nearly or the smallest RMSE for all scenarios. The RRC estimator, even with large β and strong systematic error, had reasonable coverage and bias less than 15%.

Table 2.

Simulation results for the general measurement error model with skewed distributions for the model covariates as well as the subject-specific and random error. For 1000 simulated data sets, the mean bias, empirical standard deviation (SD), bootstrap standard deviation (BSD), root mean squared error (RSME) and estimated 95% coverage probability (CP) are given for β = log 1.5, log 3.

Moderate Subject-Specific Bias
Strong Subject-Specific Bias
β = log 1.5 Bias SD BSD RMSE CP Bias SD BSD RMSE CP
True 0.002 0.020 0.020 0.020 94.6 0.002 0.020 0.020 0.020 94.6
Naive −0.082 0.022 0.022 0.085 3.8 −0.072 0.042 0.034 0.083 45.4
RC −0.025 0.099 0.079 0.102 86.9 −0.045 0.089 0.069 0.099 88.7
RRC 0.016 0.096 0.121 0.097 93.7 0.044 0.076 0.071 0.088 82.1
CS B 0.012 0.048 0.050 0.050 94.9 0.012 0.048 0.050 0.050 94.9
CS W 0.028 0.061 0.060 0.067 93.5 0.054 0.075 0.076 0.092 88.4
NP B 0.011 0.057 0.061 0.058 96.2 0.011 0.057 0.061 0.058 96.2
NP W 0.008 0.055 0.061 0.057 96.6 0.009 0.056 0.060 0.057 96.4
β = log 3 Bias SD BSD RMSE CP Bias SD BSD RMSE CP
True 0.003 0.038 0.037 0.038 94.5 0.003 0.038 0.037 0.038 94.5
Naive −0.491 0.032 0.029 0.492 0.0 −0.590 0.051 0.038 0.592 0.0
RC −0.343 0.193 0.152 0.394 44.4 −0.458 0.157 0.122 0.485 0.7
RRC −0.077 0.197 0.201 0.211 94.3 −0.116 0.167 0.161 0.203 90.9
CS B 0.136 0.160 0.184 0.210 99.0 0.136 0.160 0.184 0.210 99.0
CS W 0.138 0.226 0.225 0.265 94.9 0.112 0.231 0.249 0.257 98.0
NP B 0.068 0.210 0.243 0.221 96.6 0.068 0.210 0.242 0.221 96.6
NP W 0.059 0.195 0.234 0.204 97.3 0.064 0.209 0.237 0.219 96.6

True: Cox regression with true X; Naive: Cox regression with unadjusted Q; RC: ordinary regression calibration; RRC: risk set regression calibration; CS B: CS from Tsiatis and Davidian (2001) using biomarker W data only; CS W: weighted combination of CS using Q only and W only; NP B: NP equation from Huang and Wang (2000) using only W; NP W: weighted combination of NP using Q only and W only.

Table 3 compares the proposed estimators to those incorrectly based on the error model (1) without dependence of var(γi) on Zi. Scenarios are the same as in Table 1, except the impact of Z in var(γi) was increased with b = log 4. It is interesting to note that the misspecified RC and RRC estimators have similar bias but increased standard errors compared to their correctly specified versions. The misspecified CS estimator contains increased bias and standard errors, with similar coverage. The NP estimator, because it uses rescaling by Zi, induces and adjusts for error variance that depends on Z; so misspecification is not possible.

Table 3.

Simulation study comparing the proposed estimators with misspecified versions of the measurement error variance, under the general measurement error model with Gaussian distributions for the model covariates as well as for the subject-specific and random error. For 1000 simulated data sets, the mean bias, empirical standard deviation (SD), bootstrap standard deviation (BSD), root mean squared error (RSME) and estimated 95% coverage probability (CP) are given for β = log 1.5, log 3.

Moderate Subject-Specific Bias
Strong Subject-Specific Bias
β = log 1.5 Bias SD BSD RMSE CP Bias SD BSD RMSE CP
True 0.000 0.041 0.041 0.041 96.5 0.000 0.041 0.041 0.041 96.5
Naive −0.262 0.028 0.028 0.263 0.0 −0.308 0.031 0.031 0.309 0.0
RC −0.014 0.064 0.063 0.066 93.1 −0.016 0.069 0.070 0.071 93.2
RC-M −0.008 0.071 0.071 0.071 95.0 −0.011 0.080 0.082 0.081 95.6
RRC −0.014 0.069 0.074 0.070 94.7 −0.014 0.073 0.073 0.074 93.1
RRC-M 0.007 0.077 0.077 0.077 95.9 −0.004 0.083 0.084 0.083 96.2
CS W 0.014 0.098 0.102 0.099 96.0 0.008 0.105 0.107 0.105 95.5
CS W-M −0.040 0.101 0.111 0.109 94.3 −0.020 0.153 0.146 0.155 93.9
NP W 0.008 0.119 0.125 0.120 96.6 0.000 0.123 0.127 0.123 95.5
β = log 3 Bias SD BSD RMSE CP Bias SD BSD RMSE CP
True 0.001 0.050 0.051 0.050 95.4 0.001 0.050 0.051 0.050 95.4
Naive −0.790 0.031 0.029 0.791 0.0 −0.901 0.032 0.031 0.901 0.0
RC −0.193 0.096 0.095 0.216 45.9 −0.220 0.099 0.097 0.241 38.0
RC-M −0.196 0.102 0.101 0.221 47.6 −0.210 0.105 0.105 0.235 46.4
RRC −0.131 0.106 0.118 0.169 77.8 −0.150 0.107 0.108 0.184 68.2
RRC-M −0.088 0.110 0.109 0.141 83.8 −0.157 0.105 0.106 0.189 63.2
CS W 0.020 0.177 0.189 0.178 96.8 0.026 0.178 0.195 0.180 97.8
CS W-M 0.061 0.253 0.249 0.261 95.6 0.067 0.191 0.219 0.202 97.6
NP W 0.044 0.246 0.277 0.250 96.1 0.023 0.259 0.278 0.260 94.7

True: Cox regression with true X; Naive: Cox regression with unadjusted Q; RC: ordinary regression calibration; RRC: risk set regression calibration; CS W: weighted combination of conditional score using Q only and W only; NP W: weighted combination of the nonparametric estimator using Q only and W only. Misspecified error-correction estimators are noted by a “–M”.

5. WHI Example

The Women's Health Initiative (WHI) Dietary Modification (DM) trial followed 48,835 women for an average of 8.1 years and examined whether a low-fat dietary pattern intervention could lower the risk of breast and colorectal cancer (Women's Health Initiative Study Group, 1998; Prentice et al., 2006). Prentice et al. (2006) reported a nonsignificant reduction in breast cancer of 9% (logrank p=0.07) for the intervention compared to the control (usual diet) arm. There was no suggested reduction for colorectal cancer (Beresford et al., 2006). An important question is whether the equivocal breast cancer finding is from a lack of efficacy or a lack of adherence to the diet, but actual diet is not obtainable. Instead, the primary tool for measuring diet was a self-reported food frequency questionnaire (FFQ), an instrument known to be subject to both random and subject-specific reporting errors.

The DM trial included a Nutritional Biomarker Substudy (NBS) which collected self-reported intake along with several objective biomarkers on 544 weight-stable women randomly selected at a representative set of 12 of the 40 participating clinical centers. The NBS protocol included the doubly-labeled water recovery marker for total energy consumption (Schoeller, 1988). There were 110 women recruited from early NBS enrollees who had repeat biomarker measures, allowing the general measurement error model to be identifiable. Further details of the NBS study and an analysis of the measurement error in the WHI dietary instruments were reported by Neuhouser et al. (2008), who found BMI to a strong determinant of subject-specific bias in this cohort. In this illustrative example, we fit the measurement error in equation (1) with possible dependence on obesity status (BMI ≥ 30) and apply the developed methods to provide error-adjusted estimates of the risk of breast cancer associated with total energy consumption. Because the baseline FFQ was used to determine eligibility in the DM trial by requiring a minimum of 32% estimated calories from fat, the baseline for this analysis was taken as one year after enrollment, at which time another FFQ was obtained. We analyze data from the usual diet (control) arm.

5.1 Results

There were 25803 women in the DM control group included in this analysis, 884 of whom developed breast cancer following the 1-year FFQ collection. The estimate (95% confidence interval (CI)) for the breast cancer hazard ratio associated with a 20% increase in energy intake was: 1.00 (0.97, 1.04) for the naive estimate, 1.24 (1.03, 1.48) for RC, 1.23 (1.03, 1.48) for RRC, 1.30 (0.80, 2.10) for CS and 1.43 (0.95, 2.15) for NP. Note the hazard ratio for a fractional increase in intake is constant under the Cox model applied, since the log-hazard ratio was assumed to be a linear function of log consumption. A 20% increase is roughly the difference between the third and first quartile of energy consumption, as measured by the recovery biomarker (2268 versus 1869 calories). The RC and RRC estimates are nearly identical, since for this relatively rare disease with most censoring at the planned study termination, the distribution for E(X|W, Q, Z) changes very little across risk sets. The NP hazard ratio estimate is slightly larger than the regression calibration estimates, and its 95% confidence interval is much wider. The CS estimate is the most variable of the error-corrected estimates and had some numerical problems, with skewness in the bootstrap estimates and more than 4% of the bootstrap iterations failing to find a root to the score equation.

To help interpret the above estimates, a simulation study was built on the observed WHI data. The simulation cohort had 25000 individuals, with 540 in the biomarker sample and 110 in its reliability subset (both randomly selected). The self-reported and biomarker values for log-energy were well approximated by Gaussian distributions in the NBS (Figure 1, Supplementary Materials). BMI and X were generated as multivariate normal variates on the log scale. Roughly 25% of the simulated cohort were obese. In the fitted error model (1), δ3 was estimated to be nearly zero, so the model was applied without this term. The variance of ∊ was 30% of the total variance in W. The variance of the subject-specific bias plus random error terms was extreme, about 95% the total variance of Q. Survival time was generated according to a proportional hazards model dependent on log energy consumption with baseline survival an exponential distribution with overall event rate of about 3%.

Table 4 shows results for β = 0 and 1.25; i.e. the β for which a 20% increase in consumption leads to a hazard ratio of 1 and 1.26, respectively. None of the estimators showed evidence of bias under the null. The naive estimator under β = 1.25 had extreme bias, nearly 95% of the true value, and small standard error, leading to 0% empirical coverage. The RC and RRC estimators performed well for β = 1.25, having no detectable bias. There were some numerical problems for the CS and NP estimators, with more than 15% of the simulations failing to find a solution. These estimators were quite variable and had some small sample bias, though the coverage probabilities were close to the nominal 95%. Using the empirical mean and SD across simulations, the estimated hazard ratio (empirical 95% CI) for a 20% increase in intake is: 1.01 (0.98, 1.05) for the naive method, 1.25 (0.98, 1.60) for RC, 1.25 (0.98, 1.60) for RRC, 1.18 (0.69, 2.03) for CS, 1.23 (0.73, 2.08) for NP. The confidence intervals for the regression calibration estimators are considerably narrower than for the other two error-corrected estimators. The CS estimator had the largest variance. This simulation study suggests for this example, where there was a sizable biomarker subset, that each error correction method is providing approximately unbiased estimates of the risk associated with energy intake, with RC or RRC preferred due to their comparatively better efficiency. Another advantage of regression calibration is that standard software can be used to find the parameter estimates and replicates of neither the biomarker nor self-report are required.

Table 4.

Results of simulations designed to emulate hazard ratio estimation for energy in relation to breast cancer in the Women's Health Initiative, using 500 simulated data sets. The mean bias, empirical standard deviation (SD), root mean squared error (RSME), type I error (α) for β = 0, and estimated 95% coverage probability (CP) for β = 1.25 are given.

β = 0 Bias SD RMSE α β = 1.25 Bias SD RMSE CP
True 0.004 0.296 0.296 0.046 True 0.006 0.302 0.302 95.4
Naive −0.008 0.105 0.105 0.056 Naive −1.181 0.106 1.189 0.0
RC −0.005 0.709 0.709 0.050 RC −0.006 0.691 0.691 96.2
RRC −0.005 0.709 0.709 0.050 RRC −0.006 0.690 0.691 96.2
CS −0.052 1.507 1.507 0.052 CS −0.318 1.510 1.543 93.1
NP 0.014 1.528 1.528 0.042 NP −0.217 1.471 1.487 95.0

True: Cox regression with true X; Naive: Cox regression with unadjusted Q; RC: ordinary regression calibration; RRC: risk set regression calibration; CS: conditional score; NP: nonparametric corrected score.

6. Discussion

In this work, three methods for hazard ratio estimation were extended for the setting where the exposure of interest was measured with subject-specific bias and random error in the main cohort and additionally with classical measurement error on a subset. We also provided an estimator of the cumulative baseline hazard function. This error model is more flexible than others considered previously for these methods and does not rely on a validation subset. The extensions provided here allow for a more flexible error structure and also accommodate sources of intake assessment errors that vary across individuals.

The relocation and scaling parameters (the δ's in (1)) are crucial measurement model generalizations; the allowance for correlation between replicate error prone measurements (Q) is also important for some of the estimation procedures (conditional and nonparametric scores), while allowing the random effect variance to depend on Z may often be less important (Table 3). The risk set regression calibration estimator is a straightforward adaption of its counterpart for classical measurement error. The conditional and nonparametric scores required more detailed calculations and rely on a subtle, but necessary, stratification of the Cox model for proper error correction. Risk set regression calibration is an approximate method that typically incorporates some asymptotic bias. For consistency, the proposed conditional score estimator relies on a normality assumption for the error terms, but does not need a distributional assumption for unobserved true covariate. The nonparametric score method made no distributional assumptions for the error terms or the unobserved covariate.

Despite its lack of technical consistency, risk set regression calibration had the smallest mean-squared error in nearly all simulation scenarios considered, often by a considerable margin. Bias for both regression calibration estimators was more noticeable for extreme β, particularly when covariates and error terms had skewed distributions; however, risk set regression reduced the bias considerably. Under Gaussian error, the conditional score method had little small sample bias, maintained nominal 95% coverage, and with large β and stronger subject-specific error, had close to the smallest mean-squared error amongst the estimators. The conditional score estimator however was not robust to departures from normality, with bias in the presence of skewness increasing for larger β. The nonparametric method was robust to departures from normality with little small sample bias and good nominal coverage, and this was true for the more extreme β and subject-specific bias. Typically, however, the nonparametric estimates had substantially larger variance than regression calibration.

The mean-variance tradeoff between robust nonparametric or semiparametric methods and efficiency of parametric approaches is a familiar one for estimation. In light of the relative success of regression calibration, it is of interest to consider other approximate methods for this setting. Hu et al. (1998) presented a semiparametric likelihood approach for Cox regression with classical covariate measurement error that uses flexible distributional assumptions on the unobserved covariate and in some settings performed better than regression calibration. Alternatively, increasing the number of moments used for regression calibration and transforming data to improve the normal approximation could also be useful, and computationally less burdensome than the available likelihood approaches. As illustrated above, simulation studies can guide the practitioner's choice of an appropriate method.

In the WHI example, the association under study appeared moderate in size, the disease relatively rare, the measurement error substantial, and the number of disease events fairly large. The simulations based on this example suggests these methods provide quite adequate estimates of the hazard ratio. These simulations also demonstrated that for the naive analysis, the error in the self-reported data was large enough to cause extreme bias and obscure a clinically relevant association between caloric intake and disease incidence. This finding is consistent with Prentice et al. (2009) who reported for this cohort that several cancer endpoints had no association with self-reported energy intake, whereas with calibrated energy intake there were associations of public health importance. Variation in the self-report intake variable was largely due to measurement error in this setting. This likely contributed to the numerical problems seen for conditional and nonparametric score estimators.

Because of the large number of nuisance parameters in the measurement error model, it is difficult to do an exhaustive exploration of the relative performance of these methods. For settings with error properties much different than those studied here, further study may be needed to understand which method is most appropriate. Some interpretational challenges arise with any of the estimation procedures considered here if one or more covariates (Zi) determining the subject-specific bias in (1) are also important mediators of the association between Xi and the hazard ratio. See Prentice and Huang (2011) for further discussion of this issue, and for data analysis options. The censoring mechanism could also impact the relative performance of the methods. Regression calibration methods work best when censoring times tend to be short, as longer follow-up will lead to X-distributions that depart more extensively from baseline in risk sets at later failure times. Conditional score and nonparametric estimators on the other hand, which do not rely on distributional assumptions for the unobserved covariate X, are unlikely to be much affected by variations in the censoring distributions. The development of estimation procedures that can accommodate departures from independent censoring, for example through inverse censoring probability weighting, would also be of interest in the context of our measurement model.

The bootstrap variance estimator was studied for the proposed estimators. Robust variance estimators have been developed in the case of classical measurement error for each of the methods studied (Wang, 1999; Xie et al., 2001; Tsiatis and Davidian, 2001; Huang and Wang, 2006) and a similar approach could be taken for the proposed methods. A general discussion of sandwich estimators in the context of measurement error is provided by Carroll et al. (2006, Appendix A.6). For ease of implementation, the bootstrap estimator is preferred.

Across all scenarios studied, the nonparametric score method typically maintained the smallest bias and best nominal coverage. The risk set regression estimator had good relative performance, in terms of maintaining the smallest mean-squared error. The numerical stability and ease of implementation in standard software makes regression calibration attractive for settings with substantial measurement error. In settings where a very large hazard ratio is expected, along with potentially skewed distributions for the involved covariates, the nonparametric score estimator may be preferred. Simulation can be used to explore properties of the estimators for the data structure in a given application, and in particular, explore which method has the best numerical performance given the observed error structure in the data and the expected size of the hazard ratio for the exposure of interest.

Supplementary Material

Supp Material S1

Acknowledgements

This work was partially supported by the National Institute of Allergy and Infectious Diseases, grants CA53996 and CA119171 from the National Cancer Institute, and by contract N01-WH22110 from the National Heart, Lung and Blood Institute. The authors wish to thank the Women's Health Initiative investigators for access to the data used to illustrate the methods presented here. A list of WHI investigators can be found at www.whiscience.org. The WHI program is funded via contract by the National Heart, Lung and Blood Institute, National Institutes of Health.

Footnotes

Supplementary Materials The Supplementary Materials referenced in Sections 2, 3 and 4 are available under the Paper Information link at the Biometrics website http://www.biometrics.tibs.org.

References

  1. Beresford SAA, Johnson KC, Ritenbaugh C, Lasser NL, Snetselaar L, Black HR, et al. Low-fat dietary pattern and risk of colorectal cancer: The Women's Health Initiative randomized controlled dietary modification trial. Journal of the American Medical Association. 2006;295:643–654. doi: 10.1001/jama.295.6.643. [DOI] [PubMed] [Google Scholar]
  2. Bingham SA, Cummings JH. Urine nitrogen as an independent validatory measure of dietary intake: A study of nitrogen balance in individuals consuming their normal diet. American Journal of Clinical Nutrition. 1985;42:1276–1289. doi: 10.1093/ajcn/42.6.1276. [DOI] [PubMed] [Google Scholar]
  3. Bingham SA, Luben R, Welch A, Wareham N, Khaw KT, Day N. Are imprecise methods obscuring a relation between fat and breast cancer? Lancet. 2003;362:212–214. doi: 10.1016/S0140-6736(03)13913-X. [DOI] [PubMed] [Google Scholar]
  4. Buzas JS. Unbiased scores in proportional hazards regression with covariate measurement error. Journal of Statistical Planning and Inference. 1998;67:247–257. [Google Scholar]
  5. Carroll RJ, Freedman LS, Kipnis V, Li L. A new class of measurement error models, with applications to dietary data. Canadian Journal of Statistics. 1998;26:467–477. [Google Scholar]
  6. Carroll RJ, Ruppert D, Stefanski LA, Crainiceanu CM. Measurement error in nonlinear models: A modern perspective. 2nd edition Chapman and Hall; Boca Raton, Florida: 2006. [Google Scholar]
  7. Dahm CC, Keogh RH, Spencer EA, Greenwood DC, Key TJ, Fentiman IS, Shipley MJ, Brunner EJ, Cade JE, Burley VJ, Mishra G, Stephen AM, Kuh D, White IR, Luben R, Lentjes MAH, Khaw KT, Rodwell SA. Dietary fiber and colorectal cancer: A nested case-control study using food diaries. Journal of the National Cancer Institute. 2010;102:614–626. doi: 10.1093/jnci/djq092. [DOI] [PubMed] [Google Scholar]
  8. Freedman LS, Potischman N, Kipnis V, Midthune D, Schatzkin A, Thompson FE, Troiano RP, Prentice R, Patterson R, Carroll R, Subar AF. A comparison of two dietary instruments for evaluating the fat-breast cancer relationship. International journal of Epidemiology. 2006;35:1011–1021. doi: 10.1093/ije/dyl085. [DOI] [PubMed] [Google Scholar]
  9. Gorfine M, Hsu L, Prentice R. Nonparametric correction for covariate measurement error in a stratified Cox model. International journal of Epidemiology. 2004;5:75–87. doi: 10.1093/biostatistics/5.1.75. [DOI] [PubMed] [Google Scholar]
  10. Heitman B, Lissner L. Dietary underreporting by obese individuals - is it specific or non-specific. British Medical Journal. 1995;311:986–989. doi: 10.1136/bmj.311.7011.986. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Hu C, Lin DY. Cox regression with covariate measurement error. Scandinavian Journal of Statistics. 2002;29:637–655. [Google Scholar]
  12. Hu C, Lin DY. Semi-parametric failure time regression with replicates of mismeasured covariates. Journal of the American Statistical Association. 2004;99:105–117. [Google Scholar]
  13. Hu P, Tsiatis A, Davidian M. Estimating the parameters in the Cox model when covariate variables are measured with error. Biometrics. 1998;54:1407–1419. [PubMed] [Google Scholar]
  14. Huang Y, Wang CY. Cox regression with accurate covariates unascertainable: A nonparametric correction approach. Journal of the American Statistical Association. 2000;95:1209–1219. [Google Scholar]
  15. Huang Y, Wang CY. Errors-in-covariates effect on estimating functions: Additivity in limit and nonparametric correction. Statistica Sinica. 2006;16:861–881. [Google Scholar]
  16. Jiang W, Kipnis V, Midthune D, Carroll R. Parameterization and inference for nonparametric regression problems. (Series B).Journal of the Royal Statistical Society. 2001;63:583–591. [Google Scholar]
  17. Kaaks R. Biochemical markers as additional measurements in studies of the accuracy of dietary questionnaire measurements: Conceptual issues. American Journal of Clinical Nutrition. 1997;65:1232S–1239S. doi: 10.1093/ajcn/65.4.1232S. [DOI] [PubMed] [Google Scholar]
  18. Kipnis V, Midthune D, Freedman LS, Bingham S, Schatzkin A, Subar A, Carroll RJ. Empirical evidence of correlated biases in dietary assessment instruments and its implications. American Journal of Epidemiology. 2001;153:394–403. doi: 10.1093/aje/153.4.394. [DOI] [PubMed] [Google Scholar]
  19. Liao L, Zucker DM, Li Y, Spiegelman D. Survival analysis with error-prone time-varying covariates: A risk set calibration approach. Biometrics. 2011;67:50–58. doi: 10.1111/j.1541-0420.2010.01423.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Nakamura T. Proportional hazards model with covariates subject to measurement error. Biometrics. 1992;48:829–838. [PubMed] [Google Scholar]
  21. Neuhouser ML, Tinker L, Shaw PA, Schoeller D, Bingham SA, Van Horn L, Beresford SAA, Caan B, Thompson C, Satterfield S, Kuller L, Heiss G, Smit E, Sarto G, Ockene J, Stefanick ML, Assaf A, Runswick S, Prentice RL. Use of recovery biomarkers to calibrate nutrient consumption self-reports in the Women's Health Initiative. American Journal of Epidemiology. 2008;167:1247–1259. doi: 10.1093/aje/kwn026. [DOI] [PubMed] [Google Scholar]
  22. Prentice RL. Covariate measurement errors and parameter estimation in a failure time regression model. Biometrika. 1982;69:331–342. [Google Scholar]
  23. Prentice RL. Measurement error and results from analytic epidemiology: Dietary fat and breast cancer. Journal of the National Cancer Institute. 1996;88:1738–1747. doi: 10.1093/jnci/88.23.1738. [DOI] [PubMed] [Google Scholar]
  24. Prentice RL, Caan B, Chlebowski RT, Patterson R, Kuller LH, Ockene JK, et al. Low-fat dietary pattern and risk of invasive breast cancer: The Women's Health Initiative randomized controlled dietary modification trial. Journal of the American Medical Association. 2006;295:629–642. doi: 10.1001/jama.295.6.629. [DOI] [PubMed] [Google Scholar]
  25. Prentice RL, Huang Y. Measurement error modeling and nutritional epidemiology association analyses. Canadian Journal of Statistics. 2011 doi: 10.1002/cjs.10116. Early online access Epub: 2011 Jul 27, DOI: 10.1002/cjs.10116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Prentice RL, Huang Y, Kuller L, Tinker L, Van Horn L, Stefanick M, Sarto G, Ockene J, Johnson K. Biomarker-calibrated energy and protein consumption and cardiovascular disease risk among postmenopausal women. Epidemiology. 2011;22:170–79. doi: 10.1097/EDE.0b013e31820839bc. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Prentice RL, Shaw PA, Bingham S, Beresford SAA, Caan B, Neuhouser ML, Patterson RE, Stefanick ML, Satterfield S, Thomson CA, Snetselaar L, Thomas A, Tinker LF. Biomarker-calibrated energy and protein consumption and increased risk among postmenopausal women. American Journal of Epidemiology. 2009;169:977–89. doi: 10.1093/aje/kwp008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Prentice RL, Sugar E, Wang CY, Neuhouser M, Patterson R. Research strategies and the use of nutrient biomarkers in studies of diet and chronic disease. Public Health Nutrition. 2002;5:977–984. doi: 10.1079/PHN2002382. [DOI] [PubMed] [Google Scholar]
  29. Schoeller DA. Measurement of energy expenditure in free-living humans by using doubly labeled water. Journal of Nutrition. 1988;118:1278–1289. doi: 10.1093/jn/118.11.1278. [DOI] [PubMed] [Google Scholar]
  30. Song X, Huang Y. On corrected score approach for proportional hazards model with covariate measurement error. Biometrics. 2005;61:702–714. doi: 10.1111/j.1541-0420.2005.00349.x. [DOI] [PubMed] [Google Scholar]
  31. Stefanski LA, Carroll RJ. Conditional scores and optimal scores for general linear measurement-error models. Biometrika. 1987;74:703–716. [Google Scholar]
  32. Subar AF, Kipnis V, Troiano RP, Midthune D, Schoeller D, Bingham S, Sharbaugh C, Trabulsi J, Runswick S, Ballard-Barbash R, Sunshine J, Schatzkin A. Using intake biomarkers to evaluate the extent of dietary misreporting in a large sample of adults: The OPEN study. American Journal of Epidemiology. 2003;158:1–13. doi: 10.1093/aje/kwg092. [DOI] [PubMed] [Google Scholar]
  33. Sugar EA, Wang CY, Prentice RL. Logistic regression with exposure biomarkers and flexible measurement error. Biometrics. 2007;63:143–151. doi: 10.1111/j.1541-0420.2006.00632.x. [DOI] [PubMed] [Google Scholar]
  34. Tsiatis AA, Davidian M. A semiparametric estimator for the proportional hazards model with longitudinal covariates measured with error. Biometrika. 2001;88:447–458. doi: 10.1093/biostatistics/3.4.511. [DOI] [PubMed] [Google Scholar]
  35. Wang CY. Robust sandwich covariance estimation for regression calibration estimator in Cox regression with measurement error. Statistics & Probability Letters. 1999;45:371–378. [Google Scholar]
  36. Wang CY, Hsu L, Feng ZD, Prentice RL. Regression calibration in failure time regression. Biometrics. 1997;53:131–145. [PubMed] [Google Scholar]
  37. Women's Health Initiative Study Group Design of the Women's Health Initiative clinical trial and observational study. Controlled Clinical Trials. 1998;19:61–109. doi: 10.1016/s0197-2456(97)00078-0. [DOI] [PubMed] [Google Scholar]
  38. World Cancer Research Fund/American Institute for Cancer Research . Food, nutrition, physical activity, and the prevention of cancer: A global perspective. American Institute for Cancer Research; Washington, DC: 2007. [Google Scholar]
  39. Xie SX, Wang CY, Prentice RL. A risk set calibration method for failure time regression by using a covariate reliability sample. (Series B).Journal of the Royal Statistical Society. 2001;63:855–870. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supp Material S1

RESOURCES