Summary
Studies of clinical characteristics frequently measure covariates with a single observation. This may be a mis-measured version of the “true” phenomenon due to sources of variability like biological fluctuations and device error. Descriptive analyses and outcome models that are based on mis-measured data generally will not reflect the corresponding analyses based on the “true” covariate. Many statistical methods are available to adjust for measurement error. Imputation methods like regression calibration and moment reconstruction are easily implemented but are not always adequate. Sophisticated methods have been proposed for specific applications like density estimation, logistic regression, and survival analysis. However, it is frequently infeasible for an analyst to adjust each analysis separately, especially in preliminary studies where resources are limited. We propose an imputation approach called Moment Adjusted Imputation (MAI) that is flexible and relatively automatic. Like other imputation methods, it can be used to adjust a variety of analyses quickly, and it performs well under a broad range of circumstances. We illustrate the method via simulation and apply it to a study of systolic blood pressure and health outcomes in patients hospitalized with acute heart failure.
Keywords: Conditional score, Measurement error, Non-linear models, Regression calibration
1. Introduction
In clinical studies, biological covariates are often measured only at baseline, and this measurement includes noise due to natural fluctuations or other sources. The quantity of interest may be the average over fluctuations. For example, using data from the Organized Program to Initiate Lifesaving Treatment in Hospitalized Patients with Heart Failure (OPTIMIZE-HF) registry, Gheorgiade et al. (2006) studied the relationship between mortality and systolic blood pressure at hospital admission in patients with acute heart failure, in-hospital by fitting logistic regression models and post-discharge via Cox proportional hazard models. Blood pressure was determined by a single, in-hospital measurement; however, many studies have demonstrated large fluctuations in systolic blood pressure, and the average of many longitudinal measurements is more strongly correlated with outcomes (Brueren et al., 1997; Pickering et al., 2005; Marshall, 2008). In other words, outcomes are more directly related to an underlying blood pressure level, averaged over fluctuations. Similarly, in descriptive analysis, an unobserved “true” value may be more relevant than a noisy baseline measure.
Measurement error in covariates can also be attributed to device error, assay error, and instrumentation. Complications arise when an imperfect measurement, W, is observed in place of a latent variable, X, and desired inferences involve X. When interest focuses on estimation of the density of X, this presents an obvious problem because the mis-measured data, W, are over-dispersed relative to the distribution of X (Eddington, 1940; Tukey, 1974). When X is a covariate in a regression model for an outcome Y, estimators obtained when W is substituted for X may be substantially biased (Liu et al., 1978) and statistical power may be compromised (Freudenheim and Marshall, 1988; Carroll et al., 2006).
Many strategies to adjust for measurement error depend on the specific modeling or regression context. Correction for measurement error in linear model covariates is commonly achieved by regression calibration (RC) (Carroll and Stefanski, 1990; Gleser, 1990), which substitutes an estimate of the conditional mean E(X∣W) for the unknown X. The resulting linear regression estimates the underlying parameters of interest. When the distribution of X is of interest, E(X∣W) is over-corrected in terms of having reduced spread (Eddington, 1940; Tukey, 1974). Regression calibration is also implemented in non-linear models because of its simplicity but is typically most effective for general linear models when the measurement error is not large (Rosner, Spiegelman, and Willett, 1989; Carroll et al., 2006). Other approaches for non-linear models are described by Carroll et al. (2006) and include structural models, which regard X as a random variable, and functional models in which X is treated as a fixed parameter. Structural methods like maximum likelihood yield efficient estimation but require that the density of X, fX(x), be known or well approximated. Conditional score methods use estimating equations derived from the distribution of the observed data conditional on sufficient statistics for the unobserved X and include estimators that are efficient among functional methods. These have been developed for generalized linear models (Stefanski and Carroll, 1985, 1987), survival analysis (Tsiatis and Davidian, 2001), and joint models for longitudinal data and a primary endpoint (Li, Zhang, and Davidian, 2004).
The preceding methods target estimation of parameters in a specific regression context; a different method must be implemented for every type of regression model in which X is used. This would be burdensome in the OPTIMIZE-HF study, where the mis-measured variable is used in multiple analyses. An alternative approach is to focus on re-creating the true X from the observed W, at least approximately, as the primary quantity of interest or as a means to improving parameter estimation. This has been explored from a Bayesian perspective (Tukey, 1974; Louis, 1984; Shen and Louis, 1998; Freedman et al., 2004). Freedman et al. (2004) aim to replace the mis-measured data W with estimators that have asymptotically the same joint distribution with Y as does X. They implement a more practical approximation to this idea by focusing on only the first two moments of the joint distribution. Their moment reconstruction (MR) method is based on adjusted data, XMR, defined so that E(XMR), Var(XMR) and Cov(XMR, Y) are equal to the corresponding moments of X. They impute estimates, X̂MR, in a range of applications and demonstrate that this approach yields good results for normally distributed X (Freedman et al., 2004, 2008).
When X has a normal distribution, it may suffice to match two moments; bias in linear and logistic regression parameter estimators will be eliminated. More generally, this is not adequate, and an extension of MR to higher-order moments and cross-products is suggested. To the best of our knowledge, the idea of computing higher-order, moment-adjusted estimates of the true X originated with the unpublished dissertation research of Bay (1997). We expand on this work, calling the method Moment Adjusted Imputation (MAI). Our method retains the convenience of other imputation methods, in that, once the adjusted values are obtained, it can be used across a variety of analyses on the same data set using standard software.
In this paper, we demonstrate the benefit of MAI, particularly for density estimation and logistic regression, where X is non-normal. In Section 2, we define the MAI algorithm and relate it to other imputation methods. In Sections 3 and 4, we compare adjustment procedures for estimating kernel density and non-linear regression coefficients by simulation studies, respectively. We adjust the previous OPTIMIZE-HF analysis to account for measurement error and obtain estimates that describe the features of “true” blood pressure in Section 5.
2. The Moment Adjusted Imputation Method
We describe the method of Bay (1997). Consider mis-measured observations Wi, i = 1, …, n, assumed independent across i, where Wi = Xi + Ui for , U = (U1, …, Un)T independent of X = (X1, …, Xn)T, and Ui are mutually independent. The Wi may be mis-measured versions of a scalar covariate Xi in a regression model or subject-specific estimates of scalar random effects Xi in a mixed effects model with estimation uncertainty represented by Ui. No assumptions are made about the unobservable latent variables X1, …, Xn; they could be an independent, identically distributed (iid) random sample from some unknown distribution, as in a structural model, or fixed constants. We focus on the iid case where notation is simpler. Assume that σui are known, as is common in measurement error models.
The objective is to construct adjusted versions of the Wi, X̂i, say, where the first M sample moments of X̂i unbiasedly estimate the corresponding moments of Xi; i.e, , r = 1, …, M. The distribution of X̂i thus approximates that of Xi up to M moments. We can also match cross-product moments between model variables. Consider the simple linear regression model, E(Yi∣Xi)= β0 + βX Xi. The naive estimator of β = (β0, βX)T, based on Wi, is β̂N = A−1(Σi Yi, Σi WiYi)T, where . Inspection of β̂N suggests matching the first two moments of Xi and the cross-product with Yi so that M = 2 and . Freedman et al. (2004) match these same moments. It is straightforward to show that β̂ = A−1(Σi Yi, Σi X̂iYi)T, where , is consistent for β. Similarly, in the multiple linear regression model E(Yi∣Xi, Zi) = β0 + βXXi + βZZi, with error-free covariate Zi, both βX and βZ are consistently estimated if X̂i also satisfy . In non-linear models, where parameter estimators depend on higher-order moments and cross-products, we propose to match these as well.
In general, we wish to find X̂i with for r = 1, 2, …, Mk, where Vik is the (i, k) element of V = (1, Y, Z), 1 is a n × 1 vector of ones, Y =(Y1, …, Yn)T, and Z is a n × (K − 2) matrix whose columns are the values of K − 2 error-free covariates for i = 1, …, n and k = 1, …, K. Because V includes a vector of ones, matching cross-products with the columns of V includes matching moments. We make the common surrogacy assumption that Wi is conditionally independent of Vik given Xi (Carroll et al. 2006, Section 2.5) and use this in the following implementation. Note that V could be defined to include higher powers of the components Y and Z, thus matching moments of the form . We do not take this approach here, as we have found it unnecessary to achieve good results.
2.1 Implementation
The first step is to find estimators m̂rk so that E(m̂rk) = E(XrVk), k = 1, …, K. Based on the normality of Ui, unbiased estimators for the moments of Xi can be found as follows. Define Hermite polynomials by the recursive formulæ H0(z) = 1, H1(z) = z, Hr(z) = zHr−1(z) − (r − 1)Hr−2(z) for r ≥ 2 (Cramer, 1957). Stulajter (1978) proved that, if W ∼ N(μ, σ2), then E{σrHr(W/σ)} = μr (Stefanski, 1989; Cheng and Van Ness, 1999). Letting , we have . The estimators are unbiased under the surrogacy assumption because .
The adjusted X̂i are obtained by minimizing subject to constraints on the moments and cross-products. For the kth column of V, Mk constraints are imposed. For a particular matrix V, the vector M = (M1, …, MK) describes the number of cross-products matched with each of its columns. Using Lagrange multipliers (λ11, …, λMkK) = Λ, the objective function is
| (1) |
We take the derivative of QMK with respect to (X1, …, Xn, Λ), equate this to 0, and solve for (X̂1, …, X̂n, Λ̂) by Newton-Raphson (see Web Appendix A). The resulting adjusted data are defined implicitly as X̂i = h(Wi, Vi, Λ̂). The solution X̂i is then substituted for Xi in the standard methods of estimation that would be performed if Xi were observed.
In a simple case it is possible to obtain an analytical solution that minimizes objective function (1). For a scalar X in the absence of additional covariates, the estimator that matches two moments is X̂i = Wiα̂ + W̄(1 − α̂), where , , , , and . It easy to see that X̂i are not independent because they depend on estimated moments. In the general case, X̂i = h(Wi, Vi, Λ̂) are dependent for the same reason. In applications where X̂i is substituted for Xi, the usual standard errors, which assume independent data, should not be used. We recommend that standard errors for analyses involving X̂i be obtained by bootstrapping. In specific applications, below, we derive modifications of the usual standard errors.
2.2 Implications
The MAI estimator, X̂i = Wi â+W̄(1 − â), is familiar. A very similar estimator was developed via a multi-stage loss function by Louis (1984), with the intention of matching only two moments. Except for the exponent in , X̂i resembles the empirical Bayes estimator, which is known to have variance smaller than the posterior expected variance (Louis, 1984). MAI puts more weight on Wi than does empirical Bayes and thus provides an alternative to empirical Bayes when one is concerned about the problem of over-shrinkage.
MAI maintains the desirable properties of RC and MR with additional benefits. In fact, in simple linear regression we can replicate the RC and MR parameter estimates by matching two moments and a cross-product with the response, so M = (M1, M2)= (2, 1) for V = (1, Y). The adjusted data are where , , , and . The estimator is equivalent to the RC estimator, , which is also identical to MR (Bay, 1997; Freedman et al., 2004). In addition, the X̂i converge in probability to , where . (X*, Y) has the same distribution as (X, Y) when this is multivariate normal. This allows for consistent estimation of the regression error variance (Freedman et al., 2004). RC and MR are not identical in logistic regression, where only the later may be consistent under normality (Freedman et al., 2004). When Y is binary and (X∣Y) is normal, letting M = (2, 2) produces X̂ that converge to some X**, and (X**, Y) has the same distribution as (X,Y). As with MR, this leads to consistency of parameter estimation in linear logistic regression, but also in quadratic logistic regression (see Web Appendix B). We do not expect parameter estimators based on imputation methods to be consistent outside of specific cases in linear and logistic regression. However, substantial reduction in bias may be achieved with reasonable convenience.
The proposed method replicates the adjusted estimator of Cheng and Schneeweiss (1998) for polynomial regression, which is obtained without adjusting data. In polynomial regression, there is a closed form solution for the coefficients that depends only on sample moments and cross-products, so the unbiased estimators m̂rk can be substituted directly into this solution. These authors use the same Hermite polynomials as a method to obtain m̂rk for normally distributed measurement error; because MAI creates adjusted data with these unbiased moments, it replicates theirs. For consistent estimation of a quadratic polynomial regression in X it is necessary to substitute unbiased estimators for four moments and second order cross-products, i.e. M = (4, 2) for V = (1, Y). Although closed form solutions are not available for many non-linear models, non-linearity is often well approximated by a lower-order polynomial. This suggests that these estimators are largely determined by lower-order moments and cross-products, so MAI could result in negligible bias.
2.3 Practical considerations
In practice, the m̂rk may not be a valid moment sequence. If our aim is only to match moments, but not cross-products (k = 1), it is well known (Shohat and Tamarkin, 1943) that a sequence of 2q + 1 moments is valid if
Checking these determinants will identify the number of valid moments for a given data set. We address how many of these valid moments should be used in the sequel.
For the purpose of matching an arbitrary number of moments and cross-products, it is less clear how to identify a valid collection. For this discussion, let there be a single error-free covariate Z. Consider a limited set of M = (M1, M2 = M1/2, M3 = M1/2), where the number of moments, M1, is even and the order of cross-products is (M1/2 + 1). This corresponds to matching the variance-covariance matrix of (1, X,…, XM1/2, Y, Z), and therefore has a nice interpretation. It is also the set of moments that must be matched to achieve consistent parameter estimation in polynomial regression of order M1/2. In addition, simulations indicate that letting M1 be odd can lead to “outlying” X̂i if the true distribution of Xi is extremely skewed (Bay, 1997). When M1 = 4, we obtain m̂11, m̂21, m̂31, m̂41, m̂12, m̂22, m̂13, and m̂23, with expectations E(X), E(X2), E(X3), E(X4), E(XY), E(X2Y), E(XZ), and E(X2Z), respectively. This set of moments defines the estimated variance-covariance matrix for (1, X,…, XM1/2, Y, Z) given by
We propose checking that this matrix is positive definite to verify a valid set of moment estimates. The MAI data are denoted X̂M. For example, adjusted data based on matching two moments are X̂2 and X̂4,2 are derived by matching four moments and two cross-products.
Occasionally, numerical problems may arise when these moments can not be matched with a data set of size n. Even if the moments form a valid sequence, there may not be an empirical distribution function that takes jumps of size 1/n at each of n points having these moments. When this occurs, the Newton-Raphson algorithm will not converge to a solution. It is rare to encounter this problem when only moments, but not cross-products, are involved. When this does occur, we propose matching fewer moments.
3. Histogram and Kernel Density Estimation
In this and the next section, we demonstrate the utility of MAI in several representative analysis contexts. A simple and useful application is to adjust mis-measured data to approximate an underlying error-free variable when interest focuses on its distribution. A histogram or kernel density estimate (KDE) based on mis-measured data is too flat and dispersed. We illustrate for three distributions of X: N(0,1); chi-square with 4 degrees of freedom; and a bimodal mixture of normals, 0.30N(0,1) + 0.70N(5,1), the latter two standardized to have mean 0 and variance 1. Measurement error with variance is added to X, corresponding to large measurement error, with reliability ratio (RR) Var(X)/Var(W) = 0 5. Using the default bandwidth in the R density() package, KDEs for a simulated data set for each distribution are displayed in Figure 1. Those based on X̂4 have features more like the density estimates that would be obtained from the true X. Because the normal distribution is completely defined by its first two moments, there is no benefit to matching additional moments. However, there is great improvement from matching four moments when X is chi-square or bimodal, as the KDE based on X̂4 is substantially closer to that based on X.
Figure 1.
Kernel Density Estimation; solid line: KDE of X, dark-dashed line: KDE of X̂4, light-dotted lines: KDE of X̂2 and W; n = 2000 and RR = 0.50.
We conducted a Monte Carlo simulation to investigate more generally the extent of improvement possible and to identify the best number of moments to match. Several situations were considered, including the three distributions for X above; two levels of measurement error with RRs of 0.75 and 0.5; and three sample sizes, n = 300, 1000, and 2000, typical for measurement error applications (Stefanski and Carroll, 1985; Freedman et al., 2004).
We compare KDE based on MAI to that based on alternative methods of obtaining adjusted data. The first is regression calibration, where E(Xi∣Wi) is estimated by the best linear unbiased predictor ; and μ̂x, σ̂wx, and are estimates obtained by method of moments. This is the empirical Bayes estimator for Xi when X is normally distributed; however, the same linear estimator is used regardless of the actual distribution of X. To account for non-normality, we also consider a different estimator of E(Xi∣Wi) obtained by assuming Xi has density represented by the flexible family of semi-nonparametric (SNP) densities fX(x∣μ, σ, α) (Gallant and Nychka, 1987; Davidian, 1992, 1993; Zhang, 2001; Carroll et al., 2006), which involve parameters μ, σ, and α and can approximate many potential latent variable distributions. The family has a convenient form so that the corresponding density fW(w∣μ, σ, α) of W can be obtained by integration over x, and the parameters (μ, σ, α) can be estimated from the observed data by maximum likelihood. This approach naturally provides density estimation in the form of fX(x∣μ̂, σ̂, α̂). We took the extra step of estimating E(Xi∣Wi) as X̂SNP,i = ∫ xfX,W(x, Wi∣μ̂,σ̂, α̂)dx/fW(Wi∣μ̂, σ̂, α̂).
The versions of X̂i are evaluated according to their closeness to the underlying Xi as measured by for B simulated data sets, and by computing the integrated squared error between the empirical distribution functions, given by , where , −∞ < t < ∞ When X is assumed to have the SNP density, we can estimate the cumulative distribution function, cdfSNP, directly from fX(x∣μ̂, σ̂, α̂). Density estimation using the SNP family is well established, so we calculate as a gold standard. We report the MSE ratio MSE(W)/MSE(X̂) and ISE ratio ISE(GW)/ISE(GX̂), so that larger ratios indicate a greater reduction in error. Standard errors for these ratios are obtained by the delta method and are reported as a “coefficient of variation,” which is the ratio standard error divided by the ratio itself.
Results for n = 1000 are displayed in Table 1; those for n = 300 and n = 2000 are similar (see Web Appendix C). For normally distributed X, there is very little difference in the MSE ratio based on matching two or four moments. However, there is no additional benefit from matching six moments. When it is not important to get each X̂i close to the original Xi, but instead we want an ensemble of X̂1,…, X̂n with distribution similar to that of X1,…, Xn, ISE ratios indicate that it is better to match only two moments. However, when fX(x) is chi-square or bimodal normal mixture, both ratios indicate that it is better to match four moments rather than two or six. Moments greater than four may not be as essential in describing distributions, and their estimators are likely to be highly variable. A general recommendation is to match four moments.
Table 1.
Simulation results for three latent variable distributions, fX(x); two reliability ratios (RR), B = 500 simulated data sets, and n = 1000. Statistics reported: (a) MSE(W)/MSE(X̂), where (coefficient of variation ≈ 0.001), and (b) ISE(GW)/ISE(GX̂), where, for (coefficient of variation ≈ 0.02). Adjusted data X̂: RC, regression calibration; MAI matching 2, 4 or 6 respectively; and SNP, semi-nonparametric. cdfSNP is the estimated cumulative density calculated by integrating the estimated SNP density.
| Distribution | RR | X̂RC | X̂2 | X̂4 | X̂6 | X̂SNP | cdfSNP | |
|---|---|---|---|---|---|---|---|---|
|
| ||||||||
| Normal | 0.75 | 1.33 | 1.24 | 1.24 | 1.23 | 1.32 | - | |
| 0.50 | 1.99 | 1.71 | 1.70 | 1.55 | 1.99 | - | ||
| Chi Sq df=4 | 0.75 | 1.33 | 1.24 | 1.38 | 1.37 | 1.43 | - | |
| 0.50 | 1.99 | 1.71 | 1.88 | 1.79 | 2.15 | - | ||
| Bimodal | 0.75 | 1.33 | 1.24 | 1.50 | 1.43 | 1.64 | - | |
| 0.50 | 2.00 | 1.71 | 1.79 | 1.64 | 2.15 | - | ||
|
| ||||||||
| Normal | 0.75 | 1.09 | 7.90 | 7.03 | 4.20 | 1.08 | 5.99 | |
| 0.50 | 1.34 | 23.72 | 11.65 | 0.82 | 1.33 | 9.64 | ||
| Chi Sq df=4 | 0.75 | 1.39 | 2.41 | 6.96 | 5.15 | 0.61 | 1.08 | |
| 0.50 | 1.74 | 4.39 | 10.99 | 3.15 | 1.19 | 1.91 | ||
| Bimodal | 0.75 | 0.81 | 1.32 | 5.05 | 1.94 | 2.35 | 2.71 | |
| 0.50 | 0.86 | 1.90 | 4.13 | 1.29 | 1.48 | 4.99 | ||
Comparison of MAI, X̂4, to X̂RC and X̂SNP shows that the latter two methods do a better job getting each X̂i close to Xi and have slightly larger MSE ratios than those for X̂4, regardless of the distribution of X. It is not surprising that conditional expectations yield good estimation of the individual X̂i. However, these expectations are known to be less variable than the original data, so the distributions of X̂RC and X̂SNP may not resemble that of X (Eddington (1940), Tukey (1974)). In fact, the ISE ratios for these methods are close to 1 or even smaller than 1, indicating that the empirical distributions based on these X̂ are no better than that based on W. Thus, when interest focuses on density estimation, imputation of X̂RC or X̂SNP is inadequate. ISE ratios for X̂4 are at least as large as those of the gold standard, cdfSNP, particularly for chi-square X. The SNP density estimator used here is based on the normal distribution and is not ideal for estimating skewed densities. ISE ratios for X̂4 reflect large improvement in density estimation relative to W, confirming the impression suggested by Figure 1. When the ensemble of measurements is of interest, rather than the individual specific observations, MAI provides an attractive alternative.
4. Regression Models
We evaluate various methods that adjust for covariate measurement error in common nonlinear regression models over a range of conditions, including distribution of X (N(0,1), standardized chi-square df=4, standardized bimodal mixture of normals; see Section 3); sample size (n = 300, 1000, 2000); measurement error variance (moderate, , RR=0.75; large, , RR=0.50); and underlying model parameters (levels depend on the specific model). An additional error-free covariate Z was generated from the same distribution as X such that Corr(Z, X)=0.4. The underlying model parameters control the extent to which the model deviates from linearity. To get a general understanding of performance in different circumstances, we varied the model parameters and strength of the covariate effects.
We consider three imputation methods as well as the conditional score method (Stefanski and Carroll, 1987; Tsiatis and Davidian, 2001). For RC, X̂RC is the best estimated linear unbiased estimator of E(X∣W,Z). We use the modification of MR proposed by Freedman et al. (2004) that involves conditioning on the error-free covariate so that X̂MR,i = Ê(W∣Yi, Zi) (1 − Ĝ) + WiĜ where Ĝ =
(X∣Yi, Zi)1/2/
(W∣Yi, Zi)1/2. For MAI, we use X̂4,2,2 for which E(X), E(X2), E(X3), E(X4), E(XY), E(X2Y), E(XZ), E(X2Z) are matched. Some alternatives are discussed below in the context of specific models.
When X̂4,2,2 are used in a regression model, the usual standard errors for regression parameter estimates are not correct. As described previously, standard errors can be obtained by bootstrapping. When the estimators are M-estimators, standard errors may be obtained by the empirical sandwich approach. The equations that determine Λ̂ can be stacked with the usual equations in which the unknown Xi are replaced with h(Wi, Vi, Λ̂) (Bay, 1997; Stefanski and Boos, 2002; Carroll et al., 2006); see Web Appendix D.
4.1 Logistic Regression
The model for the outcome is P(Y = 1∣X, Z) = F(β0 + βXX + βZZ) where F(υ) = {1 + exp(−υ)}−1. We simulated data from two parameter settings, (β0, βX, βZ) = (−1.5, 1, 1) and (β0, βX, βZ) = (−0.6, 0.3, 0.3). The first is similar to Freedman et al. (2004) and corresponds to substantial non-linearity, strong covariate effects, and event rate P(Y = 1) ≈ 0.30. For the second, P(Y = 1∣X, Z) is nearly linear in the range of X, the effect of X is moderate, and the event rate P(Y = 1) ≈ 0.36. The observed data are Yi, Wi, and Zi, for i = 1,…, n.
Boxplots of the estimated coefficients β̂X from B = 500 simulations are displayed in Figure 2 for the case where (β0, βX, βZ) = (−1.5, 1, 1), n = 2000, and . When X is normally distributed, the RC estimator for βX shows slight bias, but has the least variability. The other methods are unbiased and have similar variability. RC and MR are expected to perform well when X is normally distributed, and there is nothing to be gained from information about higher-order moments. However, it is reassuring to see that the increase in variability from matching additional moments is not substantial. When the latent variable distribution is either chi-square or bimodal, the RC and MR estimators are biased. Only MAI and CS appear unbiased, and these have similar variability.
Figure 2.
Boxplots of β̂X from B = 500 simulated data sets, for three distributions of X, where P(Y = 1∣X,Z) = F(β0 + βXX + βZZ) with true values (β0, βX,βZ) = (−1.5,1,1), , and n = 2000. Method: W, naive; RC, regression calibration; MR, moment reconstruction; MAI, moment adjusted imputation with M = (4,2,2); CS, conditional score.
Other results are presented in Web Appendix E. When the underlying coefficients are (β0, βX, βZ) = (−1.5, 1, 1), results are similar to those in Figure 2. Due to the near linearity of the model, when the underlying coefficients are (β0, βX, βZ) = (−0.6, 0.3, 0.3), all adjustment methods are nearly identical in terms of estimator bias and variability. The measurement error in X also affects the estimation of the error-free covariate effect, βZ. The naive estimator for βZ is biased. This can be corrected by adjusting for measurement error in X. However, in our simulations the regression calibration estimator for βZ is biased, even when the latent variable is normally distributed. The other adjustment procedures perform similarly and demonstrate negligible bias in βZ (Web Appendix E).
Moment matching is a good alternative in logistic regression when the underlying latent variable distribution is unknown. In logistic regression, we recommend that four moments and two cross-products with important covariates be matched. This level of matching is necessary to render negligible bias in our simulations (see Web Appendix E). In Web Appendix E, Tables 10 and 11, we compare the sandwich and bootstrap variance estimators to the Monte Carlo variance of parameter estimates. The sandwich estimator appears reasonable across a variety of settings (Table 10) and the bootstrap variance is similar except in the case of large measurement error and small sample sizes (Table 11).
4.2 Cox Proportional Hazard Model
Another common non-linear model is the Cox proportional hazard model for a time to event outcome. For subject i = 1, …,n, let Ti denote failure time and Ci denote censoring time. The failure time Ti is not available for all subjects, but instead Yi = min(Ti, Ci) and δi = I(Ti ≤ Ci) are observed. The hazard of failure λ(t∣X, Z) is related to the covariates by
where λ0(t) is an underlying baseline hazard function.
We consider two scenarios. The first is similar to that of Wang (2006), where failure times occur according to the hazard function λ(t∣X, Z) with (λ0, βX, βZ) = (0.2, 0.7, 0.7), and 50% of subjects are censored uniformly. This implies a very strong covariate effect with a hazard ratio of 2 for each unit change in X and hazard ratio of 66 for the largest value of X compared to the smallest [exp(0.7) ≈ 2 and exp{0.7 range(X)} ≈ 66]. As a moderate alternative, we generated failure times from λ(t∣X, Z) with (λ0, βX, βZ) = (1.0, 0.3, 0.3) and 40% censoring, corresponding to a hazard ratio of 1.4 for each unit change in X and a hazard ratio of 6 overall [exp(0.3) ≈ 1.4 and exp{0.3 range(X)} ≈ 6].
In logistic regression, the available data consist of (Y,W,Z), whereas for time to event data we have (Y, δ, W, Z). In this case, we extend the moment matching to include δ and target the joint distribution of (Y, δ, X, Z). We considered several approaches. The simplest approach would match the variance-covariance matrix of (1, X,…, XM1/2, Y, δ, Z). We could also match the variance-covariance matrix of (1, X,…, XM1/2, Y, δ, Z) within each level of δ. Alternatively, we could match on risk sets, which is to re-match (1, X,…, XM1/2, Y, δ, Z) at different points in time for those subjects who are still at risk. We tried all of these and saw little difference in the results. We therefore recommend the first and simplest method. The adjusted data are X̂4,2,2,2, for which E(X), E(X2), E(X3), E(X4), E(XY), E(X2Y), E(Xδ), E(X2δ), E(XZ), and E(X2Z) are matched. We compared this to a lesser adjustment for which only E(X), E(X2), E(XY), E(Xδ), E(XZ) are matched and observed more bias and similar variability in the estimators from the lesser adjustment. Accordingly, we use X̂4,2,2,2 and results are presented for this version of MAI only.
Boxplots of the estimated coefficients β̂X from B = 500 simulations, where n = 2000 and , are shown in Figure 3. When the true parameter values are λ0 = 0.2, βX = 0.7, and βZ = 0.7, only the CS estimator shows no evidence of bias, regardless of the distribution of X. Bias in RC and MAI estimators is evident, though relatively small. The variability in these estimators is similar, though the RC estimator is somewhat less variable. When the true parameter values are λ0 = 1, βX = 0.3 and βZ = 0.3, and X has a normal distribution, all of the methods have similar variability and no detectable bias. However, for a chi-square or bimodal latent variable, the RC estimator is biased. Both the MAI and CS estimators appear unbiased and have similar variability. Results for other sample sizes and levels of measurement error are similar (Web Appendix F).
Figure 3.
Boxplots of β̂X from B = 500 simulated data sets and three distributions of X, where λ(t∣X,Z) = λ0(t) exp(βXX+βZZ) with true values {λ0(t), βX, βZ} = (0.2,0.7,0.7) or {λ0(t),βX, βZ} = (1,0.3,0.3), , and n = 2000. Method: W, naive; RC, regression calibration; MR, moment reconstruction; MAI, moment adjusted imputation with M = (4, 2, 2, 2); CS, conditional score.
As in logistic regression, the measurement error in X impacts estimation of βZ, and the naive estimator is biased. In Web Appendix F, we see that the RC estimator for βZ is over-corrected, particularly for the larger underlying parameter values. MAI and CS estimators are nearly unbiased and have similar variability to RC.
In our simulations, the CS approach is preferable. However, this method may be excessively time consuming or infeasible for complicated Cox models. Imputation approaches, although imperfect, offer a practical solution. Both RC and MAI are easy to implement and yield great improvement over the naive method. For estimation of βX, neither can be recommended over the other based on our simulations. However, when βZ is also of interest, MAI is preferred.
5. Application to OPTIMIZE-HF
We carry out the OPTIMIZE-HF analyses performed by Gheorgiade et al. (2006), accounting for measurement error. The data set includes information on n = 48,612 subjects, aged 18 or older, with heart failure. There are two outcomes of interest, in-hospital mortality and post-discharge mortality. We use the models reported by Gheorgiade et al. (2006), which include baseline systolic blood pressure, many baseline covariates, and linear splines and truncation that account for non-linearity in continuous covariates. Their model for in-hospital mortality is
where Z includes error-free covariates listed in Web Appendix G, Table 17; S is a truncated version of blood pressure, i.e. S=−{XI(X < 160) + 160I(X ≥ 160)}; and X represents true systolic blood pressure in 10-mm Hg units. A pre-specified subset of patients (n = 5,791) was followed for 60 to 90 days after discharge. In this group, post-discharge mortality is described by the Cox proportional hazard model
where Z includes error free covariates listed in Web Appendix G, Table 18, and S1 and S2 fit a linear spline to blood pressure, i.e. S1=−{140I(X ≥ 140)} and S2 = −{0I(X < 140) + XI(X ≥ 140)}. Gheorgiade et al. (2006) fit these models using observed baseline systolic blood pressure, W, in place of X. We adjust the mis-measured W, by matching four moments and two cross-product moments with response, and impute X̂ in place of X.
Adjustment procedures assume that the measurement error variance, , is known. In practice, is usually replaced by a good estimate. It would be best to estimate the measurement error variance from replicate measures of systolic blood pressure, taken over a period of time. Replicate measures were not available in the OPTIMIZE-HF data set; however, variability in blood pressure has been extensively studied. One source is the Framingham data set (Carroll et al., 2006), which includes four measurements of blood pressure, two taken at the first exam and two taken at a second exam. The average standard deviation in four measurements is 9 mm Hg, which corresponds to a reliability ratio of about 0.75. Based on the information from other external studies, the measurement error may actually be larger (Marshall, 2008). For the purpose of illustration, we use a reliability ratio of 0.75 for adjustment. In practice, it is critical to obtain replicate data for this purpose.
The estimated density of baseline systolic blood pressure is altered by adjustment (Figure 4). The adjusted version shows a higher peak and smaller tails and conveys the impression that patients' blood pressures are more similar to each other.
Figure 4.
Kernel Density Estimate of systolic blood pressure for the OPTIMIZE-HF study; light-dotted lines: KDE of W, dark-dashed line: KDE of X̂4
In Table 2, we compare the MAI odds ratios and hazard ratios to those obtained by Gheorgiade et al. (2006). We report odds ratios per 10-mm Hg change in S and hazard ratios for S1, and S2 per 10-mm Hg change. Wald-type 95% confidence intervals for the odds ratios and hazard ratios are based on standard errors from 1000 bootstrap samples. The adjusted estimates indicate a stronger effect of systolic blood pressure. The RC estimates move in the same direction as MAI, but are closer to the naive estimates. The impact of adjustment is not substantial in this case; however, we have assumed relatively moderate measurement error. Many studies have reported higher variability in replicate blood pressure measurements, and adjustment could be more important in estimating effect size.
Table 2.
Parameter estimates for the OPTIMIZE-HF data analysis. Confidence intervals, based on bootstrap standard errors, are given parenthesis. Method: Unadjusted; RC, regression calibration; MAI, moment adjusted imputation.
| Unadjusted | RC | MAI | |
|---|---|---|---|
| (a) Odds ratios for logistic regression (5) | |||
| β1 | 1.21 (1.17, 1.25) | 1.28 (1.22, 1.33) | 1.36 (1.44, 1.52) |
| (b) Hazard ratios for Cox model (5) | |||
| β1 | 1.18 (1.10, 1.26) | 1.24 (1.13, 1.36) | 1.38 (1.22, 1.54) |
| β2 | 1.08 (1.01, 1.15) | 1.10 (1.00, 1.21) | 1.06 (0.94, 1.17) |
Further adjustment for measurement error could be implemented by matching cross-products with all, or some, of the 18 covariates listed in Web Appendix G. We did not use these for two reasons. First, most of these covariates are only weakly correlated with systolic blood pressure, and we expect little gain from matching moments with such covariates. The only covariate strongly correlated with systolic blood pressure is diastolic blood pressure, which, realistically, is also measured with error; moreover, the measurement error due to biological fluctuations in both variables is likely correlated. Appropriate adjustment would involve a multivariate version of MAI. Extension to this case is possible; the implementation involves non-trivial considerations, and we will report this work elsewhere.
6. Discussion
We have introduced MAI as a means for adjusting mis-measured data to reflect the latent variable distribution and improve parameter estimation in non-linear regression models. The method does not require any assumptions on the latent variable distribution. Under the general recommendation of matching four moments, it performs well for a variety of distributions. For density estimation, MAI is typically superior to the simple alternatives we considered. In simulations of logistic regression, the method is similar to MR when the latent variable is normally distributed, but is a superior imputation method when the latent variable is non-normal. In the Cox proportional hazards model, RC and MAI provide substantial improvement over the naive approach, but do not eliminate bias. Of the functional approaches that we considered, the conditional score is the only method that eliminates bias in Cox model parameter estimators. The performance of MAI may differ for other latent variable distributions, such as heavy-tailed distributions. Investigation of tail effects would require larger samples that those considered here.
The OPTIMIZE-HF study of systolic blood pressure of Gheorgiade et al. (2006) is illustrative of a typical data analysis. The mis-measured variable is included in descriptive analyses and in multiple, complex models involving splines to account for non-linearity. In practice, models could include splines, squared terms or interactions with the latent variable of interest. These are easily accommodated by imputation methods, and other approaches such as conditional score may be difficult or impossible to implement. In these circumstances, an imputation approach may be desirable.
We have developed MAI for the case of normally distributed measurement error. The method depends on correct specification of the measurement error distribution. Analysts should take care to verify that normality of measurement error is a reasonable assumption. The MAI method can be applied for other types of measurement error, as long as the distribution is known. Work on such extensions is reported elsewhere.
Supplementary Material
Acknowledgments
This work was supported by NIH grants R01CA085848, T32HL079896, R37AI031789, and P01CA142538 and NSF grants DMS 0906421 and DMS 0504283
Footnotes
Supplementary Materials: Web Appendices A-G and R code implementing the methods are available under the Paper Information link at the Biometrics website http://www.biometrics.tibs.org.
References
- Bay J. Unpublished dissertation thesis. 1997. [Google Scholar]
- Brueren MM, van Limpt P, Schouten HJA, de Leeuw PW, van Ree JW. Is a series of blood pressure measurements by the general practitioner or the patient a reliable alternative to ambulatory blood pressure measurement? A study in general practice with reference to short-term and long-term between-visit variability. American Journal of Hypertension. 2008;101:879–885. doi: 10.1016/s0895-7061(97)00125-8. [DOI] [PubMed] [Google Scholar]
- Carroll RJ, Ruppert D, Stefanski LA, Crainiceanu CM. Measurement Error in Nonlinear Models: A Modern Perspective. 2nd. Boca Raton, Florida: Chapman and Hall; 2006. [Google Scholar]
- Carroll RJ, Stefanski LA. Approximate quasi-likelihood estimation in models with surrogate predictors. Journal of the American Statistical Association. 1990;85:652–663. [Google Scholar]
- Cheng CL, Schneeweiss H. Polynomial regression with errors in the variables. Journal of the Royal Statistical Society, Series B. 1998;60:189–199. [Google Scholar]
- Cheng CL, Van Ness J. Statistical Regression with Measurement Error. London: Arnold Publishers; 1999. [Google Scholar]
- Cramer H. Mathematical Methods of Statistics. Princeton: Princeton University Press; 1957. [Google Scholar]
- Davidian M, Gallant AR. Smooth nonparametric maximum likelihood estimation for population pharmacokinetics, with application to quinidine. Journal of Pharmacokinetics and Biopharmaceutics. 1992;20:529–556. doi: 10.1007/BF01061470. [DOI] [PubMed] [Google Scholar]
- Davidian M, Gallant AR. The nonlinear mixed effects model with a smooth random effects density. Biometrika. 1993;80:475–488. [Google Scholar]
- Eddington AS. The correction of statistics for accidental error. Monthly Notices of the Royal Astronomical Society. 1940;100:354–361. [Google Scholar]
- Freedman LS, Fainberg V, Kipnis V, Midthune D, Carroll RJ. A new method for dealing with measurement error in explanatory variables of regression models. Biometrics. 2004;60:172–181. doi: 10.1111/j.0006-341X.2004.00164.x. [DOI] [PubMed] [Google Scholar]
- Freedman LS, Midthune D, Carroll RJ, Kipnis V. A comparison of regression calibration, moment reconstruction and imputation for adjusting for covariate measurement error in regression. Statistics in Medicine. 2008;27:5195–5216. doi: 10.1002/sim.3361. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Freudenheim JL, Marshall JR. The problem of profound mismeasurement and the power of epidemiologic studies of diet and cancer. Nutrition and Cancer. 1988;11:243–250. doi: 10.1080/01635588809513994. [DOI] [PubMed] [Google Scholar]
- Gallant AR, Nychka DW. Seminonparametric maximum likelihood estimation. Econometrica. 1987;55:363–390. [Google Scholar]
- Garcia-Vera M, Sanz J. How many self-measured blood pressure readings are needed to estimate hypertensive patients' true blood pressure. Journal of Behavioral Medicine. 1999;22:93–113. doi: 10.1023/a:1018703819773. [DOI] [PubMed] [Google Scholar]
- Gheorghiade M, Abraham WT, Albert NM, Greenberg BH, O'Connor CM, She L, Stough WG, Yancy CW, Young JB, Fonarow GC. Systolic blood pressure at admission, clinical characteristics, and outcomes in patients hospitalized with acute heart failure. Journal of the American Medical Association. 2006;296:2217–2226. doi: 10.1001/jama.296.18.2217. [DOI] [PubMed] [Google Scholar]
- Gleser LJ. In: Statistical Analysis of Measurement Error Models and Application. Brown PJ, Fuller WA, editors. Providence: American Mathematical Society; 1990. [Google Scholar]
- Huang X, Stefanski LA, Davidian M. Latent-model robustness in structural measurement error models. Biometrika. 2006;93:53–64. [Google Scholar]
- Li E, Zhang D, Davidian M. Conditional estimation for generalized linear models when covariate are subject-specific parameters in a mixed model for longitudinal parameters. Biometrics. 2004;60:1–7. doi: 10.1111/j.0006-341X.2004.00170.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liu K, Stammler J, Dyer A, Mc Keever J, McKeever P. Statistical methods to assess and minimize the role of intra-individual variability in obscuring the relationship between dietary lipids and serum cholesterol. Journal of Chronic Diseases. 1978;31:399–418. doi: 10.1016/0021-9681(78)90004-8. [DOI] [PubMed] [Google Scholar]
- Louis TA. Estimating a population of parameter values using Bayes and empirical Bayes methods. Journal of the Americal Statistical Association. 1984;79:393–398. [Google Scholar]
- Marshal TP. Blood pressure variability: the challenge of variability. American Journal of Hypertension. 2008;21:3–4. doi: 10.1038/ajh.2007.20. [DOI] [PubMed] [Google Scholar]
- Pickering TG, Hall JE, Appel LJ, Falkner BE, Graves J, Hill MN, Jones DW, Kurtz T, Sheps SG, Roccella EJ. Recommendations for blood pressure measurement in humans and experimental animals: Part 1: Blood pressure measurement in humans: A statement for professionals for the subcommittee of professional and public education of the American Heart Association Council on High Blood Pressure Research. Hypertension. 2005;45:142–161. doi: 10.1161/01.HYP.0000150859.47929.8e. [DOI] [PubMed] [Google Scholar]
- Rosner B, Spiegelman D, Willett WC. Correction of logistic regression relative risk estimates and confidence intervals for systematic within-person measurement error. Statistics in Medicine. 1989;8:1051–1070. doi: 10.1002/sim.4780080905. [DOI] [PubMed] [Google Scholar]
- Shen W, Louis TA. Triple-goal estimates in two stage hierarchical models. Journal of the Royal Statistical Society, Series B. 1998;60:455–471. [Google Scholar]
- Shohat JA, Tamarkin JD. The Problem of Moments. Providence, R.I.: American Mathematical Society; 1943. [Google Scholar]
- Stefanski LA. Unbiased estimation of a nonlinear function of a normal mean with application to measurement error models. Communications in Statistics, Series A. 1989;18:4335–4358. [Google Scholar]
- Stefanski LA, Boos DD. The Calculus of M-Estimation. American Statistical Association. 2002;56:29–38. [Google Scholar]
- Stefanski LA, Carroll RJ. Covariate measurement error in logistic regression. The Annals of Statistics. 1985;13:1335–1351. [Google Scholar]
- Stefanski LA, Carroll RJ. Conditional scores and optimal scores for generalized linear measurement error models. Biometrika. 1987;74:703–716. [Google Scholar]
- Stefanski LA, Novick SJ, Devanarayan V. Estimating a nonlinear function of a normal mean. Biometrika. 2005;92:732–736. [Google Scholar]
- Stulajter F. Nonlinear estimators of polynomials in mean values of a Gaussian stochastic process. Kybernetika. 1978;14:206–220. [Google Scholar]
- Tsiatis AA, Davidian M. A semiparametric estimator for the proportional hazards model with longitudinal covariates measured with error. Biometrika. 2001;88:447–458. doi: 10.1093/biostatistics/3.4.511. [DOI] [PubMed] [Google Scholar]
- Tukey JW. Named and faceless values: an initial exploration in memory of Prasanta C. Mahalanobis. Sankhya A. 1974;36:125–176. [Google Scholar]
- Wang CY. Corrected score estimator for joint modeling of longitudinal and failure time data. Statistica Sinica. 2006;16:235–253. [Google Scholar]
- Wang CY, Wang N, Wang S. Regression analysis when covariates are regression parameters of a random effects model for observed longitudinal measurements. Biometrics. 2000;56:487–495. doi: 10.1111/j.0006-341x.2000.00487.x. [DOI] [PubMed] [Google Scholar]
- Zhang D, Davidian M. Linear mixed models with flexible distributions of random effects for longitudinal data. Biometrics. 2001;57:795–802. doi: 10.1111/j.0006-341x.2001.00795.x. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.




