SUMMARY
One of the main limitations of causal inference methods is that they rely on the assumption that all variables are measured without error. A popular approach for handling measurement error is simulation-extrapolation (SIMEX). However, its use for estimating causal effects have been examined only in the context of an additive, non-differential, and homoscedastic classical measurement error structure. In this article we extend the SIMEX methodology, in the context of a mean reverting measurement error structure, to a doubly robust estimator of the average treatment effect when a single covariate is measured with error but the outcome and treatment and treatment indicator are not. Throughout this article we assume that an independent validation sample is available. Simulation studies suggest that our method performs better than a naive approach that simply uses the covariate measured with error.
Keywords: Average treatment effect (ATE), Causal inference, Doubly robust, Mean reverting measurement error, Measurement error, Propensity score, SIMEX
1. INTRODUCTION
In many fields measurement error tends to be the rule rather than the exception. Methods such as simulation-extrapolation (SIMEX) (Cook and Stefanski, 1994), regression calibration (Rosner and others, 1990), and multiple imputation (Cole and others, 2006; Guo and others, 2012) have been developed to mitigate the impact of measurement error in the estimation of coefficients, but limited work has been done to extend these approaches to a causal inference context.
Causal inference has provided a number of tools and a clear conceptual framework in which causal effects can be estimated. The development of causal methods has had an important impact in the design of clinical trials and sampling designs and the framework has been extended to observational studies and other study designs where formal randomization is not possible. In particular, since propensity-score based methods were first introduced by Rosenbaum and Rubin (1983), a wide range of methods based on propensity scores have been developed to estimate treatment effects in non-experimental studies. Methods such as matching, weighting or subclassification (Stuart, 2010) allow for comparison of treatment and control groups that are similar based on a set of observed characteristics. One can also use doubly robust estimators (Rotnitzky and others, 1998), which utilize models of treatment assignment (the propensity score) and of the outcome, to estimate a treatment effect. One benefit of doubly robust estimators is that they have an asymptotically normal distribution and furthermore, they are consistent if either the model for the propensity score or for the conditional mean of the outcome (but not necessarily both) are correctly specified. Nevertheless, all of these methods rely on the assumption that all-treatment indicator, and observed outcome are measured without error.
Steiner and others (2011) has shown that measurement error in the covariates can lead to bias in the treatment effect estimator, when the true propensity score model depends on the unobserved true covariates. McCaffrey and others (2013) has also shown that when at least one of the covariates is measured with error, balance between the treatment and control groups on the true (unobserved) covariate is not always achieved. There is a need to extend and develop methodology to account for measurement error in a causal inference context. Stürmer and others (2005) propose a propensity score calibration method to account for unmeasured confounders (i.e., the true covariate in the context of measurement error). This method is related to the regression calibration approach and relies on a validation sample. Nevertheless their method can only apply when the validation sample has information regarding the set of all relevant covariates and the treatment assignment. McCaffrey and others (2013) propose a measurement-error bias-corrected inverse probability of treatment weighting estimator. This method requires either the distribution of the measurement error or the unobserved true covariate to be known, and the propensity score model to be correctly specified. Webb-Vargas and others (2015) implement a multiple imputation approach to correct for covariate measurement error in propensity score estimation, and compute a doubly robust estimator of the treatment effect. However, this method requires knowledge regarding the joint distribution of the variables (covariates, outcome, and treatment indicator). Furthermore, convergence problems have been reported when many binary confounders are included. Finally, Lockwood and McCaffrey (2015) extended the SIMEX methodology to causal inference in the context of typical classical measurement error (i.e., the error term is additive, non-differential, and homoscedastic). In this article we show that the SIMEX methodology can be extended to a more complex measurement error structure that has the typical classical measurement error structure as a special case.
Mean-reverting measurement error structures have been described by Bound and Krueger (1991) in the context of longitudinal data of earnings. As stated by Akee (2011), mean-reverting measurement error in the context of self-reported earnings implies that “the higher the true value of earnings, the more likely an individual is to under report her earnings and vice versa.” In general terms and in the context of self reported variables, a mean-reverting measurement error implies that the units with larger values of a given variable tend to underreport such values whereas units with smaller values then to overreport their true value. Mean-reverting measurement error is traditionally modeled with a similar structure as the typical measurement error with the exception that the measurement error is negatively correlated with the true value of the mismeasured covariate. In this article we present an alternative way to model a mean-reverting measurement error. The main advantage of our proposed parameterization is 2-fold: (i) it allows for “mean-diverging” measurement error (i.e., higher values of the true covariate are associated with even larger reported values and vice versa) and (ii) the typical classical measurement error structure can be conceived as a special case of this more general measurement error structure.
We propose to extend the SIMEX methodology to a doubly robust estimator of the average treatment effect (ATE), when a covariate is measured with error (under a mean-reverting measurement error structure) but the treatment, outcome, and the rest of the covariates are measured without error. This method does not require assumptions regarding the joint distributions of the variables. Additionally, the validation data only needs to have information regarding the true covariate and the faulty measured version. Furthermore, the measurement error structure (i.e., reverting, diverging, or typical) does not have to be specified beforehand. Validation samples are not uncommon in studies where acquiring the true value of the covariate of interest is too expensive, time consuming, or invasive (see Pettersen and others, 2012; Saint-Maurice and others, 2014). Work by Robins (2003), Cole and others (2006), Goetghebeur and Vansteelandt (2005), or Edwards and others (2015) has examined the consequences of measurement error in the outcome and/or the exposure.
This article is organized as follows: Section 2 presents definitions and working models, Section 3 introduces a doubly robust estimator and summarizes the SIMEX method, Section 4 deals with the asymptotics, Section 5 presents the results of a simulation study, Section 6 presents an application of the method using the National Longitudinal Study of Adolescent to Adult Health (Add Health), and Section 7 presents our conclusions.
2. DEFINITIONS
2.1. Measurement error structure
Different measurement error structures have been proposed in the literature and most of them can be grouped in two categories: classical and Berkson. Classical measurement error structures assume that the true value of a covariate is not observed but a faulty version of it is available (which in the literature is referred to as a “surrogate”). In contrast, Berkson measurement error happens “when a group’s average is assigned to each individual suiting the group’s characteristics. The group’s average is thus the ‘measured value’, that is, the value that enters the analysis, and the individual latent value is the ‘true value’.” (see Heid and others, 2004). Besides the technical differences between classical and Berkson type error, the main difference between these two structures is related to the consequences in the estimation of parameters. For example, in the context of linear regression, it can be shown that under classical measurement error structures regression coefficients will be inconsistent. However, under Berkson error structures the estimators, although inefficient, will be consistent.
Throughout this article we assume a measurement error structure that belongs to the classical type and that affects only one of the covariates. If
is defined as the true and unobserved value of a covariate for unit
, the observed surrogate measure of
, say
, is assumed to be of the form:
| (2.1) |
where
is the expected value of the mismeasured covariate. Notice that different configurations of
and
may lead to different measurement error structures. For example, if
the measurement error structure follows a typical classical measurement error structure. Furthermore we could potentially find combinations of values for
and
such that the measurement error could be either mean reverting or mean diverging. Negative values of
are associated with mean reverting measurement error structures while positive values of
are associated with mean diverging structures. Observe that if
,
represents random deviations from the mean of
. Also notice that for a given value of
and depending on the value of
, it is possible that the variance of the surrogate
will be smaller than the variance of the true covariate
. Therefore, the notion of reliability
is no longer fully informative. Under this measurement error parameterization
represents the difference in the reported values among units with the same true value of the covariate
We assume that
is a random variable following a normal distribution with mean zero and unit variance that is, independent of
for all
.
2.2. Working models
The goal of this article is to estimate the ATE of a binary treatment (
) on an observed outcome (
, with
) when a set of covariates (
) are available (with
and
).
2.2.1. Propensity score.
We define the propensity score for unit
as the probability of receiving treatment given the covariates
. Explicitly:
where:
| (2.2) |
and
is a parametric model that includes an intercept. We also assume the first derivatives are defined, namely:
.
2.2.2. Conditional mean model.
We assume the conditional mean model to be:
| (2.3) |
with
, and
. We define:
;
;
;
. We let
, notice that under this model specification, and under the assumptions and regularity conditions described in Abadie and Imbens (2016) the ATE is equal to 
3. DOUBLY ROBUST ESTIMATORS AND SIMEX
3.1. Doubly robust estimators
Under regularity conditions, if (2.2) or (2.3) are correctly specified, then it can be shown that there exists a consistent and normally distributed estimator for
, with
(see Rotnitzky and others, 1998). Doubly robust estimators can be formulated in the context of estimating equations (see Robins and others, 2007). Define
; and let
. We now define the following vector:
where
is a vector of parameters in
. If the propensity score or the conditional mean model are correctly specified then
. Thus if we define
, such that
then
will be a consistent estimator for
and will have an asymptotically normal distribution. Additionally, we assume that if both models are incorrectly specified, the resulting estimator will converge to
which is not necessarily equal to
and the estimator will follow an asymptotically normal distribution. Explicitly, if we define
to be the solution to
then we can conclude that
, where
may be different from the true vector of parameters
.
3.2. SIMEX
From this point forward, we assume that the propensity score is correctly specified. Given that we are implementing a doubly robust estimator of the ATE, if our method is able to account for the impact of the measurement error in the prediction of the propensity score the final SIMEX estimator of the treatment effect will be consistent even when the model for the conditional mean of the observed outcome is misspecified. Therefore, we can estimate the vector of unknown true parameters
with
, by solving the following unbiased estimating equations:
. Since the variable
is not observed, the solution to
will lead to an inconsistent estimator of the vector of parameters of interest. Therefore the solution to the estimating equations is a consistent estimator for some other vector of parameters, say
This property of convergence to some vector of parameters is fundamental for the implementation of the SIMEX methodology. Since convergence is always achieved (even in the presence of measurement error), we can artificially increase the measurement error in the surrogate
, and evaluate the trend of the bias as a function of such increments. Then we can extrapolate to the case of no measurement error. We now describe the two steps in SIMEX: simulation and extrapolation.
3.2.1. Simulation step.
Let
be a large fixed positive integer. Then for each unit
we generate
random standard normal variables,
, indexed by
. We define
as
for some fixed
(for simplicity,
is assumed to be known). Let
be the solution to
. Then we can define
. Now we can repeat this procedure for
to obtain
. This sequence allows us to evaluate the trend of the bias as a function of the increments in the measurement error and extrapolate to the case of no measurement error, when
.
3.2.2. Extrapolation step.
Cook and Stefanski (1994) proposed to compute the SIMEX estimator as
, where
is a parametric model for the vectors
as a function of
and
is a
-dimensional vector of coefficients associated with the model. If at least one but, not necessarily both of the working models (i.e., the propensity score or the conditional mean model) are correctly specified and the parametric model
is also correctly specified, then
will be a consistent estimator of the true vector of parameters
(see Carroll and others, 1996).
4. ASYMPTOTICS
When Cook and Stefanski (1994) presented SIMEX, they suggested a bootstrap procedure to compute standard errors of the SIMEX estimator. A few years later, Carroll and others (1996) derived the asymptotic distribution of the SIMEX estimator under a typical classical measurement error structure and the assumption that
is know, Grace (2008) extended these results to longitudinal data. Carroll and others (1996) provided guidelines to estimate the distribution of the SIMEX estimator when the variance of the measurement error is unknown but it can be estimated with an asymptotic normal estimator. Furthermore, following closely the derivation presented by Carroll and others (1996) a valid asymptotic distribution of the ATE can be derived even when the measurement error has the structure presented in equation (2.1). Thus, we only need to specify a valid estimator for the variance of the measurement error when a validation sample is available. Notice that in the validation sample, both
and
are observed. We denote with
the sample size of the calibration sample and express equation (2.1) as
for
. Therefore the variance of the measurement error can be estimated by the sample variance of the residuals of the simple linear regression of
on
. In other words, the estimator of
can be expressed as if
=
where
represents the
{th} residual of the simple linear regression of
on
. It can be shown that
where
can be computed using influence functions. More explicitly
with
.
It is important to note that the measurement error structure presented in Section 2.1 is not innocuous: it can be shown that even after applying the SIMEX methodology, when the measurement error has the structure defined in equation (2.1), the estimator of the coefficient associated with the covariate measured with error will be inconsistent. A motivating example showing this and a procedure to obtain a consistent estimator of the coefficient associated with the missmeasured variable are available in Appendix A in supplementary material available at Biostatistics online.
5. SIMULATION STUDY
To evaluate the performance of our estimator we conduct a simulation study to compare bias, mean squared error (MSE) and coverage of three different estimators of the treatment effect
: (i) the estimator obtained by using
, the true measure of the covariate, (ii) a naive estimator, which ignores the measurement error and simply uses
, and (iii) the SIMEX estimator for the treatment effect. The three methods implement a doubly robust approach using propensity score weights. A total of
simulation iteration were implemented. We set
as a quadratic function; explicitly
. Details of the data generating process can be found in Appendix B in supplementary material available at Biostatistics online. In the simulation study, we fit a correctly specified propensity score model, but we fit the following model for the conditional mean:
. Notice that we have purposely omitted
and
, thus incorrectly specifying the conditional mean model. Since we are using a doubly robust estimator, it holds that
will converge to
if our method is able to account for the impact of the measurement error in the prediction of the propensity score. We evaluate the performance of the SIMEX estimator described in Section 3.2 and compare it to that of the naive estimator and the estimator obtained when the covariate measured without error is used. Figure 1 summarizes our main findings.
Fig. 1.
Absolute bias, coverage, and MSE as functions of
for different levels of correlation of the covariates and effect size of the unobserved variable in the propensity score (simulation study).
As expected, when the true covariate is used in the estimation, performance is very good and is used as the baseline for comparisons. The naive method (ignoring the measurement error) leads to biases in the estimated treatment effect across all the settings considered. Furthermore, the bias decreases as
increases. In terms of bias, the SIMEX estimator outperforms the naive method, and this result holds for all correlation levels of
and
, and across the different coefficients on
in the true treatment assignment (propensity score) model. Similar patterns observed for MSE and coverage, in terms of the SIMEX estimator performing better than the naive approach.
In general, we observe that the SIMEX approach performs better when the coefficient on the true covariate in the propensity score model is small and when the correlation between the covariates is relatively low. Notice that all methods can produce coverage above
. This is due to the fact that the estimated propensity score is used in the computation of the estimators’ weights, and thus the standard errors are overestimated (see Rubin and Thomas, 1996; Rubin and Stuart, 2006). Notice that the same conclusions hold even when
(i.e., when the measurement error follows a typical classical structure) which implies that the defined measurement error structure defined in Section 2, can easily accommodate for a typical measurement error structure. In general we observe that, as expected, the estimator that uses
as a regressor performs better than the SIMEX and the Naive estimators across all simulated scenarios. The performance of the Naive method suggests that ignoring the measurement error in a covariate, can induce bias in the estimated treatment effect which translates into higher MSE and poor coverage. This situation is exacerbated when the impact of
in the propensity score is large and the correlation,
between
and
is high. The simulation study suggests that, implementing the SIMEX methodology can help to mitigate the consequences associated with measurement error.
6. APPLICATION
The National Longitudinal Study of Adolescent to Adult Health (Add Health) is a multi-year longitudinal study of a nationally representative sample of adolescents in the United States that began during the 1994–1995 school year, when the adolescents were in grades 7–12. Information regarding a wide range of topics (e.g., socioeconomic factors, relationships, psychological, and physical health, etc.) was collected during four waves. For details see Harris and others (2009). In this application, we estimate the effect of depression (the exposure) on sexual health, where body mass index (BMI) is the confounder measured with error.
We use the Add Health data to evaluate the performance of the SIMEX estimator in a realistic data context. For this application, we use the publicly available Add Health data of subjects who participated in Waves I and II. This dataset present a unique feature: during the second wave BMI was both measured and self-reported. Thus we can compute the treatment effect using the true BMI, and compare the result to those obtained implementing SIMEX and those obtained using the naive approach (using the self-reported BMI). To do this, we artificially construct a validation sample by randomly selecting
of the observations (this is the same relative sample sizes used in simulation study). The variance ratio of the self-reported BMI to the measured BMI is equal to
which indicates that the measurement error structure cannot follow the typical classical structure, since under that structure the variance of the surrogate is always larger than the variance of the true covariate. Furthermore, the correlation between the self-reported and the measured BMI is
and the
associated with a simple linear regression of the self-reported measure on the measured BMI is
. This indicates that the self-reported BMI is a highly reliable measure of the true BMI and therefore we do not expect significant differences between the different treatment effect estimations. In fact, the estimated treatment effect was 0.052 and statistically insignificant regardless of the approach used to estimate it.
Thus, we propose a data-based simulation study where all the covariates are obtained from the Add Health data, but the outcome, the exposure and the variable measured with error are simulated. By controlling the data generating process we should be able to assess the performance of the SIMEX estimator in more complex data structure.
6.1. Data-based simulation set-up
The Add Health data contains the measured weight and height of all the adolescents in Wave II, and so a highly reliable measure of the BMI can be obtained. Plankey and others (1997) and Stommel and Schoenborn (2009) model self-reported BMI in the context of mean reverting measurement error. In order to evaluate the performance of the SIMEX estimator, we construct a self-reported BMI,
, as
with independently and identically distributed errors
and variance equal to
. Both the
, the
and
are estimated from the available data.
The set of covariates measured without error,
, are listed in Table 1. These variables have been suggested by Goodman and Whitaker (2002) to have an effect on depression, which constitutes the exposure. Goodman and Whitaker (2002) also link depression with BMI. Thus, we construct an indicator of depression status,
, assuming that
, with
. All coefficients, except the one associated with BMI, are chosen based on a estimated logistic regression using the available data. In Section 5 we have shown that a strong association between the true values of the unobserved covariate BMI and the exposure (depression) affects the performance of the SIMEX estimator, in other words the stronger the association between the missmeasured covariate and the exposure, the more compromised is the performance (in terms of bias, coverage, and MSE) of the SIMEX estimator. Thus we increased the association between BMI and depression by a factor of 20, which implies that
(the coefficient on BMI in the propensity score model) is equal to
.
Table 1.
Empirical example using Add Health data: covariates used in propensity score and outcome models
| Wave | Covariates | Code | Description |
|---|---|---|---|
| I | Race |
|
Indicator variable indicating if the respondent did not answer that his or her race is White |
| I | Welfare |
|
Indicator variable indicating that at least one of the subject’s parents responded that he or she receives government assistance. |
| I | Parent’s education |
_
|
Indicator variable that identifies if at least one parent completed college or higher level education |
| II | Heavy drinking or smoking |
|
Indicator variable that takes the value 1 if the adolescent is either a heavy smoker or a heavy drinker or both. Heavy smokers are those individuals for whom the amount of cigarettes smoked in the last month is in the top quartile. Heavy drinkers are those individuals who have had at least three drinks per week. These cutoffs are based on those used in Goodman and Whitaker (2002) |
| II | Delinquent behavior |
|
Indicator variable that identifies if the adolescent was seriously involved in criminal activities, as measured by scoring in the top quartile of the Add Health delinquency scale. This dichotomization follows the same criteria as Goodman and Whitaker (2002) |
Wingood and others (2002) suggests that BMI and depression affect sexual health. We generate the outcome variable, number of different sexual partners in the last year,
, from a normal distribution. That is,
, with:
. For simplicity we set
. Out of the
complete cases we randomly select
adolescents that will constitute the validation sample (i.e., the sample where BMI and the generated variable measured with error,
, are observed). The remaining
observations constitute the main sample, the sample where BMI is not observed, but its surrogate is. The doubly robust estimator is computed using the data from the main sample. We ran a total of 1000 iterations
6.2. Data-based simulation results
The main results from the data-based simulation are summarized in Table 2. The first column of Table 2 shows the name of the covariates used in the outcome model, in the second column the true value of the estimated parameters are displayed, in the third column we computed the average value of the estimated parameter (across the 1000 iterations), the fourth column gives the percentage bias (in absolute value), the fifth column shows the empirical coverage of the
confidence interval and finally, the last column gives the MSE. Part I of Table 2 shows the estimation results using the measured BMI. As expected, the bias is negligible and the empirical coverage confidence intervals is close to
for all the covariates included in the outcome model. Part III of Table 2 shows the estimating results associated with the naive method (i.e., using the generated self-reported BMI), the performance of the naive estimator is far from ideal and the estimators of the coefficients associated with the variables
,
and
have on average biases larger than
. This is particularly important in the estimated treatment effect (i.e., the coefficient associated with the variable
) where the bias is about
Part II of Table 2 presents the estimation results obtained by implementing the SIMEX method. On average almost all of the coefficients have biases 
(the only exception is the estimator associated with the covariate
that has a bias of
). The results shown in Table 2 incorporate the correction suggested in Appendix A in supplementary material available at Biostatistics online. The bias of the SIMEX estimator associated with the estimation of the treatment effect is
, in other words SIMEX was able to remove about
of the bias associated with the naive estimation of the treatment effect. It is important to notice that the standard errors associated with the SIMEX estimators tend to be larger that the ones obtained by the other two methods. This could potentially translate into a power loss, nevertheless the comparison of the
of the SIMEX estimators to that of the naive approach, suggests that the efficiency loss is negligible.
Table 2.
Estimation results from the data-based simulation
|
|||||
|---|---|---|---|---|---|
| Parameter | True | ||||
| Average | Bias (%) | Coverage (%) | MSE | ||
Depression ( ) |
1.5 | 1.50 | 0.21 | 94.8 | 0.006 |
Race ( ) |
1.5 |
1.50 |
0.06 | 95.5 | 0.007 |
Welfare ( ) |
1.5 | 1.51 | 0.39 | 94.1 | 0.016 |
Parents’ education ( _ ) |
1.5 |
1.50 |
0.10 | 93.6 | 0.010 |
| BMI | 1.5 | 1.50 | 0.00 | 94.2 | 0.000 |
Delinquent behavior ( ) |
1.5 | 1.51 | 0.34 | 94.1 | 0.009 |
Heavy drinking or smoking ( ) |
1.5 | 1.50 | 0.06 | 95.4 | 0.008 |
|
|||||
| Parameter | SIMEX | ||||
| Average | Bias (%) | Coverage (%) | MSE | ||
Depression ( ) |
1.5 | 1.42 | 5.01 | 93.9 | 0.006 |
Race ( ) |
1.5 |
1.52 |
1.44 | 96.1 | 0.007 |
Welfare ( ) |
1.5 | 1.48 | 1.55 | 96.5 | 0.016 |
Parents’ education ( _ ) |
1.5 |
1.47 |
1.78 | 94.4 | 0.010 |
| BMI | 1.5 | 1.55 | 3.29 | 100 | 0.000 |
Delinquent behavior ( ) |
1.5 | 1.52 | 1.47 | 95.7 | 0.009 |
Heavy drinking or smoking ( ) |
1.5 | 1.49 | 0.62 | 96.9 | 0.008 |
|
|||||
| Parameter | Naive | ||||
| Average | Bias (%) | Coverage (%) | MSE | ||
Depression ( ) |
1.5 | 1.98 | 31.72 | 11.9 | 0.249 |
Race ( ) |
1.5 |
1.32 |
11.8 | 81.4 | 0.053 |
Welfare ( ) |
1.5 | 1.57 | 4.63 | 95.7 | 0.055 |
Parents’ education ( _ ) |
1.5 |
1.64 |
9.32 | 87.9 | 0.051 |
| BMI | 1.5 | 1.38 | 7.94 | 0.0 | 0.014 |
Delinquent behavior ( ) |
1.5 | 1.31 | 12.96 | 81.1 | 0.061 |
Heavy drinking or smoking ( ) |
1.5 | 1.59 | 5.86 | 94.2 | 0.032 |
7. CONCLUSIONS
In this article we propose a new structure of measurement error that has the typical classical measurement error structure as a special case. We found that using a covariate measured with error can lead to biases in the estimation of the ATE in non-experimental studies even when a doubly robust estimator is utilized. Our theoretical results and simulation study suggests that the SIMEX estimator can help to mitigate this problem, and a data-based simulation suggests that the SIMEX estimator can help to reduce up to 84% of the bias introduced in the estimation of the treatment effect using the covariate measured with error. It is important to highlight that the SIMEX estimator also helps to reduce the bias of the other estimated coefficients in the outcome model.
Compared to other methods that address measurement error in a causal framework (see Stürmer and others, 2005), our use of SIMEX only requires information related to the covariate and its surrogate in the validation sample. The data-based simulations suggest that this methodology can be applied to complex data structures with multiple binary covariates, which is an improvement over the Multiple Imputation for External Calibration approach (see Webb-Vargas and others, 2015), which assumes joint multivariate normality of the covariates. Future work should further investigate the relative performance of these methods under a wider range of settings.
The main limitation of the SIMEX approach is the assumption that the parametric model
is correctly specified. This assumption is not testable, and future work should investigate how robust the SIMEX estimator is to different model specifications (see Cook and Stefanski, 1994). In addition, the method presented in this article only considers the case of a linear outcome model; further work will concentrate on extending this approach to different parameterizations, such as general linear models.
In conclusion, we have shown that estimating an ATE using a doubly robust estimator in non-experimental studies can lead to significant biases when a mismeasured covariate is used in the estimation. However, the SIMEX estimator can be used to mitigate this problem. This extension is particularly relevant to public health research, where measurement error tends to be the rule rather than the exception.
SUPPLEMENTARY MATERIAL
Supplementary material is available at http://biostatistics.oxfordjournals.org.
Supplementary Material
ACKNOWLEDGMENTS
The authors thank Elizabeth M. Sweeney and John Muschelli for comments that greatly improved this manuscript.
Conflict of Interest: None declared.
FUNDING
National Institute of Mental Health (R01MH099010; PI to E.A.S.).
REFERENCES
- Abadie Alberto and Imbens Guido W.. (2016).. Matching on the estimated propensity score. Econometrica 84 (1), 781–781. [Google Scholar]
- Akee Randall. (2011).. Errors in self-reported earnings: the role of previous earnings volatility and individual characteristics. Journal of Development Economics 96 (2), 409–409. [Google Scholar]
- Bound John and Krueger Alan B.. (1991).. The extent of measurement error in longitudinal earnings data: do two wrongs make a right? Journal of Labor Economics 9, 1–1. [Google Scholar]
- Carroll Raymond J Küchenhoff Helmut Lombard F and Stefanski Leonard A.. (1996).. Asymptotics for the simex estimator in nonlinear measurement error models. Journal of the American Statistical Association 91 (433), 242–242. [Google Scholar]
- Cole Stephen R Chu Haitao and Greenland Sander.. (2006).. Multiple-imputation for measurement-error correction. International Journal of Epidemiology 35 (4), 1074–1074. [DOI] [PubMed] [Google Scholar]
- Cook John R and Stefanski Leonard A.. (1994).. Simulation-extrapolation estimation in parametric measurement error models. Journal of the American Statistical association 89 (428), 1314–1314. [Google Scholar]
- Edwards Jessie K, Cole Stephen R, Westreich Daniel, Crane Heidi, Eron Joseph J, Mathews W, Christopher Moore, Richard Boswell, Stephen L, Lesko Catherine R, Mugavero Michael J. and others. (2015).. Multiple imputation to account for measurement error in marginal structural models. Epidemiology 26 (5), 645–645. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Goetghebeur Els and Vansteelandt Stijn.. (2005).. Structural mean models for compliance analysis in randomized clinical trials and the impact of errors on measures of exposure. Statistical Methods in Medical Research 14 (4), 397–397. [DOI] [PubMed] [Google Scholar]
- Goodman Elizabeth and Whitaker Robert C.. (2002).. A prospective study of the role of depression in the development and persistence of adolescent obesity. Pediatrics 110 (3), 497–497. [DOI] [PubMed] [Google Scholar]
- Grace Y Yi. (2008).. A simulation-based marginal method for longitudinal data with dropout and mismeasured covariates. Biostatistics 9 (3), 501–501. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Guo Ying Little Roderick J and McConnell Daniel S.. (2012).. On using summary statistics from an external calibration sample to correct for covariate measurement error. Epidemiology 23 (1), 165–165. [DOI] [PubMed] [Google Scholar]
- Harris K. M. Halpern C. T. Whitsel E. Hussey J. Tabor J. Entzel P. and Udry J. R.. (2009).. The national longitudinal study of adolescent to adult health: resear design. http://www.cpc.unc.edu/projects/addhealth/design (Accessed May 15, 2015). [Google Scholar]
- Heid IM Küchenhoff H Miles J Kreienbrock L and Wichmann HE.. (2004).. Two dimensions of measurement error: classical and Berkson error in residential radon exposure assessment. Journal of Exposure Science and Environmental Epidemiology 14 (5), 365–365. [DOI] [PubMed] [Google Scholar]
- Lockwood J and McCaffrey D.. (2015).. Simulation-extrapolation for estimating means and causal effects with mismeasured covariates. Observational Studies 1, 241–290. [Google Scholar]
- McCaffrey Daniel F Lockwood JR and Setodji Claude M.. (2013).. Inverse probability weighting with error-prone covariates. Biometrika 100, ast022. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pettersen Betty J Anousheh Ramtin Fan Jing Jaceldo-Siegl Karen and Fraser Gary E.. (2012).. Vegetarian diets and blood pressure among white subjects: results from the adventist health study-2 (ahs-2). Public Health Nutrition 15 (10), 1909–1909. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Plankey Michael W Stevens June Fiegal Katherine M and Rust Philip F.. (1997).. Prediction equations do not eliminate systematic error in self-reported body mass index. Obesity Research 5 (4), 308–308. [DOI] [PubMed] [Google Scholar]
- Robins James M. (2003).. General methodological considerations. Journal of Econometrics 112 (1), 89–89. [Google Scholar]
- Robins James Sued Mariela Lei-Gomez Quanhong and Rotnitzky Andrea.. (2007).. Comment: performance of double-robust estimators when “inverse probability” weights are highly variable. Statistical Science 22 (4), 544–544. [Google Scholar]
- Rosenbaum Paul R and Rubin Donald B.. (1983).. The central role of the propensity score in observational studies for causal effects. Biometrika 70 (1), 41–41. [Google Scholar]
- Rosner B Spiegelman D and Willett WC.. (1990).. Correction of logistic regression relative risk estimates and confidence intervals for measurement error: the case of multiple covariates measured with error. American Journal of Epidemiology 132 (4), 734–734. [DOI] [PubMed] [Google Scholar]
- Rotnitzky Andrea Robins James M and Scharfstein Daniel O.. (1998).. Semiparametric regression for repeated outcomes with nonignorable nonresponse. Journal of the American Statistical Association 93 (444), 1321–1321. [Google Scholar]
- Rubin Donald B and Stuart Elizabeth A.. (2006).. Affinely invariant matching methods with discriminant mixtures of proportional ellipsoidally symmetric distributions. The Annals of Statistics 34, 1814–1814. [Google Scholar]
- Rubin Donald B and Thomas Neal.. (1996).. Matching using estimated propensity scores: relating theory to practice. Biometrics 52, 249–249. [PubMed] [Google Scholar]
- Saint-Maurice Pedro F Welk Gregory J Beyler Nicholas K Bartee Roderick T and Heelan Kate A.. (2014).. Calibration of self-report tools for physical activity research: the physical activity questionnaire (paq). BMC Public Health 14 (1), 1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Steiner Peter M Cook Thomas D and Shadish William R.. (2011).. On the importance of reliable covariate measurement in selection bias adjustments using propensity scores. Journal of Educational and Behavioral Statistics 36 (2), 213–213. [Google Scholar]
- Stommel Manfred and Schoenborn Charlotte A.. (2009).. Accuracy and usefulness of BMI measures based on self-reported weight and height: findings from the nhanes & nhis 2001-2006. BMC Public Health 9 (1), 1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stuart Elizabeth A. (2010).. Matching methods for causal inference: a review and a look forward. Statistical Science: A Review Journal of the Institute of Mathematical Statistics 25 (1), 1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stürmer Til Schneeweiss Sebastian Avorn Jerry and Glynn Robert J.. (2005).. Adjusting effect estimates for unmeasured confounding with validation data using propensity score calibration. American Journal of Epidemiology 162 (3), 279–279. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Webb-Vargas Yenny Rudolph Kara E Lenis David Murakami Peter and Stuart Elizabeth A.. (2015).. An imputation-based solution to using mismeasured covariates in propensity score analysis. Statistical Methods in Medical Research. pii: 0962280215588771. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wingood Gina M DiClemente Ralph J Harrington Kathy and Davies Susan L.. (2002).. Body image and African American females’ sexual health. Journal of Women’s Health & Gender-Based Medicine 11 (5), 433–433. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.











































