On protected estimation of an odds ratio model with missing binary exposure and confounders

E J Tchetgen Tchetgen; A Rotnitzky

doi:10.1093/biomet/asr027

. 2011 Sep;98(3):749–754. doi: 10.1093/biomet/asr027

On protected estimation of an odds ratio model with missing binary exposure and confounders

E J Tchetgen Tchetgen ¹, A Rotnitzky ²

PMCID: PMC3384358 PMID: 22822262

Abstract

We describe an estimator of the parameter indexing a model for the conditional odds ratio between a binary exposure and a binary outcome given a high-dimensional vector of confounders, when the exposure and a subset of the confounders are missing, not necessarily simultaneously, in a subsample. We argue that a recently proposed estimator restricted to complete-cases confers more protection to model misspecification than existing ones in the sense that the set of data laws under which it is consistent strictly contains each set of data laws under which each of the previous estimators are consistent.

Keywords: Inverse probability weighted, Logistic regression, Missing at random, Model misspecification

1. Introduction

A common aim of epidemiologic observational studies is to evaluate the causal effect of a dichotomous point exposure A on the risk of a disease outcome encoded in a binary variable Y. In an effort to account for confounding bias in such nonexperimental studies, investigators routinely collect and adjust for in data analysis, a large number of confounding factors encoded in a vector L. A common target of inference is a finite-dimensional parameter ψ indexing a model OR(L; ψ) for the conditional log-odds ratio function:

OR (L) = log {\frac{pr (Y = 1 | A = 1, L) pr (Y = 0 | A = 0, L)}{pr (Y = 0 | A = 1, L) pr (Y = 1 | A = 0, L)}},

(1)

where OR(· ; ·) is a known function and the true value of ψ is unknown. Under the assumption of no unmeasured confounders OR(L) quantifies, on the log-odds ratio scale, the causal effect of A on Y. The function OR(·) is attractive as it is invariant to alterations in the marginal distributions of (Y, L) or of (A, L). As a consequence, ψ can be consistently estimated from data (Y_i, A_i, L_i) (i = 1, . . . , n) collected under a variety of commonly employed epidemiological designs, in particular, simple random sample designs and case-control designs unmatched or matched on some components of L.

In this note, we consider the estimation of ψ when Y and a subset L_obs of the confounders L are always observed but A and the remaining components L_mis of L are not fully observed in all study participants, perhaps only A missing in some, only some random subset of L_mis missing in others and both missing in a third subsample. We examine inference under the assumptions that:

pr (R = 1 | A, L_{mis}, L_{obs}, Y) = pr (R = 1 | L_{obs}, Y)

(2)

and pr(R = 1 | L_obs, Y) > 0, where R = 1 if A and L_mis are jointly observed, and R = 0 otherwise. When A and L_mis can only be either both observed or both missing simultaneously, (2) is a particular instance of the assumption that the data are missing at random (Little & Rubin, 2002) which is routinely made in the literature on missing covariate data (Robins et al., 1994; Zhao et al., 1996; Lipsitz et al., 1998; Little & Rubin, 2002; Parzen et al., 2002). For concreteness, we assume that (R_i, Y_i, A_i, L_i) (i = 1, . . . , n) are independent and identically distributed. Our discussion equally applies when the data are collected from a case-control design unmatched or matched on some of the confounders L. We argue that an estimator of ψ recently proposed by Tchetgen Tchetgen et al. (2010), computed just from the completecase units, i.e., those for which A and L_mis are observed, is more robust than currently available estimators in the sense that the set of data laws under which it is consistent strictly contains each set of data laws under which each of the previous estimators is consistent.

2. Dimension reducing estimation strategies

The right-hand side of (1) and the baseline log-odds function

odds Y (L) = log {\frac{pr (Y = 1 | A = 0, L)}{pr (Y = 0 | A = 0, L)}}

(3)

determine the conditional probability function pr(Y = 1 | A, L). In the absence of missing data, to ensure maximal robustness against model misspecification, it is desirable to estimate ψ under a model that assumes just (1) rather than under a model that parameterizes the regression function pr(Y = 1 | A, L) and consequently that parameterizes the right-hand sides of both (1) and (3). Tchetgen Tchetgen et al. (2010) showed that due to the curse of dimensionality this is not feasible when, as is often the case in applications to render the assumption of no unmeasured confounders plausible, L is a vector with several components. The situation is aggravated when (A, L_mis) are missing by happenstance, as then the conditional missingness probability function pr(R = 1 | L_obs, Y) is also unknown. The following dimension reduction strategies appear to be available. One strategy is to assume (a) a parametric model oddsA(L; α) for

odds A (L) = log {\frac{pr (A = 1 | Y = 0, L)}{pr (A = 0 | Y = 0, L)}} .

(4)

Model (1) coupled with model oddsA(L; α) for (4) renders the parametric retrospective logistic regression model for A on Y and L,

log {\frac{pr (A = 1 | Y, L; ψ, α)}{pr (A = 0 | Y, L; ψ, α)}} = odds A (L; α) + OR (L; ψ) Y .

(5)

The problem then reduces to estimation of a parametric logistic regression function with incomplete observations of the outcomes A and/or a subset L_mis of the covariates, when the probability of observing both A and L_mis depends only on the observed model covariates Y and L_obs. Under such conditions, it is well known that the standard logistic regression estimator of ψ, say ψ̃_retro, restricted to the complete-case units, i.e., the units with no missing data, is consistent and asymptotically normal. In fact, when L_obs = L so that L_mis = is empty, ψ̃_retro is semiparametric efficient under the model that assumes (5) and condition (2) on the missing data mechanism, so under just these assumptions, no information is lost in large samples by discarding incomplete units (Robins et al., 1994).

An alternative strategy would be to assume (b) a parametric model oddsY (L; η) for oddsY (L), which, in conjunction with model (1), renders the parametric logistic regression model for Y on A and L,

log {\frac{pr (Y = 1 | A, L; ψ, η)}{pr (Y = 0 | A, L; ψ, η)}} = odds Y (L; η) + OR (L; ψ) A .

(6)

The problem then reduces to estimation under a logistic regression model with a subset of the covariates incompletely observed and probability of complete observations that depends at most on the outcome and on the always observed covariates. However, when L is high-dimensional, this dimension-reducing strategy is insufficient to avoid the curse of dimensionality and further model restrictions are needed. One option is to postulate, in addition to (b), a parametric model f (A, L_mis | L_obs; υ) for f (A, L_mis | L_obs) and to conduct maximum likelihood estimation of the model parameters (ψ, η, υ). This strategy, however, leads to even more restrictions than the sole restriction (a) required by the retrospective logistic regression ψ̃_retro because it leads not only to a parametric model for f (A | Y, L) but for the joint law f (Y, A, L_mis | L_obs).

Alternatively, one might consider assuming (b) and (c) a parametric model pr(R = 1 | L_obs, Y ; λ) for pr(R = 1 | L_obs, Y), thus arriving at the problem of estimating the logistic regression model (6) under a parametric model for the probability of complete observations. In such a setting, the weighted logistic regression estimator of ψ, say ψ̃_ipw, in model (6) restricted to the complete-case units and weights equal to π̂⁻¹ = pr(R = 1 | L_obs, Y; λ̂)⁻¹, where λ̂ is the maximum likelihood estimator of λ, is consistent and asymptotically normal (Robins et al., 1994). This estimator, known as the inverse probability weighted estimator, was used, for example, in Moore et al. (2009), in a setting in which only A was missing.

It is possible to construct a so-called augmented inverse probability weighted estimator of ψ, say ψ̃_aipw, which is more robust than ψ̃_ipw. Like ψ̃_ipw, the estimator, ψ̃_aipw is consistent if (b) and (c) hold, but unlike ψ̃_ipw, it is also consistent even if (c) fails provided (a), (b) and (d) a parametric model f (L_mis | L_obs, Y ; τ)for f (L_mis | L_obs, Y) hold. To compute ψ̃_aipw one first computes, using only the complete-case units, the maximum likelihood estimators τ̂ and (α̂_retro, ψ̃_retro) of τ and (α, ψ) under models (d) and (6), respectively. The estimator ψ̃_aipw of ψ is the subvector of (ψ̃^T, η̃^T)^T solving

E_{n} [R {\hat{π}}^{- 1} S (ψ, η) - ({\hat{π}}^{- 1} R - 1) \hat{E} {S (ψ, η) | Y, L_{obs}}] = 0,

where S(ψ, η) = (1, A, L′){Y − pr(Y = 1 | A, L; ψ, η)}, Ê (· | Y, L_obs) is the expectation with respect to f̂(A, L_mis | L_obs, Y) = f (A | L, Y ; α̂_retro, ψ̃_retro) f (L_mis | L_obs, Y ; τ̂) and here and throughout E_n(·) denotes the empirical mean operator. Tchetgen Tchetgen (2009) developed a simple algorithm for implementing this estimator when L is fully observed.

In summary, of the dimension reducing procedures examined so far, the two which are consistent under the weakest modelling assumptions are:

the estimator ψ̃_retro, which requires that model (a) be correct;
the estimator ψ̃_aipw, which requires that either models (b) and (c) or models (a) and (b) and (d) be correct.

3. The proposed estimator

We now argue that an estimator of ψ recently derived in Tchetgen Tchetgen et al. (2010) computed just from the complete-case units confers more robustness to model misspecification than the estimators ψ̃_retro and ψ̃_aipw. To compute this estimator, one first computes the standard logistic regression estimators α̂ of α and η̂ of η in models (a) and (b) using only data from complete-case units. One then discards the separate estimators of ψ obtained from the fit of each model and instead, one computes the desired estimator, say ψ̂, by solving the estimating equations E_n{U(ψ; η̂, α̂)} = 0 where

U (ψ; η, α) = R \frac{\partial OR (L; ψ)}{\partial ψ} {A - d (L; ψ, η, α)} [Y - expit {odds Y (L; η) + OR (L; ψ) A}]

with

d (L; ψ, η, α) = 1 + \frac{exp {OR (L; ψ)} \times [1 + exp {OR (L; ψ) + odds Y (L; η)}]}{exp {odds A (L; α)} \times [1 + exp {odds Y (L; η)}]} .

Solving the estimating equation may appear computationally challenging, but this is not the case as Tchetgen Tchetgen & Rotnitzky (2011) provide an iterative algorithm that implements ψ̂ with standard logistic regression software. The algorithm is reproduced in the online Supplementary Material for completeness. These authors showed that in the absence of missing data, ψ̂ is consistent and asymptotically normal for ψ provided either oddsA(L; α) is a correct model for oddsA(L) or, oddsY (L; η) is a correct model for oddsY (L), but not necessarily both assertions hold. This result immediately implies that with missing data, ψ̂ is an estimator of the parameter ψ indexing a model OR(L; ψ) for the complete-case log-OR function:

{OR}_{cc} (L) = log {\frac{pr (Y = 1 | A = 1, L, R = 1) \times pr (Y = 0 | A = 0, L, R = 1)}{pr (Y = 1 | A = 0, L, R = 1) \times pr (Y = 0 | A = 1, L, R = 1)}},

which is consistent and asymptotically normal for ψ provided either oddsA(L; α) is a correct model for odds_cc A(L) or, oddsY (L; η) is a correct model for odds_ccY (L), but not necessarily both, where

\begin{matrix} odds A_{cc} (L) = log \frac{pr (A = 1 | Y = 0, L, R = 1)}{pr (A = 0 | Y = 0, L, R = 1)}, & odds Y_{cc} (L) = log \frac{pr (Y = 1 | A = 0, L, R = 1)}{pr (Y = 0 | A = 0, L, R = 1)} \end{matrix}

are the complete-case baseline log-odds functions.

Under (2), OR_cc(L) = OR(L) because the full-data law f (A, Y, L) differs from the complete-case data law f (A, Y, L | R = 1) only in that f (Y, L_obs) is distinct from f (Y, L_obs | R = 1) and, as noted earlier, OR(L) remains unchanged under departures from the marginal law of (Y, L). Thus, a model OR(L; ψ) for OR_cc(L) is also a model for OR(L). Furthermore, because oddsA_cc(L) depends solely on pr(A = 1 | Y = 0, L, R = 1) and this is equal to pr(A = 1 | Y = 0, L) under (2), then oddsA_cc(L) = oddsA(L) and thus model oddsA(L; α) is also a model for oddsA(L).

We thus arrive at the conclusion that under (2), ψ̂ is consistent for ψ if one of Conditions 1 and 2 below holds, but not necessarily both.

Condition 1. Model (a) is correctly specified.
Condition 2. A parametric model for oddsY_cc(L) is correctly specified.

The identity
$log {\frac{pr (Y = 1 | A = 0, L, R = 1)}{pr (Y = 0 | A = 0, L, R = 1)}} = log {\frac{pr (Y = 1 | A = 0, L)}{pr (Y = 0 | A = 0, L)}} + log {\frac{pr (R = 1 | Y = 1, L)}{pr (R = 1 | Y = 0, L)}}$
implies that a parametric model for the full-data baseline log-odds oddsY (L) and another for pr(R = 1 | L, Y) = pr(R = 1 | L_obs, Y) determine a parametric model for oddsY_cc(L). Consequently Condition 2 is met, in particular, if Condition 3 is met and a parametric model for oddsY_cc(L) is constructed combining (b) and (c) via the identity (7).
Condition 3. Models (b) and (c) are correctly specified.

In summary,

(iii) consistency of ψ̂ requires that either model (a) or models (b) and (c) be correct.

Contrasting (iii) with (i) and (ii), we observe that the conditions for consistency of ψ̂ are met if and only if either the conditions for consistency of ψ̃_retro or those for consistency of ψ̃_aipw are met, but not necessarily both. We thus conclude that ψ̂ confers more protection to model misspecification than each individual estimator ψ̃_retro and ψ̃_aipw.This in turn implies that ψ̂ confers more protection to model misspecification than the prospective parametric likelihood-based estimator and the estimator ψ̃_ipw discussed in § 2, as consistency of the former requires even more stringent conditions than those for consistency of ψ̃_retro and consistency of the latter requires even more stringent conditions than those for consistency of ψ̃_aipw.

4. Simulation study

We compared ψ̂ with ψ̃_retro and ψ̃_aipw in a simulation study in which A is sometimes missing but L_obs = L is always observed. We simulated 500 random samples of (Y, R A, R, L) of size 2000 according to L₁ ∼ N(0, 0.9²), $L_{2} | L_{1} \sim N (L_{1}^{2} / 2, {0.9}^{2})$ , f (Y, A | L) ∝ exp(−0.5Y + 0.5YL₁ + 0.4YL₂ + 0.7YL₁ L₂ + AY + 0.8A − 0.4AL₁ + 0.5AL₂ + 0.6AL₁ L₂) and R | (Y, A, L) ∼ Ber[{1 + exp(0.3L₁ + 0.3 L₂Y + 0.4 Y L₁ L₂)}⁻¹]. Table 1 reports the results for the estimators ψ̂, ψ̃_retro and ψ̃_aipw in model OR(L; ψ) = ψ computed under all eight combinations of correct or incorrect models for oddsA(L),oddsY (L) and pr(R = 1 | L, Y), labelled (a), (b) and (c), respectively. Correct models used oddsA(L; α) = α₀ + α₁ L₁ + α₂ L₂ + α₃ L₁ L₂, oddsY (L; η) = η₀ + η₁ L₁ + η₂ L₂ + η₃ L₁ L₂ and pr(R = 1 | L, Y ; λ) = {1 + exp(−λ₀ − λ₁ L₁ − λ₂ L₂ − λ₃Y L₂ − λ₄Y L₁ L₂)}⁻¹. Incorrect models ignored the terms in L₁ L₂. According to theory, ψ̂ is consistent in lines 1–5, ψ̃_retro is consistent in lines 1, 2, 3 and 5 and ψ̃_aipw is consistent in lines 1, 2 and 4. The Monte Carlo bias of each estimator reported in the corresponding lines is small, but the biases of ψ̃_retro in line 4 and that of ψ̃_aipw in lines 3 and 5 are substantial, thus illustrating the robustness advantage of ψ̂. According to theory, none of the three estimators is consistent in lines 6–8, a fact reflected in the substantial Monte Carlo bias observed for all three estimators in these lines, with the exception of the bias of estimators ψ̃_aipw and ψ̂ in line 6. This exception is not supported by theory and is thus likely particular to our data generating process.

Table 1.

Simulation results

Line	Model Correct		Estimator
Line	Model Correct		ψ̃_retro	ψ̃_aipw	ψ̃
1	^(a), ^(b), ^(c)	Bias	−0.001	0.0003	0.0007
1	^(a), ^(b), ^(c)	Variance	0.0285	0.029	0.029
2	^(a), ^(b)	Bias	−0.001	0.00005	0.003
2	^(a), ^(b)	Variance	0.028	0.031	0.029
3	^(a), ^(c)	Bias	−0.001	0.242	0.00041
3	^(a), ^(c)	Variance	0.028	0.029	0.029
4	^(b), ^(c)	Bias	0.184	−0.043	−0.010
4	^(b), ^(c)	Variance	0.024	0.028	0.025
5	^(a)	Bias	−0.001	−0.240	−0.005
5	^(a)	Variance	0.028	0.028	0.027
6	^(b)	Bias	0.184	−0.028	0.006
6	^(b)	Variance	0.024	0.033	0.029
7	^(c)	Bias	0.184	0.210	0.190
7	^(c)	Variance	0.024	0.031	0.028
8	–	Bias	0.184	0.215	0.193
8	–	Variance	0.024	0.026	0.024

Open in a new tab

^(a)

model for oddsA(L);

^(b)

model for oddsY (L);

^(c)

model for pr(R = 1 | Y, L_obts).

As indicated in § 2, according to theory, when oddsA(L) is correctly modelled, as in lines 1, 2, 3 and 5, ψ̃_retro is indeed asymptotically efficient, in spite of being based only on complete-case units. A comparison of the Monte Carlo standard errors of ψ̃_retro and ψ̃_aipw in lines 1 and 2, corresponding to cases in which both are consistent, confirms this result. A comparison of the Monte Carlo standard errors of ψ̃_retro and ψ̂ in lines 1, 2, 3 and 5, the cases in which both are consistent, further indicates that the new complete-case estimator ψ̂ does not incur in any sizeable efficiency loss. Furthermore, a comparison of the standard errors of ψ̂ and ψ̃_aipw in line 4, a situation in which both are consistent, additionally illustrates the fact that even though ψ̃_aipw uses data from incomplete-case units it can be less efficient than ψ̂ when the model for oddsA(L) is incorrectly specified.

5. Extensions to nonbinary outcomes and/or treatments

For the situations in which A and/or Y are not binary, a number of recent articles have focused attention on the estimation of the parameter ψ indexing an odds ratio model OR(Y, A, L; ψ) for OR(Y, A, L) = log[{ f (Y | A, L) f (y₀ | a₀, L)}/{ f (y₀ | A, L) f (Y | a₀, L)}] where (y₀, a₀) is a user-specified point in the sample space (Chen, 2007; Osius, 2009; Tchetgen Tchetgen et al., 2010; Tchetgen Tchetgen, 2010). As in the binary case, OR(Y, A, L) is an attractive target of inference as it is invariant to alterations in the marginal laws of (Y, L) or of (A, L). Tchetgen Tchetgen et al. (2010) describe estimators of ψ that are consistent and asymptotically normal provided one of the following two models is correctly specified, but not necessarily both.

Model 1. A model oddsA(A, L; α) for oddsA(A, L) = log{ f (A|y₀, L)/ f (a₀| y₀, L)}.
Model 2. A model oddsY (Y, L; η) for oddsY (Y, L) = log{ f (Y |a₀, L)/ f (y₀|a₀, L)}.

Just as in the binary case, under the missing data conditions of this note, the estimators of Tchetgen Tchetgen et al. (2010) restricted to the complete-case units are consistent and asymptotically normal for ψ so long as oddsA(A, L; α) is a correctly specified model for oddsA(A, L) or oddsY (A, L; η) is now a correctly specified model for oddsY_cc(Y, L) = log{ f (Y | a₀, L, R = 1)/ f (y₀ | a₀, L, R = 1)}, but not necessarily both.

Supplementary material

Supplementary material available at Biometrika online includes an iterative algorithm.

Acknowledgments

The authors were funded by grants from the U.S. National Institutes of Health. The authors wish to thank the editor and the reviewers for helpful comments. Andrea Rotnitzky is also affiliated with the Harvard School of Public Health.

References

Chen YH. A semi-parametric odds ratio model for measuring association. Biometrics. 2007;63:413–21. doi: 10.1111/j.1541-0420.2006.00701.x. [DOI] [PubMed] [Google Scholar]
Lipsitz S, Parzen M, Ewell M. Inference using conditional logistic regression with missing covariates. Biometrics. 1998;54:295–303. [PubMed] [Google Scholar]
Little R, Rubin D. Statistical Analysis with Missing Data. 2nd ed. New York: John Wiley; 2002. [Google Scholar]
Moore CG, Lipsitz SR, Addy CL, Hussey JR, Fitzmaurice G, Natarajan S. Logistic Regression with incomplete covariate data in complex survey sampling:application of re-weighted estimating equations. Epidemiology. 2009;20:382–90. doi: 10.1097/EDE.0b013e318196cd65. [DOI] [PubMed] [Google Scholar]
Osius G. Asymptotic inference for semiparametric association models. Ann Statist. 2009;37:459–89. [Google Scholar]
Parzen M, Lipsitz S, Ibrahim J, Lipshultz S. A weighted estimating equation for linear regression with missing covariate data. Statist Med. 2002;21:2421–36. doi: 10.1002/sim.1195. [DOI] [PubMed] [Google Scholar]
Robins JM, Rotnitzky A, Zhao LP. Estimation of regression coefficients when some regressors are not always observed. J Am Statist Assoc. 1994;89:846–66. [Google Scholar]
Tchetgen Tchetgen E. A simple implementation of doubly robust estimation in logistic regression with covariates missing at random. Epidemiology. 2009;20:391–4. doi: 10.1097/EDE.0b013e3181a0acc7. [DOI] [PubMed] [Google Scholar]
Tchetgen Tchetgen E, Robins J, Rotnitzy A. On doubly robust estimation in a semi-parametric odds ratio model. Biometrika. 2010;97:171–80. doi: 10.1093/biomet/asp062. [DOI] [PMC free article] [PubMed] [Google Scholar]
Tchetgen Tchetgen E, Rotnitzky A. Double-robust adjustment for confounding in cohort and case-control studies. Statist Med. 2011;30:335–47. doi: 10.1002/sim.4103. [DOI] [PMC free article] [PubMed] [Google Scholar]
Tchetgen Tchetgen E. On the interpretation, robustness, and power of varieties of case-only tests of gene-environment interaction. Am J Epidemiol. 2010;172:1335–8. doi: 10.1093/aje/kwq359. [DOI] [PMC free article] [PubMed] [Google Scholar]
Zhao LP, Lipsitz S, Lew D. Regression analysis with missing covariate data using estimating equations. Biometrics. 1996;52:1165–82. [PubMed] [Google Scholar]

[b1-asr027] Chen YH. A semi-parametric odds ratio model for measuring association. Biometrics. 2007;63:413–21. doi: 10.1111/j.1541-0420.2006.00701.x. [DOI] [PubMed] [Google Scholar]

[b2-asr027] Lipsitz S, Parzen M, Ewell M. Inference using conditional logistic regression with missing covariates. Biometrics. 1998;54:295–303. [PubMed] [Google Scholar]

[b3-asr027] Little R, Rubin D. Statistical Analysis with Missing Data. 2nd ed. New York: John Wiley; 2002. [Google Scholar]

[b4-asr027] Moore CG, Lipsitz SR, Addy CL, Hussey JR, Fitzmaurice G, Natarajan S. Logistic Regression with incomplete covariate data in complex survey sampling:application of re-weighted estimating equations. Epidemiology. 2009;20:382–90. doi: 10.1097/EDE.0b013e318196cd65. [DOI] [PubMed] [Google Scholar]

[b5-asr027] Osius G. Asymptotic inference for semiparametric association models. Ann Statist. 2009;37:459–89. [Google Scholar]

[b6-asr027] Parzen M, Lipsitz S, Ibrahim J, Lipshultz S. A weighted estimating equation for linear regression with missing covariate data. Statist Med. 2002;21:2421–36. doi: 10.1002/sim.1195. [DOI] [PubMed] [Google Scholar]

[b7-asr027] Robins JM, Rotnitzky A, Zhao LP. Estimation of regression coefficients when some regressors are not always observed. J Am Statist Assoc. 1994;89:846–66. [Google Scholar]

[b8-asr027] Tchetgen Tchetgen E. A simple implementation of doubly robust estimation in logistic regression with covariates missing at random. Epidemiology. 2009;20:391–4. doi: 10.1097/EDE.0b013e3181a0acc7. [DOI] [PubMed] [Google Scholar]

[b9-asr027] Tchetgen Tchetgen E, Robins J, Rotnitzy A. On doubly robust estimation in a semi-parametric odds ratio model. Biometrika. 2010;97:171–80. doi: 10.1093/biomet/asp062. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b10-asr027] Tchetgen Tchetgen E, Rotnitzky A. Double-robust adjustment for confounding in cohort and case-control studies. Statist Med. 2011;30:335–47. doi: 10.1002/sim.4103. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b11-asr027] Tchetgen Tchetgen E. On the interpretation, robustness, and power of varieties of case-only tests of gene-environment interaction. Am J Epidemiol. 2010;172:1335–8. doi: 10.1093/aje/kwq359. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b12-asr027] Zhao LP, Lipsitz S, Lew D. Regression analysis with missing covariate data using estimating equations. Biometrics. 1996;52:1165–82. [PubMed] [Google Scholar]

PERMALINK

On protected estimation of an odds ratio model with missing binary exposure and confounders

E J Tchetgen Tchetgen

A Rotnitzky

Abstract

1. Introduction

2. Dimension reducing estimation strategies

3. The proposed estimator

4. Simulation study

Table 1.

5. Extensions to nonbinary outcomes and/or treatments

Supplementary material

Acknowledgments

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

On protected estimation of an odds ratio model with missing binary exposure and confounders

E J Tchetgen Tchetgen

A Rotnitzky

Abstract

1. Introduction

2. Dimension reducing estimation strategies

3. The proposed estimator

4. Simulation study

Table 1.

5. Extensions to nonbinary outcomes and/or treatments

Supplementary material

Acknowledgments

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases