Instrumental variable estimation of causal odds ratios using structural nested mean models

Roland A Matsouaka; Eric J Tchetgen Tchetgen

doi:10.1093/biostatistics/kxw059

. 2017 Feb 6;18(3):465–476. doi: 10.1093/biostatistics/kxw059

Instrumental variable estimation of causal odds ratios using structural nested mean models

Roland A Matsouaka ^*,^*, Eric J Tchetgen Tchetgen ^*

PMCID: PMC5862265 PMID: 28334061

Summary

We consider estimating causal odds ratios using an instrumental variable under a logistic structural nested mean model (LSNMM). Current methods for LSNMMs either rely heavily on possible “uncongenial” modeling assumptions or involve intricate numerical challenges, which have impeded their use. In this article, we present an alternative method that ensures a congenial parametrization, circumvents computational complexity of existing methods, and is easy to implement. We illustrate the proposed method to (1) estimate the causal effect of years of education on earnings using data from the NLSYM and (2) assess the impact of moving families from high to low-poverty neighborhoods had on lifetime major depressive disorder among adolescents in the “Moving to Opportunity (MTO) for Fair Housing Demonstration Project” from the Department of Housing and Urban Development.

Keywords: Causality, Instrumental variable, Odds ratio, Confounding, Structural model, Non-compliance

1. Introduction

Instrumental variable (IV) methods are used to estimate, under certain assumptions, the effects of an exposure on an outcome when unobserved confounding is suspected to be present. An IV is a pre-exposure variable that, conditional on a set of measured baseline covariates, is (1) associated with the exposure, (2) associated with the outcome only through the exposure, that is, there is no direct effect of the IV on the outcome upon intervening on exposure (also known as the exclusion restriction assumption), and (3) independent of any unmeasured confounding variable of the effects of the exposure on the outcome (Vansteelandt and others, 2011).

In clinical trials affected by non-compliance, random assignment to treatment (the exposure in this context) is often used as an IV, whereas in observational studies the choice of a valid IV is challenging (Martens and others, 2006; Murray, 2006; Glymour and others, 2012). Nevertheless, once an appropriate IV has been identified, it may potentially be used to recover a consistent estimate of the effect of an exposure despite the presence of unmeasured confounding.

Compared to IV methods for continuous outcomes, those for binary outcomes have received far less attention. Our aim is to estimate the conditional causal odds ratio (COR), that is, the effect of an exposure on a binary outcome, conditional on the exposure level, IV, and measured covariates under an LSNMM. As demonstrated by Robins (1994), one cannot identify the COR solely based on the standard IV assumptions (1)–(3). In fact, Robins and Rotnitzky (2004) show explicitly that the resulting causal model is overparametrized, in the sense that the likelihood for the LSNMM depends on more unknown parameters than the observed data likelihood and thus the model must somehow be restricted for identification purposes.

Any viable approach to identify and estimate the COR requires additional restrictions beyond the standard IV assumptions (Vansteelandt and Goetghebeur, 2005; Liu and others, 2015).

Both Vansteelandt and Goetghebeur (2003) and Robins and Rotnitzky (2004) make the “no-current interaction assumption” (Assumption (4)), but develop distinct approaches for estimating the COR parameter of an LSNMM under Assumptions (1)–(4). Vansteelandt and Goetghebeur (2003) posit a so-called association model for the exposure, IV and covariates with the binary outcome. However, when the association model is not saturated, it is prone to model incompatibility with the LSNMM (Robins and Rotnitzky, 2004; Vansteelandt and others, 2011). Robins and Rotnitzky (2004) develop an alternative approach that avoids this limitation using a parametrization that is always compatible. Unfortunately, their approach can be computationally prohibitive and challenging, especially with a continuous IV. It requires repeatedly solving a complicated integral equation numerically for each observation used in the process of finding an estimator of the LSNMM, which makes it difficult to implement.

The purpose of this article is to develop a different strategy to estimate the parameters of an LSNMM under assumptions (1)–(4) using a novel parametrization, which resolves the aforementioned difficulties with previous estimators of an LSNMM. Unlike Vansteelandt and Goetghebeur (2003), but similar to Robins and Rotnitzky (2004), we use a (variation independent) compatible parametrization of the observed data likelihood. Furthermore, unlike Robins and Rotnitzky (2004), our approach does not involve solving iterative integral equations and is thus readily implementable regardless of the nature or the dimension of the exposure, IV, and covariates.

Using the potential outcome framework, in Section 2, we lay out the IV conditions, present the causal effect of interest, and describe identification assumptions. We continue in Section 3 with a review of existing estimation methods for the LSNMM. In Section 4, we introduce the new parametrization, show how it contrasts with the approach of Robins and Rotnitzky (2004), and demonstrate some of its important properties. Using the proposed parametrization, we also present a straightforward maximum likelihood approach for estimating the LSNMM. In addition, we provide a goodness-of-fit (GOF) test statistic which is useful to evaluate parametric assumptions about nuisance parameters of the fitted likelihood model under Assumptions (1) through (4). In Section 5, we report a simulation study for both a binary and a continuous exposure, respectively, using continuous baseline covariates. We also closely examine the finite-sample performance of the proposed GOF test statistic. Finally, we apply our method to two different data sets to assess the impact of years of education on wages in the United States National Longitudinal Survey of Young Men (NLSYM) study and to evaluate the effects of moving from high-poverty to low-poverty neighborhoods on lifetime major depressive disorder among youth using data from the US Department of Housing and Urban Development “Moving to Opportunity (MTO) for Fair Housing Demonstration Project” (reported in the online supplementary material available at Biostatistics online).

2. Notation and assumptions

Suppose, we observe Inline graphic independent and identically distributed copies of the vector where is an IV for the effect of an exposure on an outcome given a set of measured baseline covariates . The vector includes all measured confounders of the effects of on and for the effects of on . We define the potential outcome Inline graphic as the outcome we would have observed had, possibly contrary to fact, been set to and set to by external intervention. Likewise, denotes the outcome had been set to . To identify causal effects, we make the consistency assumption that if a person is assigned to a specific exposure Inline graphic and an IV level , their observed outcome coincides with their potential outcome (Robins and Rotnitzky, 2004). The IV conditions presented in the introduction may now be formally expressed as: (1) non-null association between and ; (2) exclusion restriction: almost surely for all ; and (3) independence of potential outcomes and IV: Inline graphic for all The notation indicates stochastic independence between variables A and B given C.

We define an LSNMM as

\begin{matrix} logit [E (Y_{x} | x, z, l)] - logit [E (Y_{0} | x, z, l)] = \tilde{γ} (x, z, l), \end{matrix}

(2.1)

where Inline graphic for any Throughout the article, we will index the LSNMM with a finite dimensional parameter , that is, The primary inferential goal of this article is to identify and estimate the parameter . The function represents a causal contrast; it compares the average potential outcomes, on the log odds scale, for the subset of the population with Inline graphic . As such, characterizes a COR as a function of and may be used to evaluate heterogeneity of the exposure causal effect by the IV and other pre-exposure variables. Our focus on CORs aligns with a common practice in the analysis of observational studies where measured baseline covariates are used to control for confounders. CORs are of interest in settings where the effect of reducing the exposure to Inline graphic is investigated within subgroups of patients who share similar characteristics . Moreover, such conditional exposure effects are likely to be more transportable across populations (Vansteelandt and Keiding, 2011; Burgess, 2013).

Assumptions (1)–(3) do not suffice to identify Inline graphic (see Robins and others, 2000). Both Vansteelandt and Goetghebeur (2003) and Robins and Rotnitzky (2004) make as an additional assumption (4) the “no current treatment value interaction” assumption almost surely, for a function of and only, that is, there is no effect modification of the exposure by the IV. Therefore, conditional on the observed covariates, the effect of treatment is constant for treated individuals across levels of the IV Inline graphic Familiar choices for are or , where is a component of Throughout this article, we assume that is correctly specified with unknown parameter , which is identified under assumptions (1)–(4).

We would like to point out that Assumption (4) is not empirically testable without making an additional assumption. This is because under assumptions (1)–(4) the likelihood is in fact a perfect fit to the observed data (see Tchetgen Tchetgen and Vansteelandt, 2013 for further details). Vansteelandt and Goetghebeur (2005) and Clarke and Windmeijer (2010) studied the impact on inference of possible violation of Assumption (4). Alternative identification conditions other than (4) are of great interest and constitute an ongoing research topic (Tchetgen Tchetgen and Vansteelandt, 2013). Richardson and Robins (2010) and Richardson and others (2011) elucidated the issue of identification and a careful analysis of identification in a basic binary IV model.

3. Review of SNMM estimations for binary outcomes

3.1. Double-logistic estimator

3.1.1. Estimation:

From (2.1), Inline graphic , where . Using this relationship, Vansteelandt and Goetghebeur (2003) suggested modeling the observed association to derive an unbiased estimating equation for . They developed the so-called double-logistic estimator and showed that can be identified if one postulates an association model Inline graphic using the observed outcome , the exposure and the IV

To estimate Inline graphic , an estimator of is first obtained from using for instance maximum likelihood estimation. Then, combining the association model with the LSNMM (2.1) yields an unbiased prediction of the average of counterfactual outcome within levels of for each subject , . A consistent point estimator Inline graphic of can finally be obtained by solving the estimating equation where is an arbitrary function of and The choice of does not affect consistency but does affect efficiency (see Robins, 1994; Clarke and others, 2015; or Vansteelandt and others, 2011 for optimal choices that yield an efficient estimator of Inline graphic ).

3.1.2. Models congeniality:

To identify a consistent estimator of Inline graphic stemming from this method depends largely on how the association model reproduces major features of the data. When the association model is not saturated, it can be incompatible with the LSNMM (2.1) and lead to an inconsistent estimator (Robins and Rotnitzky, 2004; Vansteelandt and others, 2011). Even worse, for some values of Inline graphic , there may be no solution to the estimating equation. Model incompatibility or lack of model congeniality (to use the terminology of Meng (1994) and Vansteelandt and others (2011)) arises when two models for the observed data cannot hold simultaneously for all parameter values allowed by the respective models. As a consequence there is no data generating mechanism for which both can hold, which is worse than model misspecification. In the latter, there exist data generating mechanisms corresponding to the model; however, none of them coincide with the mechanism leading to the observed data. As we know, with a large number of covariates, it is a daunting task to fit the saturated model that includes all available covariates and all possible higher order interactions. Therefore, one may instead need to use a parsimonious model that is flexible enough to capture important features of the data. Unfortunately, it is exactly when parametric restrictions are imposed on the association model—particularly with respect to the main effects of Inline graphic along with its interactions with some relevant covariates in —that we may have major inconsistencies between the association model and the LSNMM (2.1), leading to a lack of models congeniality and a noisy COR estimation.

3.2. Congenial parametrization of Robins and Rotnitzky

To guarantee a parametrization that is always congenial, Robins and Rotnitzky (2004) proposed one based on the contrast Inline graphic , which encodes the degree of unobserved confounding and referred to as the selection bias function. In absence of exposure (i.e., ) or if there is no unmeasured confounding (i.e., ), .

Using the selection bias function Inline graphic , we have

\begin{matrix} logit P (Y = 1 | x, z, l) = γ (x, l) + q (x, z, l) + v (z, l), \end{matrix}

(3.1)

Inline graphic is the unique solution to the integral equation with and the cumulative distribution function (CDF) of the conditional (exposure) density probability function of the random variable . In other words, is a functional of and implicitly defined by the integral equation that must be solved for each observation.

Estimation and numerical optimization burden: Parametric working models Inline graphic , , are postulated to make inference. The MLE of maximizes where is evaluated at , the solution of the integral equation. As previously stated, the parametrization of Robins and Rotnitzky (2004) has the advantage of providing an association model that is always compatible with the LSNMM (2.1). Unfortunately, for most choices of models for Inline graphic , and , the required integral equation cannot be solved for in closed form, except when is binary.

Numerical optimization of the joint density of the observables under this parametrization involves finding, for each Inline graphic , numerical solutions from the integral equation for each observed , within each iteration of the algorithm (Robins and Rotnitzky, 2004; Vansteelandt and others, 2011). In fact, when the exposure takes more than two values, or is continuous or multivariate, this approach is computationally challenging, particularly when the IV is continuous and there is a large number of covariates Inline graphic . To put things in perspective, this means if we have 500 subjects in the data set and if the optimization algorithm requires 100 iterations to converge, we will need to solve 50 000 integral equations in total to find the final estimate of the causal odds ratio. This numerical drawback has impeded the widespread use of this approach, despite its mathematical and theoretical underpinning.

4. New parametrization

We now propose a different congenial parametrization that obviates the need to solve integral equations. Let Inline graphic denote the conditional density function of the random variable (or the probability mass function if is discrete) and its corresponding CDF. While Robins and Rotnitzky (2004) parametrize the conditional density function (among other things) to get to the parametric model (3.1), our parametrization uses the conditional density function Inline graphic . All proofs related to this section are given in the supplementary material available online at http://www.biostatistics.oxfordjournals.org.

Define Inline graphic We show (in the supplementary material available at Biostatistics online) that is equal to The parametric model (3.1) becomes,

\begin{matrix} logit P (Y = 1 | x, z, l) = logit f_{y} (1 | x, z, l) = γ (x, l; ψ) + q (x, z, l) - \bar{q} (z, l) + t (l) . \end{matrix}

Under this parametrization, we are free to choose models for Inline graphic , , and The density is determined by Under Assumptions (1)–(3) and the proposed parametrization, we have the following key result:

Theorem 1: We have Inline graphic is equal to if and only if

This result gives one the freedom to posit variation independent parametric models for Inline graphic , , and such that the marginalization property of the result will hold for all parameter values, even if all models are incorrect.

4.1. Maximum likelihood estimation

Let Inline graphic and the density functions of the random variables and , respectively. The observed data likelihood factorizes as To draw inference under the new parametrization, we obtain the maximum likelihood estimator (MLE) of using parametric models and Using models for and is derived from Inline graphic . Thus, the likelihood becomes Such a likelihood can be maximized using PROC NLMIXED in SAS or the optim function in R. For most choices of the selection bias function and of the distribution we make in practice, the integral will have a closed form solution. Nevertheless, when a choice of Inline graphic and does not lead to a closed form expression of the latter can be approximated numerically, say using Gauss-Hermite quadrature integral approximation (Liu and Pierce, 1994) or by Monte-Carlo simulation, and be easily incorporated in any standard software code.

Finally, the MLE of Inline graphic is uncorrelated with that of . In fact, we need not estimate the latter to obtain an estimate of the former. Thus, the MLE of cannot exploit any prior information about such as the known randomization probability in a randomized experiment. However, as we show in Section D of the supplementary material, available at Biostatistics online one can leverage knowledge about the law of Inline graphic given to construct a GOF test statistic for nuisance models of the likelihood function derived in this section, which is asymptotically normal with mean zero only if the likelihood is correctly specified. The GOF statistic is based on an influence function for in a model where the likelihood is otherwise unrestricted, and therefore, it naturally accounts for variability of all unknown nuisance parameters under the null of no model misspecification.

5. Simulation study

In this section, we provide a data generating process following our proposed parametrization. We sampled the baseline covariates Inline graphic from independent bivariate normal distributions such that and with correlation coefficient of 0.5. Then, we generated a binary IV and defined and . In addition, we specified the distribution generated with density and derived where Finally, we made the simple choice

Overall, we generated a total of 2000 data sets of size Inline graphic and estimated , the empirical type I error, and the power of the GOF test statistic, that is, the proportion of simulated data sets for which . We run the simulations using SAS PROC NLMIXED.

5.1. Binary exposure

Let Inline graphic and . We have and We generated the binary exposure

For each parameter, we report in Table 1 the bias, the mean square error (MSE), and the coverage probability, that is, the proportion of 95% confidence intervals that covered the true parameter. These results highlight the good performance of our approach, with small bias and small MSE. Furthermore, coverage probabilities hover around 95%, reflecting good coverage. The Monte-Carlo type I error rate of the GOF test statistic is equal to 0.012 indicating the GOF rejects the null hypothesis less often than at the nomimal level of a correctly specified model, which may partially reflect the conservative variance estimator Inline graphic used to construct the test statistic.

Table 1.

Simulation Results: Binary Exposure

Bias	MSE	Coverage	S.E.
0.002	0.102	0.96	0.319
0.004	0.106	0.95	0.326
0.005	0.007	0.95	0.083
0.002	0.001	0.95	0.038
0.001	0.001	0.94	0.033
0.001	0.003	0.95	0.054
0.006	0.031	0.96	0.177
0.000	0.002	0.95	0.041
0.001	0.002	0.95	0.040
0.001	0.004	0.95	0.065
0.000	0.001	0.94	0.032
0.001	0.000	0.95	0.030

Open in a new tab

Corresponding GOF test: Type I error = 0.012

5.2. Continuous exposure

Consider Inline graphic such that We have We show in Section E of the supplementary material available at Biostatistics online that follows a mixture of two normal distributions with density where, for Estimation results are summarized in Table 2. Similar to binary exposure, the results confirm small bias and MSE as well as good coverage probability. This indicates that our approach performs very well. The realized type I error for the specification test of nuisance models as a goodness-of-fit test gave a type I error equal to 0.039, which—although somewhat better than for binary exposure—is still conservative.

Table 2.

Simulation Results: Continuous Exposure

Bias	MSE	Coverage	S.E.
0.001	0.001	0.95	0.042
0.000	0.003	0.95	0.052
0.000	0.004	0.94	0.062
0.000	0.000	0.95	0.016
0.000	0.000	0.95	0.014
0.001	0.002	0.94	0.042
0.001	0.000	0.95	0.010
0.003	0.029	0.94	0.172
0.000	0.005	0.95	0.073
0.002	0.003	0.95	0.054
0.004	0.019	0.95	0.138
0.001	0.001	0.95	0.041
0.002	0.001	0.95	0.040

Open in a new tab

Corresponding GOF test: Type I error = 0.039.

5.3. Power of the goodness-of-fit test

In addition to the type I error, we also assessed the power of the GOF test statistic to detect the presence of model misspecification under various departures from the assumed likelihood model. We considered different misspecifications of models for Inline graphic and The results, presented in Table 3, show that the power of the GOF test varies depending on the type of misspecification. Greatest power is achieved when is omitted from the model Relatively lower power was observed for leaving out interaction term from Moderate power () was observed for related misspecifications of Inline graphic

Table 3.

Goodness-of-fit Test: Power to Detect Departures from the True Models

Missing covariates	Parameter Values	Power
(1) Binary Exposure
	0.6, 1.5	0.41
	1.5	0.40
	0.8	0.03
	1.5	0.89
(2) Continuous Exposure
	0.5	0.95
	0.6, 1.5	0.62
	0.6	0.43
	0.8	0.06
	0.6	0.88

Open in a new tab

Inline graphic Covariates (with corresponding parameter values) used in the generated model, but omitted in the fitted model.

We observed similar patterns for both binary and continuous exposures (Table 3). The power of the proposed GOF test ranges a fair amount depending on the form of model misspecification. For instance, when we specify only the main effects of the baseline covariates and ignore interaction terms in the model for Inline graphic , the power to reject the misspecified model is low. However, when the true model does not have an interaction and a main effect term for the true model for is omitted, the GOF test rejects the posited model with substantially higher power. Finally, omitting the quadratic term in the model for Inline graphic results in relatively moderate power.

6. Data application

To illustrate the proposed method, we analyze two different data sets one with a continuous exposure and the other with binary exposure. The first application uses data from the 1976 subset of the United States NLSYM, looking at the effect of years of education on earnings based on a sample of 3010 working men age 24–34 (Card, 1995).

Our second application considers the impact moving low income famillies from high poverty to low poverty neighborhoods had on lifetime major depressive disorder among adolescents in the MTO study (see Section G of the suppplementary material available at Biostatistics online).

The effect of years of education on wages

There has been a longstanding interest in the causal impact of duration of schooling on earnings (Card, 1995; Heckman and others, 2006). Following Card (1995), we use the indicator Inline graphic of whether a study participant lived in the proximity of a four-year college in 1966 as an IV to study the effect of years of education on hourly wages. For illustration purposes, we use this classical example to estimate the effect of education on , the indicator of earning an hourly wage greater than or equal to the median (i.e., 537.5 cents).

We consider 12 years of education as the reference point and define Inline graphic =Years of Education -12. The vector of potential confounding variables includes mother’s and father’s years of education as well as the indicators of whether the person is black; lived with both natural parents, with one natural parent and one step parent, or with mother only at age 14; lived in one of the nine regions of residence or in a standard metropolitan statistical area (SMSA); and their family had a library card when he was 14 years old. Missing covariates values were imputed using simple imputation.

Although Card (1995) makes a compelling case for the validity of this choice of IV, one cannot rule out with certainty that there may be other factors, such as family or neighborhood characteristics, changes in the institutional structure of the education system, that are associated with the instrument Inline graphic and can affect hourly wages apart from years of education. Nevertheless, for expository purposes, we focus our illustration on this instrument.

To run the LSNMM, we consider the following parametric models: Inline graphic and .

The goodness-of-fit test statistic p-value is equal to 0.26. This indicates that there is no evidence that the data are not consistent with the likelihood model we have used, assuming the model for the causal contrast Inline graphic is correctly specified.

Table 4 reports the results for Inline graphic and , where (95% CI: 0.23–0.94). Based on these results, for given values of we can infer the effects of years of education on the probability of earning an hourly greater or equal to the median wage in 1976. For example, a black study participant with a high-school diploma, who lived with a high-school-educated single mother (i.e., 12 years of education) in a metropolitan area and whose family had a library when he was 14 years old, the odds of earning an hourly wage greater than or equal to the median (hourly wage in 1976) would have been 3.5 times what they are currently had he received 15 years of education instead. That is, for a study participant with Inline graphic and , the corresponding odds ratio is .

Table 4.

Effect of Years of Education on Earning Abilities

Parameter	Estimate	S.E.	P-value	95% Conf.	Interval
)	0.58	0.18	0.0012	0.23	0.94
black	0.06	0.07	0.44	0.09	0.20
library card	0.20	0.07	0.004	0.33	0.06
lived with mom and dad	0.14	0.10	0.17	0.06	0.34
mother’s education	0.001	0.004	0.83	0.01	0.01
region 2	0.10	0.15	0.48	0.19	0.40
region 3	0.10	0.14	0.49	0.38	0.18
region 4	0.20	0.17	0.22	0.53	0.13
region 5	0.11	0.14		0.17	0.40
region 6	0.04	0.16	0.81	0.27	0.34
region 7	0.27	0.15		0.04	0.57
region 8	0.19	0.23		0.66	0.27
region 9	0.34	0.18	0.06	0.69	0.01
lived with single mom	0.12	0.54	0.01	0.17	0.32
SMSA	0.13	0.06	0.023	0.24	0.02
lived with step dad	0.18	0.11		0.40	0.05

Open in a new tab

The binary outcome Inline graphic is the indicator of whether a hourly wage is greater than or equal to the median wage in 1976 of 537.5 cents.

In addition, as shown in the full table (Section F of the supplementary material available at Biostatistics online), the selection bias function is significantly different from zero providing explicit empirical evidence of the impact of unobserved confounding. This bias does not appear to depend on the interaction between years of education and the IV, but does depend on years of education, on the interaction between years of education and the family having a library card, and on the interaction between years of education and geographic location (region 9).

7. Conclusion

In this article, we have presented a new parametrization for a logistic structural nested mean model (LSNMM) for a binary outcome and we have proposed a corresponding maximum likelihood approach for estimation. Our approach builds upon the theoretical framework of Vansteelandt and Goetghebeur (2003) and Robins and Rotnitzky (2004). Unlike Vansteelandt and Goetghebeur (2003), and similar to Robins and Rotnitzky (2004), our approach yields a parametric model that is guaranteed to always be congenial (or compatible) with the LSNMM. However, unlike Robins and Rotnitzky (2004), we obviate the need to numerically solve integral equations, which can be computationally cumbersome and is not easily scalable with the dimension of the exposure $X$. In addition, a key attraction of our approach is that it is readily implemented using standard statistical software. Our simulation results confirm the good performance of the proposed approach. To illustrate our approach, we applied it using two different data sets, one with a binary exposure (whether a single-mother moved with her family out of a poor neighborhood) and the other with a continuous exposure (the number of years of education).

Our simulations showed that the proposed GOF is quite conservative in the settings we considered and its power to detect possible departures from the assumed model can be moderate to low. In some settings, the low power of the GOF statistic may also reflect the conservative estimate of variance used to standardize the statistic. The main advantage of the current GOF is its simplicity. In future work, we plan to further study the performance of the GOF statistic when standardized by a consistent estimator of its variance, which was not considered in the foregoing due to severe computational roadblocks.

As previously discussed, the MLE of Inline graphic is uncorrelated with that of . In fact, we need not estimate the latter to obtain an estimate of the former. This, in turn, implies that the MLE of cannot exploit any prior information about such as the known randomization probability in a randomized experiment. This is a notable limitation of the likelihood approach. To remedy this problem Vansteelandt and Goetghebeur (2003) and Robins and Rotnitzky (2004) propose methods that are doubly robust under the sharp null hypothesis, Inline graphic , of no exposure causal effect by explicitly using any available knowledge about . Robins and Rotnitzky (2004), in particular, propose to use an influence function of for inference, in the semiparametric model defined by Assumptions (1)–(4) only, which is endowed with the above robustness property, but suffers the same computational limitations as their likelihood approach.

An alternative approach to the likelihood can be obtained by solving the estimating equation of Robins and Rotnitzky (2004) under our proposed parametrization, which is not further pursued here. In addition, to assess the exclusion restriction Assumption (2) they suggest as possible analytical strategy a sensitivity analysis by varying their so-called “weak exclusion assumption function” over a possible range. Such an approach can also be used with our proposed parametrization. Finally, the method we have described in this article assumes random sampling and therefore would not be directly applicable for case-control sampling or for other outcome dependent sampling designs. A straightforward adjustment for the sampling design entails applying inverse-probability weighting (IPW) for selection into the sample. However, weighting may potentially be inefficient. A more efficient approach that makes use of the recent developments in the analysis of secondary outcomes in case-control studies (see Sofer and others, 2014; Tchetgen Tchetgen, 2014, extending the methods presented herein will be described elsewhere.

It is worth noting that direct comparison of the three approaches discussed herein would be difficult if not impossible in the context of simulation since it would be challenging to posit models for the nuisance parameters that agree under the different parametrizations. However, in the absence of covariates, or in the presence of low dimensional covariates allowing for the use of saturated models, we generally expect all methods would perform reasonably well in practice.

Supplementary Material

Supplementary Data

Click here for additional data file.^{(369.8KB, zip)}

Acknowledgments

The authors acknowledge research support from the National Institutes of Health (NIH). We are also grateful to Nicole Schmidt for helpful comments and suggestions regarding the MTO data. Conflict of Interest: None declared.

Funding

Both authors were supported by NIH grants 1R01MD006064, 1R21HD066312 (T. Osypuk, PI) and R21ES019712 (R. A. Matsouaka) and 1R21ES019712, R01AI104459 and R01HL080644 (E. J. Tchetgen Tchetgen).

Supplementary materials

supplementary material is available online at http://biostatistics.oxfordjournals.org.

References

Burgess S. (2013). Identifying the odds ratio estimated by a two-stage IV analysis with a logistic regression model. Statistics in Medicine 32, 4726–4747. [DOI] [PMC free article] [PubMed] [Google Scholar]
Card D. (1995). Using geographic variation in college proximity to estimate the return to schooling. In: Christofides Louis N. Kenneth Grant, E. and Swidinsky Robert (editors), Aspects of Labor Market Behaviour: Essays in Honour of John Vanderkamp. Toronto: University of Toronto Press, pp. 201–222. [Google Scholar]
Clarke P. S. and Windmeijer F. (2010). Identification of causal effects on binary outcomes using structural mean models. Biostatistics 11, 756–770. [DOI] [PMC free article] [PubMed] [Google Scholar]
Clarke P. S. Palmer T. M. and Windmeijer F. (2015). Estimating structural mean models with multiple IVs using the generalised method of moments. Statistical Science 30(1), 96–117. [Google Scholar]
Glymour M. M. Tchetgen E. J. T. and Robins J. M. (2012). Response to letters on “Credible mendelian randomization studies: approaches for evaluating the instrumental variable assumptions”. American Journal of Epidemiology 176, 458–459. [DOI] [PMC free article] [PubMed] [Google Scholar]
Heckman J. J. Lochner L. J. and Todd P. E. (2006). Earnings functions, rates of return and treatment effects: The mincer equation and beyond. Handbook of the Economics of Education 1, 307–458. [Google Scholar]
Liu Q. and Pierce D. A. (1994). A note on gauss–hermite quadrature. Biometrika 81, 624–629. [Google Scholar]
Liu L. Miao W. Sun B. Robins J. and Tchetgen Tchetgen E. J. (2015). Doubly Robust Estimation of a Marginal Average Effect of Treatment on the Treated With an Instrumental Variable. Harvard University, Biostatistics Working Paper Series Paper 191. [Google Scholar]
Martens E. P. Pestman W. R. de Boer A. Belitser S. V. and Klungel O. H. (2006). Instrumental variables: application and limitations. Epidemiology 17, 260–267. [DOI] [PubMed] [Google Scholar]
Meng X. L. (1994). Multiple-imputation inferences with uncongenial sources of input. Statistical Science 9(4), 538–558. [Google Scholar]
Murray M. P. (2006). Avoiding invalid instruments and coping with weak instruments. The Journal of Economic Perspectives 20, 111–132. [Google Scholar]
Richardson T. S. and Robins J. M. (2010). Analysis of the binary instrumental variable model. In Dechter R. Geffner H. and Halpern J.Y. (editors). Heuristics, Probability and Causality. A Tribute to Judea Pearl. College Publications, pp. 415–444. [Google Scholar]
Richardson T. S. Evans R. J. and Robins J. M. (2011). Transparent parameterizations of models for potential outcomes. Bayesian Statistics 9, 569–610. [Google Scholar]
Robins J. M. (1994). Correcting for non-compliance in randomized trials using structural nested mean models. Communications in Statistics-Theory and methods 23, 2379–2412. [Google Scholar]
Robins J. and Rotnitzky A. (2004). Estimation of treatment effects in randomised trials with non-compliance and a dichotomous outcome using structural mean models. Biometrika 91, 763–783. [Google Scholar]
Robins J. M Rotnitzky A. and Scharfstein D. O. (2000). Sensitivity analysis for selection bias and unmeasured confounding in missing data and causal inference models. In: Holloran E and Berry D (editors), Statistical Models in Epidemiology, the Environment, and Clinical Trials. New York: Springer, pp. 1–94. [Google Scholar]
Sofer T. Cornelis M. C. Kraft P. and Tchetgen E. J. T. (2014). Control Function Assisted IPW Estimation with a Secondary Outcome in Case-Control Studies. Harvard University, Biostatistics Working Paper Series Paper 174. [DOI] [PMC free article] [PubMed] [Google Scholar]
Tchetgen Tchetgen E. J. (2014). A general regression framework for a secondary outcome in case–control studies. Biostatistics 15, 117–128. [DOI] [PMC free article] [PubMed] [Google Scholar]
Tchetgen Tchetgen E. J. & Vansteelandt S. (2013) Alternative identification and inference for the effect of treatment on the treated with an instrumental variable. Harvard University, Biostatistics Working Paper Series Paper 166. [Google Scholar]
Vansteelandt S. and Goetghebeur E. (2003). Causal inference with generalized structural mean models. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 65, 817–835. [Google Scholar]
Vansteelandt S. and Goetghebeur E. (2005). Sense and sensitivity when correcting for observed exposures in randomized clinical trials. Statistics in Medicine 24, 191–210. [DOI] [PubMed] [Google Scholar]
Vansteelandt S. and Keiding N. (2011). Invited commentary: G-computation-lost in translation? American Journal of Epidemiology 173, 739–742. [DOI] [PubMed] [Google Scholar]
Vansteelandt S. Bowden J. Babanezhad M. and Goetghebeur E. (2011). On instrumental variables estimation of causal odds ratios. Statistical Science 26, 403–422. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Data

Click here for additional data file.^{(369.8KB, zip)}

[B1] Burgess S. (2013). Identifying the odds ratio estimated by a two-stage IV analysis with a logistic regression model. Statistics in Medicine 32, 4726–4747. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B2] Card D. (1995). Using geographic variation in college proximity to estimate the return to schooling. In: Christofides Louis N. Kenneth Grant, E. and Swidinsky Robert (editors), Aspects of Labor Market Behaviour: Essays in Honour of John Vanderkamp. Toronto: University of Toronto Press, pp. 201–222. [Google Scholar]

[B3] Clarke P. S. and Windmeijer F. (2010). Identification of causal effects on binary outcomes using structural mean models. Biostatistics 11, 756–770. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B4] Clarke P. S. Palmer T. M. and Windmeijer F. (2015). Estimating structural mean models with multiple IVs using the generalised method of moments. Statistical Science 30(1), 96–117. [Google Scholar]

[B5] Glymour M. M. Tchetgen E. J. T. and Robins J. M. (2012). Response to letters on “Credible mendelian randomization studies: approaches for evaluating the instrumental variable assumptions”. American Journal of Epidemiology 176, 458–459. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B6] Heckman J. J. Lochner L. J. and Todd P. E. (2006). Earnings functions, rates of return and treatment effects: The mincer equation and beyond. Handbook of the Economics of Education 1, 307–458. [Google Scholar]

[B7] Liu Q. and Pierce D. A. (1994). A note on gauss–hermite quadrature. Biometrika 81, 624–629. [Google Scholar]

[B8] Liu L. Miao W. Sun B. Robins J. and Tchetgen Tchetgen E. J. (2015). Doubly Robust Estimation of a Marginal Average Effect of Treatment on the Treated With an Instrumental Variable. Harvard University, Biostatistics Working Paper Series Paper 191. [Google Scholar]

[B9] Martens E. P. Pestman W. R. de Boer A. Belitser S. V. and Klungel O. H. (2006). Instrumental variables: application and limitations. Epidemiology 17, 260–267. [DOI] [PubMed] [Google Scholar]

[B10] Meng X. L. (1994). Multiple-imputation inferences with uncongenial sources of input. Statistical Science 9(4), 538–558. [Google Scholar]

[B11] Murray M. P. (2006). Avoiding invalid instruments and coping with weak instruments. The Journal of Economic Perspectives 20, 111–132. [Google Scholar]

[B12] Richardson T. S. and Robins J. M. (2010). Analysis of the binary instrumental variable model. In Dechter R. Geffner H. and Halpern J.Y. (editors). Heuristics, Probability and Causality. A Tribute to Judea Pearl. College Publications, pp. 415–444. [Google Scholar]

[B13] Richardson T. S. Evans R. J. and Robins J. M. (2011). Transparent parameterizations of models for potential outcomes. Bayesian Statistics 9, 569–610. [Google Scholar]

[B14] Robins J. M. (1994). Correcting for non-compliance in randomized trials using structural nested mean models. Communications in Statistics-Theory and methods 23, 2379–2412. [Google Scholar]

[B15] Robins J. and Rotnitzky A. (2004). Estimation of treatment effects in randomised trials with non-compliance and a dichotomous outcome using structural mean models. Biometrika 91, 763–783. [Google Scholar]

[B16] Robins J. M Rotnitzky A. and Scharfstein D. O. (2000). Sensitivity analysis for selection bias and unmeasured confounding in missing data and causal inference models. In: Holloran E and Berry D (editors), Statistical Models in Epidemiology, the Environment, and Clinical Trials. New York: Springer, pp. 1–94. [Google Scholar]

[B17] Sofer T. Cornelis M. C. Kraft P. and Tchetgen E. J. T. (2014). Control Function Assisted IPW Estimation with a Secondary Outcome in Case-Control Studies. Harvard University, Biostatistics Working Paper Series Paper 174. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B18] Tchetgen Tchetgen E. J. (2014). A general regression framework for a secondary outcome in case–control studies. Biostatistics 15, 117–128. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B19] Tchetgen Tchetgen E. J. & Vansteelandt S. (2013) Alternative identification and inference for the effect of treatment on the treated with an instrumental variable. Harvard University, Biostatistics Working Paper Series Paper 166. [Google Scholar]

[B20] Vansteelandt S. and Goetghebeur E. (2003). Causal inference with generalized structural mean models. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 65, 817–835. [Google Scholar]

[B21] Vansteelandt S. and Goetghebeur E. (2005). Sense and sensitivity when correcting for observed exposures in randomized clinical trials. Statistics in Medicine 24, 191–210. [DOI] [PubMed] [Google Scholar]

[B22] Vansteelandt S. and Keiding N. (2011). Invited commentary: G-computation-lost in translation? American Journal of Epidemiology 173, 739–742. [DOI] [PubMed] [Google Scholar]

[B23] Vansteelandt S. Bowden J. Babanezhad M. and Goetghebeur E. (2011). On instrumental variables estimation of causal odds ratios. Statistical Science 26, 403–422. [Google Scholar]

PERMALINK

Instrumental variable estimation of causal odds ratios using structural nested mean models

Roland A Matsouaka

Eric J Tchetgen Tchetgen

Summary

1. Introduction

2. Notation and assumptions

3. Review of SNMM estimations for binary outcomes

3.1. Double-logistic estimator

3.1.1. Estimation:

3.1.2. Models congeniality:

3.2. Congenial parametrization of Robins and Rotnitzky

4. New parametrization

4.1. Maximum likelihood estimation

5. Simulation study

5.1. Binary exposure

Table 1.

5.2. Continuous exposure

Table 2.

5.3. Power of the goodness-of-fit test

Table 3.

6. Data application

The effect of years of education on wages

Table 4.

7. Conclusion

Supplementary Material

Acknowledgments

Funding

Supplementary materials

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Instrumental variable estimation of causal odds ratios using structural nested mean models

Roland A Matsouaka

Eric J Tchetgen Tchetgen

Summary

1. Introduction

2. Notation and assumptions

3. Review of SNMM estimations for binary outcomes

3.1. Double-logistic estimator

3.1.1. Estimation:

3.1.2. Models congeniality:

3.2. Congenial parametrization of Robins and Rotnitzky

4. New parametrization

4.1. Maximum likelihood estimation

5. Simulation study

5.1. Binary exposure

Table 1.

5.2. Continuous exposure

Table 2.

5.3. Power of the goodness-of-fit test

Table 3.

6. Data application

The effect of years of education on wages

Table 4.

7. Conclusion

Supplementary Material

Acknowledgments

Funding

Supplementary materials

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases