Skip to main content
Biostatistics (Oxford, England) logoLink to Biostatistics (Oxford, England)
. 2017 Feb 6;18(3):465–476. doi: 10.1093/biostatistics/kxw059

Instrumental variable estimation of causal odds ratios using structural nested mean models

Roland A Matsouaka *,*, Eric J Tchetgen Tchetgen *
PMCID: PMC5862265  PMID: 28334061

Summary

We consider estimating causal odds ratios using an instrumental variable under a logistic structural nested mean model (LSNMM). Current methods for LSNMMs either rely heavily on possible “uncongenial” modeling assumptions or involve intricate numerical challenges, which have impeded their use. In this article, we present an alternative method that ensures a congenial parametrization, circumvents computational complexity of existing methods, and is easy to implement. We illustrate the proposed method to (1) estimate the causal effect of years of education on earnings using data from the NLSYM and (2) assess the impact of moving families from high to low-poverty neighborhoods had on lifetime major depressive disorder among adolescents in the “Moving to Opportunity (MTO) for Fair Housing Demonstration Project” from the Department of Housing and Urban Development.

Keywords: Causality, Instrumental variable, Odds ratio, Confounding, Structural model, Non-compliance

1. Introduction

Instrumental variable (IV) methods are used to estimate, under certain assumptions, the effects of an exposure on an outcome when unobserved confounding is suspected to be present. An IV is a pre-exposure variable that, conditional on a set of measured baseline covariates, is (1) associated with the exposure, (2) associated with the outcome only through the exposure, that is, there is no direct effect of the IV on the outcome upon intervening on exposure (also known as the exclusion restriction assumption), and (3) independent of any unmeasured confounding variable of the effects of the exposure on the outcome (Vansteelandt and others, 2011).

In clinical trials affected by non-compliance, random assignment to treatment (the exposure in this context) is often used as an IV, whereas in observational studies the choice of a valid IV is challenging (Martens and others, 2006; Murray, 2006; Glymour and others, 2012). Nevertheless, once an appropriate IV has been identified, it may potentially be used to recover a consistent estimate of the effect of an exposure despite the presence of unmeasured confounding.

Compared to IV methods for continuous outcomes, those for binary outcomes have received far less attention. Our aim is to estimate the conditional causal odds ratio (COR), that is, the effect of an exposure on a binary outcome, conditional on the exposure level, IV, and measured covariates under an LSNMM. As demonstrated by Robins (1994), one cannot identify the COR solely based on the standard IV assumptions (1)–(3). In fact, Robins and Rotnitzky (2004) show explicitly that the resulting causal model is overparametrized, in the sense that the likelihood for the LSNMM depends on more unknown parameters than the observed data likelihood and thus the model must somehow be restricted for identification purposes.

Any viable approach to identify and estimate the COR requires additional restrictions beyond the standard IV assumptions (Vansteelandt and Goetghebeur, 2005; Liu and others, 2015).

Both Vansteelandt and Goetghebeur (2003) and Robins and Rotnitzky (2004) make the “no-current interaction assumption” (Assumption (4)), but develop distinct approaches for estimating the COR parameter of an LSNMM under Assumptions (1)–(4). Vansteelandt and Goetghebeur (2003) posit a so-called association model for the exposure, IV and covariates with the binary outcome. However, when the association model is not saturated, it is prone to model incompatibility with the LSNMM (Robins and Rotnitzky, 2004; Vansteelandt and others, 2011). Robins and Rotnitzky (2004) develop an alternative approach that avoids this limitation using a parametrization that is always compatible. Unfortunately, their approach can be computationally prohibitive and challenging, especially with a continuous IV. It requires repeatedly solving a complicated integral equation numerically for each observation used in the process of finding an estimator of the LSNMM, which makes it difficult to implement.

The purpose of this article is to develop a different strategy to estimate the parameters of an LSNMM under assumptions (1)–(4) using a novel parametrization, which resolves the aforementioned difficulties with previous estimators of an LSNMM. Unlike Vansteelandt and Goetghebeur (2003), but similar to Robins and Rotnitzky (2004), we use a (variation independent) compatible parametrization of the observed data likelihood. Furthermore, unlike Robins and Rotnitzky (2004), our approach does not involve solving iterative integral equations and is thus readily implementable regardless of the nature or the dimension of the exposure, IV, and covariates.

Using the potential outcome framework, in Section 2, we lay out the IV conditions, present the causal effect of interest, and describe identification assumptions. We continue in Section 3 with a review of existing estimation methods for the LSNMM. In Section 4, we introduce the new parametrization, show how it contrasts with the approach of Robins and Rotnitzky (2004), and demonstrate some of its important properties. Using the proposed parametrization, we also present a straightforward maximum likelihood approach for estimating the LSNMM. In addition, we provide a goodness-of-fit (GOF) test statistic which is useful to evaluate parametric assumptions about nuisance parameters of the fitted likelihood model under Assumptions (1) through (4). In Section 5, we report a simulation study for both a binary and a continuous exposure, respectively, using continuous baseline covariates. We also closely examine the finite-sample performance of the proposed GOF test statistic. Finally, we apply our method to two different data sets to assess the impact of years of education on wages in the United States National Longitudinal Survey of Young Men (NLSYM) study and to evaluate the effects of moving from high-poverty to low-poverty neighborhoods on lifetime major depressive disorder among youth using data from the US Department of Housing and Urban Development “Moving to Opportunity (MTO) for Fair Housing Demonstration Project” (reported in the online supplementary material available at Biostatistics online).

2. Notation and assumptions

Suppose, we observe Inline graphic independent and identically distributed copies of the vector Inline graphic where Inline graphic is an IV for the effect of an exposure Inline graphic on an outcome Inline graphic given a set of measured baseline covariates Inline graphic. The vector Inline graphic includes all measured confounders of the effects of Inline graphic on Inline graphic and for the effects of Inline graphic on Inline graphic. We define the potential outcome Inline graphic as the outcome we would have observed had, possibly contrary to fact, Inline graphic been set to Inline graphic and Inline graphic set to Inline graphic by external intervention. Likewise, Inline graphic denotes the outcome had Inline graphic been set to Inline graphic. To identify causal effects, we make the consistency assumption that if a person is assigned to a specific exposure Inline graphic and an IV level Inline graphic, their observed outcome Inline graphic coincides with their potential outcome Inline graphic (Robins and Rotnitzky, 2004). The IV conditions presented in the introduction may now be formally expressed as: (1) non-null association between Inline graphic and Inline graphicInline graphicInline graphic; (2) exclusion restriction: Inline graphic almost surely for all Inline graphic; and (3) independence of potential outcomes and IV: Inline graphicInline graphicInline graphic for all Inline graphic The notation Inline graphicInline graphicInline graphic indicates stochastic independence between variables A and B given C.

We define an LSNMM as

logit[E(Yx|x,z,l)]logit[E(Y0|x,z,l)]=γ~(x,z,l), (2.1)

where Inline graphic for any Inline graphic Throughout the article, we will index the LSNMM with a finite dimensional parameter Inline graphic, that is, Inline graphic The primary inferential goal of this article is to identify and estimate the parameter Inline graphic. The function Inline graphic represents a causal contrast; it compares the average potential outcomes, on the log odds scale, for the subset of the population with Inline graphic. As such, Inline graphic characterizes a COR as a function of Inline graphic and may be used to evaluate heterogeneity of the exposure causal effect by the IV and other pre-exposure variables. Our focus on CORs aligns with a common practice in the analysis of observational studies where measured baseline covariates are used to control for confounders. CORs are of interest in settings where the effect of reducing the exposure to Inline graphic is investigated within subgroups of patients who share similar characteristics Inline graphic. Moreover, such conditional exposure effects are likely to be more transportable across populations (Vansteelandt and Keiding, 2011; Burgess, 2013).

Assumptions (1)–(3) do not suffice to identify Inline graphic (see Robins and others, 2000). Both Vansteelandt and Goetghebeur (2003) and Robins and Rotnitzky (2004) make as an additional assumption (4) the “no current treatment value interaction” assumption Inline graphic almost surely, for Inline graphic a function of Inline graphic and Inline graphic only, that is, there is no effect modification of the exposure by the IV. Therefore, conditional on the observed covariates, the effect of treatment is constant for treated individuals across levels of the IV Inline graphic Familiar choices for Inline graphic are Inline graphic or Inline graphic, where Inline graphic is a component of Inline graphic Throughout this article, we assume that Inline graphic is correctly specified with unknown parameter Inline graphic, which is identified under assumptions (1)–(4).

We would like to point out that Assumption (4) is not empirically testable without making an additional assumption. This is because under assumptions (1)–(4) the likelihood is in fact a perfect fit to the observed data (see Tchetgen Tchetgen and Vansteelandt, 2013 for further details). Vansteelandt and Goetghebeur (2005) and Clarke and Windmeijer (2010) studied the impact on inference of possible violation of Assumption (4). Alternative identification conditions other than (4) are of great interest and constitute an ongoing research topic (Tchetgen Tchetgen and Vansteelandt, 2013). Richardson and Robins (2010) and Richardson and others (2011) elucidated the issue of identification and a careful analysis of identification in a basic binary IV model.

3. Review of SNMM estimations for binary outcomes

3.1. Double-logistic estimator

3.1.1. Estimation:

From (2.1), Inline graphic, where Inline graphic. Using this relationship, Vansteelandt and Goetghebeur (2003) suggested modeling the observed association Inline graphic to derive an unbiased estimating equation for Inline graphic. They developed the so-called double-logistic estimator and showed that Inline graphic can be identified if one postulates an association model Inline graphic using the observed outcome Inline graphic, the exposure Inline graphic and the IV Inline graphic

To estimate Inline graphic, an estimator Inline graphic of Inline graphic is first obtained from Inline graphic using for instance maximum likelihood estimation. Then, combining the association model Inline graphic with the LSNMM (2.1) yields an unbiased prediction Inline graphic of the average Inline graphic of counterfactual outcome Inline graphic within levels of Inline graphic for each subject Inline graphic, Inline graphic. A consistent point estimator Inline graphic of Inline graphic can finally be obtained by solving the estimating equation Inline graphic where Inline graphic is an arbitrary function of Inline graphic and Inline graphic The choice of Inline graphic does not affect consistency but does affect efficiency (see Robins, 1994; Clarke and others, 2015; or Vansteelandt and others, 2011 for optimal choices that yield an efficient estimator of Inline graphic).

3.1.2. Models congeniality:

To identify a consistent estimator of Inline graphic stemming from this method depends largely on how the association model reproduces major features of the data. When the association model is not saturated, it can be incompatible with the LSNMM (2.1) and lead to an inconsistent estimator (Robins and Rotnitzky, 2004; Vansteelandt and others, 2011). Even worse, for some values of Inline graphic, there may be no solution Inline graphic to the estimating equation. Model incompatibility or lack of model congeniality (to use the terminology of Meng (1994) and Vansteelandt and others (2011)) arises when two models for the observed data cannot hold simultaneously for all parameter values allowed by the respective models. As a consequence there is no data generating mechanism for which both can hold, which is worse than model misspecification. In the latter, there exist data generating mechanisms corresponding to the model; however, none of them coincide with the mechanism leading to the observed data. As we know, with a large number of covariates, it is a daunting task to fit the saturated model that includes all available covariates and all possible higher order interactions. Therefore, one may instead need to use a parsimonious model that is flexible enough to capture important features of the data. Unfortunately, it is exactly when parametric restrictions are imposed on the association model—particularly with respect to the main effects of Inline graphic along with its interactions with some relevant covariates in Inline graphic—that we may have major inconsistencies between the association model Inline graphic and the LSNMM (2.1), leading to a lack of models congeniality and a noisy COR estimation.

3.2. Congenial parametrization of Robins and Rotnitzky

To guarantee a parametrization that is always congenial, Robins and Rotnitzky (2004) proposed one based on the contrast Inline graphic, which encodes the degree of unobserved confounding and referred to as the selection bias function. In absence of exposure (i.e., Inline graphic) or if there is no unmeasured confounding (i.e., Inline graphicInline graphicInline graphic), Inline graphic.

Using the selection bias function Inline graphic, we have

logitP(Y=1|x,z,l)=γ(x,l)+q(x,z,l)+v(z,l), (3.1)

Inline graphic is the unique solution to the integral equation Inline graphic with Inline graphic and Inline graphic the cumulative distribution function (CDF) of the conditional (exposure) density probability function Inline graphic of the random variable Inline graphic. In other words, Inline graphic is a functional of Inline graphic and Inline graphic implicitly defined by the integral equation that must be solved for each observation.

Estimation and numerical optimization burden: Parametric working models Inline graphic, Inline graphic, Inline graphic are postulated to make inference. The MLE of Inline graphic maximizes Inline graphic where Inline graphic is evaluated at Inline graphic, the solution of the integral equation. As previously stated, the parametrization of Robins and Rotnitzky (2004) has the advantage of providing an association model that is always compatible with the LSNMM (2.1). Unfortunately, for most choices of models for Inline graphic, Inline graphic and Inline graphic, the required integral equation cannot be solved for Inline graphic in closed form, except when Inline graphic is binary.

Numerical optimization of the joint density of the observables under this parametrization involves finding, for each Inline graphic, numerical solutions Inline graphic from the integral equation for each observed Inline graphic, within each iteration of the algorithm (Robins and Rotnitzky, 2004; Vansteelandt and others, 2011). In fact, when the exposure takes more than two values, or is continuous or multivariate, this approach is computationally challenging, particularly when the IV is continuous and there is a large number of covariates Inline graphic. To put things in perspective, this means if we have 500 subjects in the data set and if the optimization algorithm requires 100 iterations to converge, we will need to solve 50 000 integral equations in total to find the final estimate of the causal odds ratio. This numerical drawback has impeded the widespread use of this approach, despite its mathematical and theoretical underpinning.

4. New parametrization

We now propose a different congenial parametrization that obviates the need to solve integral equations. Let Inline graphic denote the conditional density function of the random variable Inline graphic (or the probability mass function if Inline graphic is discrete) and Inline graphic its corresponding CDF. While Robins and Rotnitzky (2004) parametrize the conditional density function Inline graphic (among other things) to get to the parametric model (3.1), our parametrization uses the conditional density function Inline graphic. All proofs related to this section are given in the supplementary material available online at http://www.biostatistics.oxfordjournals.org.

Define Inline graphic We show (in the supplementary material available at Biostatistics online) that Inline graphic is equal to Inline graphic The parametric model (3.1) becomes,

logitP(Y=1|x,z,l)=logitfy(1|x,z,l)=γ(x,l;ψ)+q(x,z,l)q¯(z,l)+t(l).

Under this parametrization, we are free to choose models for Inline graphic, Inline graphic, and Inline graphic The density Inline graphic is determined by Inline graphic Under Assumptions (1)–(3) and the proposed parametrization, we have the following key result:

Theorem 1: We have Inline graphic is equal to Inline graphic if and only if Inline graphic

This result gives one the freedom to posit variation independent parametric models for Inline graphic, Inline graphic, and Inline graphic such that the marginalization property of the result will hold for all parameter values, even if all models are incorrect.

4.1. Maximum likelihood estimation

Let Inline graphic and Inline graphic the density functions of the random variables Inline graphic and Inline graphic, respectively. The observed data likelihood factorizes as Inline graphic To draw inference under the new parametrization, we obtain the maximum likelihood estimator (MLE) of Inline graphic using parametric models Inline graphic and Inline graphic Using models for Inline graphic and Inline graphicInline graphic is derived from Inline graphic. Thus, the likelihood becomes Inline graphic Such a likelihood can be maximized using PROC NLMIXED in SAS or the optim function in R. For most choices of the selection bias function Inline graphic and of the distribution Inline graphic we make in practice, the integral Inline graphic will have a closed form solution. Nevertheless, when a choice of Inline graphic and Inline graphic does not lead to a closed form expression of Inline graphic the latter can be approximated numerically, say using Gauss-Hermite quadrature integral approximation (Liu and Pierce, 1994) or by Monte-Carlo simulation, and be easily incorporated in any standard software code.

Finally, the MLE of Inline graphic is uncorrelated with that of Inline graphic. In fact, we need not estimate the latter to obtain an estimate of the former. Thus, the MLE of Inline graphic cannot exploit any prior information about Inline graphic such as the known randomization probability in a randomized experiment. However, as we show in Section D of the supplementary material, available at Biostatistics online one can leverage knowledge about the law of Inline graphic given Inline graphic to construct a GOF test statistic for nuisance models of the likelihood function derived in this section, which is asymptotically normal with mean zero only if the likelihood is correctly specified. The GOF statistic is based on an influence function for Inline graphic in a model where the likelihood is otherwise unrestricted, and therefore, it naturally accounts for variability of all unknown nuisance parameters under the null of no model misspecification.

5. Simulation study

In this section, we provide a data generating process following our proposed parametrization. We sampled the baseline covariates Inline graphic from independent bivariate normal distributions such that Inline graphic and Inline graphic with correlation coefficient of 0.5. Then, we generated a binary IV Inline graphicInline graphic and defined Inline graphic and Inline graphic. In addition, we specified the distribution Inline graphic generated Inline graphic with density Inline graphic and derived Inline graphic where Inline graphic Finally, we made the simple choice Inline graphic

Overall, we generated a total of 2000 data sets of size Inline graphic and estimated Inline graphic, the empirical type I error, and the power of the GOF test statistic, that is, the proportion of simulated data sets for which Inline graphic. We run the simulations using SAS PROC NLMIXED.

5.1. Binary exposure

Let Inline graphic and Inline graphic. We have Inline graphic and Inline graphic We generated the binary exposure Inline graphic

For each parameter, we report in Table 1 the bias, the mean square error (MSE), and the coverage probability, that is, the proportion of 95% confidence intervals that covered the true parameter. These results highlight the good performance of our approach, with small bias and small MSE. Furthermore, coverage probabilities hover around 95%, reflecting good coverage. The Monte-Carlo type I error rate of the GOF test statistic is equal to 0.012 indicating the GOF rejects the null hypothesis less often than at the nomimal level of a correctly specified model, which may partially reflect the conservative variance estimator Inline graphic used to construct the test statistic.

Table 1.

Simulation Results: Binary Exposure

Model Parameter Bias MSE Coverage S.E.
Inline graphic Inline graphic 0.002 0.102 0.96 0.319
Inline graphic Inline graphic Inline graphic0.004 0.106 0.95 0.326
  Inline graphic 0.005 0.007 0.95 0.083
  Inline graphic Inline graphic0.002 0.001 0.95 0.038
Inline graphic Inline graphic Inline graphic0.001 0.001 0.94 0.033
  Inline graphic 0.001 0.003 0.95 0.054
  Inline graphic 0.006 0.031 0.96 0.177
Inline graphic Inline graphic 0.000 0.002 0.95 0.041
  Inline graphic Inline graphic0.001 0.002 0.95 0.040
  Inline graphic Inline graphic0.001 0.004 0.95 0.065
Inline graphic Inline graphic 0.000 0.001 0.94 0.032
  Inline graphic 0.001 0.000 0.95 0.030

Corresponding GOF test: Type I error = 0.012

5.2. Continuous exposure

Consider Inline graphic such that Inline graphic We have Inline graphic We show in Section E of the supplementary material available at Biostatistics online that Inline graphic follows a mixture of two normal distributions with density Inline graphic where, Inline graphic for Inline graphic Estimation results are summarized in Table 2. Similar to binary exposure, the results confirm small bias and MSE as well as good coverage probability. This indicates that our approach performs very well. The realized type I error for the specification test of nuisance models as a goodness-of-fit test gave a type I error equal to 0.039, which—although somewhat better than for binary exposure—is still conservative.

Table 2.

Simulation Results: Continuous Exposure

Model Parameter Bias MSE Coverage S.E.
Inline graphic Inline graphic 0.001 0.001 0.95 0.042
Inline graphic Inline graphic 0.000 0.003 0.95 0.052
  Inline graphic 0.000 0.004 0.94 0.062
  Inline graphic 0.000 0.000 0.95 0.016
Inline graphic Inline graphic 0.000 0.000 0.95 0.014
  Inline graphic Inline graphic0.001 0.002 0.94 0.042
  Inline graphic Inline graphic0.001 0.000 0.95 0.010
  Inline graphic 0.003 0.029 0.94 0.172
Inline graphic Inline graphic 0.000 0.005 0.95 0.073
  Inline graphic 0.002 0.003 0.95 0.054
  Inline graphic Inline graphic0.004 0.019 0.95 0.138
Inline graphic Inline graphic 0.001 0.001 0.95 0.041
  Inline graphic 0.002 0.001 0.95 0.040

Corresponding GOF test: Type I error = 0.039.

5.3. Power of the goodness-of-fit test

In addition to the type I error, we also assessed the power of the GOF test statistic to detect the presence of model misspecification under various departures from the assumed likelihood model. We considered different misspecifications of models for Inline graphic and Inline graphic The results, presented in Table 3, show that the power of the GOF test varies depending on the type of misspecification. Greatest power is achieved when Inline graphic is omitted from the model Inline graphic Relatively lower power was observed for leaving out Inline graphic interaction term from Inline graphic Moderate power (Inline graphic) was observed for related misspecifications of Inline graphic

Table 3.

Goodness-of-fit Test: Power to Detect Departures from the True Models

Misspecified Model Missing covariates Parameter ValuesInline graphic Power
  (1) Binary Exposure    
Inline graphic Inline graphicInline graphic Inline graphic0.6, 1.5 0.41
  Inline graphic 1.5 0.40
Inline graphic Inline graphic 0.8 0.03
  Inline graphic 1.5 0.89
  (2) Continuous Exposure    
  Inline graphic Inline graphic0.5 0.95
Inline graphic Inline graphicInline graphic 0.6, Inline graphic1.5 0.62
  Inline graphic 0.6 0.43
Inline graphic Inline graphic 0.8 0.06
  Inline graphic 0.6 0.88

Inline graphicCovariates (with corresponding parameter values) used in the generated model, but omitted in the fitted model.

We observed similar patterns for both binary and continuous exposures (Table 3). The power of the proposed GOF test ranges a fair amount depending on the form of model misspecification. For instance, when we specify only the main effects of the baseline covariates and ignore interaction terms in the model for Inline graphic, the power to reject the misspecified model is low. However, when the true model does not have an interaction and a main effect term for the true model for Inline graphic is omitted, the GOF test rejects the posited model with substantially higher power. Finally, omitting the quadratic term in the model for Inline graphic results in relatively moderate power.

6. Data application

To illustrate the proposed method, we analyze two different data sets one with a continuous exposure and the other with binary exposure. The first application uses data from the 1976 subset of the United States NLSYM, looking at the effect of years of education on earnings based on a sample of 3010 working men age 24–34 (Card, 1995).

Our second application considers the impact moving low income famillies from high poverty to low poverty neighborhoods had on lifetime major depressive disorder among adolescents in the MTO study (see Section G of the suppplementary material available at Biostatistics online).

The effect of years of education on wages

There has been a longstanding interest in the causal impact of duration of schooling on earnings (Card, 1995; Heckman and others, 2006). Following Card (1995), we use the indicator Inline graphic of whether a study participant lived in the proximity of a four-year college in 1966 as an IV to study the effect of years of education on hourly wages. For illustration purposes, we use this classical example to estimate the effect of education on Inline graphic, the indicator of earning an hourly wage greater than or equal to the median (i.e., 537.5 cents).

We consider 12 years of education as the reference point and define Inline graphic=Years of Education -12. The vector of potential confounding variables Inline graphic includes mother’s and father’s years of education as well as the indicators of whether the person is black; lived with both natural parents, with one natural parent and one step parent, or with mother only at age 14; lived in one of the nine regions of residence or in a standard metropolitan statistical area (SMSA); and their family had a library card when he was 14 years old. Missing covariates values were imputed using simple imputation.

Although Card (1995) makes a compelling case for the validity of this choice of IV, one cannot rule out with certainty that there may be other factors, such as family or neighborhood characteristics, changes in the institutional structure of the education system, that are associated with the instrument Inline graphic and can affect hourly wages apart from years of education. Nevertheless, for expository purposes, we focus our illustration on this instrument.

To run the LSNMM, we consider the following parametric models: Inline graphicInline graphicInline graphicInline graphic and Inline graphic.

The goodness-of-fit test statistic p-value is equal to 0.26. This indicates that there is no evidence that the data are not consistent with the likelihood model we have used, assuming the model for the causal contrast Inline graphic is correctly specified.

Table 4 reports the results for Inline graphic and Inline graphic, where Inline graphic (95% CI: 0.23–0.94). Based on these results, for given values of Inline graphic we can infer the effects of Inline graphic years of education on the probability of earning an hourly greater or equal to the median wage in 1976. For example, a black study participant with a high-school diploma, who lived with a high-school-educated single mother (i.e., 12 years of education) in a metropolitan area and whose family had a library when he was 14 years old, the odds of earning an hourly wage greater than or equal to the median (hourly wage in 1976) would have been 3.5 times what they are currently had he received 15 years of education instead. That is, for a study participant with Inline graphic and Inline graphic, the corresponding odds ratio is Inline graphic.

Table 4.

Effect of Years of Education on Earning Abilities

Model Parameter Estimate S.E. P-value 95% Conf. Interval
  Inline graphic) 0.58 0.18 0.0012 0.23 0.94
  black 0.06 0.07 0.44 Inline graphic0.09 0.20
  library card Inline graphic0.20 0.07 0.004 Inline graphic0.33 Inline graphic0.06
  lived with mom and dad 0.14 0.10 0.17 Inline graphic0.06 0.34
  mother’s education Inline graphic0.001 0.004 0.83 Inline graphic0.01 0.01
  region 2 0.10 0.15 0.48 Inline graphic0.19 0.40
  region 3 Inline graphic0.10 0.14 0.49 Inline graphic0.38 0.18
  region 4 Inline graphic0.20 0.17 0.22 Inline graphic0.53 0.13
Inline graphic region 5 0.11 0.14 Inline graphic Inline graphic0.17 0.40
  region 6 0.04 0.16 0.81 Inline graphic0.27 0.34
  region 7 0.27 0.15 Inline graphic Inline graphic0.04 0.57
  region 8 Inline graphic0.19 0.23 Inline graphic Inline graphic0.66 0.27
  region 9 Inline graphic0.34 0.18 0.06 Inline graphic0.69 0.01
  lived with single mom 0.12 0.54 0.01 Inline graphic0.17 0.32
  SMSA Inline graphic0.13 0.06 0.023 Inline graphic0.24 Inline graphic0.02
  lived with step dad Inline graphic0.18 0.11 Inline graphic Inline graphic0.40 0.05

The binary outcome Inline graphic is the indicator of whether a hourly wage is greater than or equal to the median wage in 1976 of 537.5 cents.

In addition, as shown in the full table (Section F of the supplementary material available at Biostatistics online), the selection bias function is significantly different from zero providing explicit empirical evidence of the impact of unobserved confounding. This bias does not appear to depend on the interaction between years of education and the IV, but does depend on years of education, on the interaction between years of education and the family having a library card, and on the interaction between years of education and geographic location (region 9).

7. Conclusion

In this article, we have presented a new parametrization for a logistic structural nested mean model (LSNMM) for a binary outcome and we have proposed a corresponding maximum likelihood approach for estimation. Our approach builds upon the theoretical framework of Vansteelandt and Goetghebeur (2003) and Robins and Rotnitzky (2004). Unlike Vansteelandt and Goetghebeur (2003), and similar to Robins and Rotnitzky (2004), our approach yields a parametric model that is guaranteed to always be congenial (or compatible) with the LSNMM. However, unlike Robins and Rotnitzky (2004), we obviate the need to numerically solve integral equations, which can be computationally cumbersome and is not easily scalable with the dimension of the exposure $X$. In addition, a key attraction of our approach is that it is readily implemented using standard statistical software. Our simulation results confirm the good performance of the proposed approach. To illustrate our approach, we applied it using two different data sets, one with a binary exposure (whether a single-mother moved with her family out of a poor neighborhood) and the other with a continuous exposure (the number of years of education).

Our simulations showed that the proposed GOF is quite conservative in the settings we considered and its power to detect possible departures from the assumed model can be moderate to low. In some settings, the low power of the GOF statistic may also reflect the conservative estimate of variance used to standardize the statistic. The main advantage of the current GOF is its simplicity. In future work, we plan to further study the performance of the GOF statistic when standardized by a consistent estimator of its variance, which was not considered in the foregoing due to severe computational roadblocks.

As previously discussed, the MLE of Inline graphic is uncorrelated with that of Inline graphic. In fact, we need not estimate the latter to obtain an estimate of the former. This, in turn, implies that the MLE of Inline graphic cannot exploit any prior information about Inline graphic such as the known randomization probability in a randomized experiment. This is a notable limitation of the likelihood approach. To remedy this problem Vansteelandt and Goetghebeur (2003) and Robins and Rotnitzky (2004) propose methods that are doubly robust under the sharp null hypothesis, Inline graphic, of no exposure causal effect by explicitly using any available knowledge about Inline graphic. Robins and Rotnitzky (2004), in particular, propose to use an influence function of Inline graphic for inference, in the semiparametric model defined by Assumptions (1)–(4) only, which is endowed with the above robustness property, but suffers the same computational limitations as their likelihood approach.

An alternative approach to the likelihood can be obtained by solving the estimating equation of Robins and Rotnitzky (2004) under our proposed parametrization, which is not further pursued here. In addition, to assess the exclusion restriction Assumption (2) they suggest as possible analytical strategy a sensitivity analysis by varying their so-called “weak exclusion assumption function” over a possible range. Such an approach can also be used with our proposed parametrization. Finally, the method we have described in this article assumes random sampling and therefore would not be directly applicable for case-control sampling or for other outcome dependent sampling designs. A straightforward adjustment for the sampling design entails applying inverse-probability weighting (IPW) for selection into the sample. However, weighting may potentially be inefficient. A more efficient approach that makes use of the recent developments in the analysis of secondary outcomes in case-control studies (see Sofer and others, 2014; Tchetgen Tchetgen, 2014, extending the methods presented herein will be described elsewhere.

It is worth noting that direct comparison of the three approaches discussed herein would be difficult if not impossible in the context of simulation since it would be challenging to posit models for the nuisance parameters that agree under the different parametrizations. However, in the absence of covariates, or in the presence of low dimensional covariates allowing for the use of saturated models, we generally expect all methods would perform reasonably well in practice.

Supplementary Material

Supplementary Data

Acknowledgments

The authors acknowledge research support from the National Institutes of Health (NIH). We are also grateful to Nicole Schmidt for helpful comments and suggestions regarding the MTO data. Conflict of Interest: None declared.

Funding

Both authors were supported by NIH grants 1R01MD006064, 1R21HD066312 (T. Osypuk, PI) and R21ES019712 (R. A. Matsouaka) and 1R21ES019712, R01AI104459 and R01HL080644 (E. J. Tchetgen Tchetgen).

Supplementary materials

supplementary material is available online at http://biostatistics.oxfordjournals.org.

References

  1. Burgess S. (2013). Identifying the odds ratio estimated by a two-stage IV analysis with a logistic regression model. Statistics in Medicine 32, 4726–4747. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Card D. (1995). Using geographic variation in college proximity to estimate the return to schooling. In: Christofides Louis N. Kenneth Grant, E. and Swidinsky Robert (editors), Aspects of Labor Market Behaviour: Essays in Honour of John Vanderkamp. Toronto: University of Toronto Press, pp. 201–222. [Google Scholar]
  3. Clarke P. S. and Windmeijer F. (2010). Identification of causal effects on binary outcomes using structural mean models. Biostatistics 11, 756–770. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Clarke P. S. Palmer T. M. and Windmeijer F. (2015). Estimating structural mean models with multiple IVs using the generalised method of moments. Statistical Science 30(1), 96–117. [Google Scholar]
  5. Glymour M. M. Tchetgen E. J. T. and Robins J. M. (2012). Response to letters on “Credible mendelian randomization studies: approaches for evaluating the instrumental variable assumptions”. American Journal of Epidemiology 176, 458–459. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Heckman J. J. Lochner L. J. and Todd P. E. (2006). Earnings functions, rates of return and treatment effects: The mincer equation and beyond. Handbook of the Economics of Education 1, 307–458. [Google Scholar]
  7. Liu Q. and Pierce D. A. (1994). A note on gauss–hermite quadrature. Biometrika 81, 624–629. [Google Scholar]
  8. Liu L. Miao W. Sun B. Robins J. and Tchetgen Tchetgen E. J. (2015). Doubly Robust Estimation of a Marginal Average Effect of Treatment on the Treated With an Instrumental Variable. Harvard University, Biostatistics Working Paper Series Paper 191. [Google Scholar]
  9. Martens E. P. Pestman W. R. de Boer A. Belitser S. V. and Klungel O. H. (2006). Instrumental variables: application and limitations. Epidemiology 17, 260–267. [DOI] [PubMed] [Google Scholar]
  10. Meng X. L. (1994). Multiple-imputation inferences with uncongenial sources of input. Statistical Science 9(4), 538–558. [Google Scholar]
  11. Murray M. P. (2006). Avoiding invalid instruments and coping with weak instruments. The Journal of Economic Perspectives 20, 111–132. [Google Scholar]
  12. Richardson T. S. and Robins J. M. (2010). Analysis of the binary instrumental variable model. In Dechter R. Geffner H. and Halpern J.Y. (editors). Heuristics, Probability and Causality. A Tribute to Judea Pearl. College Publications, pp. 415–444. [Google Scholar]
  13. Richardson T. S. Evans R. J. and Robins J. M. (2011). Transparent parameterizations of models for potential outcomes. Bayesian Statistics 9, 569–610. [Google Scholar]
  14. Robins J. M. (1994). Correcting for non-compliance in randomized trials using structural nested mean models. Communications in Statistics-Theory and methods 23, 2379–2412. [Google Scholar]
  15. Robins J. and Rotnitzky A. (2004). Estimation of treatment effects in randomised trials with non-compliance and a dichotomous outcome using structural mean models. Biometrika 91, 763–783. [Google Scholar]
  16. Robins J. M Rotnitzky A. and Scharfstein D. O. (2000). Sensitivity analysis for selection bias and unmeasured confounding in missing data and causal inference models. In: Holloran E and Berry D (editors), Statistical Models in Epidemiology, the Environment, and Clinical Trials. New York: Springer, pp. 1–94. [Google Scholar]
  17. Sofer T. Cornelis M. C. Kraft P. and Tchetgen E. J. T. (2014). Control Function Assisted IPW Estimation with a Secondary Outcome in Case-Control Studies. Harvard University, Biostatistics Working Paper Series Paper 174. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Tchetgen Tchetgen E. J. (2014). A general regression framework for a secondary outcome in case–control studies. Biostatistics 15, 117–128. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Tchetgen Tchetgen E. J. & Vansteelandt S. (2013) Alternative identification and inference for the effect of treatment on the treated with an instrumental variable. Harvard University, Biostatistics Working Paper Series Paper 166. [Google Scholar]
  20. Vansteelandt S. and Goetghebeur E. (2003). Causal inference with generalized structural mean models. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 65, 817–835. [Google Scholar]
  21. Vansteelandt S. and Goetghebeur E. (2005). Sense and sensitivity when correcting for observed exposures in randomized clinical trials. Statistics in Medicine 24, 191–210. [DOI] [PubMed] [Google Scholar]
  22. Vansteelandt S. and Keiding N. (2011). Invited commentary: G-computation-lost in translation? American Journal of Epidemiology 173, 739–742. [DOI] [PubMed] [Google Scholar]
  23. Vansteelandt S. Bowden J. Babanezhad M. and Goetghebeur E. (2011). On instrumental variables estimation of causal odds ratios. Statistical Science 26, 403–422. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Data

Articles from Biostatistics (Oxford, England) are provided here courtesy of Oxford University Press

RESOURCES