Estimation of Risk Ratios in Cohort Studies With Common Outcomes: A Simple and Efficient Two-stage Approach

Eric J Tchetgen Tchetgen

doi:10.1515/ijb-2013-0007

. Author manuscript; available in PMC: 2017 Jun 29.

Published in final edited form as: Int J Biostat. 2013 May 7;9(2):251–264. doi: 10.1515/ijb-2013-0007

Estimation of Risk Ratios in Cohort Studies With Common Outcomes: A Simple and Efficient Two-stage Approach

Eric J Tchetgen Tchetgen ¹

PMCID: PMC5490504 NIHMSID: NIHMS864741 PMID: 23658213

Abstract

The risk ratio is perhaps the effect measure most commonly assessed in epidemiologic studies with a binary outcome. In this paper, the author presents a simple and efficient two-stage approach to estimate risk ratios directly, which does not directly rely for consistency on an estimate of the baseline risk. This latter property is a key advantage of the approach over existing methods, because, unlike these other methods, the proposed approach obviates the need to restrict the predicted risk probabilities to fall below one, in order to recover efficient inferences about risk ratios. An additional appeal of the approach is that it is easy to implement. Finally, when the primary interest is in the effect of a specific binary exposure, a simple doubly robust closed-form estimator is derived, for the multiplicative effect of the exposure. Specifically, we show how one can adjust for confounding by incorporating a working regression model for the propensity score so that correct inferences about the multiplicative effect of the exposure are recovered if either this model is correct or a working model for the association between confounders and outcome risk is correct, but both do not necessarily hold.

Keywords: Risk ratio, prevalence ratio, semiparametric efficient, doubly robust

1 Introduction

An objective of many epidemiologic studies is to evaluate the multiplicative association between a vector of risk factors and a binary outcome. When the outcome is rare within all levels of the covariates, logistic regression is well known to deliver valid, albeit approximate, inferences about risk ratios whether in a cohort or in a case-control study. When, as often the case in cohort studies, the outcome is not rare within all levels of covariates, logistic regression overstates the relative risk association and should not be used to approximate the latter. Instead, a variety of techniques have been proposed in recent years to recover estimates of risk ratios for a common outcome (Wacholder, 1986, Lee, 1994, Skov et al, 1998, Greenland, 2004, Zou, 2004, Spiegelman and Hertzmark, 2005, Chu and Cole, 2010). A basic requirement shared by previous methods, with the exception of the method proposed by Breslow (1974) and subsequently by Lee (2004) is that the log-baseline risk, i.e. the regression intercept, must be estimated along with regression coefficients, in order to obtain a consistent estimate of regression coefficients. Unfortunately, this task is often not easily achieved if one wishes to respect the essential model restriction that all predicted probabilities in the sample should not exceed one; often resulting in lack of convergence of estimation procedures. The suboptimal performance of such methods are well documented in the literature (Deddens et al, 2003, Petersen and Deddens, 2006, Tian and Liu, 2006, Chu and Cole, 2010). Recently, such concerns prompted Chu and Cole to develop a Bayesian approach that appropriately incorporates this additional modeling restriction (Chu and Cole, 2010). Their approach which relies on Markov Chain Monte Carlo simulations provides a promising Bayesian solution when risk prediction is of primary interest, but a satisfactory frequentist solution is still lacking even in settings where risk ratios are the primary target of inference.

In this paper, the author presents a simple approach to estimate risk ratios directly, that does not directly rely for consistency on obtaining an estimate of the baseline risk. In this respect, the approach is similar to that of Breslow (1974) and Lee (2004); but whereas their method is inefficient, here a two-stage approach is described that delivers efficient estimates of risk ratios. The first stage of the method does not require an estimate of the baseline risk, while the second stage recovers information not used in the first stage by incorporating a weight which does depend on the individual predicted risk, and therefore on the individual baseline risk. However, because the weights are not essential for consistency, a simple pluggin estimate of the baseline risk may be used without altering the large sample behavior, more precisely, without altering the large sample efficiency of the estimated regression coefficients. This property holds even though the pluggin estimate is generally inefficient for the baseline risk and may result in a predicted risk outside of the unit range. An important advantage of the approach is that it is easy to implement. An alternative approach is described, which guarantees that the estimated predicted risk used for the weight remains bounded between zero and one. Finally, when the primary interest is in the effect of a specific binary exposure, we describe a simple closed-form estimator, of the multiplicative effect of the exposure that is doubly robust. Specifically, we show how to incorporate a working regression model for the probability of being exposed given confounders, i.e. the propensity score, so that correct inferences about the multiplicative effect of the exposure are recovered if either this model is correct or the working model for the association between confounders and disease risk is correct, but both do not necessarily hold.

2 PROPOSED METHODS

2.1 A Simple Inefficient Initial Estimator

To motivate the approach, consider the simple case where X_i (i = 1, 2, ..., n) is a binary exposure with a value of 1 if exposed and 0 if unexposed. Let Y_i (i = 1, ..., n) denote the binary response, which is randomly sampled from a log-binomial model with

log Pr (Y_{i} = 1 ∣ X_{i}) = log [p_{i}] = α_{0} + β_{0} X_{i}

(1)

Then, a standard application of maximum likelihood theory delivers the estimator

exp ({\hat{β}}_{MLE}) = \frac{\sum_{i} Y_{i} X_{i}}{\sum_{i} X_{i}} \times \frac{\sum_{i} (1 - X_{i})}{\sum_{i} (1 - X_{i}) Y_{i}}

Now, we note that this equation is equivalent to:

\begin{array}{l} 0 = exp (- {\hat{β}}_{MLE}) \sum_{i} Y_{i} X_{i} \sum_{i} (1 - X_{i}) - \sum_{i} (1 - X_{i}) Y_{i} \sum_{i} X_{i} \\ \Leftrightarrow 0 = \sum_{i} Y_{i} [exp (- {\hat{β}}_{MLE}) X_{i} \sum_{i} (1 - X_{i}) - (1 - X_{i}) \sum_{i} X_{i}] \\ \Leftrightarrow 0 = \sum_{i} Y_{i} exp (- {\hat{β}}_{MLE} X_{i}) [X_{i} n - \sum_{i} X_{i}] \end{array}

which states that β̂_MLE solves the equation

\begin{array}{l} 0 = \sum_{i} Y_{i} exp (- {\hat{β}}_{MLE} X_{i}) (X_{i} - \bar{X}) \\ \Leftrightarrow 0 = \sum_{i} Y_{i} exp (- {\hat{β}}_{MLE} (X_{i} - \bar{X})) (X_{i} - \bar{X}) \\ \Leftrightarrow 0 = \sum_{i : Y_{i} = 1} {Z_{i} - exp ({\hat{β}}_{MLE} W_{i})} W_{i} \end{array}

(2)

where X̄ is the sample average of X, W_i = −(X_i − X̄), and Z_i = 0 for all i. The main appeal of the representation given by equation (2) in the above display is two-fold:

It is completely free of the intercept, and therefore does not require an actual estimate of the predicted probabilities.
It is exactly of the form of the score equation for β, under the artificial case-only model in which the pseudo-outcome Z_i is assumed to follow a Poisson distribution with mean given by the intercept-free multiplicative model exp(βW_i), i = 1, ...n, in cases only.

Thus, Equation (2) provides an equivalent representation of the maximum likelihood estimator in the simple setting of a saturated multiplicative model with a binary exposure; however, this representation is of no particular use in this latter setting because the maximum likelihood estimator is easy to compute. But, as we show below, the alternative representation is useful for estimation in settings where it may be considerably more difficult to compute the maximum likelihood estimator. Specifically, now suppose that X_i and thus W_i, are vector valued possibly with several continuous components and one aims to make inferences about β₀ in the multiplicative model

\frac{Pr (Y_{i} = 1 ∣ X_{i})}{Pr (Y_{i} = 1 ∣ X_{i} = 0)} = exp {β_{0}^{T} X_{i}}

(3)

Then, one may generalize equation (2), and define an estimator β̂ as the solution to the equation:

\begin{array}{l} 0 = \sum_{i : Y_{i} = 1} U_{i} (\hat{β}) \\ = \sum_{i : Y_{i} = 1} {Z_{i} - exp ({\hat{β}}^{T} W_{i})} W_{i} \end{array}

In the appendix, we show that β̂ is consistent for β₀ and we establish its large sample behavior.

Result 1

Under assumption (1), n^1/2 (β̂ − β₀) is approximately normal with mean zero and variance Σ_β provided in the appendix. We also show that the standard sandwich estimator

{\sum^{^}}_{β} = {[\sum_{i : Y_{i} = 1} {\frac{\partial U_{i} (β)}{\partial β} ∣}_{\hat{β}}]}^{- 1} n \sum_{i : Y_{i} = 1} U_{i} {(\hat{β})}^{2} {[\sum_{i : Y_{i} = 1} {\frac{\partial U_{i} (β)}{\partial β} ∣}_{\hat{β}}]}^{- 1}

is a conservative estimator of Σ_β.

The estimator β̂ is particularly useful for routine application in epidemiologic practice, because properties (i) and (ii) continue to apply even though model (3) is no longer saturated, and therefore β̂ does not generally inherit the efficiency properties of a maximum likelihood estimator. The efficiency loss (relative to a maximum likelihood estimator) can be particularly severe when the regression model is not saturated, and when as we assume throughout, the outcome is not rare. The loss of efficiency should decrease the more flexible or richly parametrize the model is allowed to remain, and should be almost nill for nearly saturated models. Despite this limitation, the approach has some advantages in that by (i) it does not require an estimate of the intercept and therefore will generally not suffer from the same computational challenges as methods that rely on an estimate of the intercept. For inference using β̂, valid confidence intervals, for say the first component $β_{0}^{(1)}$ of β₀, can be obtained by the method of Wald: ${\hat{β}}^{(1)} \pm 1.96 \sqrt{{\sum^{^}}_{β}^{11} / n}$ , where ${\sum^{^}}_{β}^{11} / n$ is the estimate of the variance of β̂⁽¹⁾. This approach is convenient, as (ii) outlines how to obtain β̂ using standard statistical software such as GENMOD; which also provides the empirical/sandwich variance estimator Σ̂_β upon request, i.e by specifying the REPEATED statement.

We performed a simulation study to illustrate the performance of the method. For this we generated 1000 samples each of size n =1000, under the following model X⁽²⁾ is Bernoulli (0.7), X⁽³⁾, X⁽⁴⁾ are both uniform(0,1); X⁽¹⁾ is Bernoulli((1+exp(−[0.5, −0.5, 0.5, −0.9, 0.9] × Q)) where Q = [1,X⁽²⁾×X⁽³⁾,X⁽³⁾×X⁽⁴⁾,X⁽²⁾×X⁽⁴⁾²]; Y is then generated under a Bernoulli model with event probability exp ([−1.4, 0.3, −0.2, 0.2, 0.3] × [1,X^T ]^T), thus $[α_{0}, β_{0}^{T}] = [- 1.4, 0.3, - 0.2, - 0.2, 0.3]$ which roughly corresponds to a marginal risk Pr(Y = 1) ≈ 0.278. We then obtained estimates using the method described in this section which we summarize in Table 1 in rows labelled “Correct Model”. The simulation study indicates that the point estimate β̂ performs well and has small bias. The simulation further shows that the simple sandwich estimator Σ̂_β can be quite conservative as it produces estimates that can be much larger than the Monte Carlo variance. Instead of using Σ̂_β, alternative inferences can also be obtained by using an empirical version of Σ_β, which we denote Σ̃_β and is given by

{\sum^{\sim}}_{β} = {\sum^{^}}_{β} - {[\sum_{i} Y_{i} {exp (- β_{0}^{T} (X_{i} - \bar{X}))}]}^{2} \times {[\sum_{i : Y_{i} = 1} {\frac{\partial U_{i} (β)}{\partial β} ∣}_{\hat{β}}]}^{- 1} n \sum_{i : Y_{i} = 1}^{2} {(X_{i} - \bar{X})}^{\otimes 2} {[\sum_{i : Y_{i} = 1} {\frac{\partial U_{i} (β)}{\partial β} ∣}_{\hat{β}}]}^{- 1}

as derived in the appendix. However, this more precise estimator may be less convenient as it requires additional, though fairly straightforward programming. The simulation study indicates that Σ̃_β outperforms Σ̂_β and performs well.

2.2 An Efficient Estimator

To address concerns about lack of efficiency, suppose that we have obtained β̂ in a first stage. One can then update β̂ in a single step, to obtain an efficient estimator of β₀. Let

{\hat{T}}_{i} (w) = {w_{i} - \frac{\sum_{i} w_{i} exp ({\hat{β}}^{T} X_{i})}{\sum_{i} exp ({\hat{β}}^{T} X_{i})}}

where w_i is a vector, of the same dimension as X_i, of user-specified functions of X_i. For any choice of w_i, let

\hat{β} (w) = \hat{β} + {[\sum_{i} Y_{i} {\hat{T}}_{i} (w) X_{i}^{T}]}^{- 1} \times [\sum_{i} Y_{i} {\hat{T}}_{i} (w)]

define a new so-called one-step-update estimator. The class of one-step-update estimators is very rich and includes several well-known estimators. In fact, for any estimator β̄ of β₀ that is regular and asymptotically linear, we show in the appendix using results due to Bickel et al (1993), that there exist a corresponding weight function w_i such that

\sqrt{n} {\hat{β} (w) - \bar{β}} = o_{p} (1)

In other words, the two estimators share a common large sample distribution and are therefore asymptotically equivalent. For instance, one can easily verify that the particular choice w_i = exp(−β̂^TX_i)(X_i − X̄) recovers β̂ exactly. Whereas, w_i = X_i produces an estimator that is asymptotically equivalent to the Breslow-Lee estimator. Neither of these estimators is generally efficient. In the appendix, we show that β̂ (w_opt) is efficient, where

w_{opt, i} = {1 - {\hat{p}}_{i}}^{- 1} \times [X_{i} - \frac{\sum_{i} X_{i} {1 - {\hat{p}}_{i}}^{- 1} {\hat{p}}_{i}}{\sum_{i} {1 - {\hat{p}}_{i}}^{- 1} {\hat{p}}_{i}}]

with

{\hat{p}}_{i} = exp ({\hat{β}}^{T} X_{i}) \sum_{i^{'}} Y_{i^{'}} exp (- {\hat{β}}^{T} X_{i^{'}}) / n

an estimator of the predicted risk for person i; where

exp (\hat{α}) = \sum_{i^{'}} Y_{i^{'}} exp (- {\hat{β}}^{T} X_{i^{'}}) / n

is a pluggin estimator of the baseline risk Pr (Y = 1|X = 0) = exp(α₀). Specifically, we establish that

\hat{β} (w_{opt}) = \hat{β} + {[\sum_{i} Y_{i} w_{opt, i} X_{i}^{T}]}^{- 1} \times [\sum_{i} Y_{i} w_{opt, i}]

since Σ_i w_opt,i exp(β̂^TX_i) = 0. In fact, we prove the following result:

Result 2

Under assumption (1), n^−1/2 (β̂_eff − β₀) is approximately normal with mean zero and variance $\sum_{β}^{eff}$ . Furthermore, the estimator ${\sum^{^}}_{β}^{eff}$ converges (in probability) to $\sum_{β}^{eff}$ where

{\sum^{^}}_{β}^{eff} = n {[\sum_{i} {1 - {\hat{p}}_{i}} {\hat{p}}_{i} w_{opt, i} w_{opt, i}^{T}]}^{- 1}

Finally, β̂_eff achieves the semiparametric efficiency bound for the model given by (1).

As before, ${\sum^{^}}_{β}^{eff}$ can be used to construct Wald-type confidence intervals. The simulation results in table 1 confirm that, as theory predicts β̂_eff significantly outperforms β̂ in terms of efficiency. We emphasize that the estimated individual risk p̂_i, i = 1, ., n, is solely used for the purpose of enhancing efficiency through the weights w_opt,i. Result 2 confirms that the baseline log-risk α₀ may be inefficiently estimated by the simple pluggin estimator α̂, without affecting the efficiency of β̂_eff. However, although α̂ is consistent and asymptotically linear, p̂_i may be greater than one for some observations in the sample. Naturally, one may wish to impose that the estimated risk used to compute the optimal weight be a genuine probability; in the next section, we describe a slight modification of the proposed approach that achieves this goal.

3 Additional results and an application

3.1 An alternative efficient estimator

While not strictly required by the two stage approach, the following modification guarantees that individuals’ estimated risk used to compute the weights w_opt,i fall within the unit interval. To develop the approach, we observe that p_i is equivalently written:

logit p_{i} = ξ (β_{0}^{T} X_{i})

where

ξ (\cdot) = log {exp (α_{0} + \cdot) / (1 - exp (α_{0} + \cdot))} .

Given the first stage estimate M_i = β̂^TX_i of $β_{0}^{T} X_{i}$ , we propose to ignore knowledge about the precise functional form of ξ (·) , and to estimate ξ (·) by fitting a nonparametric logistic regression of Y_i on the scalar variable M_i, i = 1, ..., n. Let ξ̂_i = ξ̂ (M_i) denotes such an estimator of $ξ (β_{0}^{T} X_{i})$ ; then clearly

0 < {\tilde{p}}_{i} = {1 + exp (- ({\hat{ξ}}_{i}))}^{- 1} < 1, i = 1, \dots, n

that is p̃_i is guaranteed to fall within the unit interval. There currently exist a vast literature on nonparametric techniques that may be used to obtain ξ̂ (·) , including polynomials series, local polynomial smoothing, trigonometric series, wavelet regression, spline regression or kernel smoothing; a textbook treatment of these various methods may be found in Wasserman (2003) and Hastie et al (2008). Here, we briefly illustrate polynomial series regression. Let $ϕ_{k} (M_{i}) = M_{i}^{k}$ , k = 0, ...K. Then, for fixed K, let p̃_i denote the predicted probabilities obtained by standard logistic regression of Y_i on {ϕ_k (M_i) ; k ≤ K} using data {(M_i, Y_i) : i = 1, ..., n}. A result due to Hirano et al (2003) implies that, since ξ (·) has at least four bounded derivatives, setting K = Cn^1/6 for some constant C is sufficient for the resulting estimator p̃_i to converge to p_i at rates no slower than n^1/4; and the resulting estimator β̃_eff of β₀ is semiparametric efficient.

3.2 A data illustration

We consider a data set involving 172 diabetic patients presented by Lachin (14, p. 261) and also analyzed by Zou (2003). This is a subset of a large clinical trial known as the Diabetes Control and Complications Trial (The Diabetes Control and ComplicationsTrial Research Group, 1993), where it is of interest to determine the relative risk of standard therapy versus intensive treatments in terms of the prevalence of microalbuminuria at 6 years of follow-up. For estimation, we adjust for the following covariates: the percentage of total hemoglobin that has become glycosylated at baseline, the prior duration of diabetes in months, the level of systolic blood pressure (mmHg), and gender (female) (1 if female, 0 if male). Applying the single stage approach results in an estimated risk of microalbuminuria that is 2.5 times higher in the control group than in the treatment group (β̂ = −0.92, s.e = 0.37). The efficient two-stage approach delivers a more precise estimated risk ratio, with the risk in the control group that is 5.4 times higher than in the treatment group (β̂_eff = −1.69, s.e. = 0.28) using the simple pluggin approach for estimating individuals’ predicted risk, and an estimated risk that is 3.2 times higher in the control group (β̃_eff = −1.18, s.e. = 0.25) using the approach described in Section 5.4. It is Interest to compare these point estimates to those reported by Zou (2003) who estimated that the risk in the control group is 2.9 that in the treatment group (β̂_Zou = −1.08, s.e. = 0.30) using a modified Poisson approach, which closely matched the estimated risk ratio of 2.85 for the control vs the treatment group (β̂_bin = −1.04, s.e. = 0.30) he obtained using the log-binomial approach. He further noted that the binomial regression procedure failed to converge until a variety of starting values were provided, when it finally converged with a starting value of −1.1 for the intercept. The two-stage estimator appears to provide more precise inference about the treatment effect than the other methods.

3.3 Double robustness

Suppose that, as often the case in epidemiologic studies, we are particularly interested in the effect β⁽¹⁾ of the first component X⁽¹⁾ of X, which represents a binary exposure under study, and the remaining sub-vector X⁽⁻¹⁾ of X includes confounding factors with corresponding effect β⁽⁻¹⁾, so that X = (X⁽¹⁾,X⁽⁻¹⁾^T)^T and β = (β⁽¹⁾, β⁽⁻¹⁾^T)^T. Then, strictly speaking β⁽⁻¹⁾ is a nuisance parameter not of direct interest, and the model

\frac{Pr (Y_{i} = 1 ∣ X^{(1)} = 0, X^{(- 1)})}{Pr (Y_{i} = 1 ∣ X_{i} = 0)} = exp {β_{0}^{(- 1) T} X_{i}}

(4)

is a working model used strictly for the purpose of confounding adjustment. Unless the working model in the display above is saturated, in general one cannot rule out possible model mis-specification which in turn can result in biased inferences about the exposure effect, due to inadequate confounding adjustment. Because saturated models will generally be impractical due to data sparseness, we propose to partially alleviate these concerns by modeling the probability of exposure given covariates, i.e. the propensity score, with a working regression model,

logit {Pr (X^{(1)} = 1 ∣ X^{(- 1)}; ψ_{0})} = ψ_{0}^{T} {[1, X^{(- 1)}]}^{T}

(5)

Suppose ψ̂ is the maximum likelihood estimator of ψ₀, and let π̂_i = Pr(X⁽¹⁾ = 1|X⁽⁻¹⁾;ψ̂). Then ${\hat{β}}_{d r}^{(1)}$ is doubly robust, where

{\hat{β}}_{d r}^{(1)} = log \frac{\sum_{i} Y_{i} X_{i}^{(1)} exp (- β^{(- 1) T} X^{(- 1)}) {1 - {\hat{π}}_{i}}}{\sum_{i} Y_{i} (1 - X_{i}^{(1)}) exp (- β^{(- 1) T} X^{(- 1)}) {\hat{π}}_{i}}

that is

Result 3

${\hat{β}}_{d r}^{(1)}$ converges (in probability) to $β_{0}^{(1)}$ if either model (4) holds, or model (5) holds, but not necessarily both hold.

Furthermore, it can be shown that ${\hat{β}}_{d r}^{(1)}$ is in large samples normally distributed with mean $β_{0}^{(1)}$ and variance that is easily estimated via the nonparametric bootstrap. The bootstrap is required here to appropriately account for additional variability from the first stage regression of $X_{i}^{(1)}$ onto $X_{i}^{(- 1)}$ . Although doubly robust estimators of a multiplicative exposure effect have previously been proposed (Robins and Rotnitzky, 2001), the doubly robust method described here is new and has the appealing property that, unlike previous methods, it does not require an estimate of the baseline risk Pr(Y = 1|X = 0).

The simulation study reported in table 1 nicely illustrates the robustness property described in Result 2, as it shows in the row labelled ‘Incorrect Model”, that the doubly robust estimator remains unbiased when model (5) holds, even though model (4) is incorrect because in this scenario, Y is generated under a log-binomial model with event probability exp([−1.5, 0.3, −0.2, −0.7, 0.9] × Q), with corresponding marginal risk Pr(Y = 1) ≈ 0.30. This is in stark contrast with the non-doubly robust estimator β̂⁽¹⁾ which incurs bias when the confounders are mis-specified. The simulation study also indicates that when modeling error is absent, the doubly robust estimator exhibits similar efficiency as the non-doubly robust estimator, suggesting that, at least in this specific simulation study, little efficiency loss was incurred in exchange for a potential gain in robustness. In the appendix, the doubly robust methods described above are extended to incorporate possible interactions between the exposure and covariates, and the approach is further developed for a continuous exposure.

4 Conclusion

In this paper, we have described a simple and efficient two-stage approach to estimate risk ratios directly, which does not directly rely for consistency on an estimate of the baseline risk. This latter property is advantageous, because unlike previous methods, the proposed approach obviates the need to restrict the predicted risk probabilities to fall below one, in order to recover efficient inferences about risk ratios. For efficiency, the approach incorporates an individual weight which does depend on the individual’s predicted risk; nonetheless, because the primary target of inference is the risk ratio parameter, we have argue that a consistent estimate of the risk is sufficient for inference, and we have described a simple pluggin estimator of risk which we have used to construct an efficient estimator of risk ratios. Both a simulation study and a data application confirmed the good performance of the approach. We have further extended the proposed methodology by modifying it to ensure that individuals’ estimated risks are genuine probabilities. Furthermore, when the primary interest is in the effect of a specific exposure, we have developed a simple doubly robust closed-form estimator for the multiplicative effect of the exposure, while adjusting for a possibly large number of confounders. In future work, we plan to further extend the methods of this paper for correlated binary outcomes as encountered in studies with repeated outcome measurements, or in studies with clustered data.

Table 1.

Simulation results

		bias (β̂)	MC Var (β̂)	Σ̂_β	Σ̃_β	bias (β̂^eff)	MC Var (β̂^eff)	bias (β̂_dr)	MC Var (β̂_dr)
β⁽¹⁾	Correct Model	−0.0083	0.0140	0.0180	0.0138	0.0029	0.0123	−0.0079	0.0141
β⁽¹⁾	Incorrect Model	0.0183	0.0128	0.0170	0.0130	−0.0226	0.0110	0.0001	0.0130
β⁽²⁾	Correct Model	−0.0026	0.0138	0.0185	0.0135	0.0046	0.0115	**	**
β⁽³⁾	Correct Model	0.0150	0.0402	0.0514	0.0388	0.0007	0.0316	**	**
β⁽⁴⁾	Correct Model	−0.0084	0.0411	0.0515	0.0390	−0.0022	0.0345	**	**

Open in a new tab

APPENDIX 1 Proofs

Proof of Result 1

Let E {U^* (β)} = E[Y {exp(−β^T (X − E (X)))} (X − E (X))] denote the (probability) limiting value of n⁻¹ Σ_{i:Y_i=1} U_i (β). To show that the result holds, it suffices to show that U (β) is an unbiased estimating function; that is we need to show that E {U (β₀)} = 0. Now

\begin{array}{l} E {U (β_{0})} = E [Y {exp (- β_{0}^{T} (X - E (X)))} (X - E (X))] \\ = E [E (Y ∣ X) {exp (- β_{0}^{T} (X - E (X)))} (X - E (X))] \\ = E [exp (β_{0}^{T} X + α_{0}) {exp (- β_{0}^{T} (X - E (X)))} (X - E (X))] \\ = E [exp (α_{0} + β_{0}^{T} E (X)) (X - E (X))] \\ = exp (α_{0} + β_{0}^{T} E (X)) E [(X - E (X))] \\ = 0 \end{array}

To establish the large sample behaviour of β̂, we perform a standard Taylor expansion

\begin{array}{l} 0 = \sum_{i} [Y_{i} {exp (- {\hat{β}}^{T} (X_{i} - \bar{X}))} (X_{i} - \bar{X})] \\ \approx \sum_{i} [Y_{i} {exp (- β_{0}^{T} (X_{i} - E (X)))} (X_{i} - E (X))] \\ - \sum_{i} [Y_{i} {exp (- β_{0}^{T} (X_{i} - E (X)))} (X_{i} - E (X)) {(X_{i} - E (X))}^{T}] (\hat{β} - β_{0}) \\ + \sum_{i} [Y_{i} {exp (- β_{0}^{T} (X_{i} - E (X)))} (X_{i} - E (X)) β_{0}^{T}] (\bar{X} - E (X)) \\ - \sum_{i} [Y_{i} {exp (- β_{0}^{T} (X_{i} - E (X)))}] (\bar{X} - E (X)) \end{array}

By the law of large numbers and an application of Slutzky’s theorem, we conclude that $\sqrt{n} (\hat{β} - β_{0})$ has large sample distribution equal to the distribution of

\begin{array}{l} E {[Y_{i} {exp (- β_{0}^{T} (X - E (X)))} (X - E (X)) {(X - E (X))}^{T}]}^{- 1} \\ \times {n^{- 1 / 2} \sum_{i} [Y_{i} {exp (- β_{0}^{T} (X_{i} - E (X)))} (X_{i} - E (X))] \\ + E [Y {exp (- β_{0}^{T} (X - E (X)))} (X - E (X)) β_{0}^{T}] n^{- 1 / 2} \sum_{i} (X_{i} - E (X)) \\ - E [Y {exp (- β_{0}^{T} (X - E (X)))}] n^{- 1 / 2} \sum_{i} (X_{i} - E (X))} \\ = E {[Y_{i} {exp (- β_{0}^{T} (X - E (X)))} (X - E (X)) {(X - E (X))}^{T}]}^{- 1} \times \\ {\sum_{i} [Y_{i} {exp (- β_{0}^{T} (X_{i} - E (X))) - E [Y {exp (- β_{0}^{T} (X - E (X)))}]} (X_{i} - E (X))] \end{array}

since $E [Y {exp (- β_{0}^{T} (X - E (X)))} (X - E (X)) β_{0}^{T}] = 0$ . We may further conclude that the large sample variance of $\sqrt{n} (\hat{β} - β_{0})$ is given by

\begin{array}{l} E {[Y_{i} {exp (- β_{0}^{T} (X - E (X)))} (X - E (X)) {(X - E (X))}^{T}]}^{- 1} \\ \times E {{[Y_{i} {exp (- β_{0}^{T} (X_{i} - E (X))) - E [Y {exp (- β_{0}^{T} (X - E (X)))}]} (X_{i} - E (X))]}^{\otimes 2}} \\ \times E {[Y_{i} {exp (- β_{0}^{T} (X - E (X)))} (X - E (X)) {(X - E (X))}^{T}]}^{- 1} \\ = {[E (\frac{\partial U^{*} (β)}{\partial β})]}^{- 1} E {U^{*} {(β_{0})}^{\otimes 2}} {[E (\frac{\partial U^{*} (β)}{\partial β})]}^{- 1} \\ - E {[Y {exp (- β_{0}^{T} (X - E (X)))}]}^{2} \\ \times {[E (\frac{\partial U^{*} (β)}{\partial β})]}^{- 1} E {{(X - E (X))}^{\otimes 2}} {[E (\frac{\partial U^{*} (β)}{\partial β})]}^{- 1} \end{array}

because

\begin{array}{l} E [Y {exp (- β_{0}^{T} (X - E (X)))}] \\ = E [Y {exp (- β_{0}^{T} (X - E (X)))} (X - E (X)) {(X - E (X))}^{T}] E {[(X - E (X)) {(X - E (X))}^{T}]}^{- 1} \end{array}

where A^⊗2 = AA^T. Furthermore, because covariance matrices are positive-definite, we may conclude that ${[E (\frac{\partial U^{*} (β)}{\partial β})]}^{- 1} E {U^{*} {(β_{0})}^{\otimes 2}} {[E (\frac{\partial U^{*} (β)}{\partial β})]}^{- 1}$ is conservative for the variance-covariance matrice in the positive-definite sense, that for any non-zero constant vector t

t^{T} \sum_{β} t < t^{T} {[E (\frac{\partial U^{*} (β)}{\partial β})]}^{- 1} E {U^{*} {(β_{0})}^{\otimes 2}} {[E (\frac{\partial U^{*} (β)}{\partial β})]}^{- 1} t

and therefore Σ̂_β is a conservative estimator of Σ_β. Whereas Σ̃_β is consistent for Σ_β where

\begin{array}{l} {\sum^{\sim}}_{β} = {\sum^{^}}_{β} - {[\sum_{i} Y_{i} {exp (- β_{0}^{T} (X_{i} - \bar{X}))}]}^{2} \\ \times {[\sum_{i : Y_{i} = 1} {\frac{\partial U_{i} (β)}{\partial β} ∣}_{\hat{β}}]}^{- 1} n \sum_{i : Y_{i} = 1}^{2} {(X_{i} - \bar{X})}^{\otimes 2} {[\sum_{i : Y_{i} = 1} {\frac{\partial U_{i} (β)}{\partial β} ∣}_{\hat{β}}]}^{- 1} \end{array}

Proof of Result 2

Consider the semiparametric model given solely by restriction (3) ; then Bickel et al (1993) established that all regular and asymptotically linear estimators of β₀ are fully characterized by the set of influence functions:

Λ = {\begin{matrix} U^{†} (v) = {Y - exp (α_{0} + β_{0}^{T} X)} \times [V - \frac{E {V exp (β_{0}^{T} X)}}{E {exp (β_{0}^{T} X)}}] : \\ V = v (X) of dimension dim - β_{0} \\ with E {U^{†} {(v)}^{T} U^{†} (v)} < \infty \end{matrix}}

It is straightforward to verify that this set is equivalently written:

Λ = {\begin{matrix} U^{♮} (μ) = \frac{{Y - exp (α_{0} + β_{0}^{T} X)}}{1 - exp (α_{0} + β_{0}^{T} X)} \times [μ - \frac{E {μ exp (α_{0} + β_{0}^{T} X) {[1 - exp (α_{0} + β_{0}^{T} X)]}^{- 1}}}{E {exp (α_{0} + β_{0}^{T} X) {[1 - exp (α_{0} + β_{0}^{T} X)]}^{- 1}}}] : \\ μ = μ (X) of dimension dim - β_{0} \\ with E {U^{♮} {(μ)}^{T} U^{♮} (μ)} < \infty \end{matrix}}

Now, the score for β₀ in this model is given by

S_{β} = \frac{X (Y - exp (α_{0} + β_{0}^{T} X))}{{1 - exp (α_{0} + β_{0}^{T} X)}}

therefore, the efficient score of β₀, i.e. the orthogonal projection of S_β onto Λ, is U^† (μ_opt), with μ_opt = μ_opt(X) = X, in other words,

S_{β}^{eff} = U^{♮} (μ_{opt}) = \frac{{Y - exp (α_{0} + β_{0}^{T} X)}}{1 - exp (α_{0} + β_{0}^{T} X)} \times [X - \frac{E {X exp (α_{0} + β_{0}^{T} X) {[1 - exp (α_{0} + β_{0}^{T} X)]}^{- 1}}}{E {exp (α_{0} + β_{0}^{T} X) {[1 - exp (α_{0} + β_{0}^{T} X)]}^{- 1}}}]

since $S_{β}^{eff} \in Λ$ , and for all U^♮ (w) ∈ Λ

E [(S_{β} - S_{β}^{eff}) U^{♮} (w)] = 0

The proof is completed by noting that

S_{β}^{eff} = U^{♮} (μ_{opt}) = U^{†} (v_{opt})

where

v_{opt} (X) = [X - \frac{E {X exp (α_{0} + β_{0}^{T} X) {[1 - exp (α_{0} + β_{0}^{T} X)]}^{- 1}}}{E {exp (α_{0} + β_{0}^{T} X) {[1 - exp (α_{0} + β_{0}^{T} X)]}^{- 1}}}] / 1 - exp (α_{0} + β_{0}^{T} X)

Then, a theorem due to Bickel et al (1993) states that for any initial n^1/2–consistent estimator of β₀, an efficient estimator can be constructed by a one-step update of β̂ in the direction of the estimated efficient score by using the following formula

{\hat{β}}^{eff} = \hat{β} - {[\sum_{i} \hat{{\dot{U}}_{i}^{♮}} (μ_{opt})]}^{- 1} \sum_{i} {\hat{U}}_{i}^{♮} (μ_{opt})

where Û^♮ (μ_opt) is an empirical version of U^♮ (μ_opt) obtained by replacing all expectations by empirical expectations, with β₀ estimated by β̂ and exp (α₀) estimated by the simple pluggin estimator Σ_i_′ Y_i_′ exp(−β̂^T X_i_′)/n; $\sum_{i} \hat{{\dot{U}}^{♮}} (μ_{opt}) / n$ is a similarly constructed estimator of the expected derivative of the efficient score, $\frac{{Y - exp (α_{0} + β^{T} X)}}{1 - exp (α_{0} + β^{T} X)} [X - \frac{E {X exp (α_{0} + β^{T} X) {[1 - exp (α_{0} + β^{T} X)]}^{- 1}}}{E {exp (α_{0} + β^{T} X) {[1 - exp (α_{0} + β^{T} X)]}^{- 1}}}]$ with respect to β evaluated at β₀. It is straightforward to verify that β̂^eff reduces to the formula provided in the main text. Furthermore, the theorem of Bickel et al (1993) further states that under standard regularity conditions, n^1/2 (β̂^eff − β₀) is asymptotically normal with mean zero and variance

E {U^{♮} {(μ_{opt})}^{\otimes 2}}^{- 1}

which is also the semiparametric efficiency bound of β₀. Finally, ${\sum^{^}}_{β}^{eff}$ is an empirical version of $\sum_{β}^{eff}$ which converges to the latter in probability.

In order to prove Result 3, we first establish a more general result, for which we allow X⁽¹⁾ to be continuous, and for the model to incorporate a possible interaction between exposure and covariates, say X⁽²⁾. Specifically, we suppose that

\frac{Pr (Y_{i} = 1 ∣ X^{(1)}, X^{(- 1)})}{Pr (Y_{i} = 1 ∣ X^{(1)} = 0, X^{(- 1)})} = exp {β_{0}^{(1) T} {[X_{i}, X_{i}^{(1)} X_{i}^{(2)}]}^{T}}

(6)

and let π (ψ) = E(X⁽¹⁾|X⁽⁻¹⁾;ψ) = g(ψ^T[1, X⁽⁻¹⁾^T]^T) denote a working model for the mean of the exposure given covariates; where g⁻¹ is the identity link for continuous X⁽¹⁾ and g is the logit link for binary X⁽⁻¹⁾. Define the estimating function

W (β^{(1)}, β^{(- 1)}, ψ) = (\begin{matrix} 1 \\ X^{(2)} \end{matrix}) Y_{i} exp {- β^{(1) T} {[X, X^{(1)} X^{(2)}]}^{T} - β^{(- 1) T} X^{(- 1)}} {X^{(1)} - π (ψ)}

Then we have the following lemma.

Lemma 1

Under model (6),

E [W (β_{0}^{(1)}, β^{(- 1)}, ψ)] = 0

(7)

if either but not necessarily both of the following conditions hold,

ψ = ψ₀ and E(X⁽¹⁾|X⁽⁻¹⁾;ψ₀) = E(X⁽¹⁾|X⁽⁻¹⁾) or
$β^{(- 1)} = β_{0}^{(- 1)}$ and model (4) holds.

Proof of Lemma 1

\begin{array}{l} E [W (β_{0}^{(1)}, β^{(- 1)}, ψ)] = E [(\begin{matrix} 1 \\ X^{(2)} \end{matrix}) Y_{i} exp {- β_{0}^{(1) T} {[X, X^{(1)} X^{(2)}]}^{T} - β^{(- 1) T} X^{(- 1)}} {X^{(1)} - π (ψ)}] \\ = E [(\begin{matrix} 1 \\ X^{(2)} \end{matrix}) E [Y_{i} ∣ X_{i}] exp {- β_{0}^{(1) T} {[X, X^{(1)} X^{(2)}]}^{T} - β^{(- 1) T} X^{(- 1)}} {X^{(1)} - π (ψ)}] \\ = E [(\begin{matrix} 1 \\ X^{(2)} \end{matrix}) exp {E [Y_{i} ∣ X_{i}^{(1)} = 0, X_{i}^{(- 1)}] - β^{(- 1) T} X^{(- 1)}} {E (X^{(1)} ∣ X^{(- 1)}) - π (ψ)}] \end{array}

which is certainly zero if (1) holds since then E(X⁽¹⁾|X⁽⁻¹⁾) − π (ψ) = 0. If (2) holds, we have

\begin{array}{l} E [(\begin{matrix} 1 \\ X^{(2)} \end{matrix}) exp {E [Y_{i} ∣ X_{i}^{(1)} = 0, X_{i}^{(- 1)}] - β^{(- 1) T} X^{(- 1)}} {E (X^{(1)} ∣ X^{(- 1)}) - π (ψ)}] \\ = E [(\begin{matrix} 1 \\ X^{(2)} \end{matrix}) exp {E [Y_{i} ∣ X_{i} = 0]} {E (X^{(1)} ∣ X^{(- 1)}) - π (ψ)}] \\ = exp {E [Y_{i} ∣ X_{i} = 0]} \times E [(\begin{matrix} 1 \\ X^{(2)} \end{matrix}) {X^{(1)} - π (ψ)}] \\ \propto E [(\begin{matrix} 1 \\ X^{(2)} \end{matrix}) {X^{(1)} - π (ψ)}] = 0 \end{array}

since the last quantity is part of the first order condition used to estimate ψ either by ordinary least-squares when X⁽⁻¹⁾ is continuous or by logistic regression in the binary case.

Proof of Result 3

The result immediately follows from Lemma 1 since when X⁽¹⁾ is binary, it is straightforward to verify that equation (7) is equivalent to

β_{0}^{(1)} = log \frac{E [Y X^{(1)} exp (- β^{(- 1) T} X^{(- 1)}) {1 - π (ψ)}]}{E [Y (1 - X^{(1)}) exp (- β^{(- 1) T} X^{(- 1)}) π (ψ)]}

Therefore, if either (1) holds, and thus ψ̂ converges to ψ₀ or (2) holds and thus β̂⁽⁻¹⁾ converges to $β_{0}^{(- 1)}$ , we have that ${\hat{β}}_{d r}^{(1)}$ converges to $β_{0}^{(1)}$ .

References

1.Bickel P, Klassen C, Ritov Y, Wellner J. Efficient and Adaptive Estimation for Semi-parametric Models. Springer; New York: 1993. [Google Scholar]
2.Breslow NE. Covanance analysis of censored survival data. Biometrics. 1974;30:89–99. [PubMed] [Google Scholar]
3.Chu H, Cole S. Estimation of Risk Ratios in Cohort Studies With Common Outcomes. A Bayesian Approach. Epidemiology. 2010;21(6) doi: 10.1097/EDE.0b013e3181f2012b. [DOI] [PubMed] [Google Scholar]
4.Deddens JA, Petersen MR, Lei X. Estimation of prevalence ratios when PROC GEN-MOD does not converge. Proceedings of the 28th Annual SAS Users Group International Conference; Seattle, Washington. 2003. pp. 270–28. ( http://www2.sas.com/proceedings/sugi28/270-28.pdf) [Google Scholar]
5.The Diabetes Control and Complications Trial Research Group. The effect of intensive treatment of diabetes on the development and progression of long-term complications in insulindependent diabetes mellitus. N Engl J Med. 1993;329:977–86. doi: 10.1056/NEJM199309303291401. [DOI] [PubMed] [Google Scholar]
6.Greenland S. Model-based estimation of relative risks and other epidemiologic measures in studies of common outcomes and in case-control studies. Am J Epidemiol. 2004;160:301–305. doi: 10.1093/aje/kwh221. [DOI] [PubMed] [Google Scholar]
7.Hirano K, Imbens GW, Ridder G. Efficient estimation of average treatment effects using the estimated propensity score. Econometrica. 2003;71:1161–1189. [Google Scholar]
8.Lachin J. Biostatistical Methods. John Wiley and Sons; New York: 2000. [Google Scholar]
9.Hastie T, Tibshirani R, Friedman J. The Elements of Statistical Learning. 2. Springer-Verlag; New York: 2008. [Google Scholar]
10.Lee J. Odds Ratio or Relative Risk for Cross-Sectional Data? Int J Epidemiol. 1994;23(1):201–203. doi: 10.1093/ije/23.1.20. [DOI] [PubMed] [Google Scholar]
11.Petersen MR, Deddens JA. RE: “Easy SAS calculations for risk or prevalence ratios and differences”. Am J Epidemiol. 2006;163:1158–1159. doi: 10.1093/aje/kwj162. [DOI] [PubMed] [Google Scholar]
12.Ruppert D, Wand M, Carroll R. Semiparametric Regression. Cambridge University Press; 2003. [Google Scholar]
13.Robins JM, Rotnitzky A. Comment on the Bickel and Kwon article, “Inference for semiparametric models: Some questions and an answer”. Statistica Sinica. 2001;11(4):920–936. [Google Scholar]
14.Skov T, Deddens J, Petersen MR, Endahl L. Prevalence proportion ratios: estimation and hypothesis testing. Int J Epidemiol. 1998;27:91–95. doi: 10.1093/ije/27.1.91. [DOI] [PubMed] [Google Scholar]
15.Spiegelman D, Hertzmark E. Easy SAS calculations for risk or prevalence ratios and differences. Am J Epidemiol. 2005;162:199–200. doi: 10.1093/aje/kwi188. [DOI] [PubMed] [Google Scholar]
16.Tian L, Liu K. Re: “Estimating the relative risk in cohort studies and clinical trials of common outcomes.” (Letter) Am J Epidemiol. 2006;163:1157–1163. [Google Scholar]
17.Wacholder S. Binomial regression in GLIM: estimating risk ratios and risk differences. Am J Epidemiol. 1986;123:174–184. doi: 10.1093/oxfordjournals.aje.a114212. [DOI] [PubMed] [Google Scholar]
18.Wasserman L. All of Nonparametric Statistics. Springer; New York: 2003. [Google Scholar]
19.Zou GY. A modified Poisson regression approach to prospective studies with binary data. Am J Epidemiol. 2004;159:702–706. doi: 10.1093/aje/kwh090. [DOI] [PubMed] [Google Scholar]

[R1] 1.Bickel P, Klassen C, Ritov Y, Wellner J. Efficient and Adaptive Estimation for Semi-parametric Models. Springer; New York: 1993. [Google Scholar]

[R2] 2.Breslow NE. Covanance analysis of censored survival data. Biometrics. 1974;30:89–99. [PubMed] [Google Scholar]

[R3] 3.Chu H, Cole S. Estimation of Risk Ratios in Cohort Studies With Common Outcomes. A Bayesian Approach. Epidemiology. 2010;21(6) doi: 10.1097/EDE.0b013e3181f2012b. [DOI] [PubMed] [Google Scholar]

[R4] 4.Deddens JA, Petersen MR, Lei X. Estimation of prevalence ratios when PROC GEN-MOD does not converge. Proceedings of the 28th Annual SAS Users Group International Conference; Seattle, Washington. 2003. pp. 270–28. ( http://www2.sas.com/proceedings/sugi28/270-28.pdf) [Google Scholar]

[R5] 5.The Diabetes Control and Complications Trial Research Group. The effect of intensive treatment of diabetes on the development and progression of long-term complications in insulindependent diabetes mellitus. N Engl J Med. 1993;329:977–86. doi: 10.1056/NEJM199309303291401. [DOI] [PubMed] [Google Scholar]

[R6] 6.Greenland S. Model-based estimation of relative risks and other epidemiologic measures in studies of common outcomes and in case-control studies. Am J Epidemiol. 2004;160:301–305. doi: 10.1093/aje/kwh221. [DOI] [PubMed] [Google Scholar]

[R7] 7.Hirano K, Imbens GW, Ridder G. Efficient estimation of average treatment effects using the estimated propensity score. Econometrica. 2003;71:1161–1189. [Google Scholar]

[R8] 8.Lachin J. Biostatistical Methods. John Wiley and Sons; New York: 2000. [Google Scholar]

[R9] 9.Hastie T, Tibshirani R, Friedman J. The Elements of Statistical Learning. 2. Springer-Verlag; New York: 2008. [Google Scholar]

[R10] 10.Lee J. Odds Ratio or Relative Risk for Cross-Sectional Data? Int J Epidemiol. 1994;23(1):201–203. doi: 10.1093/ije/23.1.20. [DOI] [PubMed] [Google Scholar]

[R11] 11.Petersen MR, Deddens JA. RE: “Easy SAS calculations for risk or prevalence ratios and differences”. Am J Epidemiol. 2006;163:1158–1159. doi: 10.1093/aje/kwj162. [DOI] [PubMed] [Google Scholar]

[R12] 12.Ruppert D, Wand M, Carroll R. Semiparametric Regression. Cambridge University Press; 2003. [Google Scholar]

[R13] 13.Robins JM, Rotnitzky A. Comment on the Bickel and Kwon article, “Inference for semiparametric models: Some questions and an answer”. Statistica Sinica. 2001;11(4):920–936. [Google Scholar]

[R14] 14.Skov T, Deddens J, Petersen MR, Endahl L. Prevalence proportion ratios: estimation and hypothesis testing. Int J Epidemiol. 1998;27:91–95. doi: 10.1093/ije/27.1.91. [DOI] [PubMed] [Google Scholar]

[R15] 15.Spiegelman D, Hertzmark E. Easy SAS calculations for risk or prevalence ratios and differences. Am J Epidemiol. 2005;162:199–200. doi: 10.1093/aje/kwi188. [DOI] [PubMed] [Google Scholar]

[R16] 16.Tian L, Liu K. Re: “Estimating the relative risk in cohort studies and clinical trials of common outcomes.” (Letter) Am J Epidemiol. 2006;163:1157–1163. [Google Scholar]

[R17] 17.Wacholder S. Binomial regression in GLIM: estimating risk ratios and risk differences. Am J Epidemiol. 1986;123:174–184. doi: 10.1093/oxfordjournals.aje.a114212. [DOI] [PubMed] [Google Scholar]

[R18] 18.Wasserman L. All of Nonparametric Statistics. Springer; New York: 2003. [Google Scholar]

[R19] 19.Zou GY. A modified Poisson regression approach to prospective studies with binary data. Am J Epidemiol. 2004;159:702–706. doi: 10.1093/aje/kwh090. [DOI] [PubMed] [Google Scholar]

PERMALINK

Estimation of Risk Ratios in Cohort Studies With Common Outcomes: A Simple and Efficient Two-stage Approach

Eric J Tchetgen Tchetgen

Abstract

1 Introduction

2 PROPOSED METHODS

2.1 A Simple Inefficient Initial Estimator

Result 1

2.2 An Efficient Estimator

Result 2

3 Additional results and an application

3.1 An alternative efficient estimator

3.2 A data illustration

3.3 Double robustness

Result 3

4 Conclusion

Table 1.

APPENDIX 1 Proofs

Proof of Result 1

Proof of Result 2

Lemma 1

Proof of Lemma 1

Proof of Result 3

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Estimation of Risk Ratios in Cohort Studies With Common Outcomes: A Simple and Efficient Two-stage Approach

Eric J Tchetgen Tchetgen

Abstract

1 Introduction

2 PROPOSED METHODS

2.1 A Simple Inefficient Initial Estimator

Result 1

2.2 An Efficient Estimator

Result 2

3 Additional results and an application

3.1 An alternative efficient estimator

3.2 A data illustration

3.3 Double robustness

Result 3

4 Conclusion

Table 1.

APPENDIX 1 Proofs

Proof of Result 1

Proof of Result 2

Lemma 1

Proof of Lemma 1

Proof of Result 3

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases