Sensitivity analysis for principal ignorability violation in estimating complier and noncomplier average causal effects

Trang Quynh Nguyen; Elizabeth A Stuart; Daniel O Scharfstein; Elizabeth L Ogburn

doi:10.1002/sim.10153

. Author manuscript; available in PMC: 2025 Aug 30.

Published in final edited form as: Stat Med. 2024 Jun 18;43(19):3664–3688. doi: 10.1002/sim.10153

Sensitivity analysis for principal ignorability violation in estimating complier and noncomplier average causal effects

Trang Quynh Nguyen ¹, Elizabeth A Stuart ^2,^1,³, Daniel O Scharfstein ⁴, Elizabeth L Ogburn ²

PMCID: PMC11995412 NIHMSID: NIHMS2065332 PMID: 38890728

Abstract

An important strategy for identifying principal causal effects (popular estimands in settings with noncompliance) is to invoke the principal ignorability (PI) assumption. As PI is untestable, it is important to gauge how sensitive effect estimates are to its violation. We focus on this task for the common one-sided noncompliance setting where there are two principal strata, compliers and noncompliers. Under PI, compliers and noncompliers share the same outcome-mean-given-covariates function under the control condition. For sensitivity analysis, we allow this function to differ between compliers and noncompliers in several ways, indexed by an odds ratio, a generalized odds ratio, a mean ratio, or a standardized mean difference sensitivity parameter. We tailor sensitivity analysis techniques (with any sensitivity parameter choice) to several types of PI-based main analysis methods, including outcome regression, influence function (IF) based and weighting methods. We discuss range selection for the sensitivity parameter. We illustrate the sensitivity analyses with several outcome types from the JOBS II study. This application estimates nuisance functions parametrically – for simplicity and accessibility. In addition, we establish rate conditions on nonparametric nuisance estimation for IF-based estimators to be asymptotically normal – with a view to inform nonparametric inference.

Keywords: principal stratification, complier average causal effect, principal ignorability, sensitivity analysis

1 |. INTRODUCTION

The study of causal effects of a treatment is often complicated by noncompliance. The principal stratification framework¹ defines types (principal strata) of study participants based on their potential compliance to treatment conditions. In the one-sided noncompliance setting where individuals in the control condition do not have access to the active treatment, there are two principal strata: compliers, who would take the treatment if offered, and noncompliers, who would not. In the two-sided noncompliance setting where all individuals (assigned to either treatment or control) can access the treatment, there are four principal strata, often known as compliers, always-takers, never-takers, and defiers. Principal causal effects are effects of treatment assignment within each stratum, $E [Y_{1} - Y_{0} ∣ C]$ , where $Y_{1}$ and $Y_{0}$ are potential outcomes² under assignment of active treatment and of control, respectively, and C denotes principal stratum. Of common interest is the complier average causal effect (CACE), but other principal causal effects may also be of interest^3,4.

This paper focuses on one-sided noncompliance, which is common in studies where the treatment is designed and implemented by the study and is not otherwise available, e.g., job search training for unemployed workers⁵, volunteering program for the elderly⁶, or weight management for people with mental illness⁷. We will briefly comment on the two-sided non-compliance case in the Discussion section.

The challenge in identifying principal causal effects is that principal stratum membership $C$ is only partially observed; with one-sided noncompliance $C$ is not observed in the control condition. Effect identification thus requires untestable assumptions. One such assumption is exclusion restriction⁸ (ER), which posits that treatment assignment does not affect the outcome other than through its effect on treatment received. This means there is no effect on noncompliers, and effects on compliers explain the full effect of treatment assignment. ER is not suitable if treatment receipt is not strictly binary, i.e., noncompliers are exposed to some active ingredients in the treatment arm^9,10. This case may arise when an intervention includes several components, and only a major one is used to define compliance. It may also arise due to dichotomization, e.g., only people who attend more than a certain number of treatment sessions are classified as compliers⁶. ER may also not hold if there are compensating behaviors or psychological effects due to being assigned to one condition as opposed to the other¹¹.

Another identification strategy does not restrict the noncomplier effect to zero, but instead invokes the principal ignorability (PI) assumption^12,11,13. This assumption posits that, conditional on a set of pre-treatment-assignment covariates $X$ , the potential outcome under control $Y_{0}$ is independent (or mean-independent) of principal stratum $C$ , i.e., compliers and noncompliers share the same conditional $Y_{0}$ distribution (or mean function). PI may be appealing for studies with rich baseline covariate data. As randomized trials and cohort studies tend to collect a lot of covariate data, one might hope that the covariates account for a substantial part of the dependence between $Y_{0}$ and $C$ . On the other hand, most studies are not designed with noncompliance in mind, and thus not much attention is paid to measuring covariates that predict compliance type to render $C$ and $Y_{0}$ independent, which means PI may be violated.

1.1 |. Our contribution

In this paper we focus on the PI assumption. Specifically, we develop methods to evaluate the robustness of the estimated principal causal effects to violation of PI, in the one-sided noncompliance setting. We introduce several sensitivity parameterizations representing how (within levels of $X$ ) the mean of $Y_{0}$ differs between compliers and noncompliers. These are indexed by an odds ratio, generalized odds ratio, mean ratio, or standardized mean difference, suitable for use with different outcome types. In addition, we tailor sensitivity analysis techniques for pairing with a range of estimation methods that may be used for the PI-based main analysis, including weighting, outcome regression and influence function based estimation.

We illustrate the proposed sensivity analysis methods using the JOBS II Intervention Study⁵, where unemployed workers were randomized to receive either a week-long training program to promote mental health and provide job search skills (treatment) or a booklet with job search tips (control). Just over half of those randomized to treatment actually attended the training, resulting in a setting with compliers and noncompliers. JOBS II has been used by authors investigating different aspects of principal stratification, e.g., identification and estimation under PI^12,11, alternative identification assumptions¹⁴, bias due to failed assumptions¹⁵, and noncompliance combined with outcome missingness¹⁶. For our purpose, JOBS II is an interesting example for two reasons: (i) the study paid attention to the issue of noncompliance and collected baseline data on workers’ motivation to participate in a hypothetical training program on job search skills, making this a prime case for invoking PI; and (ii) the study collected outcomes of several types (binary, continuous and bounded) to which the methods we propose are relevant.

1.2 |. Related work

To our knowledge, two methods have been proposed to assess sensitivity of effect estimates to PI violation. The method used in Ding and Lu (2017)¹³ is the closest to, and inspired, our work. In the one-sided noncompliance context, this method allows the mean of $Y_{0}$ given $X$ to differ between compliers and noncompliers by a ratio that serves as the sensitivity parameter, and estimates effects under each value of the sensitivity parameter by modifying a PI-based weighting estimator. The application was with a binary outcome, flu-related hospitalization. A drawback is that with a binary outcome this mean ratio parameter may yield predictions greater than 1. This motivated our expansion of the range of sensitivity parameterizations to accommodate different outcome types. Also, we consider sensitivity analysis techniques pairing with different types of PI-based estimators, not just the weighting estimator. The second sensitivity analysis method is that of Wang et al. (2023)¹⁷ for survival outcomes, which imputes unobserved $C$ and $Y_{0}$ under a parametric model containing a hazard ratio sensitivity parameter. This work differs from our approach in that it relies on this parametric model for identification, whereas we make explicit the assumption required for identification and then use modeling only for estimation. We also avoid refitting models for every value of the sensitivity parameter.

There are methods to assess sensitivity of principal causal effect estimates to violation of assumptions other than PI, such as treatment assignment ignorability^18,19 and ER²⁰. These are not our current focus.

To discuss sensitivity analysis, we will need to start with a description of PI-based estimation. While PI-based methods have been discussed in the literature, it has been in settings that are somewhat different, e.g., randomized treatment assignment^15,11,13 (which we do not require), a qualitatively different assumption¹⁴, or two-sided rather than one-sided noncompliance²¹. The PI-based estimators we list in this paper share certain features (e.g., principal score weighting, multiple robustness) with these earlier works, but are based on results for the current setting.

The paper proceeds as follows. Section 2 presents the setting, the estimands, and identification under PI. Section 3 introduces three types of PI-based estimators to be handled with different sensitivity analysis techniques. Sections 4 and 5 present sensitivity analysis using ratio-type and difference-type sensitivity parameters, respectively, and address each of the three estimator types. Section 6 covers other topics relevant to the sensitivity analyses. Section 7 analyzes JOBS II data. Section 8 closes with a discussion. Proofs are provided in the Web Appendices. Code is provided in the R-package PIsens available at https://github.com/trangnguyen74/PIsens.

2 |. SETTING, ESTIMANDS, AND PI-BASED IDENTIFICATION

2.1 |. Setting, estimands, and standard assumptions

Let $Z$ denote treatment assignment (1 for treatment, 0 for control), $Y$ denote the observed outcome, $Y_{z}$ the potential outcome had treatment $z$ been assigned $(z = 0, 1)$ , and $X$ denote baseline covariates. Let $S$ be a binary variable indicating whether the person actually receives the treatment $(S = 1)$ or not $(S = 0)$ . (More generally, $S$ can be any post-treatment variable of interest ^6,22,4 The principal stratification framework¹ defines subpopulations (aka principal strata, denoted by $C$ ) based on $S_{1}$ and $S_{0}$ , the potential values of $S$ under assignment to treatment and to control. In the one-sided compliance setting, $S_{0} = 0$ , so only $S_{1}$ matters. Hence $C$ coincides with $S_{1}$ and there are two principal strata: compliers $(C = 1)$ who would and noncompliers $(C = 0)$ who would not take the treatment, if offered the treatment. The “full” data for an individual are $(X, Z, C, Y_{1}, Y_{0})$ ; the observed data are $O : = (X, Z, S, Y)$ . Assume that we observe $n$ i.i.d. copies of $O$ .

Here we are interested in the complier and noncomplier average causal effects (CACE and NACE). As the PI identification strategy is symmetric with respect to these two effects (and so are the sensitivity assumptions we consider), we focus on the generic estimand

Δ_{c} : = E [Y_{1} - Y_{0} ∣ C = c] = E [Y_{1} ∣ C = c] - E [Y_{0} ∣ C = c],

where $c = 1$ gives the CACE and $c = 0$ gives the NACE.

Throughout we assume the usual causal inference assumptions:

\begin{array}{l} A 0 (consistency) : & Y = Z Y_{1} + (1 - Z) Y_{0}, S = Z C, \\ A 1 (treatment assignment ignorability) : & Z ⫫ (C, Y_{1}, Y_{0}) ∣ X, \\ A 2 (treatment assignment positivity) : & 0 < P (Z = 1 ∣ X) < 1 . \end{array}

Under $A 0$ , we write $O = (X, Z, Z C, Y)$ to simplify presentation.

As several expressions appear repeatedly in the paper, we will use the shorthand notation

\begin{array}{l} τ_{z c} & : = E [Y_{z} ∣ C = c], \\ μ_{z c} (X) & : = E [Y_{z} ∣ X, C = c], \\ μ_{0} (X) & : = E [Y ∣ X, Z = 0], \\ e (X, Z) & : = P (Z ∣ X), \\ π_{c} (X) & : = P (C = c ∣ X), \end{array}

for $z = 0, 1, c = 0, 1$ . Here $Δ_{c} = τ_{1 c} - τ_{0 c}$ . Note the difference between $μ_{z c} (X)$ which is the conditional mean of a potential outcome within a principal stratum and $μ_{0} (X)$ which concerns the observed outcome in the control condition and does not condition on principal stratum. $e (X, 1)$ is the propensity score. $π_{c} (X)$ is the probability of being in stratum $c$ given covariate values, which we also refer to as the principal score, following the literature^12,15,11,13.

Proofs of all results in this section are provided in Web Appendix B.

2.2 |. The identification challenge and the PI assumption

Identification of $Δ_{c} = τ_{1 c} - τ_{0 c}$ amounts to identification of $τ_{1 c}$ and $τ_{0 c}$ . The challenge is that while A0-A2 identify $τ_{1 c}$ , they are not sufficient to identify $τ_{0 c}$ . To see this, we start with the identity below.

Lemma 1.

\overset{= : τ_{z c}}{\overset{⏞}{E [Y_{z} ∣ C = c]}} = \frac{E {\overset{= : π_{c} (X)}{\overset{⏞}{P (C = c ∣ X)}} \overset{= : μ_{z c} (X)}{\overset{⏞}{E [Y_{z} ∣ X, C = c]}}}}{E [P (C = c ∣ X)]} .

(1)

(To simplify presentation, it is left implicit that $μ_{z c} (X)$ is only defined where $π_{c} (X) > 0$ .)

Lemma 1 says that $τ_{z c}$ is equal to the weighted average of the stratum-specific potential outcome mean $μ_{z c} (X)$ where the weight is proportional to the principal score $π_{c} (X)$ . This means $τ_{z c}$ can be identified via identification of $π_{c} (X)$ and $μ_{z c} (X)$ , which we address next.

Proposition 1 (Results without PI).

Under assumptions A0-A2,

π_{c} (X) = P (C = c ∣ X, Z = 1),

(2)

μ_{z c} (X) = E [Y ∣ X, Z = z, C = c],

(3)

τ_{1 c} = \frac{E [π_{c} (X) μ_{1 c} (X)]}{E [π_{c} (X)]} = \frac{E [\frac{Z}{e (X, Z)} I (C = c) Y]}{E [\frac{Z}{e (X, Z)} I (C = c)]},

(4)

π_{1} (X) μ_{01} (X) + π_{0} (X) μ_{00} (X) = \overset{= : μ_{0} (X)}{\overset{⏞}{E [Y∣ X, Z = 0]}} .

(5)

Proposition 1 shows that A0-A2 identify $π_{c} (X)$ and $μ_{1 c} (X)$ , but not $μ_{0 c} (X)$ . (The RHS of 3) conditions on $C$ , which is not observed for $Z = 0$ .) Hence $τ_{1 c}$ is identified, but $τ_{0 c}$ is not, so $Δ_{c}$ is not.

The problem here is nonidentifiability of the stratum-specific conditional $Y_{0}$ mean functions $μ_{0 c} (X)$ . These two functions, $μ_{01} (X)$ for compliers (where $c = 1$ ) and $μ_{00} (X)$ for noncompliers (where $c = 0$ ), are tied together as two unknowns in one equation, 55, which we will call the mixture equation. To identify them, some additional assumption is needed.

One such assumption is PI, which we state here as a conditional mean independence:

A 3 (principal ignorability) : \overset{= : μ_{01} (X)}{\overset{⏞}{E [Y_{0} ∣ X, C = 1]}} = \overset{= : μ_{00} (X)}{\overset{⏞}{E [Y_{0} ∣ X, C = 0]}} .

PI is also sometimes stated as $C ⫫ Y_{0} ∣ X$ (which implies A3). This version is more intuitive: it is satisfied if $X$ captures all common causes of $C$ and $Y_{0}$ ¹³. Like other authors, we assume that A3 and A1 involve the same set of covariates; this can be relaxed.

A3 combined with (5) solves the identification problem.

Proposition 2 (PI based identification).

Under assumptions A0-A3,

μ_{0 c} (X) = μ_{0} (X),

(6)

τ_{0 c} = \frac{E [π_{c} (X) μ_{0} (X)]}{E [π_{c} (X)]} = \frac{E [\frac{Z}{e (X, Z)} I (C = c) μ_{0} (X)]}{E [\frac{Z}{e (X, Z)} I (C = c)]} = \frac{E [\frac{1 - Z}{e (X, Z)} π_{c} (X) Y]}{E [\frac{1 - Z}{e (X, Z)} π_{c} (X)]} .

(7)

We will refer to the observed data functionals in Proposition 2 that identify $μ_{0 c} (X)$ and $τ_{0 c}$ as $μ_{0 c}^{P I} (X)$ and $τ_{0 c}^{P I}$ , and the corresponding result for $Δ_{c}$ (i.e., $τ_{1 c} - τ_{0 c}^{P I}$ ) as $Δ_{c}^{P I}$ .

Remark 1 (Sufficient PI version).

A3 involves $Y_{0}$ but not $Y_{1}$ . Feller et al. (2017)¹¹ call this assumption weak PI to differentiate it from a different assumption (strong PI) that involves both potential outcomes, $C ⫫ Y_{z} ∣ X$ for $z = 0, 1$ . While these labels suggest a difference in degree, these assumptions are qualitatively different. Strong PI implies that conditional on $X$ , the average causal effect is constant across principal strata, which is generally not desired¹¹. As A3 is sufficient (and strong PI is unnecessary), we simply refer to A3 as PI.

PI is untestable. The sensitivity analyses in Sections 4 and 5 will each replace PI with an alternative assumption (sensitivity assumption) indexed by a sensitivity parameter representing deviation from PI. Such an assumption obtains alternative identification results for $μ_{0 c} (X)$ and $Δ_{c}$ . The sensitivity analysis then shows, for a plausible range of the sensitivity parameter, how effect estimates depart from those obtained in a PI-based analysis.

3 |. THREE TYPES OF PI-BASED ESTIMATORS FROM THE LENS OF SENSITIVITY ANALYSIS

It is desirable to develop sensitivity analysis methods that are simple modifications of PI-based methods. With this in mind, in this section we group estimators of $Δ_{c}^{P I}$ into three types (each with a few example estimators), which we anticipate can be adapted for sensitivity analysis using different techniques (in subsequent sections). This grouping may be useful generally, say, where it is desirable to use a different sensitivity assumption not covered in this paper.

With three estimator types, the presentation from here through Section 5 is slightly complex. Readers who are mainly looking to add a sensitivity analysis to an already conducted or planned PI-based analysis only need to focus on the type of their estimator and can ignore the others.

Proofs of results in this section are provided in Web Appendix C.

3.1 |. Type A (≈ outcome regression estimators)

As PI-based analysis relies on the identification result $μ_{0 c}^{P I} (X) = μ_{0} (X)$ , an obvious sensitivity analysis technique (applicable to any PI-based method that involves estimating $μ_{0} (X)$ ) is to replace $μ_{0} (X)$ with the alternative formula for $μ_{0 c} (X)$ identified under the sensitivity assumption. We aim to use this technique with type A (roughly outcome regression) estimators.

To be precise, type A estimators involve estimating $μ_{0} (X)$ in order to first estimate the principal causal effect conditional on covariates (which under PI is $μ_{1 c} (X) - μ_{0} (X)$ ) or a proxy for it, and then aggregate these conditional effects to estimate the average principal causal effect $Δ_{c}^{P I}$ . Examples include the principal-score-weighted outcome-regression estimator (aka the plug-in estimator) (8) and the propensity-score-weighted outcome-regression estimator (9):

{\hat{Δ}}_{c, π μ}^{P I} : = \frac{\sum_{i = 1}^{n} {\hat{π}}_{c} (X_{i}) [{\hat{μ}}_{1 c} (X_{i}) - {\hat{μ}}_{0} (X_{i})]}{\sum_{i = 1}^{n} {\hat{π}}_{c} (X_{i})},

(8)

{\hat{Δ}}_{c, e μ}^{P I} : = \frac{\sum_{i = 1}^{n} \frac{Z_{i} I (C_{i} = c)}{\hat{e} (X_{i}, Z_{i})} [Y_{i} - {\hat{μ}}_{0} (X_{i})]}{\sum_{i = 1}^{n} \frac{Z_{i} I (C_{i} = c)}{\hat{e} (X_{i}, Z_{i})}},

(9)

where the hat notation indicates an estimated function. These are justified by the $τ_{1 c}$ formulae in (4) and the first two $τ_{0 c}^{P I}$ formulae in (7). Also included in type A is a multiply robust outcome regression estimator, ${\hat{Δ}}_{c, M S}^{P I}$ , which we will present after explaining type B estimators.

For each estimator here we highlight in red the component to be replaced in sensitivity analysis.

3.2 |. Type B (≈ influence function based estimators)

Type B estimators are a subset of estimators constructed based on the nonparametric influence function (IF) of $Δ_{c}^{P I}$ (hence the rough label IF-based estimators, although not all IF-based estimators belong in type B). To define this type precisely, let

ν_{z c} : = E [π_{c} (X) μ_{z c} (X)],

π_{c} : = E [π_{c} (X)],

ν_{0 c}^{P I} : = E [π_{c} (X) μ_{0} (X)] .

In this notation, $Δ_{c} = (ν_{1 c} - ν_{0 c}) / π_{c}$ and $Δ_{c}^{P I} = (ν_{1 c} - ν_{0 c}^{P I}) / π_{c}$ . A type B estimator of $Δ_{c}^{P I}$ is one that can be expressed as a combination of IF-based estimators of $ν_{1 c}, ν_{0 c}^{P I}$ and $π_{c}$ . The sensitivity analysis technique will be to replace the $ν_{0 c}^{P I}$ component with an IF-based estimator of $ν_{0 c}$ under the sensitivity assumption. To obtain these estimators, we derive the relevant IFs.

Proposition 3 (IFs for PI-based analysis).

The IFs of $π_{c}, ν_{1 c}, ν_{0 c}^{P I}$ , and $Δ_{c}^{P I}$ are

φ_{π_{c}} (O) = \frac{Z}{e (X, Z)} [I (C = c) - π_{c} (X)] + π_{c} (X) - π_{c},

(10)

φ_{ν_{1 c}} (O) = \frac{Z}{e (X, Z)} I (C = c) [Y - μ_{1 c} (X)] + \frac{Z}{e (X, Z)} μ_{1 c} (X) [I (C = c) - π_{c} (X)] + π_{c} (X) μ_{1 c} (X) - ν_{1 c},

(11)

φ_{ν_{0 c}^{P I}} (O) = \frac{1 - Z}{e (X, Z)} π_{c} (X) [Y - μ_{0} (X)] + \frac{Z}{e (X, Z)} μ_{0} (X) [I (C = c) - π_{c} (X)] + π_{c} (X) μ_{0} (X) - ν_{0 c}^{P I},

(12)

φ_{Δ_{c}^{P I}} (O) = \frac{1}{π_{c}} \{[φ_{ν_{1 c}} (O) + ν_{1 c}] - [φ_{ν_{0 c}^{P I}} (O) + ν_{0 c}^{P I}] - Δ_{c}^{P I} [φ_{π_{c}} (O) + π_{c}]\} .

(13)

The estimator that uses the IF of $Δ_{c}^{P I}$ (with estimated nuisances) as the estimating function is a type B estimator. This is because due to (13), this estimator has the form

{\hat{Δ}}_{c, I F}^{P I} = \frac{{\hat{ν}}_{1 c, I F} - {\hat{ν}}_{0 c, I F}^{P I}}{{\hat{π}}_{c, I F}},

(14)

where (with $P_{n}$ representing sample average)

{\hat{ν}}_{1 c, I F} : = P_{n} \{\frac{Z}{\hat{e} (X, Z)} I (C = c) [Y - {\hat{μ}}_{1 c} (X)] + \frac{Z}{\hat{e} (X, Z)} {\hat{μ}}_{1 c} (X) [I (C = c) - {\hat{π}}_{c} (X)] + {\hat{π}}_{c} (X) {\hat{μ}}_{1 c} (X)\}, {\hat{ν}}_{0 c, IF}^{P I} : = P_{n} \{\frac{1 - Z}{\hat{e} (X, Z)} {\hat{π}}_{c} (X) [Y - {\hat{μ}}_{0} (X)] + \frac{Z}{\hat{e} (X, Z)} {\hat{μ}}_{0} (X) [I (C = c) - {\hat{π}}_{c} (X)] + {\hat{π}}_{c} (X) {\hat{μ}}_{0} (X)\}, {\hat{π}}_{c, I F} : = P_{n} \{\frac{Z}{\hat{e} (X, Z)} [I (C = c) - {\hat{π}}_{c} (X)] + {\hat{π}}_{c} (X)\}

are IF-based estimators of $ν_{1 c}, ν_{0 c}^{P I}, π_{c}$ .

Another type B estimator is the Hájek-type ²³ estimator,

{\hat{Δ}}_{c, I F H}^{P I} = \frac{{\hat{ν}}_{1 c, I F H} - {\hat{ν}}_{0 c, I F H}^{P I}}{{\hat{π}}_{c, I F H}},

(15)

where ${\hat{ν}}_{1 c, IFH}$ , ${\hat{ν}}_{0 c, IFH}^{P I}$ , ${\hat{π}}_{c, IFH}$ are a modified version of ${\hat{ν}}_{1 c, IF}$ , ${\hat{ν}}_{0 c, IF}^{P I}$ , ${\hat{π}}_{c, IF}$ , replacing $\frac{Z}{\hat{e} (X, Z)}$ with $\frac{Z}{\hat{e} (X, Z)} / P_{n} [\frac{Z}{\hat{e} (X, Z)}]$ and $\frac{1 - Z}{\hat{e} (X, Z)}$ with $\frac{1 - Z}{\hat{e} (X, Z)} / P_{n} [\frac{1 - Z}{\hat{e} (X, Z)}]$ . (We call this modification Hájek-ization.)

${\hat{Δ}}_{c, I F}^{P I}$ and ${\hat{Δ}}_{c, I F H}^{P I}$ are multiply robust (see Proposition 4 below). ${\hat{Δ}}_{c, I F H}^{P I}$ is range-preserving.

Circling back to type A.

We now present the multiply robust outcome regression estimator ${\hat{Δ}}_{c, M S}^{P I}$ mentioned earlier. This is a multi-step estimator (the MS subscript is for “multi-step”) that is based on expressing the IF of $Δ_{c}^{P I}$ as a sum of three terms:

φ_{Δ_{c}^{P}} (O) = \frac{1}{π_{c}} {\overset{(*)}{\overset{⏞}{\frac{Z}{e (X, Z)} I (C = c) [Y - μ_{1 c} (X)]}} - \overset{(* *)}{\overset{⏞}{\frac{1 - Z}{e (X, Z)} π_{c} (X) [Y - μ_{0} (X)]}} + \underset{(* * *)}{\underset{⏟}{[\frac{Z}{e (X, Z)} [I (C = c) - π_{c} (X)] + π_{c} (X)] [μ_{1 c} (X) - μ_{0} (X) - Δ_{c}^{P I}]}}},

(16)

and building steps that zero out the sample means of the terms. The resulting estimator is

{\hat{Δ}}_{c, M S}^{P I} : = \frac{\sum_{i = 1}^{n} \hat{w} (O_{i}) [{\tilde{μ}}_{1 c} (X_{i}) - {\tilde{μ}}_{0} (X_{i})]}{\sum_{i = 1}^{n} \hat{w} (O_{i})} .

(17)

Here $\hat{w} (O) : = \frac{Z}{\hat{e} (X, Z)} [I (C = c) - {\hat{π}}_{c} (X)] + {\hat{π}}_{c} (X)$ . ${\tilde{μ}}_{1 c} (X)$ and ${\tilde{μ}}_{0} (X)$ are specific estimators of $μ_{1 c} (X)$ and $μ_{0} (X) : {\tilde{μ}}_{1 c} (X)$ is fit to (non)compliers in the treatment arm weighted by $1 / \hat{e} (X, 1)$ , ${\tilde{μ}}_{0} (X)$ is fit to control units weighted by ${\hat{π}}_{c} (X) / \hat{e} (X, 0)$ , and both are mean-recovering models (i.e., on the sample to which the model is fit, the mean of model predictions equals outcome mean). These models zero out the sample means of $(*)$ and $(* *)$ , and the weighted averaging in 17 zeros out the sample mean of $(* * *)$ . $\hat{w} (O)$ can also be Hájek-ized, for another version.)

Remark 2.

The tilde notation here refers to this specific method of estimating $μ$ functions for this estimator. The weighting targets the model to the relevant covariate space where it is used for prediction, and the mean-recovering feature ensures that predictions are on average unbiased (if the weights are correct). This targeted estimation technique can also be used (but is not required) for estimating $μ_{1 c} (X)$ and $μ_{0} (X)$ for other estimators, and for estimating $π_{c} (X)$ .

${\hat{Δ}}_{c, MS}^{P I}$ shares the same multiply robust property of ${\hat{Δ}}_{c, IF}^{P I}$ and ${\hat{Δ}}_{c, IFH}^{P I}$ (see Proposition 4).

Proposition 4 (multiply robust PI-based estimators).

${\hat{Δ}}_{c, I F}^{P I}, {\hat{Δ}}_{c, I F H}^{P I}$ and ${\hat{Δ}}_{c, M S}^{P I}$ are consistent if one of the following three conditions hold:

the propensity score $e (X, Z)$ and principal score $π_{c} (X)$ models are correctly specified; or
the principal score model $π_{c} (X)$ and both outcome models $μ_{1 c} (X)$ , $μ_{0} (X)$ are correctly specified; or
the propensity score model $e (X, Z)$ and the outcome under control $μ_{0} (X)$ model are correctly specified.

For simplicity, we presume that estimation uses parametric models. We leave IF-based inference using data-adaptive nuisance estimation to future work, except for a small first step of deriving nonparametric rate conditions (see Section 6.3).

3.3 |. Type C (≈ other/weighting estimators)

Type C estimators do not involve estimating $μ_{0} (X)$ as a step in the estimation procedure. This type includes the pure weighting estimator

{\hat{Δ}}_{c, e π}^{P I} : = \frac{\sum_{i = 1}^{n} \frac{Z_{i} I (C_{i} = c)}{\hat{e} (X_{i}, Z_{i})} Y_{i}}{\sum_{i = 1}^{n} \frac{Z_{i} I (C_{i} = c)}{\hat{e} (X_{i}, Z_{i})}} - \frac{\sum_{i = 1}^{n} \frac{(1 - Z_{i}) {\hat{π}}_{c} (X_{i})}{\hat{e} (X_{i}, Z_{i})} Y_{i}}{\sum_{i = 1}^{n} \frac{(1 - Z_{i}) {\hat{π}}_{c} (X_{i})}{\hat{e} (X_{i}, Z_{i})}},

(18)

justified by the second $τ_{1 c}$ formula in (4) and the third $τ_{0 c}^{P I}$ formula in (7). Also included in type C is the estimator that employs this same weighting scheme and uses the weighted sample to fit a model regressing outcome on treatment and covariates, say, to improve precision in estimating $Δ_{c}^{P I}$ (in the spirit of^24,25). For this type, we do not have a specific sensitivity analysis technique in mind, and will need to see whether the identification result under the sensitivity assumption allows a simple modification.

To sum up, we have defined three types of PI-based estimators: type A, whose defining feature is involving $μ_{0} (X)$ estimation; type B, whose defining feature is having as a component an IF-based estimator of $ν_{0 c}^{P I}$ , and type C, other estimators. We now consider sensitivity analysis.

4 |. SENSITIVITY ANALYSIS BASED ON THREE RATIO-TYPE SENSITIVITY PARAMTERS

Recall that the challenge before invoking PI was that the stratum-specific conditional $Y_{0}$ means $μ_{01} (X)$ and $μ_{00} (X)$ are not identified, as they are two unknowns in the mixture equation

μ_{01} (X) π_{1} (X) + μ_{00} (X) π_{0} (X) = μ_{0} (X) .

(5)

PI identifies $μ_{01} (X)$ and $μ_{00} (X)$ by equating them to each other. A sensitivity analysis replaces PI with a sensitivity assumption that allows $μ_{01} (X)$ and $μ_{00} (X)$ to differ from each other. The assumption is indexed by a sensitivity parameter indicating how and to what degree they differ. To accommodate different outcome types (binary, bounded, unbounded) and different conceptualizations of how $μ_{01} (X)$ and $μ_{00} (X)$ may differ, we consider different parameterizations. The following assumptions use an odds ratio (OR), a generalized odds ratio (GOR) and a mean ratio (MR) sensitivity parameter. In all of them, $ρ = 1$ recovers the PI case.

A4-OR (sensitivity odds ratio): $\frac{μ_{01} (X) / [1 - μ_{01} (X)]}{μ_{00} (X) / [1 - μ_{00} (X)]} = ρ$ ,

A4-GOR (sensitivity generalized odds ratio): $\frac{[μ_{01} (X) - l] / [h - μ_{01} (X)]}{[μ_{00} (X) - l] / [h - μ_{00} (X)]} = ρ$ , where $l, h$ are the lower and upper $Y_{0}$ bounds,

A4-MR (sensitivity mean ratio): $\frac{μ_{01} (X)}{μ_{00} (X)} = ρ$ ,

for some positive range of $ρ$ that is considered plausible.

As mentioned in Section 1.2, a challenge with A4-MR is that it may predict out of the outcome range. For an example, consider an outcome on a 0 to 7 scale. Suppose that for some covariate value $x, μ_{0} (X) = 5$ and $π_{1} (X) = 0.3$ . Then a sensitivity MR value of 1.69 would imply $μ_{01} (X) > 7$ . A4-MR is thus more suitable if the outcome is single-signed and unbounded. Since most outcomes are practically bounded, if using A4-MR, the parameter range should be carefully selected to avoid predicting extreme $μ_{0 c} (X)$ values; we will discuss this in Section 6.1.

For binary outcomes, we propose A4-OR, the assumption that within levels of $X$ (i) the odds of the outcome for compliers is $ρ$ times that for noncompliers, or equivalently (because ORs are symmetric), (ii) the odds of being a complier for those with the outcome is $ρ$ times that for those without the outcome. A4-OR predicts $μ_{0 c} (X)$ within [0, 1].

More generally, for outcomes bounded on both ends, we propose A4-GOR, a generalization of A4-OR. (A4-OR is a special case with $l = 0$ and $h = 1$ .) Figure 1 shows the connection between $μ_{00} (X)$ and $μ_{01} (X)$ for several GOR values. If the outcome range varies with $X$ , the bounds can be made $X$ -value-specific, i.e., $l (X)$ and $h (X)$ . A4-GOR always predicts $μ_{0 c} (X)$ within the specified bounds. For a non-binary outcome, however, A4-GOR may still contradict with the observed outcome distribution in ways that are not obvious, e.g., predicting $μ_{0 c} (X)$ values far from where the outcome mass is concentrated.

Connection between $μ_{00} (X)$ and $μ_{01} (X)$ under A4-GOR for different GOR values

Remark 3 (Exponential tilting connection).

A4-OR can be equivalently expressed as

P (Y_{0} ∣ X, C = 1) = P (Y_{0} ∣ X, C = 0) \frac{e x p ((l n ρ) Y_{0})}{E [e x p ((l n ρ) Y_{0}) ∣ X, C = 0]},

(19)

which looks like exponential tilting assumptions used in the context of non-ignorable missingness and unobserved confounding^26,27,28. The difference is that in these other problems, the assumption connects an unobserved distribution (e.g., that of missing data) to an observed distribution (that of non-missing data), whereas here the assumption relates two otherwise unidentified distributions whose mixture (and mixing ratio) is identified. Here the tilting-like assumption (19) achieves identification with a binary outcome but not generally. If $Y_{0}$ is continuous, for example, 19 (combined with the mixing weights $π_{c} (X)$ ) is not sufficient to identify the component distributions $P (Y_{0} ∣ X, C = c)$ (or their means) based on the mixture distribution $P (Y_{0} ∣ X)$ .

Proofs of all results in this section are provided in Web Appendix D.

4.1 |. Identification

Combining any of the A4- assumptions with (5), we can identify $μ_{0 c} (X)$ , which then identifies $τ_{0 c}$ . We present results for A4-GOR (which includes A4-OR as a special case) and A4-MR.

To maintain symmetry, let $ρ_{1} = ρ, ρ_{0} = 1 / ρ$ .

Proposition 5 (GOR- and MR-based identification).

Under assumptions A0-A2 combined with A4-GOR,

μ_{0 c} (X) = \{\begin{array}{l} \frac{α_{c} (X) - β_{c} (X)}{2 (ρ_{c} - 1) π_{c} (X)} (h - l) + l & if ρ_{c} \neq 1 \\ μ_{0} (X) & if ρ_{c} = 1 \end{array} = : μ_{0 c}^{G O R} (X),

(20)

and under assumptions A0-A2 combined with A4-MR,

μ_{0 c} (X) = γ_{c} (X) μ_{0} (X) = : μ_{0 c}^{M R} (X),

(21)

where

α_{c} (X) : = [π_{c} (X) + \frac{μ_{0} (X) - l}{h - l}] (ρ_{c} - 1) + 1, β_{c} (X) : = \sqrt{{[α_{c} (X)]}^{2} - 4 π_{c} (X) \frac{μ_{0} (X) - l}{h - l} ρ_{c} (ρ_{c} - 1),} γ_{c} (X) : = \frac{ρ_{c}}{(ρ_{c} - 1) π_{c} (X) + 1} .

Identification of $ν_{0 c}, τ_{0 c}, Δ_{c}$ follows from $μ_{0 c} (X)$ identification. We will label the results of these parameters under A4-GOR and A4-MR with superscripts ^GOR and ^MR, respectively.

4.2 |. Estimation

Based on the above identification results, we now modify the PI-based estimators. We let each resulting estimator inherit the label of the originating estimator, except replacing the superscript ^PI with one indicating the sensitivity assumption.

Figure 2 provides a summary of the key techniques presented here and in the next section.

Flowchart summarizing key sensitivity analysis techniques that are applicable given PI-based estimator type and sensitivity parameterization

4.2.1 |. Type A estimators

These estimators are adapted by replacing the estimate of $μ_{0} (X)$ with estimates of $μ_{0 c}^{G O R} (X)$ or $μ_{0 c}^{M R} (X)$ . For example, this turns the principal score weighted outcome regression estimator ${\hat{Δ}}_{c, π μ}^{P I}$ (8) (aka the plug-in estimator) into

{\hat{Δ}}_{c, π μ}^{G O R} : = \frac{\sum_{i = 1}^{n} {\hat{π}}_{c} (X_{i}) [{\hat{μ}}_{1 c} (X_{i}) - {\hat{μ}}_{0 c}^{G O R} (X_{i})]}{\sum_{i = 1}^{n} {\hat{π}}_{c} (X_{i})}, {\hat{Δ}}_{c, π μ}^{M R} : = \frac{\sum_{i = 1}^{n} {\hat{π}}_{c} (X_{i}) [{\hat{μ}}_{1 c} (X_{i}) - {\hat{μ}}_{0 c}^{M R} (X_{i})]}{\sum_{i = 1}^{n} {\hat{π}}_{c} (X_{i})},

(22)

where ${\hat{μ}}_{0 c}^{G O R} (X_{i})$ and ${\hat{μ}}_{0 c}^{M R} (X_{i})$ are $μ_{0 c}^{G O R} (X_{i})$ (20) and $μ_{0 c}^{M R} (X_{i})$ (21) evaluated at ${\hat{μ}}_{0} (X_{i})$ and ${\hat{π}}_{c} (X_{i})$ . The other outcome-regression estimators ${\hat{Δ}}_{c, e μ}^{P I}$ (9) and ${\hat{Δ}}_{c, M S}^{P I}$ (17) are adapted similarly.

4.2.2 |. Type B estimators

Adaptation is based on the IFs of $ν_{0 c}^{G O R} : = E [π_{c} (X) μ_{0 c}^{G O R} (X)]$ and $ν_{0 c}^{M R} : = E [π_{c} (X) μ_{0 c}^{M R} (X)]$ , which are provided in Proposition 6.

Proposition 6 (GOR- and MR-based IFs).

The IFs for $ν_{0 c}^{G O R}$ and $ν_{0 c}^{M R}$ are

φ_{ν_{0 c}^{G O R}} (O) = \frac{1 - Z}{e (X, Z)} ϵ_{μ, c}^{G O R} (X) [Y - μ_{0} (X)] + \frac{Z}{e (X, Z)} ϵ_{π, c}^{G O R} (X) [I (C = c) - π_{c} (X)] + π_{c} (X) μ_{0 c}^{G O R} (X) - ν_{0 c}^{G O R},

(23)

φ_{ν_{0 c}^{M R}} (O) = \frac{1 - Z}{e (X, Z)} ϵ_{μ, c}^{M R} (X) [Y - μ_{0} (X)] + \frac{Z}{e (X, Z)} ϵ_{π, c}^{M R} (X) [I (C = c) - π_{c} (X)] + π_{c} (X) μ_{0 c}^{M R} (X) - ν_{0 c}^{M R},

(24)

where

\begin{array}{l} ϵ_{μ, c}^{G O R} (X) : = {\begin{cases} \frac{1}{2} - \frac{α_{c} (X)}{2 β_{c} (X)} + \frac{ρ_{c} π_{c} (X)}{β_{c} (X)} & if ρ_{c} \neq 1 \\ π_{c} (X) & if ρ_{c} = 1 \end{cases}, & ϵ_{μ, c}^{M R} (X) : = γ_{c} (X) π_{c} (X), \\ ϵ_{π, c}^{G O R} (X) : = {\begin{cases} [\frac{1}{2} - \frac{α_{c} (X)}{2 β_{c} (X)} + \frac{ρ_{c} \frac{μ_{0} (X) - l}{h - l}}{β_{c} (X)}] (h - l) + l & if ρ_{c} \neq 1 \\ μ_{0} (X) & if ρ_{c} = 1 \end{cases}, & ϵ_{π, c}^{M R} (X) : = γ_{1} (X) γ_{0} (X) μ_{0} (X) . \end{array}

Based on Proposition 6, under A4-GOR and A4-MR, we obtain estimators ${\hat{Δ}}_{c, I F}^{G O R}$ and ${\hat{Δ}}_{c, IF}^{M R}$ by replacing the ${\hat{ν}}_{0 c, I F}^{P I}$ component of ${\hat{Δ}}_{c, IF}^{P I} : = \frac{{\hat{ν}}_{1 c, I F} - {\hat{ν}}_{0 c, I F}^{P}}{{\hat{δ}}_{c, I F}}$ (14) with ${\hat{ν}}_{0 c, I F}^{G O R}$ and ${\hat{ν}}_{0 c, I F}^{M R}$ , respectively, where

{\hat{ν}}_{0 c, I F}^{G O R} : = P_{n} \{\frac{1 - Z}{\hat{e} (X, Z)} {\hat{ϵ}}_{μ, c}^{G O R} (X) [Y - {\hat{μ}}_{0} (X)] + \frac{Z}{\hat{e} (X, Z)} {\hat{ϵ}}_{π, c}^{G O R} (X) [I (C = c) - {\hat{π}}_{c} (X)] + {\hat{π}}_{c} (X) {\hat{μ}}_{0 c}^{G O R} (X)\},

(25)

{\hat{ν}}_{0 c, I F}^{M R} : = P_{n} \{\frac{1 - Z}{\hat{e} (X, Z)} {\hat{ϵ}}_{μ, c}^{M R} (X) [Y - {\hat{μ}}_{0} (X)] + \frac{Z}{\hat{e} (X, Z)} {\hat{ϵ}}_{π, c}^{M R} (X) [I (C = c) - {\hat{π}}_{c} (X)] + {\hat{π}}_{c} (X) {\hat{μ}}_{0 c}^{M R} (X)\},

(26)

and the $ϵ$ functions are estimated by evaluating them at ${\hat{π}}_{c} (X)$ and ${\hat{μ}}_{0} (X)$ .

${\hat{Δ}}_{c, I F H}^{P I}$ (15) is modified similarly to obtain sensitivity estimators ${\hat{Δ}}_{c, I F H}^{G O R}$ and ${\hat{Δ}}_{c, I F H}^{M R}$ by replacing ${\hat{ν}}_{0 c, I F H}^{P I}$ with ${\hat{ν}}_{0 c, I F H}^{G O R}$ and ${\hat{ν}}_{0 c, I F H}^{M R}$ , the Hájek-ized version of ${\hat{ν}}_{0 c, IF}^{G O R}$ and ${\hat{ν}}_{0 c, I F}^{M R}$ .

4.2.2.1 |. Partial loss of robustness.

Proposition 4 stated that several PI-based estimators are multiply robust, including type B estimators ${\hat{Δ}}_{c, I F}^{P I}$ (14) and ${\hat{Δ}}_{c, I F H}^{P I}$ (15), and the multi-step type A estimator ${\hat{Δ}}_{c, M S}^{P I}$ (17). The adaptation of these estimators for sensitivity analysis results in partial loss of robustness (see Proposition 7). The resulting GOR-based estimators ( ${\hat{Δ}}_{c, I F}^{G O R}, {\hat{Δ}}_{c, I F H}^{G O R}, {\hat{Δ}}_{c, M S}^{G O R}$ ) depend on correct specification of models for $π_{c} (X)$ and $μ_{0} (X)$ (i.e., they are inconsistent if either model is misspecified). The MR-based counterparts ( ${\hat{Δ}}_{c, I F}^{M R}, {\hat{Δ}}_{c, I F H}^{M R}, {\hat{Δ}}_{c, M S}^{M R}$ ) depend on correct specification of the model for $π_{c} (X)$ .

Proposition 7 (Partial loss of robustness).

${\hat{Δ}}_{c, I F}^{G O R}, {\hat{Δ}}_{c, I F H}^{G O R}$ and ${\hat{Δ}}_{c, M S}^{G O R}$ are consistent for $Δ_{c}^{G O R}$ if

both the model for $π_{c} (X)$ and the model for $μ_{0} (X)$ are correctly specified, AND
either the model for $e (X, Z)$ or the model for $μ_{1 c} (X)$ is correctly specified.
${\hat{Δ}}_{c, I F}^{M R}, {\hat{Δ}}_{c, I F H}^{M R}$ and ${\hat{Δ}}_{c, M S}^{M R}$ are consistent for $Δ_{c}^{M R}$ if
the model for $π_{c} (X)$ is correctly specified, AND
either the model for $e (X, Z)$ or both outcome models $μ_{1 c} (X)$ , $μ_{0} (X)$ are correctly specified.

Remark 4 (Approximate robustness).

Among these sensitivity estimators, the type B estimators ( ${\hat{Δ}}_{c, I F}^{G O R}, {\hat{Δ}}_{c, I F H}^{G O R}, {\hat{Δ}}_{c, I F}^{M R}, {\hat{Δ}}_{c, I F H}^{M R}$ ) are in a sense more robust than the multi-step type A estimators ( ${\hat{Δ}}_{c, M S}^{G O R}, {\hat{Δ}}_{c, M S}^{M R}$ ): they have an approximate robustness property with respect to the model component(s) whose correct specification they require for consistency. Specifically, (i) while all six estimators depend on a correct model for $π_{c} (X)$ , the type B estimators provide a first-order correction of the bias (that would be incurred if simply using the plug-in estimator 22) due to the deviation of the probability limit $π_{c}^{†} (X)$ of ${\hat{π}}_{c} (X)$ from the true function $π_{c} (X)$ . Also, (ii) while all three GOR-based estimators additionally depend on a correct model for $μ_{0} (X)$ , the type B estimators provide a first-order correction of the bias due to the deviation of the probability limit $μ_{0}^{†} (X)$ of ${\hat{μ}}_{0} (X)$ from the true function $μ_{0} (X)$ . (This first-order bias correction feature is also shared by the originating PI-based estimators ${\hat{Δ}}_{c, M S}^{P I}, {\hat{Δ}}_{c, I F}^{P I}, {\hat{Δ}}_{c, I F H}^{P I}$ , and results in the robustness of those estimators.)

We give a quick explanation of (ii) to make this concrete. (For full details concerning Remark 4, see the Web Appendix.) If $\hat{e} (X, Z)$ and ${\hat{π}}_{c} (X)$ are correctly specified but ${\hat{μ}}_{0} (X)$ is not, the probability limit of both ${\hat{ν}}_{0 c, I F}^{G O R}$ and ${\hat{ν}}_{0 c, I F H}^{G O R}$ is the sum of two terms

E \{π_{c} (X) μ_{0 c}^{G O R} [μ_{0}^{†} (X), π_{c} (X)]\} + E \{ϵ_{μ, c}^{G O R} [μ_{0}^{†} (X), π_{c} (X)] [μ_{0} (X) - μ_{0}^{†} (X)]\}

(27)

(which result from the last and first terms in (25)). These are the first two terms in the Taylor expansion of the true parameter $ν_{0 c}^{GOR}$ treated as a function of $μ_{0} ()$ at the point $μ_{0}^{†} ()$ . The first term coincides with the probability limit of the plug-in estimator, which is biased due to $μ_{0}^{†} (X) \neq μ_{0} (X)$ . The second term provides a first-order correction of this bias. For this approximate robustness property to be beneficial, however, $μ_{0}^{†} (X)$ needs to be close to $μ_{0} (X)$ .

4.2.3 |. Type C estimators

We consider A4-MR and A4-GOR separately. Under A4-MR, the convenient form of $μ_{0 c}^{M R} (X)$ (21) allows a simple adaptation of type C estimators: to estimate $Δ_{c}^{M R}$ , scale the outcome in control units by a factor of ${\hat{γ}}_{c} (X)$ then use the PI-based analysis method. For the pure weighting estimator specifically, this adaptation results in the estimator

{\hat{Δ}}_{c, e π}^{M R} : = \frac{\sum_{i = 1}^{n} \frac{Z_{i} I (C_{i} = c)}{\hat{e} (X_{i}, Z_{i})} Y_{i}}{\sum_{i = 1}^{n} \frac{Z_{i} I (C_{i} = c)}{\hat{e} (X_{i}, Z_{i})}} - \frac{\sum_{i = 1}^{n} \frac{(1 - Z_{i}) {\hat{π}}_{c} (X_{i})}{\hat{e} (X_{i}, Z_{i})} {\hat{γ}}_{c} (X_{i}) Y_{i}}{\sum_{i = 1}^{n} \frac{(1 - Z_{i}) {\hat{π}}_{c} (X_{i})}{\hat{e} (X_{i}, Z_{i})}} .

This outcome scaling technique is justified by the result below, a corollary of Proposition 5

Corollary 1 (MR-based outcome scaling).

τ_{0 c}^{M R} = \frac{E [\frac{1 - Z}{e (X, Z)} π_{c} (X) γ_{c} (X) Y]}{E [\frac{1 - Z}{e (X, Z)} π_{c} (X)]} .

(28)

Remark 5.

When specializing to the randomized treatment setting, 28p simplifies, and one expression of the specialized version of 28 is $E [\frac{π_{c} (X) γ_{c} (X) Y}{π_{c}} | Z = 0]$ , which appeared in Ding and Lu (2017, proposition 3)¹³. Based on this expression, this prior work characterizes the MR-based sensitivity analysis as an under/overweighting of the principal score by a factor of $γ_{c} (X)$ . Interestingly, this characterization breaks the interpretation of $τ_{0 c}$ as a weighted average (our starting point in Lemma 1, which we have maintained throughout). Our new insight here is that the appearance of $γ_{c} (X)$ in (28) is due to the fact that under A4-MR the outcome mean $μ_{0 c} (X)$ is identified by $γ_{c} (X) μ_{0} (X)$ . It is thus natural to use the scaling the outcome by a factor of $γ_{c} (X)$ characterization. Also, by leaving the principal score weights alone, this outcome scaling technique applies to type C estimators generally, not just the pure weighting estimator.

Under A4-GOR, there is no result similar to (28) that separates $Y$ from functions of $X$ , therefore no simple modification is available for type C estimators. The pure weighting estimator ${\hat{Δ}}_{c, e π}^{P I}$ (9) (but not type C generally) can be adapted by replacing $Y$ in the second term with an estimate of $μ_{0 c}^{G O R} (X)$ (which requires estimating $μ_{0} (X)$ ). For this estimator to reduce to ${\hat{Δ}}_{c, e π}^{P I}$ when $ρ = 1, μ_{0} (X)$ has to be estimated by a ${\tilde{μ}}_{0} (X)$ model (defined in 17). However, with $μ_{0} (X)$ estimated, there are other options for estimating $Δ_{c}^{GOR}$ that one might prefer to such modification, e.g., replacing the whole second term of ${\hat{Δ}}_{c, e π}^{P I}$ with $\frac{\sum_{i = 1}^{n} \frac{Z_{i} ((C_{i} = c)}{\hat{e} (X_{i}, Z_{i})} {\hat{μ}}_{0 c}^{G O R}}{\sum_{i = 1}^{n} \frac{Z_{i} I (C_{i} = c)}{\hat{e} (X_{i}, Z_{i})}}$ . This obtains the type A estimator ${\hat{Δ}}_{c, e μ}^{G O R}$ , which inconveniently does not reduce to ${\hat{Δ}}_{c, e π}^{P I}$ when $ρ = 1$ . Hence this is one place where we break the convention of respecting the primacy of the main analysis and recommend that, if a GOR-based sensitivity analysis is to be conducted, a type A (or type B) estimator be used for the main analysis.

5 |. SENSITIVITY ANALYSIS BASED ON A DIFFERENCE-TYPE SENSITIVITY PARAMETER

A4-OR, A4-GOR and A4-MR all assume that the means of $Y_{0}$ differ between compliers and noncompliers in some multiplicative manner. If one believes the difference is additive, it is more appropriate to use a sensitivity parameter that involves $μ_{01} (X) - μ_{00} (X)$ . We propose using a standardized mean difference (SMD). For convenient notation, let

σ_{0 c}^{2} (X) : = v a r (Y_{0} ∣ X, C = c), σ_{0}^{2} (X) : = v a r (Y∣ X, Z = 0) .

A simple SMD-based assumption is

A4-SMD: \frac{μ_{01} (X) - μ_{00} (X)}{\sqrt{[σ_{01}^{2} (X) + σ_{00}^{2} (X)] / 2}} = η, for a plausible range of η .

The denominator here is an “average” standard deviation: the quadratic mean of $σ_{01} (X)$ and $σ_{00} (X)$ (the within-stratum conditional standard deviations of $Y_{0}$ ). This standard deviation scale helps in selecting a range for $η$ and users can tap into intuition about SMDs from other contexts (e.g., measuring effect size ²⁹ or covariate imbalance³⁰. $η = 0$ recovers the PI case; $η = \pm 1$ indicates a substantial complier-noncomplier difference in the outcome under control.

Inconveniently, $A 4 - S M D$ combined with A0-A2 only partially identifies $μ_{0 c} (X)$ . For a simple sensitivity analysis, we consider the stronger assumption below, which supplements $A 4 - S M D$ with an equal variance assumption:

A 4 - S M D e (s e n s i t i v i t y S M D, e q u a l v a r i a n c e) : A 4 - S M D a n d σ_{01}^{2} (X) = σ_{00}^{2} (X) .

5.1 |. Identification

For symmetry, let $η_{1} = η$ and $η_{0} = - η$ .

Proposition 8 (SMDe-based identification).

Under A0-A2 combined with A4-SMDe,

μ_{0 c} (X) = μ_{0} (X) + η_{c} \underset{= : λ_{c} (X)}{\underset{⏟}{\frac{π_{1 - c} (X) σ_{0} (X)}{\sqrt{1 + η^{2} π_{1} (X) π_{0} (X)}} = : μ_{0 c}^{S M D e} (X),}}

(29)

τ_{0 c} = τ_{0 c}^{P I} + η_{c} \underset{= : ξ_{c}}{\underset{⏟}{\frac{E [π_{c} (X) λ_{c} (X)]}{E [π_{c} (X)]}}} = : τ_{0 c}^{S M D e},

(30)

Δ_{c} = Δ_{c}^{P I} - η_{c} ξ_{c} = : Δ_{c}^{S M D e} .

(31)

If equal variance is not assumed, $Δ_{c}$ is not point identified, but bounds can be obtained. The bounds can be narrowed if one additionally assumes that $σ_{01}^{2} (X)$ and $σ_{00}^{2} (X)$ differ from each other by less than a certain factor (see Proposition 8 b in the Web Appendix).

5.2 |. Estimation

This sensitivity analysis requires estimating $σ_{0}^{2} (X)$ . For simplicity, in the illustration we use a quasi-likelihood approach assuming the outcome’s conditional variance is proportional to a function of its mean. An alternative is to directly model ${[Y - {\hat{μ}}_{0} (X)]}^{2}$ based on $X$ in control units.

With the simple result 31, each estimator of $Δ_{c}^{S M D e}$ we obtain is an estimator of $Δ_{c}^{P I}$ minus $η_{c}$ times an estimator of $ξ_{c}$ . This is the case regardless of the type of the PI-based estimator.

5.2.1 |. Simple type A estimators

Adaptation of ${\hat{Δ}}_{c, π μ}^{P I}$ (8) and ${\hat{Δ}}_{c, e μ}^{P I}$ (9) by replacing $μ_{0} (X)$ with $μ_{0 c}^{S M D e} (X)$ yields the following estimators:

{\hat{Δ}}_{c, π μ}^{S M D e} : = {\hat{Δ}}_{c, π μ}^{P I} - η_{c} \underset{= : {\hat{ξ}}_{c, π σ}}{\underset{⏟}{\frac{\sum_{i = 1}^{n} \frac{{\hat{π}}_{1} (X_{i}) {\hat{π}}_{0} (X_{i}) \sqrt{{\hat{σ}}_{0}^{2} (X_{i})}}{\sqrt{1 + η^{2} {\hat{π}}_{1} (X_{i}) {\hat{π}}_{0} (X_{i})}}}{\sum_{i = 1}^{n} {\hat{π}}_{c} (X_{i})}}},

(32)

{\hat{Δ}}_{c, e μ}^{S M D e} : = {\hat{Δ}}_{c, e μ}^{P I} - η_{c} \underset{= : {\hat{ξ}}_{c, e π σ}}{\underset{⏟}{\frac{\sum_{i = 1}^{n} \frac{Z_{i} (C_{i} = c)}{\hat{e} (X_{i}, Z_{i})} \frac{{\hat{π}}_{1 - c} (X) \sqrt{{\hat{σ}}_{0}^{2} (X)}}{\sqrt{1 + η^{2} {\hat{π}}_{1} (X) {\hat{π}}_{0} (X)}}}{\sum_{i = 1}^{n} \frac{Z_{i} I (C_{i} = c)}{\hat{e} (X_{i}, Z_{i})}}}} .

(33)

Rather than applying the same adaptation to the multi-step estimator ${\hat{Δ}}_{c, M S}^{P I}$ (17), thanks to the special form of $Δ_{c}^{S M D e}$ , we can adapt ${\hat{Δ}}_{c, M S}^{P I}$ the way we adapt other IF-based estimators.

5.2.2 |. IF-based estimators (including type B and multi-robust type A)

We adapt these estimators using IF-based estimators of $ξ_{c}$ . Let $ϑ (X) : = \frac{π_{1} (X) π_{0} (X) σ_{0} (X)}{\sqrt{1 + η^{2} π_{1} (X) π_{0} (X)}}$ and $ϑ : = E [ϑ (X)]$ . Then $ξ_{c} = ϑ / π_{c}$ .

Proposition 9 (SMDe-based IF).

The IFs of $ϑ$ and $ξ_{c}$ are

φ_{ϑ} (O) = \frac{1 - Z}{e (X, Z)} {\dot{ϑ}}_{σ^{2}} (X) {{[Y - μ_{0} (X)]}^{2} - σ_{0}^{2} (X)} + \frac{Z}{e (X, Z)} {\dot{ϑ}}_{π} (X) [C - π_{1} (X)] + ϑ (X) - ϑ,

(34)

φ_{ξ_{c}} (O) = \frac{1}{π_{c}} \{[φ_{ϑ} (O) + ϑ] - ξ_{c} [φ_{π_{c}} (O) + π_{c}]\},

(35)

where

{\dot{ϑ}}_{σ^{2}} (X) : = \frac{π_{1} (X) π_{0} (X)}{2 σ_{0} (X) \sqrt{1 + η^{2} π_{1} (X) π_{0} (X)}}, {\dot{ϑ}}_{π} (X) : = \frac{[π_{0} (X) - π_{1} (X)] [2 + η^{2} π_{1} (X) π_{0} (X)] σ_{0} (X)}{2 {[1 + η^{2} π_{1} (X) π_{0} (X)]}^{3 / 2}} .

Based on Proposition 9, we have the estimator

{\hat{ξ}}_{c, I F} : = {\hat{ϑ}}_{I F} / {\hat{π}}_{c, I F},

(36)

where

{\hat{ϑ}}_{I F} : = P_{n} [\frac{1 - Z}{\hat{e} (X, Z)} {\hat{\dot{ϑ}}}_{σ^{2}} (X) {{[Y - {\hat{μ}}_{0} (X)]}^{2} - {\hat{σ}}_{0}^{2} (X)} + \frac{Z}{\hat{e} (X, Z)} {\hat{\dot{ϑ}}}_{π} (X) [C - {\hat{π}}_{1} (X)] + \hat{ϑ} (X)]

is the IF-based estimator of $ϑ$ (where $ϑ (X)$ , ${\dot{ϑ}}_{σ^{2}} (X)$ and ${\dot{ϑ}}_{π} (X)$ are estimated by plugging in ${\hat{π}}_{1} (X)$ , ${\hat{π}}_{0} (X)$ and $\sqrt{{\hat{σ}}_{0}^{2} (X)}$ ), and ${\hat{π}}_{c, IF}$ is the IF-based estimator of $π_{c}$ (defined under (14)). In addition, we have the estimator based on Hájek-ized versions of ${\hat{ϑ}}_{IF}$ and ${\hat{π}}_{c, IF}$ ,

{\hat{ξ}}_{c, I F H} : = {\hat{ϑ}}_{I F H} / {\hat{π}}_{c, I F H} .

(37)

Then the adapted IF-based estimators are

{\hat{Δ}}_{c, I F}^{S M D e} : = {\hat{Δ}}_{c, I F}^{P I} - η_{c} {\hat{ξ}}_{c, I F},

(38)

{\hat{Δ}}_{c, I F H}^{S M D e} : = {\hat{Δ}}_{c, I F H}^{P I} - η_{c} {\hat{ξ}}_{c, I F H},

(39)

{\hat{Δ}}_{c, M S}^{S M D e} : = {\hat{Δ}}_{c, M S}^{P I} - η_{c} {\hat{ξ}}_{c, I F} .

(40)

Remark 6.

${\hat{ξ}}_{c, IF}$ and ${\hat{ξ}}_{c, IFH}$ depend on consistent estimation of $π_{c} (X)$ and $σ_{0}^{2} (X)$ (they are inconsistent if either component is inconsistent), but they have the approximately robust property where (i) if ${\hat{π}}_{1} (X)$ , ${\hat{μ}}_{0} (X)$ and $\hat{e} (X, Z)$ are consistent but ${\hat{σ}}_{0}^{2} (X)$ is not, the estimator provides a first-order correction of the bias of the plug-in estimator due to the deviation of the probability limit $σ_{0}^{2^{†}} (X)$ of ${\hat{σ}}_{0}^{2} (X)$ from the true $σ_{0}^{2} (X)$ ; and (ii) if ${\hat{μ}}_{0} (X)$ , ${\hat{σ}}_{0}^{2} (X)$ and $\hat{e} (X, Z)$ are consistent but ${\hat{π}}_{c} (X)$ is not, the estimator provides a first-order correction of the bias due to the deviation of the probability limit $π_{c}^{†} (X)$ of ${\hat{π}}_{c} (X)$ from the true $π_{c} (X)$ . (See details in the Web Appendix.)

5.2.3 |. Other estimators

While any PI-based estimator can be paired with any $ξ_{c}$ estimator, to keep things simple it is reasonable to pair non-IF-based estimators with either ${\hat{ξ}}_{c, π σ}$ (32) or ${\hat{ξ}}_{c, e π σ}$ (33), which are not IF-based. As outcome modeling is needed to estimate $ξ_{c}$ for the sensitivity analysis, however, we recommend switching to a type A or IF-based estimator for the PI-based main analysis.

6 |. OTHER TOPICS

6.1 |. Using data in considering the range of the MR and SMD parameters

We now return to the issue that certain sensitivity parameters may predict extreme $μ_{0 c} (X)$ values. A example concerns the outcome earnings in our illustrative study. Since earnings span a large range, it may be intuitive to think about the earnings as differing in a multiplicative rather than additive manner, so a researcher may choose to use A4-MR for a sensitivity analysis. But earnings are not unbounded, and there is a maximum earning in the dataset, so we would be right to worry that certain sensitivity MR values may predict some $μ_{0 c} (X)$ values that are too high. A4-SMDe also has the same issue (to a lesser degree), where predicted $μ_{0 c} (X)$ values may be too high or too low. A4-GOR and A4-OR, on the other hand, predict within bounds.

We can use the data to gauge what values of the MR or SMD sensitivity parameter may be extreme, if we are willing to also specify bounds for the stratum-specific conditional $Y_{0}$ means, $μ_{0 c} (X)$ . With A4-MR (and a non-negative outcome), we fix an upper bound (B) for $μ_{0 c} (X)$ . With A4-SMDe, we fix a pair of upper $(B_{h})$ and lower $(B_{l})$ bounds. These can be informed by the observed outcome distribution, but are not necessarily bounds on the outcome itself. They are required to satisfy $B \geq {\hat{μ}}_{0} (X)$ or $B_{l} \leq {\hat{μ}}_{0} (X) \leq B_{h}$ for all $X$ values in the data.

For each $X$ value, we can obtain an interval for the MR/SMD sensitivity parameter that does not predict $μ_{0 c} (X)$ outside of these assumed bounds. (This interval is derived in Web Appendix F, see Propositions 10 and 11.) We estimate such intervals for all covariate values and examine the distributions of their upper and lower ends to judge which ranges of the sensitivity parameter should not be allowed – see application in the illustrative example in Section 7

Note that while this helps guard against mathematically implausible values, it does not replace careful consideration based on substantive knowledge, which is important for deciding which range is practically plausible and relevant to the specific application.

6.2 |. Confidence interval estimation

The application in this paper estimates nuisance functions (e.g., propensity score, principal score and outcome mean) parametrically, for simplicity. All the estimators in sections 3, 4 and 5 are M-estimators. With parametric nuisance estimation, they are asymptotically normal and analytic standard errors can be derived using M-estimation calculus³¹, and the bootstrap is also valid. In our illustration below, we bootstrap and construct BCa confidence intervals³².

6.3 |. Rate conditions for nonparametric estimation

With a view to inform nonparametric inference (not the focus of this paper), we derive rate conditions on nonparametric nuisance estimation for IF-based estimators (using sample splitting or cross fitting) to be $\sqrt{n}$ -consistent and asymptotically normal. See Propositions 12 and 13 in Web Appendix G for these results under PI and under the sensitivity assumptions, respectively. To our knowledge, our results are the first on rate conditions for sensitivity analyses for PI violation. They show that while PI-based analysis only requires typical rate conditions on several error products of nuisance functions (e.g., ${‖{\hat{e}}_{1} (X) - e_{1} (X)‖}_{2} ‖{\hat{π}}_{c} (X) - π_{c} (X)‖ = o_{p} (n^{- 1 / 2})$ ), the sensitivity analyses require rate conditions on single nuisance functions (due to the presence of square errors in the remainder bias term). Specifically, we require ${‖{\hat{π}}_{c} (X) - π_{c} (X)‖}_{2} = o_{p} (n^{- 1 / 4})$ with all the sensitivity analyses, and additionally ${‖{\hat{μ}}_{0} (X) - μ_{0} (X)‖}_{2} = o_{p} (n^{- 1 / 4})$ with the GOR- and SMDe-based sensitivity analyses, and ${‖ {\hat{σ}}_{0}^{2} (X) - σ_{0}^{2} (X) ‖}_{2} = o_{p} (n^{- 1 / 4})$ with the SMDe-based sensitivity analyis. These results immediately connect to the earlier results on the robustness under PI, and (partial) loss of robustness under sensitivity assumptions, of IF-based estimation.

6.4 |. Finite-sample bias

There is not an ideal choice for the placement of this topic. It is easier to read after reading the illustrative analysis in the next section. But we put it here for it is a small other topic.

Many consistent estimators are biased in finite samples. Methods to reduce such bias^33,34 are not often used, perhaps because the bias tends to be small, and the correction is complicated. The data example, however, reveals an interesting pattern of bias specific to sensitivity analysis that is worth noting. It is seen with the different outcomes and different estimators. An instance of this pattern is shown in Figure 3, all instances are shown in Web Appendix H.

Point estimate and iterated bootstrap mean estimates. Plots are shown for the outcome *work for pay*.

In Figure 3 the solid black curve is the point estimate (which we refer to generically as $\hat{θ}$ ), the dashed red curve is the mean of bootstrap estimates $({\bar{\hat{θ}}}^{*})$ , and the dashed orange curve is the mean of estimates from the double bootstrap (bootstrap of bootstrap samples) $({\bar{\hat{θ}}}^{* *})$ . The shared pattern in all sensitivity analyses is that the slope of the ${\bar{\hat{θ}}}^{*}$ curve is less steep than that of the $\hat{θ}$ curve, and the slope of the ${\bar{\hat{θ}}}^{* *}$ is even less steep. (Note that the steepness of the curve indicates the degree to which sensitivity analysis estimates depart from the main analysis estimate.) For the two outcomes work and depressive symptoms, where the differences between $\hat{θ}, {\hat{\hat{θ}}}^{*}$ and ${\bar{\hat{θ}}}^{* *}$ are minimal in the main analysis, this means that in the sensitivity analysis ${\bar{\hat{θ}}}^{*}$ tends to be less extreme than $\hat{θ}$ , and ${\bar{\hat{θ}}}^{* *}$ tends to be even less extreme; and this gets more pronounced the farther the sensitivity parameter is from its null value.

Finite-sample bias deserves dedicated investigation, which is outside the scope of this paper. This specific pattern, however, begs the question why. Our intuition is that it may be due to the fact that $τ_{0 c}$ is a weighted average of $μ_{0 c} (X)$ where the weights are $π_{c} (X)$ , and under sensitivity assumptions the quantity being averaged $μ_{0 c} (X)$ depends on the weight $π_{c} (X)$ . Specifically, with a fixed $μ_{0} (X)$ , $μ_{0 c} (X)$ is (i) monotone decreasing in $π_{c} (X)$ for $ρ_{c} > 1$ or $η_{c} > 0$ , and (ii) monotone increasing in $π_{c} (X)$ for $ρ_{c} < 1$ or $η_{c} < 0$ (see Proposition 14 in Web Appendix H). This results in a coupling of (a) any deviation (of the finite sample from the population) in the weight with (b) a deviation in the quantity being averaged – in the opposite direction for case (i) and the same direction for case (ii). The resulting finite-sample bias is an attenuation of the difference between the sensitivity analysis and main analysis estimates.

For the data example we use a bootstrap-based bias correction after conducting a focused simulation study (see Web Appendix H). This bias correction is also implemented in our R-package.

7 |. JOBS II ILLUSTRATION

De-identified JOBS II data were accessed from the Inter-University Consortium for Political and Social Research data archive (www.icpsr.umich.edu). Our analysis focuses on the set of participants who were identified at initial screening as being at high risk for developing depression⁵. For illustrative purposes, we further subset to participants with complete data (n=465) and treat the resulting dataset as if it were an observational study. (Due to this restriction of the sample, analysis results should be seen as merely illustrative and not taken as substantive findings.) We consider three outcomes: working for pay (binary), monthly earnings (non-negative), and depressive symptoms (a score ranging from 1 to 5) at six months post-treatment. The study has a rich set of baseline covariates including demographics, household characteristics, employment history, motivation, and depressive symptoms. Given these covariates, we assume treatment assignment ignorability. We also assume PI in the main analysis.

Table 1 summarizes the covariate distribution (i) in the full analysis sample; (ii) stratified by compliance type in the treatment group (to give a sense of $X - C$ associations); and (iii) stratified by the binary work-for-pay outcome in the control group (to give a sense of $X - Y_{0}$ associations). Compared to noncompliers, compliers were more likely to be male, White, older and have a college degree. They were more likely to have ever married and have fewer cohabiting children, and less likely to have low household income. They were more likely to have had a professional job as their last steady job, to have been unemployed for a shorter time, and to report slightly higher job-seeking and program-participation motivation. In the control condition, participants who were younger, White, higher educated, unemployed for a shorter period, a manager at their last steady job, or reported higher motivation were more likely to be employed at six months.

TABLE 1.

Baseline covariates in (1) full analysis sample; (2) propensity-score-weighted treatment group, stratified by compliance type; and (3) propensity-score-weighted control group, stratified by outcome work for pay

	Full analysis sample (n=465)		Treatment group propensity-score-weighted				Control group propensity-score-weighted
	Full analysis sample (n=465)		compliers (n=172) (n.wt=256.6)		noncompliers (n=139) (n.wt=208.0)		work (n=96) (n.wt=303.3)		not work (n=58) (n.wt=152.1)

	mean or %	(SD) (count)	mean or %	(SD) (count)	mean or %	(SD) (count)	mean or %	(SD) (count)	mean or %	(SD) (count)
Age	36.5	(9.9)	39.0	(9.7)	33.5	(9.8)	35.2	(9.4)	38.6	(11.3)
Sex (female)	57.6%	(268)	53.4%	(137)	62.9%	(130.9)	59.4%	(180.3)	56.7%	(86.3)
Race (white)	81.7%	(380)	85.1%	(218.5)	78.6%	(163.5)	87.1%	(264.3)	73.7%	(112.1)
Education
less than high school	10.5%	(49)	7.3%	(18.8)	16.5%	(34.3)	8.5%	(25.9)	16.3%	(24.8)
high school	29.7%	(138)	26.3%	(67.5)	31.7%	(66.0)	25.6%	(77.5)	35.6%	(54.2)
some college	38.9%	(181)	37.2%	(95.5)	41.7%	(86.7)	42.6%	(129.0)	34.7%	(52.8)
Bachelor’s degree	13.1%	(61)	19.4%	(49.9)	5.7%	(11.8)	14.6%	(44.3)	9.5%	(11.4)
graduate studies	7.7%	(36)	9.8%	(25.0)	4.4%	(9.3)	8.7%	(26.4)	3.9%	(5.9)
Marital status
never married	34.4%	(160)	31.8%	(81.6)	38.0%	(79.0)	35.2%	(106.8)	35.5%	(54.0)
married	38.7%	(180)	37.3%	(95.7)	38.5%	(80.2)	35.4%	(107.3)	39.8%	(60.6)
divorced/separated/widowed	26.9%	(125)	30.9%	(79.2)	23.5%	(48.8)	29.4%	(89.2)	24.6%	(37.5)
Kids in household	0.93	(1.13)	0.85	(1.12)	0.95	(1.17)	0.80	(1.05)	0.98	(1.03)
Household income
under 15K	22.8%	(106)	19.3%	(49.4)	26.7%	(55.6)	19.0%	(57.5)	31.3%	(47.6)
15K to under 25K	24.9%	(116)	22.1%	(56.6)	29.6%	(61.6)	34.3%	(104.1)	15.1%	(23.0)
25K to under 40K	25.8%	(120)	28.6%	(73.3)	23.3%	(48.6)	25.6%	(77.7)	24.0%	(36.5)
40K to under 50K	10.8%	(50)	12.6%	(32.4)	7.7%	(16.0)	6.7%	(20.2)	15.3%	(23.2)
50K or more	15.7%	(73)	17.5%	(44.9)	12.6%	(26.2)	14.4%	(43.8)	14.3%	(21.8)
Economic hardship	3.62	(0.92)	3.52	(0.92)	3.78	(0.92)	3.73	(0.91)	3.52	(1.00)
Occupation (last steady job)
professional	18.5%	(86)	26.7%	(68.5)	9.1%	(18.9)	17.1%	(51.8)	18.8%	(28.6)
managerial	17.2%	(80)	14.7%	(37.6)	19.5%	(40.6)	18.4%	(55.9)	10.8%	(16.4)
clerical	23.4%	(109)	23.9%	(61.3)	23.2%	(48.2)	22.9%	(69.6)	26.3%	(40.1)
sales	6.5%	(30)	5.2%	(13.3)	7.7%	(16.0)	7.9%	(23.8)	3.0%	(4.6)
crafts/foremen	12.9%	(60)	13.6%	(34.8)	12.2%	(25.4)	10.9%	(33.0)	18.8%	(28.6)
operative	9.5%	(44)	5.2%	(13.3)	14.6%	(30.4)	9.5%	(28.9)	7.3%	(11.1)
labor/service	12.0%	(56)	10.8%	(27.7)	13.7%	(28.5)	13.3%	(40.3)	15.0%	(22.7)
Weeks unemployed	9.3	(11.0)	8.1	(10.3)	10.4	(11.1)	8.0	(9.4)	10.5	(12.5)
Motivation to participate	5.34	(0.80)	5.50	(0.79)	5.19	(0.78)	5.41	(0.73)	5.37	(0.83)
Job-seeking motivation	82	(17)	84	(15)	81	(19)	85	(16)	76	(17)
Job-seeking self-efficacy	3.59	(0.83)	3.48	(0.84)	3.70	(0.82)	3.66	(0.76)	3.44	(0.84)
Assertiveness	2.99	(0.82)	2.90	(0.82)	3.07	(0.82)	2.97	(0.81)	2.94	(0.79)
Depressive symptoms	2.34	(0.68)	2.34	(0.69)	2.36	(0.68)	2.42	(0.70)	2.25	(0.60)

Open in a new tab

n.wt = weighted subsample size. Ranges of continuous/interval variables: age 17 to 77; kids in households 0 to 5 (one observation >5 truncated to 5), economic hardship 1 to 5; weeks unemployed 1 to 52 (12 observations >52 truncated to 52); motivation to participate 1 to 6.5; job-seeking motivation 0 to 100; job-seeking self-efficacy 1 to 5; assertiveness 1 to 5; depressive symptoms 1 to 5.

We aim to illustrate the use of the sensitivity assumptions introduced above with the different outcomes, and show how sensitivity analysis effect estimates depart from PI-based estimates. For this purpose, any type A or type B estimator suffices. We suppose that a researcher has chosen to use the Hájek-type IF-based estimator (15) for the PI-based analysis. We will briefly describe an implementation of this estimator, and then will focus on the sensitivity analyses.

We report bias-corrected point estimates (see Section 6.4) and BCa confidence intervals.

7.1 |. PI-based main analysis

The estimator ${\hat{Δ}}_{c, 1 F H}^{P I}$ requires estimating several nuisance functions. We make relatively simple choices, keeping in mind what applied researchers may use in practice. We use logistic regression to fit the propensity score ( $e (Z, X)$ ) and principal score $(π_{c} (X)$ ) models. These models include all baseline covariates, plus squares and square roots of continuous covariates; the inclusion of these additional terms is meant to improve covariate balance to be obtained from principal score and inverse propensity score weighting. We check balance as suggested in¹³. Figure 17 (in Appendix I) shows that covariate balance is improved (i) between the treated and control groups after propensity score weighting, and (ii) between treated (non)compliers and controls after principal score weighting combined with propensity score weighting.

Next, we estimate the conditional outcome mean functions for treated compliers ( $μ_{11} (X)$ ), treated noncompliers ( $μ_{10} (X)$ ) and controls ( $μ_{0} (X)$ ). With the binary outcome work for pay, we use logistic regression. For the outcome earnings, the means are estimated conditional on working using gamma regression with log link. and then multiplied with the probability of working predicted by the work for pay model. (Small detail: since we use a noncanonical link with the gamma model, the predictions are slightly mean-biased; we calibrate them by a multiplicative constant to remove this bias.) For the depressive symptoms outcome, we use a simple transformation to the [0, 1] interval (by subtracting $l = 1$ and dividing by $h - l = 4$ ), fit a quasi-logistic model to the transformed outcome to estimate the conditional means, and then transform the means back to the original scale. These models include all baseline covariates.

We use targeted nuisance estimation (see Remark 2). The $π_{c} (X)$ , $μ_{11} (X)$ and $μ_{10} (X)$ models are fit to data (treated group, treated compliers and treated noncompliers, respectively) weighted by $1 / \hat{e} (Z, X)$ . The $μ_{0} (X)$ model is fit twice, to the control group weighted by ${\hat{π}}_{1} (X) / \hat{e} (Z, X)$ and weighted by ${\hat{π}}_{0} (X) / \hat{e} (Z, X)$ , for CACE and NACE estimation, respectively.

Results (see Table 2) suggest that assignment to the intervention resulted in increased employment and earnings and decreased depressive symptoms for compliers. For noncompliers, effect estimates are close to null.

TABLE 2.

Pl-based analysis results: point estimates (and 95% BCa confidence intervals)

outcome	compliers			noncompliers
outcome	mean $Y_{1} (τ_{11})$	mean $Y_{0} (τ_{01}^{P I})$	CACE ( $Δ_{1}^{P I}$ )	mean $Y_{1} (τ_{10})$	mean $Y_{0} (τ_{00}^{P I})$	NACE ( $Δ_{0}^{P I}$ )
work	75.4% (69.4, 81.4)	61.1% (53.1,68.4)	14.3 percentage points (4.7, 23.1)	68.5% (60.7, 75.2)	64.2% (55.7, 72.2)	4.3 percentage points (−6.2, 14.2)
earnings	$1,279 (1,107, 1,452)	$1,014 (802, 1,221)	$266 (18, 530)	$928 (776, 1,115)	$835 (666, 972)	$92 (−90, 318)
depressive symptoms	1.90 (1.80, 1.99)	2.07 (1.96,2.20)	−0.18 (−0.32, -0.04)	2.05 (1.94,2.16)	2.02 (1.88,2.14)	0.02 (−0.12, 0.18)

Open in a new tab

Variable work is binary. Actual earnings range is $0–5,667. Scale range of depressive symptoms is 1 to 5.

7.2 |. Sensitivity analysis

We now demonstrate sensitivity analyses that are OR-based for work for pay, MR-based for earnings, and GOR- and SMDe-based for depressive symptoms.

7.2.1 |. OR-based sensitivity analysis: work for pay

We noted above that some baseline factors such as socio-economic advantage and motivation are positively associated both with being a complier $(C)$ and with the work for pay outcome under control $(Y_{0})$ . One might be concerned whether, within subpopulations homogeneous in the observed covariates, there are other advantage type factors that are unobserved that relate to $C$ and $Y_{0}$ in a similar way; in that case the PI-based analysis might have overestimated the CACE and underestimated the NACE. On the other hand, one might be concerned that among people with the same $X$ , some may not have needed to participate in the training because they had good prospects of finding a job; in that case the PI-based analysis might have been biased in the opposite direction. We thus consider a range of sensitivity OR values spanning both sides of 1. The results of this sensitivity analysis (Figure 4, top left) suggest that, even if (within levels of $X$ ) compliers had double the odds (relative to noncompliers) of getting work without the intervention, the intervention’s effect on having work for compliers would still be positive.

Sensitivity analysis results: point estimates and 95% point-wise CIs for CACE, NACE and stratum-specific potential outcome means, for the range of the sensitivity parameter.

7.2.2 |. GOR-based sensitivity analysis: depressive symptoms

A concern may be that even among people with the same baseline covariate values (including baseline depressive symptoms score), compliers may be those who were more robust in some way (e.g., better at getting out of bed in the morning), and therefore may have better outcome (i.e., lower depressive symptoms at six months) under control than noncompliers. Therefore we consider sensitivity GOR values smaller than 1 (Figure 4, bottom left). The CACE estimate is quite sensitive to PI violation. It is negative (indicating a reduction in depressive symptoms) under PI, but as the sensitivity GOR deviates only slightly from 1, it quickly approaches zero.

7.2.3 |. MR-based sensitivity analysis: earnings

With this outcome, we use the MR sensitivity parameter. To illustrate the method as it would typically be used, we treat earnings as a stand-alone outcome, using ${\hat{μ}}_{0} (X)$ as the only input, putting aside its connection with the work for pay outcome.

We start with a tentative MR range from 1/3 to 3, which is covered in Figure 4(top right). As mentioned earlier, it is challenging to choose what range to consider for the sensitivity parameter. Most important to this decision is substantive knowledge, including opinions of experts and study staff (who might know participants better than what is captured in the covariates). Such knowledge should be used, whenever it is available, to help rule in which range of the sensitivity parameter is practical and relevant.

As discussed in Section 6.1, the data can help rule out some implausible ranges. Here we simply use the maximum reported earnings under control ($5,667) as the upper bound B for $μ_{0 c} (X)$ . After computing covariate-specific “legal” intervals for the sensitivity parameter, we use their end points to make the plot on the left in Figure 5, which shows the proportion of the sample with either ${\hat{μ}}_{01} (X)$ or ${\hat{μ}}_{00} (X)$ exceeding B under each MR value. We do not restrict the MR range based on this plot, as it suggests limited bound contradiction. (Alternatives include (i) restricting the MR range, or (ii) modifying the assumption to let the MR be $ρ$ for $X$ values where $ρ$ is in the legal interval, and otherwise be the legal value closest to $ρ$ .)

Bounds violation diagnostic: proportion contradicting bounds as a function of the sensitivity parameter

Another way to rely on the data is to examine what the MR values imply about the distributions of $μ_{01} (X)$ values among compliers and of $μ_{00} (X)$ values among noncompliers. Figure 18 (in Appendix I) plots these implied distributions for several MR values on [1/3, 3]. To judge whether such distributions are plausible, again, one should rely substantive knowledge if possible. Also, very large $μ_{01} (X)$ or $μ_{00} (X)$ values (especially those substantially larger than the maximum reported earnings) are suspect. Based on this, one might consider excluding MR values at the low end (1/3) and at the high end (≥ 2).

Another possibility is to supplement the A4-MR with other assumptions based on substantive knowledge. Suppose, for example, that substantive experts think it is unlikely that being assigned to the intervention is harmful to noncompliers (a relaxation of the ER assumption). Based on the results plot in Figure 4, this would narrow attention to the MR range above 1/2.

7.2.4 |. SMDe-based sensitivity analysis: depressive symptoms

Suppose that for the depressive symptoms outcome, an investigator prefers a sensitivity analysis based on A4-SMDe, being more comfortable communicating about mean differences. Here also, we consider a sensitivity SMD range to the left of the null value, where within $X$ levels, complier and noncomplier outcome means under control may differ by up to one standard deviation. Results (Figure 4, bottom right) look similar to those from the GOR-based sensitivity analysis, although using a different sensitivity parameter.

We note two details. First, this sensitivity analysis requires estimating the conditional variance $σ_{0}^{2} (X)$ . Using the quasi-likelihood approach, we assume that $σ_{0}^{2} (X)$ is proportional to $[μ_{0} (X) - l] [h - μ_{0} (X)]$ . This is equivalent to assuming that the outcome, after being shifted and rescaled to the [0, 1] interval, follows a quasibinomial model conditional on covariates. Recall that in the PI-based analysis, we transformed this outcome to the [0, 1] interval and fit a model with logit link. We now manually extract the dispersion parameter $\hat{ϕ}$ from this model and use it to compute the variance estimate ${\hat{σ}}_{0}^{2} (X) = \hat{ϕ} [{\hat{μ}}_{0} (X) - l] [h - {\hat{μ}}_{0} (X)]$ . Second, the plot on the right of Figure 5 shows that for the SMD range considered there is minimal contradiction with the $μ_{0 c} (X)$ bounds, which here are simply set to the minimum and maximum depressive symptom scores. This is expected, as we consider a modest SMD range.

We do not conduct an SMDe-based sensitivity analysis for the outcome earnings, because the equal variance part of A4-SMDe is likely grossly incorrect for that outcome.

8 |. DISCUSSION

This paper substantially expands options for sensitivity analysis for PI violation in the estimation of complier and noncomplier average causal effects in two ways. First, we consider several sensitivity models with different sensitivity parameters (OR, GOR, MR, SMD) suitable for different outcome types and reflecting different ways compliers and noncompliers may differ with respect to outcome under control. Second, rather than proposing one estimator under the sensitivity model, we tailor sensitivity analysis techniques to different types of estimators (outcome regression, IF-based and weighting) that may be used for the PI-based main analysis.

There are several future directions for this line of sensitivity analysis. One is to incorporate data-adaptive nuisance estimation. As noted, the robustness available for PI-based analysis via IF-based estimation is partially lost for sensitivity analysis, making it more important that we estimate nuisance functions well. We provide rate conditions, but otherwise leave this to future work. Also important is how to handle missing data. Missing-at-random cases can be handled by standard techniques, but given the difference in compliance type observability between treatment arms, one may wish to allow certain not-at-random missingness, e.g., outcome missingness that depends on compliance type³⁵. Another extension is to adapt the methods to accommodate two-sided noncompliance and non-binary $S$ , which are also common settings.

For the two-sided noncompliance case, extension is conceptually straightforward: wherever a PI assumption is used to disentangle a mixture it can be replaced with a sensitivity assumption. With binary $Z$ and $S$ there are four mixtures, so if PI assumptions are invoked to disentangle all four, then replacing those assumptions requires four sensitivity parameters. If one assumes away one principal stratum (say, defiers) to identify stratum prevalences and covariate distributions, then two mixtures remain, which means a PI-based analysis requires two PI assumptions and the sensitivity analysis involves two sensitivity parameters – see ²¹ for a sensitivity analysis using two MR parameters. While the idea is simple, works needs to be done to consider different (types of) PI-based estimators and pair them with sensitivity analysis techniques.

The methods in this paper belong to a mean-centric approach to sensitivity analysis. Each assumes a connection between two conditional outcome mean functions of complier and noncompliers. For a binary outcome, the sensitivity analysis based on A4-OR fully respects the observed outcome distribution. For continuous outcomes, however, the sensitivity analyses based on A4-GOR, A4-MR and A4-SMDe alone may conflict with the observed outcome distribution. The MR-based model may predict out of range because it treats the outcome as unbounded. The other two methods use some additional information: the GOR-based model takes in user-specified outcome bounds and respects those bounds; the SMDe-based model is informed about conditional outcome variability and with that information offers a scale-free sensitivity parameter. To mitigate the out-of-range prediction problem that affects the MR-based and to a lesser degree of the SMD-based method, we propose a simple technique that requires an additional assumption of bounds on stratum-specific conditional outcome means. There remains, however, the risk of more subtle conflict (e.g., predicting mean outcome in the tail of the distribution). A different approach is to avoid conflicting with the observed data distribution^27,26,28 all together by anchoring on the conditional distribution of the outcome under control rather than just its mean plus bounds/variance. Such sensitivity analysis (described briefly in the preprint³⁶) will be presented in a separate manuscript.

One last comment:

This paper provides technical solutions for doing sensitivity analysis, but does not address how to choose a relevant range for the sensitivity parameter and how to elicit and use expert opinion for this purpose. This is a topic that should receive more attention.

Supplementary Material

Appendix

NIHMS2065332-supplement-Appendix.pdf^{(2MB, pdf)}

ACKNOWLEDGMENTS

This work is partially supported by grants R03MH128634, R01MH115487 and U24OD023382 from the National Institutes of Health, and N00014-21-1-2820 from the Office of Naval Research. Its depth and clarity benefited from helpful feedback from several anonymous reviewers. TQN thanks Drs. Ilya Shpitser, Bonnie Smith and Razieh Nabi for helpful discussions about influence functions, and Drs. Constantine Frangakis and Scott Zeger for thought-provoking comments at an early presentation of this work. The authors appreciate the participants, staff and investigators of the JOBS II study, and the ICPSR data archive.

Abbreviations:

CACE: complier average causal effect
NACE: noncomplier average causal effect
ER: exclusion restriction
PI: principal ignorability
MR: mean ratio
OR: odds ratio
GOR: generalized odds ratio
SMD: standardized mean difference
IF: influence function

Footnotes

CONFLICT OF INTEREST STATEMENT

The authors declare that they have no conflict of interests.

DATA AND CODE AVAILABILITY STATEMENT

The de-identified JOBS II data used in this paper can be requested from the Inter-University Consortium for Political and Social Research data archive at www.icpsr.umich.edu. All code to produce the results in this paper and to implement the proposed methods are provided in the R-package PIsens available at https://github.com/trangnguyen74/PIsens.

SUPPORTING INFORMATION

The Web Appendix may be found in the online version of the article at the publisher’s website, and also included on the next pages.

References

1.Frangakis CE, Rubin DB. Principal stratification in causal inference. Biometrics. 2002;58(1):21–29. doi: 10.1111/j.0006-341X.2002.00021.x [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Rubin DB. Estimating causal effects of treatments in randomized and nonrandomized studies.. Journal of Educational Psychology. 1974;66(5):688–701. doi: 10.1037/h0037350 [DOI] [Google Scholar]
3.Rubin DB. Causal inference through potential outcomes and principal stratification: Application to studies with “censoring” due to death. Statistical Science. 2006;21(3):299–309. doi: 10.1214/088342306000000114 [DOI] [Google Scholar]
4.Griffin BA, McCaffrey DF, Morral AR. An application of principal stratification to control for institutionalization at follow-up in studies of substance abuse treatment programs. The Annals of Applied Statistics. 2008;2(3):1034–1055. doi: 10.1214/08-AOAS179 [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Vinokur AD, Price RH, Schul Y. Impact of the JOBS intervention on unemployed workers varying in risk for depression. American Journal of Community Psychology. 1995;23(1):39–74. doi: 10.1007/BF02506922 [DOI] [PubMed] [Google Scholar]
6.Gruenewald TL, Tanner EK, Fried LP, et al. The Baltimore Experience Corps Trial: Enhancing generativity via intergenerational activity engagement in later life. The Journals of Gerontology Series B: Psychological Sciences and Social Sciences. 2016;71(4):661–670. doi: 10.1093/geronb/gbv005 [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Daumit GL, Dickerson FB, Wang NY, et al. A behavioral weight-loss intervention in persons with serious mental illness. New England Journal of Medicine. 2013;368(17):1594–1602. doi: 10.1056/NEJMoa1214530 [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Angrist JD, Imbens GW. Two-stage least squares estimation of average causal effects in models with variable treatment intensity. Journal of the American Statistical Association. 1995;90(430):431–442. doi: 10.1080/01621459.1995.10476535 [DOI] [Google Scholar]
9.Marshall J Coarsening bias: How coarse treatment measurement upwardly biases instrumental variable estimates. Political Analysis. 2016;24(2):157–171. doi: 10.1093/pan/mpw007 [DOI] [Google Scholar]
10.Andresen ME, Huber M. Instrument-based estimation with binarised treatments: issues and tests for the exclusion restriction. The Econometrics Journal. 2021;24(3):536–558. doi: 10.1093/ectj/utab002 [DOI] [Google Scholar]
11.Feller A, Mealli F, Miratrix L. Principal score methods: Assumptions, extensions, and practical considerations. Journal of Educational and Behavioral Statistics. 2017;42(6):726–758. doi: 10.3102/1076998617719726 [DOI] [Google Scholar]
12.Jo B, Stuart EA. On the use of propensity scores in principal causal effect estimation. Statistics in Medicine. 2009;28(23):2857–2875. doi: 10.1002/sim.3669 [DOI] [PMC free article] [PubMed] [Google Scholar]
13.Ding P, Lu J. Principal stratification analysis using principal scores. Journal of the Royal Statistical Society. Series B: Statistical Methodology. 2017;79(3):757–777. doi: 10.1111/rssb.12191 [DOI] [Google Scholar]
14.Jiang Z, Ding P. Identification of causal effects within principal strata using auxiliary variables. Statistical Science. 2021;36(4):1–49. doi: 10.1214/20-STS810 [DOI] [Google Scholar]
15.Stuart EA, Jo B. Assessing the sensitivity of methods for estimating principal causal effects. Statistical Methods in Medical Research. 2015;24(6):657–674. doi: 10.1177/0962280211421840 [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Jo B, Vinokur AD. Sensitivity analysis and bounding of causal effects with alternative identifying assumptions. Journal of Educational and Behavioral Statistics. 2011;36(4):415–440. doi: 10.3102/1076998610383985 [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Wang C, Zhang Y, Mealli F, Bornkamp B. Sensitivity analyses for the principal ignorability assumption using multiple imputation. Pharmaceutical Statistics. 2023;22(1):64–78. doi: 10.1002/pst.2260 [DOI] [PubMed] [Google Scholar]
18.Schwartz S, Li F, Reiter JP. Sensitivity analysis for unmeasured confounding in principal stratification settings with binary variables. Statistics in Medicine. 2012;31(10):949–962. doi: 10.1002/sim.4472 [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Mercatanti A, Li F. Do debit cards decrease cash demand?: Causal inference and sensitivity analysis using principal stratification. Journal of the Royal Statistical Society. Series C: Applied Statistics. 2017;66(4):759–776. doi: 10.1111/rssc.12193 [DOI] [Google Scholar]
20.Baiocchi M, Cheng J, Small DS. Instrumental variable methods for causal inference. Statistics in Medicine. 2014;33(13):2297–2340. doi: 10.1002/sim.6128 [DOI] [PMC free article] [PubMed] [Google Scholar]
21.Jiang Z, Yang S, Ding P. Multiply robust estimation of causal effects under principal ignorability. Journal of the Royal Statistical Society. Series B: Statistical Methodology. 2022;84(4):1423–1445. doi: 10.1111/rssb.12538 [DOI] [Google Scholar]
22.McConnell S, Stuart EA, Devaney B. The truncation-by-death problem: What to do in an experimental evaluation when the outcome is not always defined. Evaluation Review. 2008;32(2):157–186. doi: 10.1177/0193841X07309115 [DOI] [PubMed] [Google Scholar]
23.Hájek J Comment on “An essay on the logical foundations of survey sampling, part one” by Basu, D. In: Toronto: Holt, Rinehart, and Winston, 1971:236. [Google Scholar]
24.Wang B, Ogburn EL, Rosenblum M. Analysis of covariance in randomized trials: More precision and valid confidence intervals, without model assumptions. Biometrics. 2019;75(4):1391–1400. doi: 10.1111/biom.13062 [DOI] [PubMed] [Google Scholar]
25.Steingrimsson JA, Hanley DF, Rosenblum M. Improving precision by adjusting for prognostic baseline variables in randomized trials with binary outcomes, without regression model assumptions. Contemporary Clinical Trials. 2017;54:18–24. doi: 10.1016/j.cct.2016.12.026 [DOI] [PubMed] [Google Scholar]
26.Franks AM, D’Amour A, Feller A. Flexible sensitivity analysis for observational studies without observable implications. Journal of the American Statistical Association. 2020;115(532):1730–1746. doi: 10.1080/01621459.2019.1604369 [DOI] [Google Scholar]
27.Scharfstein DO, Nabi R, Kennedy EH, Huang MY, Bonvini M, Smid M. Semiparametric sensitivity analysis: Unmeasured confounding in observational studies. 2021. arxiv: 2104.08300. [DOI] [PubMed] [Google Scholar]
28.Robins JM, Rotnitzky A, Scharfstein DO. Sensitivity analysis for selection bias and unmeasured confounding in missing data and causal inference models. In: New York, NY: Springer New York, 2000:1–94 [Google Scholar]
29.Cohen J Statistical Power Analysis for the Behavioral Sciences. New York: Routledge. 2nd ed., 1988 [Google Scholar]
30.Stuart EA. Matching methods for causal inference: A review and a look forward. Statistical Science. 2010;25(1). doi: 10.1214/09-STS313 [DOI] [PMC free article] [PubMed] [Google Scholar]
31.Stefanski LA, Boos DD. The calculus of M-estimation. The American Statistician. 2002;56(1):29–38. doi: 10.1198/000313002753631330 [DOI] [Google Scholar]
32.Efron B Better bootstrap confidence intervals. Journal of the American Statistical Association. 1987;82(397):171–185. doi: 10.1080/01621459.1987.10478410 [DOI] [Google Scholar]
33.Efron B, Tibshirani RJ. An Introduction to the Bootstrap. CRC Press, 1994. Google-Books-ID: gLlpIUxRntoC. [Google Scholar]
34.Chang J, Hall P. Double-bootstrap methods that use a single double-bootstrap simulation. Biometrika. 2015;102(1):203–214. doi: 10.1093/biomet/asu060 [DOI] [Google Scholar]
35.Frangakis C, Rubin DB. Addressing complications of intention-to-treat analysis in the combined presence of all-or-none treatment-noncompliance and subsequent missing outcomes. Biometrika. 1999;86(2):365–379. doi: 10.1093/biomet/86.2.365 [DOI] [Google Scholar]
36.Nguyen TQ, Stuart EA, Scharfstein DO, Ogburn EL. Sensitivity analysis for principal ignorability violation in estimating complier and noncomplier average causal effects. 2023. arXiv:2303.05052v1 (preprint version 1). [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Appendix

NIHMS2065332-supplement-Appendix.pdf^{(2MB, pdf)}

Data Availability Statement

SUPPORTING INFORMATION

The Web Appendix may be found in the online version of the article at the publisher’s website, and also included on the next pages.

[R1] 1.Frangakis CE, Rubin DB. Principal stratification in causal inference. Biometrics. 2002;58(1):21–29. doi: 10.1111/j.0006-341X.2002.00021.x [DOI] [PMC free article] [PubMed] [Google Scholar]

[R2] 2.Rubin DB. Estimating causal effects of treatments in randomized and nonrandomized studies.. Journal of Educational Psychology. 1974;66(5):688–701. doi: 10.1037/h0037350 [DOI] [Google Scholar]

[R3] 3.Rubin DB. Causal inference through potential outcomes and principal stratification: Application to studies with “censoring” due to death. Statistical Science. 2006;21(3):299–309. doi: 10.1214/088342306000000114 [DOI] [Google Scholar]

[R4] 4.Griffin BA, McCaffrey DF, Morral AR. An application of principal stratification to control for institutionalization at follow-up in studies of substance abuse treatment programs. The Annals of Applied Statistics. 2008;2(3):1034–1055. doi: 10.1214/08-AOAS179 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R5] 5.Vinokur AD, Price RH, Schul Y. Impact of the JOBS intervention on unemployed workers varying in risk for depression. American Journal of Community Psychology. 1995;23(1):39–74. doi: 10.1007/BF02506922 [DOI] [PubMed] [Google Scholar]

[R6] 6.Gruenewald TL, Tanner EK, Fried LP, et al. The Baltimore Experience Corps Trial: Enhancing generativity via intergenerational activity engagement in later life. The Journals of Gerontology Series B: Psychological Sciences and Social Sciences. 2016;71(4):661–670. doi: 10.1093/geronb/gbv005 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R7] 7.Daumit GL, Dickerson FB, Wang NY, et al. A behavioral weight-loss intervention in persons with serious mental illness. New England Journal of Medicine. 2013;368(17):1594–1602. doi: 10.1056/NEJMoa1214530 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R8] 8.Angrist JD, Imbens GW. Two-stage least squares estimation of average causal effects in models with variable treatment intensity. Journal of the American Statistical Association. 1995;90(430):431–442. doi: 10.1080/01621459.1995.10476535 [DOI] [Google Scholar]

[R9] 9.Marshall J Coarsening bias: How coarse treatment measurement upwardly biases instrumental variable estimates. Political Analysis. 2016;24(2):157–171. doi: 10.1093/pan/mpw007 [DOI] [Google Scholar]

[R10] 10.Andresen ME, Huber M. Instrument-based estimation with binarised treatments: issues and tests for the exclusion restriction. The Econometrics Journal. 2021;24(3):536–558. doi: 10.1093/ectj/utab002 [DOI] [Google Scholar]

[R11] 11.Feller A, Mealli F, Miratrix L. Principal score methods: Assumptions, extensions, and practical considerations. Journal of Educational and Behavioral Statistics. 2017;42(6):726–758. doi: 10.3102/1076998617719726 [DOI] [Google Scholar]

[R12] 12.Jo B, Stuart EA. On the use of propensity scores in principal causal effect estimation. Statistics in Medicine. 2009;28(23):2857–2875. doi: 10.1002/sim.3669 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R13] 13.Ding P, Lu J. Principal stratification analysis using principal scores. Journal of the Royal Statistical Society. Series B: Statistical Methodology. 2017;79(3):757–777. doi: 10.1111/rssb.12191 [DOI] [Google Scholar]

[R14] 14.Jiang Z, Ding P. Identification of causal effects within principal strata using auxiliary variables. Statistical Science. 2021;36(4):1–49. doi: 10.1214/20-STS810 [DOI] [Google Scholar]

[R15] 15.Stuart EA, Jo B. Assessing the sensitivity of methods for estimating principal causal effects. Statistical Methods in Medical Research. 2015;24(6):657–674. doi: 10.1177/0962280211421840 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R16] 16.Jo B, Vinokur AD. Sensitivity analysis and bounding of causal effects with alternative identifying assumptions. Journal of Educational and Behavioral Statistics. 2011;36(4):415–440. doi: 10.3102/1076998610383985 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R17] 17.Wang C, Zhang Y, Mealli F, Bornkamp B. Sensitivity analyses for the principal ignorability assumption using multiple imputation. Pharmaceutical Statistics. 2023;22(1):64–78. doi: 10.1002/pst.2260 [DOI] [PubMed] [Google Scholar]

[R18] 18.Schwartz S, Li F, Reiter JP. Sensitivity analysis for unmeasured confounding in principal stratification settings with binary variables. Statistics in Medicine. 2012;31(10):949–962. doi: 10.1002/sim.4472 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R19] 19.Mercatanti A, Li F. Do debit cards decrease cash demand?: Causal inference and sensitivity analysis using principal stratification. Journal of the Royal Statistical Society. Series C: Applied Statistics. 2017;66(4):759–776. doi: 10.1111/rssc.12193 [DOI] [Google Scholar]

[R20] 20.Baiocchi M, Cheng J, Small DS. Instrumental variable methods for causal inference. Statistics in Medicine. 2014;33(13):2297–2340. doi: 10.1002/sim.6128 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R21] 21.Jiang Z, Yang S, Ding P. Multiply robust estimation of causal effects under principal ignorability. Journal of the Royal Statistical Society. Series B: Statistical Methodology. 2022;84(4):1423–1445. doi: 10.1111/rssb.12538 [DOI] [Google Scholar]

[R22] 22.McConnell S, Stuart EA, Devaney B. The truncation-by-death problem: What to do in an experimental evaluation when the outcome is not always defined. Evaluation Review. 2008;32(2):157–186. doi: 10.1177/0193841X07309115 [DOI] [PubMed] [Google Scholar]

[R23] 23.Hájek J Comment on “An essay on the logical foundations of survey sampling, part one” by Basu, D. In: Toronto: Holt, Rinehart, and Winston, 1971:236. [Google Scholar]

[R24] 24.Wang B, Ogburn EL, Rosenblum M. Analysis of covariance in randomized trials: More precision and valid confidence intervals, without model assumptions. Biometrics. 2019;75(4):1391–1400. doi: 10.1111/biom.13062 [DOI] [PubMed] [Google Scholar]

[R25] 25.Steingrimsson JA, Hanley DF, Rosenblum M. Improving precision by adjusting for prognostic baseline variables in randomized trials with binary outcomes, without regression model assumptions. Contemporary Clinical Trials. 2017;54:18–24. doi: 10.1016/j.cct.2016.12.026 [DOI] [PubMed] [Google Scholar]

[R26] 26.Franks AM, D’Amour A, Feller A. Flexible sensitivity analysis for observational studies without observable implications. Journal of the American Statistical Association. 2020;115(532):1730–1746. doi: 10.1080/01621459.2019.1604369 [DOI] [Google Scholar]

[R27] 27.Scharfstein DO, Nabi R, Kennedy EH, Huang MY, Bonvini M, Smid M. Semiparametric sensitivity analysis: Unmeasured confounding in observational studies. 2021. arxiv: 2104.08300. [DOI] [PubMed] [Google Scholar]

[R28] 28.Robins JM, Rotnitzky A, Scharfstein DO. Sensitivity analysis for selection bias and unmeasured confounding in missing data and causal inference models. In: New York, NY: Springer New York, 2000:1–94 [Google Scholar]

[R29] 29.Cohen J Statistical Power Analysis for the Behavioral Sciences. New York: Routledge. 2nd ed., 1988 [Google Scholar]

[R30] 30.Stuart EA. Matching methods for causal inference: A review and a look forward. Statistical Science. 2010;25(1). doi: 10.1214/09-STS313 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R31] 31.Stefanski LA, Boos DD. The calculus of M-estimation. The American Statistician. 2002;56(1):29–38. doi: 10.1198/000313002753631330 [DOI] [Google Scholar]

[R32] 32.Efron B Better bootstrap confidence intervals. Journal of the American Statistical Association. 1987;82(397):171–185. doi: 10.1080/01621459.1987.10478410 [DOI] [Google Scholar]

[R33] 33.Efron B, Tibshirani RJ. An Introduction to the Bootstrap. CRC Press, 1994. Google-Books-ID: gLlpIUxRntoC. [Google Scholar]

[R34] 34.Chang J, Hall P. Double-bootstrap methods that use a single double-bootstrap simulation. Biometrika. 2015;102(1):203–214. doi: 10.1093/biomet/asu060 [DOI] [Google Scholar]

[R35] 35.Frangakis C, Rubin DB. Addressing complications of intention-to-treat analysis in the combined presence of all-or-none treatment-noncompliance and subsequent missing outcomes. Biometrika. 1999;86(2):365–379. doi: 10.1093/biomet/86.2.365 [DOI] [Google Scholar]

[R36] 36.Nguyen TQ, Stuart EA, Scharfstein DO, Ogburn EL. Sensitivity analysis for principal ignorability violation in estimating complier and noncomplier average causal effects. 2023. arXiv:2303.05052v1 (preprint version 1). [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Sensitivity analysis for principal ignorability violation in estimating complier and noncomplier average causal effects

Trang Quynh Nguyen

Elizabeth A Stuart

Daniel O Scharfstein

Elizabeth L Ogburn

Abstract

1 |. INTRODUCTION

1.1 |. Our contribution

1.2 |. Related work

2 |. SETTING, ESTIMANDS, AND PI-BASED IDENTIFICATION

2.1 |. Setting, estimands, and standard assumptions

2.2 |. The identification challenge and the PI assumption

Lemma 1.

Proposition 1 (Results without PI).

Proposition 2 (PI based identification).

Remark 1 (Sufficient PI version).

3 |. THREE TYPES OF PI-BASED ESTIMATORS FROM THE LENS OF SENSITIVITY ANALYSIS

3.1 |. Type A (≈ outcome regression estimators)

3.2 |. Type B (≈ influence function based estimators)

Proposition 3 (IFs for PI-based analysis).

Circling back to type A.

Remark 2.

Proposition 4 (multiply robust PI-based estimators).

3.3 |. Type C (≈ other/weighting estimators)

4 |. SENSITIVITY ANALYSIS BASED ON THREE RATIO-TYPE SENSITIVITY PARAMTERS

FIGURE 1.

Remark 3 (Exponential tilting connection).

4.1 |. Identification

Proposition 5 (GOR- and MR-based identification).

4.2 |. Estimation

FIGURE 2.

4.2.1 |. Type A estimators

4.2.2 |. Type B estimators

Proposition 6 (GOR- and MR-based IFs).

4.2.2.1 |. Partial loss of robustness.

Proposition 7 (Partial loss of robustness).

Remark 4 (Approximate robustness).

4.2.3 |. Type C estimators

Corollary 1 (MR-based outcome scaling).

Remark 5.

5 |. SENSITIVITY ANALYSIS BASED ON A DIFFERENCE-TYPE SENSITIVITY PARAMETER

5.1 |. Identification

Proposition 8 (SMDe-based identification).

5.2 |. Estimation

5.2.1 |. Simple type A estimators

5.2.2 |. IF-based estimators (including type B and multi-robust type A)

Proposition 9 (SMDe-based IF).

Remark 6.

5.2.3 |. Other estimators

6 |. OTHER TOPICS

6.1 |. Using data in considering the range of the MR and SMD parameters

6.2 |. Confidence interval estimation

6.3 |. Rate conditions for nonparametric estimation

6.4 |. Finite-sample bias

FIGURE 3.

7 |. JOBS II ILLUSTRATION

TABLE 1.

7.1 |. PI-based main analysis

TABLE 2.

7.2 |. Sensitivity analysis

7.2.1 |. OR-based sensitivity analysis: work for pay

FIGURE 4.

7.2.2 |. GOR-based sensitivity analysis: depressive symptoms

7.2.3 |. MR-based sensitivity analysis: earnings

FIGURE 5.

7.2.4 |. SMDe-based sensitivity analysis: depressive symptoms

8 |. DISCUSSION

One last comment:

Supplementary Material

ACKNOWLEDGMENTS

Abbreviations:

Footnotes

DATA AND CODE AVAILABILITY STATEMENT

References

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK