Skip to main content
American Journal of Epidemiology logoLink to American Journal of Epidemiology
. 2023 Nov 3;193(3):536–547. doi: 10.1093/aje/kwad209

The Importance of Equity Value Judgments and Estimator-Estimand Alignment in Measuring Disparity and Identifying Targets to Reduce Disparity

Ting-Hsuan Chang, Trang Quynh Nguyen, John W Jackson
PMCID: PMC10911841  PMID: 37939055

Abstract

The choice of which covariates to adjust for (so-called allowability designation (AD)) in health disparity measurements reflects value judgments about inequitable versus equitable sources of health differences, which is paramount for making inferences about disparity. Yet, many off-the-shelf estimators used in health disparity research are not designed with equity considerations in mind, and they imply different ADs. We demonstrated the practical importance of incorporating equity concerns in disparity measurements through simulations, motivated by the example of reducing racial disparities in hypertension control via interventions on disparities in treatment intensification. Seven causal decomposition estimators, each with a particular AD (with respect to disparities in hypertension control and treatment intensification), were considered to estimate the observed outcome disparity and the reduced/residual disparity under the intervention. We explored the implications for bias of the mismatch between equity concerns and the AD in the estimator under various causal structures (through altering racial differences in covariates or the confounding mechanism). The estimator that correctly reflects equity concerns performed well under all scenarios considered, whereas the other estimators were shown to have the risk of yielding large biases in certain scenarios, depending on the interaction between their ADs and the specific causal structure.

Keywords: causal inference, decomposition, disparities, equity

Abbreviations

AD

allowability designation

CI

confidence interval

COVID-19

coronavirus disease 2019

RMPW

ratio of mediator probability weighting

RMSE

root mean squared error

SES

socioeconomic status

Health disparities are often defined as “systematic, plausibly avoidable health differences adversely affecting socially disadvantaged groups” (1, p. S151). The measurement of disparity involves value judgments about inequitable versus equitable sources of health differences. It is well recognized in the epidemiology literature that several choices about disparity measurement—for example, scale, group weighting, or choice of reference group—reflect certain perspectives and may yield different conclusions about disparity. The choice of what variables to adjust for is an important dimension, because it designates the sources of difference that a disparity measure captures (2–6). For example, media coverage on racial inequities in the coronavirus disease 2019 (COVID-19) pandemic often discussed the racial differences in preexisting health conditions as a reasonable explanation for the disparate COVID-19 rates across racial groups, without attending to the sources of those differences or their implications for reducing disparities in COVID-19 rates (7). Another example concerns the Centers for Disease Control and Prevention’s release of COVID-19 surveillance data with weights that, when applied, adjust for geography and thus remove the contribution of racial residential segregation from racial disparities in COVID-19 outcomes at the national level (8, 9). As a measure of injustice in society, being explicit about what is captured in a disparity measure and the equity value judgments behind this choice is paramount for making inferences about disparity.

Many off-the-shelf estimators used in health disparity research are not designed with equity value judgments in mind. Yet, they differ in what they adjust for when measuring disparity and identifying potential targets for intervention (3). If the investigator or stakeholders have equity considerations that are not reflected in the estimator, one can view this discrepancy as a form of bias. Although this may be appreciated in theory, as has been discussed extensively by Jackson (3), it is unclear how large this bias may be in practice where the causal structure varies from setting to setting.

Here, we demonstrate the practical importance of aligning estimators in health disparity research to underlying value judgments about health equity. We will illustrate this issue through causal decomposition analysis, an analytical approach to identifying intervention targets for reducing disparity. We begin with an introduction of causal decomposition analysis and how our equity value judgments can be incorporated into the estimators. To make these ideas more concrete, we will consider the specific application to the potential for interventions on treatment intensification (sometimes defined as the initiation, increase in dose, or added class of an antihypertensive medication (10)) to reduce racial disparities in hypertension control. We then examine the consequences when an analyst uses an estimator that does not reflect a set of underlying equity value judgments, using a simulation analysis based on real data from the National Health and Nutrition Examination Survey. We also explore how relationships in the causal structure affect the consequences of this discrepancy between the estimator and the underlying equity value judgments. We close with practical recommendations for incorporating equity value judgments into health disparities research.

REVIEW OF CAUSAL DECOMPOSITION ANALYSIS

Causal decomposition analysis is an approach to identifying interventional targets for reducing disparity. To illustrate its use, we consider the motivating example of reducing racial disparities in hypertension control through intervening upon racial disparities in treatment intensification. Here causal decomposition analysis asks the question: How much of the racial disparity in hypertension control could be reduced if the Black population were intervened upon to receive treatment intensification at the same rates as the White population? This question reflects our estimands of interest, which include the reduced and residual disparities in the outcome under the hypothetical intervention done on the disadvantaged group, and the observed disparity prior to the intervention.

This perspective, where one considers disparity as observed versus postintervention, has its roots in the population impact literature (e.g., population attributable risk (11)). When viewed from this angle, causal decomposition analysis is not an attempt to pick up a mediated effect of race. In fact, it remains meaningful for informing interventions even when the potential target of interest is a variable that is merely associated with race (through a common cause as in Figure 1) but not caused by race (12). Such common causes may involve parental and ancestral (i.e., intergenerational) experiences of structural racism that serve to reify racial classification and determine early life conditions (13). It also remains meaningful when, as may often be the case, the target’s effect on the outcome is racialized (i.e., racially heterogeneous) due to experiences of structural racism.

Figure 1.

Figure 1

Causal diagram depicting relationships between historical and contemporary processes (Inline graphic), race (Inline graphic), demographic covariates (Inline graphic includes sex Inline graphic and age Inline graphic), socioeconomic status (SES) covariates (Inline graphic includes educational attainment Inline graphic and private health insurance Inline graphic), diabetes (Inline graphic), systolic blood pressure at baseline (Inline graphic) and at 6 months of follow-up (Inline graphic), hypertensive status at baseline (Inline graphic) and at 6 months of follow-up (Inline graphic), and treatment intensification (Inline graphic). Note that clinical measures (Inline graphic; not shown on the graph) include Inline graphic and Inline graphic. The subscripts index when the covariate is realized (0 = before baseline, 1 = at baseline, 2 = at 6 months of follow-up).

Disparity measurements

As mentioned above, it is important that disparity measurements—in particular, the choice of the covariates to adjust for—reflect the investigators’ equity concerns. In this subsection we demonstrate how this can be done in the context of our motivating example: We consider 3 sets of covariates (demographic characteristics, socioeconomic status (SES), and clinical measures) and discuss the meaning of disparity in terms of these covariates.

The bioethics literature offers some principles for deciding which sources of difference in health outcomes are unfair and thus contribute to disparity; 2 commonly raised ones are manipulability and amenability to intervention (3). According to the principle of manipulability, racial differences in covariates that are “nonmanipulable” from the point of follow-up (i.e., covariates that an actor, e.g., a provider, has no control over) are considered fair explanations for differences in health outcomes. This underlies assessments of disparity for the particular purposes of performance or accountability (14). According to the counterprinciple of amenability to intervention, differences from nonmanipulable covariates should be considered unfair if, in the broadest sense possible, they have addressable effects. This underlies assessments of disparity for the overarching purpose of tracking social progress and intervention. It is best applied when the socially marginalized group is disadvantaged on these nonmanipulable factors. Otherwise, their distribution does not, statistically speaking, contribute to disparate outcomes for the marginalized group, and failing to adjust for such differences could mask disparity from other unjust sources (3). Because disparity is a “moral red flag” that motivates social action (15), we want to avoid masking disparity when possible. In our example, we will refer to fair (or not necessarily unjust) sources of difference in hypertension control (outcome of interest) as “outcome-allowable.”

Our hypothetical intervention is to remove disparities in treatment intensification. Thus, to define our hypothetical intervention, we have to define what we mean by disparities in treatment intensification. We consider the concept of a social contract—a set of criteria that society has agreed (or could agree) upon should ideally govern how a good is distributed—to guide our choice of unfair sources of difference in treatment intensification (3). In our use, a social contract may generally be an aspirational proposal that reflects equity values that one hopes all of society could agree upon, but here we simply rely on widely accepted clinical guidelines regarding the management of hypertension (16). We will refer to fair (or not necessarily unjust) sources of difference in treatment intensification (interventional target) as “target-allowable.”

Based on the principles described above, the underlying allowability choice in our example is explicated in Table 1. Note that outcome-allowable covariates (demographic characteristics) are automatically included among the target-allowable covariates (demographic characteristics and clinical measures), but there may be exclusively target-allowable covariates (clinical measures). “Nonallowable” covariates (SES) can be seen as any covariate that is not “allowable” but are only specified for identification purposes (e.g., to help identify our estimands from observed data by accounting for confounding by nonallowables). We thus define the observed disparity in hypertension control as the difference in the proportion of uncontrolled hypertension between the Black and White populations, after adjusting for distributional differences in demographic characteristics by standardizing to the combined population (for a discussion of standard population choices, see Jackson et al. (17)). The hypothetical intervention should be defined as removing the disparity in treatment intensification conditional on demographic characteristics and clinical measures, but not on SES.

Table 1.

Covariates That Are Sources of Racial Differences in Hypertension Control/Treatment Intensification and Reasoning for Allowability Choice

Covariates Allowability Reasoning
Demographic characteristics (sex, age) Outcome- and target-allowable Even though sex and age are amenable to intervention (e.g., treatment may be adjusted to accommodate sex and age), the Black population is not disadvantaged on sex and age. Demographic differences are thus not responsible for worse outcomes in hypertension control; not adjusting for them could mask disparity (outcome-allowable).
Sex and age, as clinically prognostic modifiers of treatment intensification effects, are medically relevant for guiding decisions about treatment intensification (target-allowable).
Clinical measures (diabetes, baseline blood pressure) Target-allowable but not outcome-allowable Clinical measures are nonmanipulable at the point of care, but can be managed for better health outcomes (e.g., clinicians can connect patients with appropriate specialists); the Black population on average has worse clinical measures. Thus, disparities in clinical measures that result in health differences are unjust (not outcome-allowable).
Clinical measures are medically relevant factors for determining whether to intensify treatment (target-allowable).
SES (educational attainment, private health insurance) Neither outcome-allowable nor target-allowable (i.e., nonallowable) SES is nonmanipulable at the point of care, but can be managed for better health outcomes (e.g., clinicians can connect patients with additional resources); the Black population on average has lower SES. Thus, health differences due to SES differences are unjust (not outcome-allowable).
SES is not medically relevant for determining whether to intensify treatment. Thus, differences in treatment intensification that depend on SES are unjust (not target-allowable).

Abbreviation: SES, socioeconomic status.

We clarify that our purpose here is to encourage the explicit consideration of allowability choices, which may vary by context and investigator judgment. With a clear meaning of the disparities in hypertension control and treatment intensification, we can then define the reduced disparity as the difference in the proportion of uncontrolled hypertension, adjusted for the outcome-allowables, between Black populations with and without the hypothetical intervention defined above, while the residual disparity compares the proportion of uncontrolled hypertension between the Black population under the hypothetical intervention and the White population.

Causal decomposition estimators

The estimands are counterfactual, and to link them to the observed data, several assumptions are needed: 1) common support of the outcome-allowable covariates across race; 2) conditional exchangeability for treatment intensification in the Black population; 3) the distribution of treatment intensification in the Black population falls within the distribution in the White population, conditional on the target-allowable covariates; and 4) in the Black population, we would attain the same outcome under a) the intervention to have their treatment intensified versus b) the observation that their treatment is intensified. Details are provided in Web Appendix 1 (available at https://doi.org/10.1093/aje/kwad209).

We will use a weighting-based estimation method adapted from ratio of mediator probability weighting (RMPW), introduced by Jackson (3). A key distinction of this version of RMPW is its incorporation of allowability choices in the estimators. Specifically, RMPW allows the outcome disparity and the hypothetical intervention to be explicitly defined based on the user-designated outcome-allowables and target-allowables, respectively, while still incorporating nonallowables in the models when they are needed to address confounding (see Web Appendix 2 for details). Sample software code for implementing RMPW is provided in Web Appendix 3.

In our simulations we will consider the allowability designation (AD) defined in the previous subsection plus 6 other ADs, which will be explained below—5 of these correspond to existing estimators (Table 2), and 1 of these is added to provide additional comparison. In other words, these existing estimators implicitly assume a specific AD without necessarily giving thought to the meanings of disparity (3). A meaningful estimator would be the one that maps to the equity value judgment the investigators have in mind.

Table 2.

Seven Covariate Adjustment Choices (Allowability Designations) and Corresponding Estimators

Allowability
Designation
Covariates Treated as
Outcome-Allowable
Covariates Treated as
Target-Allowable
Corresponding (Existing) Estimator
1 None None Oaxaca-Blinder decomposition via linear models (29, 30)
2 None Demographic characteristics (Wdem)
Clinical measures (Wcln)
3 None Demographic characteristics (Wdem) Oaxaca-Blinder decomposition estimator via reweighting functions (21, 22)
Clinical measures (Wcln)
SES (WSES)
4 Demographic characteristics (Wdem) Demographic characteristics (Wdem) Path-specific effect analog estimator (somewhat similar to that of VanderWeele et al. (31))
5 Demographic characteristics (Wdem) Demographic characteristics (Wdem) Our proposed equity-aligned estimator
Clinical measures (Wcln)
6 Demographic characteristics (Wdem) Demographic characteristics (Wdem) Path-specific effect analog estimator (32)
Clinical measures (Wcln)
SES (WSES)
7 Demographic characteristics (Wdem) Demographic characteristics (Wdem) Natural direct/indirect effect analog estimators (difference method under stronger assumptions; e.g., see Valeri and VanderWeele (33) and Breen et al. (34))
Clinical measures (Wcln) Clinical measures (Wcln)
SES (WSES) SES (WSES)

Abbreviation: SES, socioeconomic status.

As discussed by Jackson (3), when all covariates are treated as both outcome-allowable and target-allowable, which corresponds to AD 7 in Table 2, the RMPW estimator described here reduces to the RMPW of Hong et al. (18, 19) and is mathematically equivalent to the inverse odds ratio weighting of Tchetgen Tchetgen (20) from the causal mediation analysis literature. When all covariates are treated as target-allowable but none are treated as outcome-allowable, which corresponds to AD 3 in Table 2, RMPW reduces to the density ratio estimator of Barsky et al. (21) and is mathematically equivalent to Dinardo et al. (22) from the econometric decomposition literature. Under other choices, the estimator aligns with those used to estimate path-specific effects, as in ADs 4 and 6 of Table 2. Despite these connections with estimators used for mediation/path analysis (i.e., assessment of mechanisms that constitute effects of manipulating race) and econometric decomposition analysis (i.e., assessment of discrimination), the counterfactual query is unique and should not be conflated with these endeavors. As described by Miles (23), a disparity can be reduced by intervening on a potential target even when that target does not mediate an effect of race for any person in the population.

SIMULATION STUDY

Scenario 1 (baseline)

Our example concerns a target population with a history of hypertension diagnosis, who may or may not have uncontrolled hypertension (defined as systolic blood pressure Inline graphic 140 mm Hg) at baseline. Using a subset of persons with a previous hypertension diagnosis from the National Health and Nutrition Examination Survey 2015–2016 Questionnaire Data (24) as a guide, we simulated a cohort of size 3,000 with covariates generated in the following order: race (Inline graphic: 1 = Black, 0 = White), sex (Inline graphic: 1 = male, 0 = female), age (Inline graphic), educational attainment (Inline graphic: 1 = undergraduate degree, 0 = no undergraduate degree), private health insurance (Inline graphic: 1 = yes, 0 = no), diabetes (Inline graphic: 1 = yes, 0 = no), systolic blood pressure at baseline (Inline graphic), and hypertensive status at baseline (Inline graphic 1 (i.e., uncontrolled hypertension) if Inline graphic 140 mm Hg, and Inline graphic 0 otherwise). The proportion of Black persons in our hypothetical cohort was roughly 0.4 (Inline graphic 0.4). Treatment intensification (Inline graphic) was simulated as a Bernoulli draw with probability defined as a function of Inline graphic, demographic covariates (Inline graphic), SES covariates (Inline graphic), and clinical measures (Inline graphic); persons with controlled hypertension at baseline (Inline graphic 140) had a 0 probability of treatment intensification. Systolic blood pressure at 6 months (Inline graphic) was generated from a normal distribution with mean defined as a function of Inline graphic, Inline graphic, Inline graphic, Inline graphic, and Inline graphic. The outcome of interest was hypertensive status at 6 months (Inline graphic 1 if Inline graphic 140 mm Hg, and 0 otherwise). The relationships between the covariates are depicted in Figure 1. We assumed that historical and contemporary processes Inline graphic, such as Jim Crow segregation laws and structural forms of racism, link Inline graphic with Inline graphic and Inline graphic. Inline graphic represents unmeasured factors that may correlate Inline graphic and Inline graphic. Details on the data-generating models are presented in Web Appendix 4. Details on obtaining the true values of the estimands are presented in Web Appendix 5.

In this and all other scenarios, the true values of the observed, reduced, and residual disparities (defined by the correct AD shown in Table 1) were computed numerically by large-scale simulation of actual and counterfactual situations. Details on the data-generating models are presented in Web Appendix 4. Details on obtaining the true values of the estimands are presented in Web Appendix 5.

Scenario 1 explored the consequences of ADs under a particular causal structure. Note that one cannot generalize the results to a new scenario, as causal structures vary across substantive settings. One may naively assume that settings with large racial differences in covariates would be the ones where ADs matter most, but this intuition is incomplete. It is entirely possible that covariates with strong effects (on either the treatment or outcome) could remain salient despite having smaller racial differences. To explore the trade-off between differences in the racial distribution versus differences in the strength of effects, we considered scenarios 2a–2f (“racial difference scenarios”) and 3a–3f (“confounding scenarios”).

Additional scenarios

Six scenarios of varying associations between race and covariates, as well as 6 scenarios with strong confounding (through increasing the covariates’ associations with either the treatment or outcome), are outlined in Table 3 (details are provided in Web Appendices 4 and 5).

Table 3.

Additional Scenarios With Stronger or Weaker Associations (Relative to Scenario 1) Among Covariates

Covariates Association Strength (Relative to Scenario 1) Scenario
Race Treatment Outcome
W dem : demographic characteristics (sex, age) Weak 2a
Strong 2b
W cln : clinical measures (diabetes, baseline blood pressure) Weak 2c
Strong 2d
W SES : SES (educational attainment, private health insurance) Weak 2e
Strong 2f
W dem : demographic characteristics (sex, age) Strong 3a
Strong 3b
W cln : clinical measures (diabetes, baseline blood pressure) Strong 3c
Strong 3d
W SES : SES (educational attainment, private health insurance) Strong 3e
Strong 3f

Abbreviation: SES, socioeconomic status.

Graphically, under the racial difference scenarios, the relationship between race (Inline graphic) and treatment intensification (Inline graphic) or outcome (Inline graphic) was modified through changing the association strength represented by the paths connecting Inline graphic with different sets of covariates (Figure 2). We expected that when there were large racial differences in nonallowable covariates, failing to capture their contributions to disparity would lead to large biases in the relevant estimates. For example, measuring disparity by adjusting for Inline graphic would not capture mechanisms involving Inline graphic, such as the resources to pay for preventive medical care, which could underestimate disparity in both treatment intensification and the outcome (25). On the other hand, if the racial differences in nonallowable covariates were small, failing to capture their contributions to disparity would likely not matter much. We had similar expectations under failing to adjust for racial differences in allowable covariates. Figure 3 depicts the strengthened association(s) under the confounding scenarios.

Figure 2.

Figure 2

Causal diagrams depicting relationships between historical and contemporary processes (Inline graphic), race (Inline graphic), demographic covariates (Inline graphic include sex Inline graphic and age Inline graphic), socioeconomic status (SES) covariates (Inline graphic include educational attainment Inline graphic and private health insurance Inline graphic), diabetes (Inline graphic), systolic blood pressure at baseline (Inline graphic) and 6 months of follow-up (Inline graphic), hypertensive status at baseline (Inline graphic) and at 6 months of follow-up (Inline graphic), and treatment intensification (Inline graphic). Clinical measures (Inline graphic) include Inline graphic and Inline graphic. The subscripts index when the covariate is realized (0 = before baseline, 1 = at baseline, 2 = at 6 months of follow-up). Dark arrows indicate stronger or weaker (relative to scenario 1) associations connecting Inline graphic and Inline graphic (panel A; scenarios 2a and 2b), Inline graphic and Inline graphic (panel B; scenarios 2c and 2d), and Inline graphic and Inline graphic (panel C; scenarios 2e and 2f).

Figure 3.

Figure 3

Causal diagrams depicting relationships between historical and contemporary processes (Inline graphic), race (Inline graphic), demographic covariates (Inline graphic include sex Inline graphic and age Inline graphic), socioeconomic status (SES) covariates (Inline graphic include educational attainment Inline graphic and private health insurance Inline graphic), diabetes (Inline graphic), systolic blood pressure at baseline (Inline graphic) and 6 months of follow-up (Inline graphic), hypertensive status at baseline (Inline graphic) and at 6 months of follow-up (Inline graphic), and treatment intensification (Inline graphic). Clinical measures (Inline graphic) include Inline graphic and Inline graphic. The subscripts index when the covariate is realized (0 = before baseline, 1 = at baseline, 2 = at 6 months of follow-up). Dark arrows indicate stronger (relative to scenario 1) associations connecting Inline graphic and Inline graphic (panel A; scenario 3a), Inline graphic and Inline graphic (panel B; scenario 3b), Inline graphic and Inline graphic (panel C; scenario 3c), Inline graphic and Inline graphic (panel D; scenario 3d), Inline graphic and Inline graphic (panel E; scenario 3e), and Inline graphic and Inline graphic (panel F; scenario 3f).

Allowability designations

In each scenario (the baseline scenario, racial difference scenarios 2a–2f, and confounding scenarios 3a–3f), we considered 7 ADs in the estimator (Table 2; note that AD5 corresponds to the equity-aligned estimator).

Estimation

We estimated the observed, reduced, and residual disparities via RMPW based on each of the 7 ADs. The estimation method is outlined in Table 4, with the covariates included in Inline graphic (denotes the set of covariates treated as outcome-allowable), Inline graphic (exclusively target-allowable), and Inline graphic (nonallowable) varying by AD (e.g., AD5 corresponds to the estimator with Inline graphic, Inline graphic, and Inline graphic). Note that at the individual level, the intervention depends on the observed values of Inline graphic and Inline graphic, while the observed, reduced, and residual disparities are simply contrasts between mean outcomes, standardized with respect to the distribution of Inline graphic.

Table 4.

Estimation of the Observed, Reduced, and Residual Disparities Via Ratio of Mediator Probability Weightinga

Estimand Estimate
The proportion of uncontrolled hypertension in the Black population, adjusted for covariates treated as outcome-allowable (1) Weighted mean of Inline graphicb in the Black population (Inline graphic) with weightscInline graphic
The proportion of uncontrolled hypertension in the White population, adjusted for covariates treated as outcome-allowable (2) Weighted mean of Inline graphic in the White population (Inline graphic) with weights Inline graphic
The proportion of uncontrolled hypertension in the Black population under the hypothetical intervention, adjusted for covariates treated as outcome-allowable (3) Weighted mean of Inline graphic in the Black population (Inline graphic) with weightsdInline graphic
Observed disparity (4) Estimate of (1) − (2)
Reduced disparity (5) Estimate of (1) − (3)
Residual disparity (6) Estimate of (3) − (2)

Abbreviation: RMPW, ratio of mediator probability weighting.

a Parenthetical numbers are used to label the estimands.

b Hypertensive status at 6 months (Inline graphic 1 if systolic blood pressure Inline graphic 140 mm Hg, and 0 otherwise).

c Inline graphic was estimated by the proportion of the Black population in the simulated data; Inline graphic was estimated by a logistic regression of race on covariates treated as outcome-allowable (Inline graphic). Natural cubic splines (3 degrees of freedom) for continuous variables and 2-way interactions among binary variables were included.

d Inline graphic was estimated by a logistic regression of treatment intensification (Inline graphic) on covariates treated as allowable (Inline graphic: outcome-allowable; Inline graphic: exclusively target-allowable) fitted in the White population; Inline graphic was estimated by a logistic regression of treatment intensification on all covariates (Inline graphic; Inline graphic; Inline graphic: nonallowable) fitted in the Black population. Natural cubic splines (3 degrees of freedom) for continuous variables and 2-way interactions among binary variables were included.

Within each simulation, bootstrapping with 1,000 resamples was done to obtain a 95% confidence interval (CI) for each estimand. All analyses were conducted using R, version 4.0.2 (26). Performances of estimates under the 7 ADs were evaluated using the following metrics over 1,000 simulations: bias (the average estimated value minus the true value), percent bias (bias divided by the true value), empirical standard error, root mean squared error (RMSE), and 95% CI coverage (the percentage of 1,000 bootstrap 95% CIs that cover the true value).

RESULTS

Scenario 1 (baseline)

Table 5 shows the performances of the estimators in scenario 1. Among the 7 ADs, AD4, AD5, and AD6 have the correct outcome AD; hence, they yield the same observed disparity estimates with the smallest bias and RMSE, and a 95% CI coverage of 95.3%. AD1, AD2, and AD3 fail to treat demographic characteristics (which the Black population is advantaged on) as outcome-allowable and thus underestimate the observed disparity (average estimated observed disparity = 0.1136; true value = 0.1442), with a 95% CI coverage of 52.1%. AD7 treats all covariates as outcome-allowable, which leads to severe underestimation of the observed disparity (average estimated observed disparity = 0.0594), with the largest bias and RMSE and a 95% CI coverage of only 3.1%.

Table 5.

Performance Metrics for the Estimation of Observed, Reduced, and Residual Disparity in 1,000 Simulated Data Sets Under Different Allowability Designations in the Estimator

Allowability
Designation a
Metric
True
Value
Average Estimated
Observed Disparity
Bias % Bias Empirical
SE
RMSE 95% CI
Coverage, %
Observed disparity 0.1442
 1 0.1136 −0.0306 −21.2 0.0153 0.0342 52.1
 2 0.1136 −0.0306 −21.2 0.0153 0.0342 52.1
 3 0.1136 −0.0306 −21.2 0.0153 0.0342 52.1
 4 0.1451 0.0009 0.6 0.0202 0.0202 95.3
 5b 0.1451 0.0009 0.6 0.0202 0.0202 95.3
 6 0.1451 0.0009 0.6 0.0202 0.0202 95.3
 7 0.0594 −0.0849 −58.9 0.0184 0.0868 3.1
Reduced disparity 0.0384
 1 −0.0405 −0.0789 −205.5 0.0062 0.0792 0
 2 0.0370 −0.0014 −3.6 0.0081 0.0083 95.8
 3 0.0355 −0.0029 −7.6 0.0082 0.0088 95.4
 4 −0.0497 −0.0881 −229.4 0.0110 0.0888 0.2
 5b 0.0385 0.0001 0.3 0.0093 0.0093 96.0
 6 0.0372 −0.0012 −3.1 0.0093 0.0094 95.8
 7 0.0321 −0.0063 −16.4 0.0086 0.0107 88.5
Residual disparity 0.1058
 1 0.1542 0.0483 45.7 0.0161 0.051 16.1
 2 0.0766 −0.0292 −27.6 0.0160 0.0333 56.7
 3 0.0782 −0.0277 −26.2 0.0160 0.0319 61.4
 4 0.1949 0.0890 84.1 0.0217 0.0916 2.2
 5b 0.1066 0.0008 0.8 0.0209 0.0209 94.6
 6 0.1079 0.0021 2.0 0.0209 0.0210 95.1
 7 0.0273 −0.0785 −74.2 0.0184 0.0806 6.6

Abbreviations: CI, confidence interval; RMSE, root mean squared error; SE, standard error.

a Allowability designation in the estimator: 1, all covariates are treated as nonallowable; 2, socioeconomic covariates are treated as nonallowable, and all other covariates are treated as exclusively target-allowable; 3, all covariates are treated as exclusively target-allowable; 4, demographic covariates are treated as outcome- and target-allowable, and all other covariates are treated as nonallowable; 5, demographic covariates are treated as outcome- and target-allowable, clinical measures are treated as exclusively target-allowable, and socioeconomic covariates are treated as nonallowable; 6, demographic covariates are treated as outcome- and target-allowable, and all other covariates are treated as exclusively target-allowable; 7, all covariates are treated as outcome- and target-allowable.

b Allowability designation 5 is the equity-aligned estimator.

In terms of the reduced disparity, estimates under AD2 and AD5 have the smallest bias and RMSE and good 95% CI coverage (AD2: 95.8%; AD5: 96.0%), mainly due to the correct target AD. Although AD3, AD6, and AD7 incorrectly treat SES as target-allowable, they treat clinical measures as target-allowable and yield estimates with small bias and RMSE and decent 95% CI coverage (AD3: 95.4%; AD6: 95.8%; AD7: 88.5%); this is a result of clinical measures’ having a much stronger relationship with the target than SES does in our example, which makes the incorrect adjustment of SES negligible as long as clinical measures are correctly adjusted for. On the other hand, AD1 and AD4 do not treat clinical measures as target-allowable, resulting in estimates with the largest bias and RMSE and 95% CI coverage of less than 1%. Moreover, the estimated reduced disparities under AD1 and AD4 are negative (AD1: –0.0405; AD4: −0.0497), implying that the hypothetical intervention would increase the racial disparity in hypertension control rather than reduce it. This is because AD1 and AD4 imply interventions that assign treatment intensification without regard to clinical status, effectively reducing the chance of treatment intensification for Black individuals who need it.

The performance of the residual disparity estimate can be evaluated as a function of the performances with respect to the observed and reduced disparities under each AD. As expected, residual disparity estimates under AD5 exhibit minimal bias and RMSE and good CI coverage. This is also the case for AD6, which has the correct outcome AD and which includes clinical measures as target-allowable (which is important to the reduced disparity). AD4 has the correct outcome AD but does not include clinical measures as target-allowable, leading to large bias in its residual disparity estimates despite its being unbiased for the observed disparity. The biases in the residual disparity estimates under AD2, AD3, and AD7 are mostly driven by their incorrect outcome ADs.

Scenarios 2a–2f (racial/ethnic difference scenarios)

Altering the racial differences in covariates results in certain ADs performing better or worse than they do in scenario 1. AD1, AD2, and AD3 do not treat demographic characteristics as outcome-allowable, which does not matter when there are no racial differences in demographic characteristics (scenario 2a). Thus, in this scenario their observed disparity estimates show minimal bias (Web Figure 1), and the performance measures are comparable to those under ADs that correctly treat demographic characteristics as the only outcome-allowables (Web Table 1). The observed disparity estimates under AD7 do not exhibit large bias when the racial differences in clinical measures are small (scenario 2c), in which case the consequence of incorrectly treating clinical measures as outcome-allowable is minimized. In addition, because clinical measures have a stronger relationship with the outcome than SES does, the impact of incorrectly treating SES as outcome-allowable under AD7 is negligible in scenario 2c.

As in scenario 1, reduced disparity estimates under AD1 and AD4 show the largest bias and RMSE (Web Figure 1; Web Table 2), with poor 95% CI coverage (AD1: 0%–0.1% across scenarios 2a–2f; AD4: 0%–3.8% across scenarios 2a–2f). Given that the residual disparity estimate is a combination of the observed and reduced disparity estimates, residual disparity estimates under AD2 and AD3 show minimal bias and adequate performance overall in scenario 2a (Web Figure 1; Web Table 3), for the reason stated above (that not treating demographic characteristics as outcome-allowable does not matter when there are no racial differences in demographic characteristics). Likewise, residual disparity estimates under AD7 show small bias in scenario 2c, because of the reduced impact of treating clinical measures as outcome-allowable in this scenario.

Scenarios 3a–3f (confounding scenarios)

Altering the associations between the covariates and treatment intensification or outcome does not lead to significant changes in the observed disparity estimation performances for all ADs (Web Figure 2; Web Table 4). Reduced disparity estimates under AD1 and AD4 have the largest bias in all scenarios (mainly a result of not treating clinical measures as target-allowable), except in scenario 3c, where there is a strong association between clinical measures and the target (Web Figure 2; Web Table 5). In this scenario, there is a reduction in the treatment disparity between Blacks and Whites, given that Blacks on average have worse clinical measures (which predict higher treatment probability) than Whites. Thus, it matters less in scenario 3c whether clinical measures are correctly treated as target-allowable. AD4 also demonstrates good performance with respect to the residual disparity in scenario 3c (Figure 3; Web Table 6), for the reason stated above, as well as having the correct outcome AD.

In addition to the results presented here, we also obtained the estimates on a relative scale by taking the ratio of the uncontrolled hypertension proportions instead of the difference. These analyses showed similar findings, and results are presented in Web Table 7 and Web Figures 3 and 4.

DISCUSSION

We have shown through simulations that the causal decomposition estimator with the correct AD (i.e., one that matches the investigator’s equity value judgments) performed consistently well under all scenarios considered, whereas the performances of the other estimators may vary from scenario to scenario.

The relationship between race and the treatment decision or the outcome can arise in many ways—either through the associations between race and a set of pretreatment covariates or the associations between the covariates and the treatment decision or the outcome. We have chosen a particular set of causal structures with different association strengths and demonstrated their implications for allowability choices. The performances of the “unmeaningful” estimators (i.e., estimators that imply an AD other than AD5) in a certain scenario hinged upon how important their incorrect ADs were given the causal structure in that scenario. For example, the estimates based on AD2, which has the correct target AD but does not treat demographic characteristics as outcome-allowable, were acceptable only under the causal structure where there were no racial differences in demographic characteristics.

The main limitation of our simulation, as with most simulation studies, is that it does not cover all possible substantive scenarios. We have chosen a specific example to demonstrate the importance of ADs, but our results should be interpreted more broadly with the specific context in mind. In our simulation setting, clinical measures had a stronger impact on the treatment decision and the outcome than SES did. Hence, we have observed that once clinical measures, along with demographic characteristics, were correctly treated as target-allowable, the reduced disparity estimates were generally acceptable even though SES was incorrectly treated as target-allowable (e.g., AD3, AD6, and AD7 in scenario 1). Note that even in our scenario with strong SES-treatment confounding, the association between clinical measures and treatment still dominates the bias. However, there may be other cases where SES has a much stronger impact on the treatment decision—for example, in settings where treatment is primarily driven by someone’s ability to pay out of pocket. In this case one would likely find opposite results, where the bias in reduced disparity estimates is determined largely by the correct handling of SES (and thus the reduced disparity estimates under AD3, AD6, and AD7 would be expected to have larger biases). Last, our simulation assumes that there is no unmeasured confounding. In certain situations (such as our example, where the target is a medical treatment and its indications—systolic and diastolic blood pressure—are measured), we believe this assumption is likely to hold, at least approximately. Otherwise, incorporating approaches based on sensitivity analysis (27) or proxy measures (28) for unmeasured confounding variables may be fruitful, but much of this is left for future development.

Among all of the estimators we considered, the natural direct/indirect effect analogs and the difference methods—which both correspond to AD7 (all covariates are treated as both outcome- and target-allowable)—may be among the most widely adopted methods for causal decomposition analysis. The AD underlying estimators derived from these methods is driven by identification assumptions, which call for the adjustment of all confounders of the treatment effect. However, it would be problematic to define the causal decomposition estimands based on all confounders, because the definition of disparity would change depending on the chosen interventional target, making it impossible to compare the relative importance of different interventional targets in reducing disparity. For example, suppose investigators wanted to compare the importance of treatment intensification versus physical activity in reducing the disparity in hypertension control. Because there is a different set of confounders for the effect of physical activity on hypertension control (e.g., body mass index, neighborhood factors) than there is for treatment intensification, if the natural direct/indirect effect analogs or the difference methods were used, the set of covariates included as outcome-allowable and thus the definition of disparity would be different between the contexts of the two interventional targets. The only case where the natural direct/indirect effects and the difference methods are appropriate is when all confounders can be justified as both outcome- and target-allowable, which is suspected to be a very rare case in practice. In our simulations, AD7 is acceptable only when there are small racial differences in clinical measures, which are incorrectly treated as outcome-allowable under AD7. Therefore, an automatic selection of the natural direct/indirect effect analogs and the difference methods is not appropriate, since in many cases these methods may result in severe bias.

In conclusion, we have demonstrated the importance of considering allowability when measuring disparity and identifying interventional targets, which does have implications for bias, as shown in our simulations motivated by a real-world example. Therefore, investigators should approach off-the-shelf estimators with caution, recognizing the allowability choices they implicitly assume and whether these choices align with equity considerations within the substantive setting of interest.

Supplementary Material

Web_Material_kwad209

ACKNOWLEDGMENTS

Author affiliations: Department of Biostatistics, Mailman School of Public Health, Columbia University, New York, New York, United States (Ting-Hsuan Chang); Department of Mental Health, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, Maryland, United States (Trang Quynh Nguyen, John W. Jackson); Department of Epidemiology, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, Maryland, United States (John W. Jackson); Department of Biostatistics, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, Maryland, United States (John W. Jackson); Center for Health Equity, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, Maryland, United States (John W. Jackson); and Center for Health Disparities Solutions, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, Maryland, United States (John W. Jackson).

This work was funded by grants from the National Heart, Lung, and Blood Institute (K01HL145320) and the National Institute of Mental Health (R01MH115487 and R03MH128634).

The data set is available from the corresponding author.

This work was presented at the 2022 American Causal Inference Conference, Berkeley, California, May 23–25, 2022.

The views expressed in this article are those of the authors and do not reflect those of the National Institutes of Health.

Conflict of interest: none declared.

REFERENCES

  • 1. Braveman PA, Kumanyika S, Fielding J, et al. Health disparities and health equity: the issue is justice. Am J Public Health. 2011;101(suppl 1):S149–S155. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Lê Cook B, McGuire TG, Zaslavsky AM. Measuring racial/ethnic disparities in health care: methods and practical issues. Health Serv Res. 2012;47(3):1232–1254. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Jackson JW. Meaningful causal decompositions in health equity research: definition, identification, and estimation through a weighting framework. Epidemiology. 2021;32(2):282–290. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Asada Y. A framework for measuring health inequity. J Epidemiol Community Health. 2005;59(8):700–705. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Messer LC. Invited commentary: measuring social disparities in health—what was the question again? Am J Epidemiol. 2008;167(8):900–904. [DOI] [PubMed] [Google Scholar]
  • 6. Harper S, King NB, Meersman SC, et al. Implicit value judgments in the measurement of health inequalities. Milbank Q. 2010;88(1):4–29. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Lyons M, Shanning K, Stroud A. A surprising racial twist: racialized discourse in media coverage of Covid-19. TRIO McNair Schol Res J. 2020;XXI:74–82. [Google Scholar]
  • 8. Cowger TL, Davis BA, Etkins OS, et al. Comparison of weighted and unweighted population data to assess inequities in coronavirus disease 2019 deaths by race/ethnicity reported by the US Centers for Disease Control and Prevention. JAMA Netw Open. 2020;3(7):e2016933. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Zalla LC, Martin CL, Edwards JK, et al. A geography of risk: structural racism and coronavirus disease 2019 mortality in the United States. Am J Epidemiol. 2021;190(8):1439–1446. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Fontil V, Pacca L, Bellows BK, et al. Association of differences in treatment intensification, missed visits, and scheduled follow-up interval with racial or ethnic disparities in blood pressure control. JAMA Cardiol. 2022;7(2):204–212. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Morgenstern H, Bursic ES. A method for using epidemiologic data to estimate the potential impact of an intervention on the health status of a target population. J Community Health. 1982;7(4):292–309. [DOI] [PubMed] [Google Scholar]
  • 12. Jackson JW, VanderWeele TJ. Decomposition analysis to identify intervention targets for reducing disparities. Epidemiology. 2018;29(6):825–835. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Howe CJ, Bailey ZD, Raifman JR, et al. Recommendations for using causal diagrams to study racial health disparities. Am J Epidemiol. 2022;191(12):1981–1989. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. Duan N, Meng XL, Lin JY, et al. Disparities in defining disparities: statistical conceptual frameworks. Stat Med. 2008;27(20):3941–3956. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Hutler B. Causation and injustice: locating the injustice of racial and ethnic health disparities. Bioethics. 2022;36(3):260–266. [DOI] [PubMed] [Google Scholar]
  • 16. Whelton PK, Carey RM, Aronow WS, et al. 2017 ACC/AHA/AAPA/ABC/ACPM/AGS/APhA/ASH/ASPC/NMA/PCNA Guideline for the Prevention, Detection, Evaluation, and Management of High Blood Pressure in Adults: a report of the American College of Cardiology/American Heart Association Task Force on Clinical Practice Guidelines. Hypertension. 2018;71(6):e13–e115. [DOI] [PubMed] [Google Scholar]
  • 17. Jackson JW, Hsu YJ, Greer RC, et al. The observational target trial: a conceptual model for measuring disparity [preprint]. arXiv. 2022. 10.48550/arXiv.2207.00530. Accessed September 8, 2022. [DOI] [Google Scholar]
  • 18. Hong G. Ratio of mediator probability weighting for estimating natural direct and indirect effects. Presented at the Joint Statistical Meeting of the American Statistical Association, Vancouver, British Columbia, July 31–August 5, 2010. [Google Scholar]
  • 19. Hong G, Deutsch J, Hill HD. Ratio-of-mediator-probability weighting for causal mediation in the presence of treatment-by-mediator interaction. J Educ Behav Stat. 2015;40(3):307–340. [Google Scholar]
  • 20. Tchetgen Tchetgen EJ. Inverse odds ratio-weighted estimation for causal mediation analysis. Stat Med. 2013;32(26):4567–4580. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21. Barsky R, Bound J, Charles KK, et al. Accounting for the black-white wealth gap: a nonparametric approach. J Am Stat Assoc. 2002;97(459):663–673. [Google Scholar]
  • 22. DiNardo J, Fortin NM, Lemieux T. Labor market institutions and the distribution of wages, 1973–1992: a semiparametric approach. Econometrica. 1996;64(5):1001–1044. [Google Scholar]
  • 23. Miles CH. On the causal interpretation of randomized interventional indirect effects [preprint]. arXiv. 2022. 10.48550/arXiv.2203.00245. Accessed January 19, 2023. [DOI] [Google Scholar]
  • 24. National Center for Health Statistics . National Health and Nutrition Examination Survey. 2015-2016 Questionnaire Data. https://wwwn.cdc.gov/nchs/nhanes/search/datapage.aspx?Component=Questionnaire&CycleBeginYear=2015. Published September 2017. Updated February 2022. Accessed December 4, 2023.
  • 25. Mueller M, Purnell TS, Mensah GA, et al. Reducing racial and ethnic disparities in hypertension prevention and control: what will it take to translate research into practice and policy? Am J Hypertens. 2015;28(6):699–716. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26. R Core Team . R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing; 2020. [Google Scholar]
  • 27. Ding P, VanderWeele TJ. Sensitivity analysis without assumptions. Epidemiology. 2016;27(3):368–377. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28. Tchetgen Tchetgen EJ, Ying A, Cui Y, et al. An introduction to proximal causal learning [preprint]. arXiv. 2020. 10.48550/arXiv.2009.10982. Accessed January 19, 2023. [DOI] [Google Scholar]
  • 29. Blinder AS. Wage discrimination: reduced form and structural estimates. J Hum Resour. 1973;8(4):436–455. [Google Scholar]
  • 30. Oaxaca R. Male-female wage differentials in urban labor markets. Int Econ Rev. 1973;14(3):693–709. [Google Scholar]
  • 31. VanderWeele TJ, Vansteelandt S, Robins JM. Effect decomposition in the presence of an exposure-induced mediator-outcome confounder. Epidemiology. 2014;25(2):300–306. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32. Zheng W, Laan MJ. Targeted maximum likelihood estimation of natural direct effects. Int J Biostat. 2012;8(1):1–40. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33. Valeri L, VanderWeele TJ. Mediation analysis allowing for exposure-mediator interactions and causal interpretation: theoretical assumptions and implementation with SAS and SPSS macros. Psychol Methods. 2013;18(2):137–150. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34. Breen R, Karlson KB, Holm A. Total, direct, and indirect effects in logit and probit models. Sociol Methods Res. 2013;42(2):164–191. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Web_Material_kwad209

Articles from American Journal of Epidemiology are provided here courtesy of Oxford University Press

RESOURCES