Skip to main content
American Journal of Epidemiology logoLink to American Journal of Epidemiology
. 2019 May 20;188(9):1674–1681. doi: 10.1093/aje/kwz121

Methodological Challenges When Studying Distance to Care as an Exposure in Health Research

Ellen C Caniglia 1,2,, Rebecca Zash 3, Sonja A Swanson 2,4, Kathleen E Wirth 5, Modiegi Diseko 6, Gloria Mayondi 6, Shahin Lockman 6,7,8, Mompati Mmalane 6, Joseph Makhema 6, Scott Dryden-Peterson 6,7,8, Kalé Z Kponee-Shovein, Oaitse John 6, Eleanor J Murray 2, Roger L Shapiro 3,6,8
PMCID: PMC6735874  PMID: 31107529

Abstract

Distance to care is a common exposure and proposed instrumental variable in health research, but it is vulnerable to violations of fundamental identifiability conditions for causal inference. We used data collected from the Botswana Birth Outcomes Surveillance study between 2014 and 2016 to outline 4 challenges and potential biases when using distance to care as an exposure and as a proposed instrument: selection bias, unmeasured confounding, lack of sufficiently well-defined interventions, and measurement error. We describe how these issues can arise, and we propose sensitivity analyses for estimating the degree of bias.

Keywords: causal diagrams, causal inference, distance to care, identifiability conditions, instrumental variables, selection bias, unmeasured confounding, well-defined interventions


Editor’s note: An invited commentary on this article appears on page 000.

The distance an individual travels from their home to receive medical care, referred to as distance to care, is an important and frequently used construct in health services research. In particular, distance to care is often evaluated in relationship to infant and child mortality in low- and middle-income countries (15). Distance to care is also proposed as an instrument for estimating effects of other exposures, such as the receipt or intensity of a medical treatment or procedure (611), on a health outcome. In many of these studies, the explicit or implicit goal is to inform public health decision-making, a goal that relies on the study results providing valid estimates of causal effects (12).

However, when distance to care is used as an exposure or proposed as an instrument, a number of possible biases exist that might violate fundamental identifiability conditions for causal inference—namely, exchangeability and consistency (13). First, nonexchangeability due to selection bias can occur when distance to care is associated with selection into the study, as is often the case when participant recruitment occurs at sites offering care. Second, nonexchangeability due to unmeasured or residual confounding by variables such as socioeconomic status is likely. Third, distance to care might not correspond to a sufficiently well-defined intervention, an arguably critical component of consistency (1417), affecting interpretability and potentially inducing bias. Fourth, measurement error for distance can compound these problems, complicating exchangeability and undermining attempts at specifying a well-defined intervention. In this paper, we use an example from a birth outcomes surveillance study in Botswana to illustrate these potential biases when using distance to care as an exposure or a proposed instrument. We introduce the case study, describe potential issues when using distance to care as an exposure or proposed instrument, and propose sensitivity analyses.

CASE STUDY: BOTSWANA BIRTH OUTCOMES SURVEILLANCE STUDY

The Botswana Birth Outcomes Surveillance study is a surveillance study of >75,000 infant livebirths and stillbirths at 8 geographically representative major delivery centers in Botswana offering high levels of care, representing approximately 45% of all births in the country between 2014 and 2016. Data are collected from obstetrical cards at the time of discharge from the postnatal ward (18, 19), including adverse birth outcomes and the first antenatal care clinic visited during pregnancy, a proxy for a woman’s home village.

The high incidence of adverse birth outcomes in Botswana has been previously reported in this cohort using data from 2009 to 2011 (18). The overall risk of stillbirth (defined as fetal death at ≥24 weeks’ gestation with an Apgar score of 0,0,0) was 3.3%, the overall risk of preterm delivery (delivery at <37 weeks’ gestation) was 19.6%, and the overall risk of the infant being small for gestational age (<10th percentile according to World Health Organization norms (20, 21)) was 13.5% (18). Greater access to care during pregnancy could reduce the risk of adverse birth outcomes by facilitating receipt of treatment for infections, diabetes, hypertension, anemia, and other pregnancy complications, which are known risk factors for adverse birth outcomes (18, 2225). Greater access to care also increases access to HIV medication, reducing HIV transmission from mother to child (26). In addition, shorter distances to care decrease the likelihood that a woman will deliver on the way to the delivery center or at home, which in turn could decrease the risk of stillbirth, neonatal death, and maternal mortality (22, 27).

Our goal was to estimate the effect of distance from the first antenatal-care clinic visited during pregnancy to the nearest major delivery center on the risk of stillbirth. We used driving distance, calculated using the Google Distance Matrix API, as our primary exposure (dichotomized as ≤100 km vs. >100 km to facilitate a clear presentation of key issues). In Botswana, areas >100 km from a major delivery center can be considered very remote. Our analysis was restricted to 51,558 pregnant women who attended 1 or more antenatal-care clinic visits within the first 28 weeks of pregnancy.

Web Figure 1 (available at https://academic.oup.com/aje) shows a heat map of the risk of stillbirth in Botswana. Table 1 shows the observed association between distance from any major delivery center and stillbirth on the risk ratio scale (28) (Web Table 1 shows results using other categorizations and definitions of distance). We present unadjusted results to facilitate discussion of key biases. Below, we describe 4 key issues that prevent endowing the associational risk ratio with a causal interpretation.

Table 1.

Unadjusted Association Between Distance to Delivery Center and Risk of Stillbirth in Botswana, 2014–2016

Distance No. of Individuals Stillbirths Unadjusted RR 95% CI
No. %
≤100 km 47,258 1,050 2.22 1.00 Referent
>100 km 4,300 119 2.77 1.25 1.03, 1.50

Abbreviations: CI, confidence interval; RR, risk ratio.

POTENTIAL ISSUES WHEN USING DISTANCE TO CARE AS AN EXPOSURE

Issue 1: selection bias

An analysis of the association between distance to care and a health outcome will necessarily be restricted to individuals selected into the study, but this can introduce selection bias. Selection bias can occur when distance to care is associated with selection into the study and when there are shared causes of selection and the outcome (29). Selection could occur by design if study recruitment occurs at health clinics, hospitals, or other sites offering health services. For example, individuals living closer to study sites are more likely to be recruited into the study. Even if study recruitment does not occur at sites offering health services, selection could be implicit, because individuals living in rural areas are less likely to be captured in health research studies. In the extreme case where no individuals in certain remote areas are recruited into the study, an assessment of distance to care could violate the positivity condition (13).

The structure of this bias is depicted in the causal diagram in Figure 1. Estimating the causal effect of distance on the health outcome by adjusting for shared causes (if information on these variables is available) of selection and the outcome via standard methods such as stratification, matching, restriction, or regression requires the strong assumption that distance does not affect any of those shared causes. Inverse-probability-of-censoring weighting requires collecting data on or making assumptions about individuals who were not enrolled in the study. Selection bias violates the exchangeability condition for causal inference because distance to care and the counterfactual outcome are no longer independent (13).

Figure 1.

Figure 1.

Causal diagram including distance, outcome (in our case study, stillbirth), selection, and a shared cause U of selection and outcome. Selection bias might occur via conditioning on selection into the study, a collider on the path following distance, selection, U, and outcome. In our case study, U was measured among those included in the study but not among those not included.

In the Botswana birth outcomes surveillance study, recruitment into the surveillance study occurred exclusively at the 8 major delivery hospitals at the time of delivery. We hypothesized that women who live close to a major delivery hospital or have complicated pregnancies will be more likely to deliver at one of the hospitals included in the surveillance study. Women in rural areas experiencing complicated pregnancies might be referred to deliver at one of the 8 hospitals because they are large maternity facilities with high levels of care. Because women experiencing complicated pregnancies have a higher risk of stillbirth, the selection process will make it appear as if women who live in rural areas have a higher risk of stillbirth. In other words, our study could oversample women from rural areas who have complicated pregnancies and are more likely to have a stillbirth. Adjustment for pregnancy complications via standard methods would be inappropriate in this example because distance to care could affect pregnancy complications through its impact on access to medications or antenatal care services.

Sensitivity analyses can be used to quantify the range of plausible magnitudes of selection bias. If all shared causes of selection and the outcome were measured and the conditional probabilities of selection given these shared causes were known, inverse-probability weighting could be used to obtain an unbiased estimate of the effect of distance on the outcome. Because these probabilities are not known, we can vary a series of selection probabilities to estimate a range of plausible estimates. Suppose we assume the only shared cause of selection and the outcome is pregnancy complications (U), which was measured among women included in the study. We can assign values to the 4 conditional probabilities Pr[Selection=1|Distance=distance,U=u], assign weights to each individual included in the study based on their values of distance and pregnancy complications (yes/no), estimate the effect of distance on the outcome in the weighted pseudopopulation, and repeat this process to obtain a range of estimates. Table 2 shows results from 2 of these sensitivity analyses: an extreme case where the probabilities are selected to yield a risk ratio close to 1 and a realistic case that reflects what we believe to be reasonable values for each probability. Other sensitivity analyses for selection bias are available that vary the probably of selection based on the exposure and outcome (30) rather than the exposure and shared causes of the exposure and outcome.

Table 2.

Adjusted Association Between Distance to Delivery Center and Risk of Stillbirth Under Different Selection Probabilities in Botswana, 2014–2016a

Distance RR 95% CI
Unadjusted estimate
 ≤100 km 1.00 Referent
 >100 km 1.25 1.03, 1.50
Adjusted estimate, extreme caseb
 ≤100 km 1.00 Referent
 >100 km 0.98 0.80, 1.21
Adjusted estimate, realistic casec
 ≤100 km 1.00 Referent
 >100 km 1.12 0.92, 1.37

Abbreviations: CI, confidence interval; RR, risk ratio.

a We used robust variance estimators that take into account the procedure of weight estimation to compute 95% CIs (46). A calculator and directions to compute these adjusted effect estimates can be found online (44, 45) and in Web Table 2. Web Table 3 shows results from sensitivity analyses based on a wider range of selection probabilities.

b Proposed probabilities of selection: ≤100 km, no complications: 0.8; ≤100 km, complications: 0.4; >100 km, no complications: 0.1; >100 km, complications: 0.7.

c Proposed probabilities of selection: ≤100 km, no complications: 0.55; ≤100 km, complications: 0.65; >100 km, no complications: 0.15; >100 km, complications: 0.6.

Issue 2: unmeasured confounding for distance to care

No unmeasured confounding (conditional exchangeability for the exposure) is a fundamental identifiability assumption for causal inference (13). Given the set of measured covariates, we assume individuals living within some radius of care would have the same outcome distribution as individuals living outside that same radius of care, had those inside actually lived outside that radius, and vice versa. An unmeasured shared cause of distance and the outcome violates the assumption of conditional exchangeability, as shown in Figure 2. For example, distance to care is often confounded by socioeconomic status (SES), a variable for which adjustment is very difficult. Both individual-level factors (such as education, income, and work status) and neighborhood-level factors (such as average income in the neighborhood) should be considered when measuring SES. In addition, where an individual lives is intertwined with family, financial, social, political, community, and structural conditions that are often too complex and poorly understood to measure accurately.

Figure 2.

Figure 2.

Causal diagram including distance, outcome (in our case study, stillbirth), and an unmeasured shared cause U of distance and outcome. Confounding might occur via the path following distance, U, and outcome.

In Botswana, as in most settings, SES is linked to geographic location, access to care, and the risk of adverse birth outcomes such as stillbirth (18). Educational attainment and working status are measured but might not be accurate proxies of SES in this setting (in fact, adjustment for these variables does not meaningfully change our estimates).

Sensitivity analyses for unmeasured and unknown confounding are available to calculate a range of effect estimates in the presence of an unmeasured confounder. Table 3 shows estimates adjusted for unmeasured SES under an extreme and reasonable scenario, hypothesizing that lower SES is positively correlated with greater distances and higher stillbirth risk. The extreme case selects probabilities that yield a risk ratio close to 1, and the realistic case reflects what we believe are reasonable values for each probability. We use the bias formula B=RRUDRREU/(RRUD+RREU1) where RRUD is the risk ratio for the outcome comparing the 2 levels of the unmeasured confounder SES within either treatment group, and RREU is the risk ratio for the unmeasured confounder SES comparing those with and without treatment (31).

Table 3.

Adjusted Association Between Distance to Delivery Center and Risk of Stillbirth Under Different Assumptions About Unmeasured Socioeconomic Status in Botswana, 2014–2016a

Distance RR 95% CI
Unadjusted estimate
 ≤100 km 1.00 Referent
 >100 km 1.25 1.03, 1.50
Adjusted estimate, extreme caseb
 ≤100 km 1.00 Referent
 >100 km 0.94 0.77, 1.13
Adjusted estimate, realistic casec
 ≤100 km 1.00 Referent
 >100 km 1.17 0.96, 1.40

Abbreviations: CI, confidence interval; RR, risk ratio; SES, socioeconomic status.

a The calculator and directions for use can be found online (44, 45). Web Table 4 shows results from sensitivity analyses based on a wider range of scenarios.

b Proposed ratio by which low SES increases stillbirth: RRUD=2. Proposed ratio by which low SES differs by distance (>100 km vs. ≤100 km): RREU=2. Bias = 1.33.

cProposed ratio by which low SES increases stillbirth: RRUD=1.25. Proposed ratio by which low SES differs by distance (>100 km vs. ≤100 km): RREU=1.5. Bias = 1.07.

The minimum strength of association between any unmeasured confounder and both the treatment and outcome that would fully explain away the treatment-outcome association, RR, is estimated using the formula E-value=RR+sqrt[RR×(RR1)] (31). In our example, the E value is 1.81, meaning the observed risk ratio of 1.25 could be explained away by an unmeasured confounder, such as SES, that was associated with both distance to care and stillbirth by a risk ratio of at least 1.81 each, above and beyond any measured confounders.

Issue 3: lack of sufficiently well-defined interventions

A sufficiently well-defined intervention, in which no meaningful vagueness remains in the intervention’s definition (all components of the intervention that could affect the counterfactual outcome have been specified), is a critical component of consistency, another fundamental identifiability assumption for causal inference (13, 32). An ill-defined intervention corresponds to multiple versions of treatment (14) and is problematic for 2 reasons. First, when an intervention has multiple versions of treatment, we cannot identify and measure the covariates necessary to achieve conditional exchangeability. Second, an intervention with multiple versions of treatment usually will not correspond to one target randomized trial. This poses a problem for decision-makers, because if an association is found between the exposure and outcome, even if we could somehow be convinced that conditional exchangeability was plausible, it will be unclear what intervention would have a similar effect (33). In fact, it is possible that some versions of treatment cause harm while others are protective.

For many research questions, distance to care corresponds to an ill-defined intervention. Intervening on distance to care could mean moving an individual (e.g., providing transportation or incentives) or could mean moving the care (e.g., building more hospitals). An intervention on distance to care should also specify the timing of the intervention (e.g., before pregnancy, during a specific trimester, during labor). These interventions could all have different confounding structures and different impacts on adverse birth outcomes or other health outcomes. This means that any attempt to estimate an effect based on a single measure of distance to care from a real data set (e.g., the main result in Table 1) is at best capturing a peculiar weighted average of all these different interventions’ possible impact and at worst biased due to improper control of confounding for each of these possible interventions (34).

Imagine 3 individuals who live within a 100-km travel distance from the nearest major delivery center. Individual 1 recently moved from a small village to a large city and lives 5 km from a major hospital. Individual 2 lives in a village 25 km from a new hospital that was built 1 year ago. Individual 3 lives in a rural area, but a new road was just built so that a hospital is now 90 km from her house. The data from the 3 individuals correspond to 3 versions of the treatment “live within a 100-km travel distance from the nearest major delivery center.” We could map the data from these individuals to 3 different interventions: 1) move individuals so they live within (or exactly) 5 km from a major hospital; 2) build hospitals so that each individual lives within (or exactly) 25 km from a hospital; and 3) build more roads so that each individual lives within (or exactly) 90 km from a hospital. All 3 individuals live within 100 km from the nearest major delivery center, but the shared causes of distance and the outcome could be very different for each of these versions of treatment. Neighborhood economic growth could be a strong predictor of building more roads, whereas increased educational attainment could be a strong predictor of moving to an urban area—that is, economic growth and educational attainment could either be causes of or share common causes with building roads and/or moving to an urban area.

The causal diagram in Figure 3A depicts multiple versions of the treatment “travel distance to care”, where Distance(v1), Distance(v2), and Distance(v3) represent the 3 potential versions of treatment and L(v1), L(v2), and L(v3) denote the 3 sets of (potentially overlapping) confounders for each version of treatment. The diagram is simplified in Figure 3B by collapsing the 3 versions of distance into a single variable Distance(v) and collapsing the confounders into a single variable L(v). We assume that Distance(v) causes Distance rather than Distance causing Distance(v). By estimating the association between distance (“live within a 100-km travel distance from the nearest major delivery center”) and the outcome, we are measuring the impact of a weighted average of many versions of treatment Distance(v) for which we have not measured the potential confounders L(v) and for which the weights are unknown. Note that even if we treated distance as a continuous variable with ≥100 instead of 2 categories of treatment, the problem described here would only be partially ameliorated, because the versions of treatment for a continuous distance are still unknown and might have different confounders.

Figure 3.

Figure 3.

Causal diagram including distance, outcome (in our case study, stillbirth), versions of distance Distance(v), and shared causes L(v) of Distance(v) and outcome. Confounding might occur via the path following Distance(v), L(v), and outcome. A) Three versions of distance and 3 shared causes for each version of distance and the outcome. B) A simplified causal diagram including 1 node for versions of distance and 1 node for shared causes of versions of distance and the outcome.

Many versions of the comparator treatment “live a >100-km travel distance from the nearest major delivery center” also exist. In Botswana, some women live more than 100 km from the nearest major delivery hospital but stay with a relative near the delivery center later in pregnancy, a practice that also exists globally. Some women might have attended their first antenatal care visit at a different clinic from where they usually receive antenatal care. Finally, some women who live >100 km from the nearest major delivery center might live close to a smaller delivery center or other location where a safe birth could occur. These versions of the treatment “live a >100-km travel distance from the nearest major delivery center” could also have different confounding structures and different impacts on adverse health outcomes.

Issue 4: measurement error and misclassification

Separate from the question of whether we could define an exposure that corresponded to a sufficiently well-defined intervention for distance to care (issue 3), the validity of our results could still be affected by measurement error and misclassification. Suppose, for example, that we defined an intervention of interest based on reducing travel time. Accurate calculation of travel time requires high-quality information on the location, quality, and congestion of roads and on available modes of transportation, and such high-quality information might not be available in one’s data set. Measurement error for travel time could be a greater issue in more rural areas; likewise, dichotomizing distance can lead to misclassification when the underlying variable is measured with error (35).

POTENTIAL ISSUES WHEN PROPOSING DISTANCE TO CARE AS AN INSTRUMENT IN AN INSTRUMENTAL VARIABLE ANALYSIS

Distance to care is also often proposed as an instrument in instrumental variable analyses aimed at estimating effects of another exposure, such as receipt or intensity of certain treatments or medical procedures, on a health outcome. The use of distance to care as a proposed instrument dates back to the first published instrumental variable analysis paper in epidemiology (8). McClellan et al. (8) proposed differential distance to alternative types of hospitals as an instrument for use of intensive treatments for acute myocardial infarction because it was associated with intensive treatment use and hypothesized not to have a direct effect on or share any causes with survival. The primary advantage of an instrumental variable analysis is that it does not rely on the strong assumption of no unmeasured confounding for the treatment and the outcome. Briefly, for an instrument to be valid it must meet 3 conditions: 1) the instrument and exposure must be associated; 2) the instrument must affect the outcome only through its potential effect on the treatment; and 3) the instrument and the outcome must not share any causes. A fourth condition, effect homogeneity, is required in order to estimate an average causal effect (36). There are several resources describing the strengths and limitations of instrumental variable methods in observational studies in general (e.g., Swanson and Hernán (37)) and specifically for distance to care as a proposed instrument (611). Here, we do not attempt to provide an exhaustive discussion of issues related to distance to care as a proposed instrument. Rather, we restrict our discussion to the 4 issues raised above, now revisited in the context of instrumental variable analyses.

Distance to the nearest delivery hospital (e.g., travel distance in kilometers) is used as a proxy for the causal instrument access to care. A causal instrument is one that has an effect on (rather than sharing causes with) the treatment or exposure (38). Figure 4 shows the standard instrumental variable causal diagram with a noncausal instrument Distance and arbitrary treatment or exposure. We make one important change to the standard diagram: We use Distance(v) rather than “access to care” as our proposed causal instrument. As described previously, distance to care is often ill-defined and might correspond to several versions of an intervention on distance. Access to care is similarly ill-defined. We conceptualize Distance(v) as one potential version of Distance and assume it directly affects exposure through access to care.

Figure 4.

Figure 4.

Causal diagram including a proposed proxy instrument distance, proposed causal instrument Distance(v), exposure, outcome (in our case study, stillbirth), and unmeasured shared cause U of exposure and outcome. (Note that this U might differ from the U in Figures 1 and 2.)

Selection bias in an instrumental variable analysis can occur when distance to care is associated with selection into the study and when there are shared causes of selection and the outcome. In an instrumental variable analysis, this implies a lack of exchangeability in the selected sample for the proposed instrument with respect to the outcome. The structure of this bias is depicted in Figure 5.

Figure 5.

Figure 5.

Causal diagram including proposed proxy instrument distance, proposed causal instrument Distance(v), exposure, outcome (in our case study, stillbirth), selection, and unmeasured shared cause U of exposure, outcome, and selection. Selection bias might occur via conditioning on selection into the study, a collider on the path following distance, Distance(v), selection, U, and outcome.

Even though we are not interested in estimating the causal effect of distance to care, an instrumental variable analysis still relies on the strong assumption of no unmeasured confounding for the instrument, distance to care, and the outcome. If we are concerned that an estimate of the effect of exposure on the outcome will be confounded by SES, we might choose to conduct an instrumental variable analysis. However, if SES is also a confounder for the instrument-outcome association, the results of the instrumental variable analysis will also be biased, as shown in Figure 6. In fact, because the magnitude of this bias can be amplified by a weak instrument, the instrumental variable analysis could be more biased than the noninstrumental variable analysis (39).

Figure 6.

Figure 6.

Causal diagram including proposed proxy instrument distance, proposed causal instrument Distance(v), exposure, outcome (in our case study, stillbirth), unmeasured shared cause U1 of Distance(v) and outcome, and unmeasured shared cause U2 of exposure and outcome. Confounding might occur via the path following distance, Distance(v), U1, and outcome. Note that U1 And U2 might not be distinct.

Instruments are not required per se to be well-defined interventions. However, an ill-defined instrument can affect the interpretation of our estimates, especially when estimating a local average treatment effect (38, 40, 41). In our example, under an additional monotonicity condition, such a local average treatment effect would correspond to a weighted average of the effect of exposure across all study participants where the weights are unknown to us and related to how the underlying unmeasured causal instrument affects exposure (36). The same complexities of interpretation can arise when the proposed instrument is measured with error. Moreover, if the measurement error is dependent or differential, the proposed instrument might no longer satisfy the instrumental conditions.

CONCLUSION

We have focused on challenges with estimating causal effects of distance to care on health outcomes and with using distance to care as a proposed instrument. These challenges apply to many other epidemiologic exposures, especially those for which no gold-standard measurement or clear intervention exists, such as environmental exposures. When the goal is estimating causal effects of these exposures, care should be taken to evaluate potential sources of bias and identify sufficiently well-defined interventions corresponding to the exposure.

Knowledge of the setting is critical to determine what interventions are feasible and appropriate and what hypothetical intervention most closely corresponds to the exposure being assessed. In the example of distance to care, realistic interventions might include building more hospitals, building more roads, or helping individuals to more easily access medical care. As distance to care relates to pregnancy outcomes, several public health interventions to increase access to care for pregnant women in the rural United States have received recent media attention. In rural Alaska, women are invited to stay at a prematernal home near a delivery hospital starting 30 days prior to their delivery date in an attempt to reduce a markedly high risk of maternal death, especially among native women (42). In Missouri, the shrinking number of hospitals offering obstetrical care disproportionately affects patients on Medicare, Medicaid, or without insurance, such that one woman had to travel nearly 100 miles and 4 hours to deliver her premature twins after the only hospital in her county closed (43). While not explicit, these articles propose building prematernal homes and preventing hospitals from closing as context-specific interventions to reduce maternal and infant mortality.

Unmeasured confounding concerns receive much attention in epidemiology (31), but selection bias can often be more insidious because the direction and magnitude of bias is often difficult to assess. In general, inverse-probability weighting can be used to adjust for selection bias but relies on estimating the conditional probability of selection given exposure and covariates. If these probabilities are not known, investigators should consider sensitivity analyses to quantify the magnitude of selection bias. Our relatively simple sensitivity analysis has key limitations, including lack of certainty about the true selection probabilities and evaluating only one measured shared cause of selection and the outcome. However, the exercise can be useful to gain intuition about the uncertainty around the estimates and circumstances under which the observed association could be completely explained by bias. A tool to calculate these simple sensitivity analyses is available online (44, 45).

Finally, estimating causal effects is not the only goal of public health research (12). Distance to care, and other similar exposures, can and should be used as a prediction tool for identifying high-risk populations and as a hypothesis-generating tool to understand local and regional differences in health outcomes. When investigators are interested in estimating the population health impact of building new hospitals or other geospatial-related interventions, distance to care might serve as a useful exposure, but selection bias, unmeasured confounding, measurement error, and specifying the intervention of interest should be considered in the data collection, study design, and analysis stage.

Supplementary Material

Web Material

ACKNOWLEDGMENTS

Author affiliations: Department of Population Health, New York University School of Medicine, New York, New York (Ellen C. Caniglia); Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts (Ellen C. Caniglia, Sonja A. Swanson, Kalé Z. Kponee-Shovein, Eleanor J. Murray); Department of Medicine, Beth Israel Deaconess Medical Center, Boston, Massachusetts (Rebecca Zash, Roger L. Shapiro); Department of Epidemiology, Erasmus Medical Center, Rotterdam, the Netherlands (Sonja A. Swanson); Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts (Kathleen E. Wirth); Botswana-Harvard AIDS Institute Partnership, Gaborone, Botswana (Modiegi Diseko, Gloria Mayondi, Shahin Lockman, Mompati Mmalane, Joseph Makhema, Scott Dryden-Peterson, Oaitse John, Roger L. Shapiro); Department of Medicine, Brigham and Women’s Hospital, Boston, Massachusetts (Shahin Lockman, Scott Dryden-Peterson); and Department of Immunology and Infectious Diseases, Harvard T.H. Chan School of Public Health, Boston, Massachusetts (Shahin Lockman, Scott Dryden-Peterson, Roger L. Shapiro).

This work was supported by the Eunice Kennedy Shriver National Institute of Child Health and Human Development (grants R01 HD0804701, R01 HD095766, and K24 AI131924) and the National Institute of Allergy and Infectious Diseases (grant T32 AI007433).

Conflict of interest: none declared.

Abbreviations

SES

socioeconomic status

REFERENCES

  • 1. Okwaraji YB, Edmond KM. Proximity to health services and child survival in low- and middle-income countries: a systematic review and meta-analysis. BMJ Open. 2012;2(4):e001196. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Lohela TJ, Campbell OM, Gabrysch S. Distance to care, facility delivery and early neonatal mortality in Malawi and Zambia. PLoS One. 2012;7(12):e52110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Kadobera D, Sartorius B, Masanja H, et al. The effect of distance to formal health facility on childhood mortality in rural Tanzania, 2005–2007. Glob Health Action. 2012;5:1–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Hanson C, Cox J, Mbaruku G, et al. Maternal mortality and distance to facility-based obstetric care in rural southern Tanzania: a secondary analysis of cross-sectional census data in 226 000 households. Lancet Glob Health. 2015;3(7):e387–e395. [DOI] [PubMed] [Google Scholar]
  • 5. Scott S, Chowdhury ME, Pambudi ES, et al. Maternal mortality, birth with a health professional and distance to obstetric care in Indonesia and Bangladesh. Trop Med Int Health. 2013;18(10):1193–1201. [DOI] [PubMed] [Google Scholar]
  • 6. Rassen JA, Brookhart MA, Glynn RJ, et al. Instrumental variables I: instrumental variables exploit natural variation in nonexperimental data to estimate causal relationships. J Clin Epidemiol. 2009;62(12):1226–1232. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Brookhart MA, Rassen JA, Schneeweiss S. Instrumental variable methods in comparative safety and effectiveness research. Pharmacoepidemiol Drug Saf. 2010;19(6):537–554. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. McClellan M, McNeil BJ, Newhouse JP. Does more intensive treatment of acute myocardial infarction in the elderly reduce mortality? Analysis using instrumental variables. JAMA. 1994;272(11):859–866. [PubMed] [Google Scholar]
  • 9. Brooks JM, Irwin CP, Hunsicker LG, et al. Effect of dialysis center profit-status on patient survival: a comparison of risk-adjustment and instrumental variable approaches. Health Serv Res. 2006;41(6):2267–2289. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. McConnell KJ, Newgard CD, Mullins RJ, et al. Mortality benefit of transfer to level I versus level II trauma centers for head-injured patients. Health Serv Res. 2005;40(2):435–457. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Pracht EE, Tepas JJ 3rd, Langland-Orban B, et al. Do pediatric patients with trauma in Florida have reduced mortality rates when treated in designated trauma centers? J Pediatr Surg. 2008;43(1):212–221. [DOI] [PubMed] [Google Scholar]
  • 12. Hernán MA. The C-word: scientific euphemisms do not improve causal inference from observational data. Am J Public Health. 2018;108(5):616–619. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Hernán MA, Robins JM. Causal Inference. Forthcoming ed Boca Raton, FL: Chapman & Hall/CRC, 2019. [Google Scholar]
  • 14. Hernán MA, VanderWeele TJ. Compound treatments and transportability of causal inference. Epidemiology. 2011;22(3):368–377. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. VanderWeele TJ. Concerning the consistency assumption in causal inference. Epidemiology. 2009;20(6):880–883. [DOI] [PubMed] [Google Scholar]
  • 16. Cole SR, Frangakis CE. The consistency statement in causal inference: a definition or an assumption? Epidemiology. 2009;20(1):3–5. [DOI] [PubMed] [Google Scholar]
  • 17. Pearl J. On the consistency rule in causal inference: axiom, definition, assumption, or theorem? Epidemiology. 2010;21(6):872–875. [DOI] [PubMed] [Google Scholar]
  • 18. Chen JY, Ribaudo HJ, Souda S, et al. Highly active antiretroviral therapy and adverse birth outcomes among HIV-infected women in Botswana. J Infect Dis. 2012;206(11):1695–1705. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Zash R, Souda S, Chen JY, et al. Reassuring birth outcomes with tenofovir/emtricitabine/efavirenz used for prevention of mother-to-child transmission of HIV in Botswana. J Acquir Immune Defic Syndr. 2016;71(4):428–436. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Villar J, Cheikh Ismail L, Victora CG, et al. International standards for newborn weight, length, and head circumference by gestational age and sex: the Newborn Cross-Sectional Study of the INTERGROWTH-21st Project. Lancet. 2014;384(9946):857–868. [DOI] [PubMed] [Google Scholar]
  • 21. Villar J, Giuliani F, Fenton TR, et al. INTERGROWTH-21st very preterm size at birth reference charts. Lancet. 2016;387(10021):844–845. [DOI] [PubMed] [Google Scholar]
  • 22. Lawn JE, Blencowe H, Waiswa P, et al. Stillbirths: rates, risk factors, and acceleration towards 2030. Lancet. 2016;387(10018):587–603. [DOI] [PubMed] [Google Scholar]
  • 23. Goldenberg RL, Culhane JF, Iams JD, et al. Epidemiology and causes of preterm birth. Lancet. 2008;371(9606):75–84. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24. Vogel JP, Chawanpaiboon S, Moller AB, et al. The global epidemiology of preterm birth. Best Pract Res Clin Obstet Gynaecol. 2018;52:3–12. [DOI] [PubMed] [Google Scholar]
  • 25. Muhihi A, Sudfeld CR, Smith ER, et al. Risk factors for small-for-gestational-age and preterm births among 19,269 Tanzanian newborns. BMC Pregnancy Childbirth. 2016;16:110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26. Padian NS, McCoy SI, Karim SS, et al. HIV prevention transformed: the new prevention research agenda. Lancet. 2011;378(9787):269–278. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27. de Bernis L, Kinney MV, Stones W, et al. Stillbirths: ending preventable deaths by 2030. Lancet. 2016;387(10019):703–716. [DOI] [PubMed] [Google Scholar]
  • 28. Spiegelman D, Hertzmark E. Easy SAS calculations for risk or prevalence ratios and differences. Am J Epidemiol. 2005;162(3):199–200. [DOI] [PubMed] [Google Scholar]
  • 29. Hernán MA. Invited commentary: selection bias without colliders. Am J Epidemiol. 2017;185(11):1048–1050. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Lash TL, Fox MP, Fink AK. Applying Quantitative Bias Analysis to Epidemiologic Data. New York, NY: Springer New York; 2009. [Google Scholar]
  • 31. VanderWeele TJ, Ding P. Sensitivity analysis in observational research: introducing the E-value. Ann Intern Med. 2017;167(4):268–274. [DOI] [PubMed] [Google Scholar]
  • 32. Hernán MA. Does water kill? A call for less casual causal inferences. Ann Epidemiol. 2016;26(10):674–680. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33. Hernán MA, Taubman SL. Does obesity shorten life? The importance of well-defined interventions to answer causal questions. Int J Obes (Lond). 2008;32(suppl 3):S8–S14. [DOI] [PubMed] [Google Scholar]
  • 34. VanderWeele TJ. Commentary: on causes, causal inference, and potential outcomes. Int J Epidemiol. 2016;45(6):1809–1816. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35. Altman DG, Royston P. The cost of dichotomising continuous variables. BMJ. 2006;332(7549):1080. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36. Hernán MA, Robins JM. Instruments for causal inference: an epidemiologist’s dream? Epidemiology. 2006;17(4):360–372. [DOI] [PubMed] [Google Scholar]
  • 37. Swanson SA, Hernán MA. Commentary: how to report instrumental variable analyses (suggestions welcome). Epidemiology. 2013;24(3):370–374. [DOI] [PubMed] [Google Scholar]
  • 38. Swanson SA, Hernán MA. The challenging interpretation of instrumental variable estimates under monotonicity. Int J Epidemiol. 2018;47(4):1289–1297. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39. Jackson JW, Swanson SA. Toward a clearer portrayal of confounding bias in instrumental variable applications. Epidemiology. 2015;26(4):498–504. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40. Swanson SA, Miller M, Robins JM, et al. Definition and evaluation of the monotonicity condition for preference-based instruments. Epidemiology. 2015;26(3):414–420. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41. Swanson SA, Hernán MA. Think globally, act globally: an epidemiologist’s perspective on instrumental variable estimation. Stat Sci. 2014;29(3):371–374. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Pregnant and far from home, a sisterhood of the expecting [editorial]. New York Times August 25, 2017:A10.
  • 43.It’s 4 A.M. The baby’s coming. But the hospital is 100 miles away [editorial]. New York Times December 30, 2018:F15.
  • 44. Murray EJ, Caniglia EC Calculator for selection bias and unmeasured confounding. https://emurray.shinyapps.io/distanceApp/ Accessed May 21, 2019.
  • 45. Murray EJ, Caniglia EC eleanormurray/distanceApp (Version v1.0.0). March 6, 2019. 10.5281/zenodo.2586078. Accessed May 21, 2019. [DOI]
  • 46. Robins J. Marginal structural models. 1997 Proceedings of the American Statistical Association Alexandria, VA: American Statistical Association; 1998:1–10.

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Web Material

Articles from American Journal of Epidemiology are provided here courtesy of Oxford University Press

RESOURCES