Abstract
We propose a novel definition of selection bias in analytic epidemiology using potential outcomes. This definition captures selection bias under both the structural approach (where conditioning on selection into the study opens a noncausal path from exposure to disease in a directed acyclic graph) and the traditional definition (where a given measure of association differs between the study sample and the population eligible for inclusion). This approach is nonparametric, and selection bias under the approach can be analyzed using single-world intervention graphs both under and away from the null hypothesis. It allows the simultaneous analysis of confounding and selection bias, it explicitly links the selection of study participants to the estimation of causal effects using study data, and it can be adapted to handle selection bias in descriptive epidemiology. Through examples, we show that this approach provides a novel perspective on the variety of mechanisms that can generate selection bias and simplifies the analysis of selection bias in matched studies and case–cohort studies.
Keywords: Causal inference, Epidemiologic methods, Measures of association, Potential outcomes, Selection bias, Single-world intervention graphs
Along with confounding, selection bias is one of the fundamental threats to the validity of epidemiologic research. Traditionally, selection bias has been defined as a systematic difference (i.e., a difference beyond random variation) between measures of an exposure–disease association in a study sample and the underlying eligible population, where the study sample is included in the study and the eligible population is eligible for inclusion.1 For example, Berkson2 showed that two diseases can be positively correlated among hospitalized patients even when they are independent in the source population. This definition can depend on the parameterization of the exposure–disease association,3 and it provides little or no guidance about identifying and controlling selection bias in the design and analysis of a study. These problems resemble those that arise when confounding is defined as a change in an exposure–disease association upon adjustment for a covariate.4–7
The structural approach of Hernán et al.8 showed that selection bias occurs when conditioning on selection into the study opens a noncausal path from an exposure to a disease in a causal directed acyclic graph (DAG)9,10 for the eligible population. This approach is nonparametric, and its use of DAGs to incorporate background knowledge in the identification and control of selection bias is a practical advantage over the traditional definition.
However, the structural approach only captures selection bias that can occur under the null hypothesis (i.e., no causal path from to ) and no confounding (i.e., no open backdoor path from to ).3 An example of selection bias that escapes this approach was given by Greenland11: In a hypothetical cohort study where right censoring was not associated with exposure, there was no selection bias under the null. Away from the null, the risk ratio in the full cohort (censored and uncensored) differed from the risk ratio in the observed cohort (uncensored only).
Here, we use potential outcomes12 to propose a novel definition of selection bias in analytic epidemiology, where the goal is to infer the causal effect of a treatment or exposure on the risk of disease. The proposed definition allows the simultaneous analysis of confounding and selection bias, and it captures all selection bias under the structural approach and the traditional definition. It is nonparametric, it can be analyzed using single-world intervention graphs (SWIGs) of Richardson and Robins13,14 both under and away from the null, and it explicitly links the selection of study participants to the measures of association that can be estimated using study data. We show how it can be adapted to handle selection bias in descriptive epidemiology, where the goal is to estimate the joint distribution of disease and covariates (e.g., demographics or exposures). Through examples, we show how selection bias can be generated by colliders at and , how it can arise in randomized clinical trials, and how the potential outcomes approach simplifies the analysis of selection bias in matched studies and case–cohort studies.
Confounding, Exchangeability, and Backdoor Paths
Unlike selection bias, confounding has a standard definition in terms of potential outcomes. Let be an exposure or treatment, be a disease outcome, and denote the outcome that occurs if we intervene to set . In the notation of Dawid,15 there is no unmeasured confounding when
(1) |
(i.e., is conditionally independent of given ), where is a set of measured nondescendants of .16 The conditional independence in equation (1) is called exchangeability.5 This definition of confounding is difficult to use directly as a guide to study design and analysis.
Confounding can also be defined as an open backdoor path from to in a causal DAG,10 which provides an intuitive way to use background knowledge to identify and control confounding. SWIGs are transformations of causal DAGs that explicitly represent potential outcomes.13 A SWIG represents the intervention of setting by splitting the node into two new nodes: One node represents the realized value of and inherits all incoming edges from the node . The other node represents the intervention and inherits all outgoing edges from the node . The two new nodes are not connected by an edge, and all paths through the node representing the intervention are blocked. Any node that is a descendant of in the DAG is written to show that it is a potential outcome that can be observed only in individuals with . If is a nondescendant of , then and can be observed in all individuals.
In a SWIG, the rules of d-separation9 can be used to evaluate conditional independence. This can be used to show that the backdoor path criterion and exchangeability are equivalent.14 Figure 1 shows how exchangeability is guaranteed by no open backdoor path from to in a simple example: Conditioning on blocks the backdoor path from to in the DAG, and it d-separates and in the SWIG. Therefore, confounding has a nonparametric definition in terms of potential outcomes that can be analyzed using causal graphs both under and away from the null hypothesis. The potential outcomes approach to selection bias achieves something similar.
SELECTION BIAS VIA POTENTIAL OUTCOMES
Let be an exposure or treatment, be a disease outcome, and indicate selection into the study out of a specified eligible population. The study sample is the subset of the eligible population with . Assume that is not a descendant of in a causal DAG for the eligible population. We propose the following definition of selection bias in analytic epidemiology:
Definition 1 (Analytic Selection Bias)
There is no unmeasured selection bias for the causal effect of on if and only if at least one of the following conditions holds:
1. Analytic cohort condition. If we intervene to set exposure , selection into the study is conditionally independent of the disease outcome given and :
(2) |
where is a (possibly empty) set of measured nondescendants of such that exchangeability holds in equation (1).
2. Analytic case–control condition. If we intervene to set disease outcome , selection into the study is conditionally independent of exposure given , , and :
(3) |
where is defined above and is a (possibly empty) set of measured nondescendants of such that
(4) |
This conditional independence implies that cannot contain variables on a causal path from to .
The analytic cohort condition in equation (2) is relevant to studies where exposure is a cause of selection. It is similar to the conditions given by Daniel et al.17 for the identifiability of a causal effect with missing data, which are based on the do-operator9 rather than potential outcomes. In the DAG at the top of Figure 2, we must condition on to close the backdoor path from to , so must contain . In SWIG (a), we must condition on to d-separate and , so must contain . The cohort condition holds because and are a nondescendants of , exchangeability holds given , and d-separates and . The case–control condition fails because of the arrow from to in SWIG (b).
The analytic case–control condition in equation (3) is relevant to studies where disease outcome is a cause of selection. It is based on the principle that controls should be individuals who could become cases if they had a disease onset while under observation.18,19 In Figure 3, the cohort condition fails because of the arrow from to in SWIG (a). Because conditioning on is necessary to block the backdoor path from to , must contain . In SWIG (b), conditioning on and is needed to d-separate and . The variable cannot be in , but it can be in because Vx ╨ DX|(X,A. The variable can be in because it is a nondescendant of and exchangeability holds given . It can also be in because Bx = B ╨ DX|(X,A. Thus, the case–control condition holds given in two ways: and or and .
Evaluation of the analytic cohort and case–control conditions should consider all stable conditional independencies20 implied by the SWIGs derived from a causal DAG for the eligible population that includes . The examples above show that neither condition implies the other. To estimate a conditional causal effect of on given a set of nondescendants of , we must find a that contains all variables in . The conditions for no unmeasured analytic selection bias are similar to the identifiability conditions for the causal odds ratio in Bareinboim and Pearl,21 but they guarantee the identifiability of a greater variety of causal effect measures.
When there is no unmeasured analytic selection bias, a study has external validity in the sense that adjustment for measured variables is sufficient to generalize a causal effect estimate from the study sample to the eligible population or a subset of it defined by measured variables.1 As in Greenland and Pearl,22 adjustment for means a measure of association standardized to a specified joint distribution of the variables in , a set of conditional measures of association within strata of , or a common conditional measure of association given .
Selection Bias Due to a Collider at
The structural approach of Hernán et al.8 identifies selection bias when conditioning on opens a noncausal path from to . This occurs when is a collider on a path from to in a causal DAG or a descendant of such a collider. The DAG in Figure 4 represents a path from to on which is a collider. As in Greenland et al.,10 the undirected dashed edges represent open paths whose structure is not specified. Thus, there is an open path from to that ends with an arrow pointing into and an open path from to that starts with an arrow pointing into .
Theorem 1. Selection bias under the structural approach implies analytic selection bias under the potential outcomes approach.
Proof. There is selection bias under the structural approach if and only if at least one path matching the pattern in Figure 4 exists such that the paths from to and from to cannot be blocked. If we condition on or a descendant of , then:
If we set , the open path from to implies that the analytic cohort condition fails in the SWIG for setting . This holds whether or not is a descendant of .
If we set , the open path from to implies that the analytic case–control condition fails in the SWIG for setting . This holds whether or not is a descendant of .
Because the analytic cohort and case–control conditions fail, we have analytic selection bias under the potential outcomes approach. □
In each example from Hernán et al.,8 the potential outcomes approach and the structural approach reach identical conclusions about the presence and control of selection bias. As noted by Hernán,3 all of these examples have no causal path and no open backdoor path from to . Selection bias under the structural approach compromises both the internal and external validity of a study: The causal effect estimate within the study sample is biased, so it cannot generalize to the eligible population.
Selection Bias in Descriptive Epidemiology
The traditional definition of selection bias applies to a measure of association between and whether or not it represents a causal effect. To define selection bias for descriptive epidemiology, we can drop the requirement that exchangeability holds given and remove all restrictions on descendants of and so there is no need to distinguish between and :
Definition 2 (Descriptive Selection Bias)
There is no unmeasured selection bias for the association between and if and only if at least one of the following conditions holds:
1. Descriptive cohort condition. Selection into the study is conditionally independent of disease outcome given and :
(5) |
where is a (possibly empty) set of measured covariates.
2. Descriptive case–control condition. Selection into the study is conditionally independent of exposure given and :
(6) |
where is defined above.
When at least one of these conditions holds, the conditional association between and given in the study sample matches the same conditional association in the eligible population (up to random variation). In the cohort study from Figure 2, conditioning on controls descriptive selection bias but not analytic selection bias. The same is true of conditioning on in the case–control study from Figure 3. There are two primary differences between the control of descriptive and analytic selection bias: Analytic selection bias must be controlled using a set of variables sufficient to control confounding, and conditioning on causal descendants of and is constrained.
The descriptive cohort and case–control conditions can be assessed on any DAG—causal or not—that represents the joint distribution of , , , and in the eligible population. This definition of no unmeasured descriptive selection bias matches the necessary and sufficient conditions given in Didelez et al.23 for the - odds ratio given to be collapsible over .
Selection and Estimation
The selection bias example of Greenland11 was analyzed by Hernán3 using a DAG similar to that in Figure 5, where is not a collider or a descendant of a collider. Howe et al.24 considered a similar causal structure for selection bias caused by loss to follow-up. In the SWIG for setting , the open path -- must be blocked by conditioning on . In the SWIG for setting , the path --- is opened by conditioning on the collider at , so it must be closed by conditioning on . In both SWIGs, the potential outcomes approach correctly identifies selection bias and shows that it can be controlled by adjusting for .
Theorem 2. Selection bias under the traditional definition implies selection bias under the potential outcomes approach.
Proof. We will prove the equivalent statement that no selection bias under the potential outcomes approach implies no selection bias under the traditional definition. If there is no selection bias under the potential outcomes approach, then at least one of the cohort and case–control conditions holds. In each case, we will consider both analytic and descriptive selection bias.
When the analytic cohort condition holds and we intervene to set , the conditional risk of disease given in the eligible population equals the conditional risk given and in the study sample:
(7) |
by exchangeability, the cohort condition, and consistency. Any measure of causal effect based on conditional risks of disease given in the study sample equals the same measure based on potential outcomes in the eligible population (up to random variation).
Under the descriptive cohort condition, we have
(8) |
Any measure of association based on conditional risks of disease given in the study sample equals the same measure in the eligible population (up to random variation).
For the case–control conditions, assume is binary and that we are comparing two levels of exposure that we call and without loss of generality. If the analytic case–control condition holds and we intervene to set , the conditional odds of disease given is
(9) |
by exchangeability, the conditional independence in equation (4), and consistency. By Bayes’s rule, the final odds equals
(10) |
Because the second term does not depend on , it cancels out of the conditional causal odds ratio for disease given . This causal odds ratio equals the conditional odds ratio for exposure given and in the eligible population:
(11) |
Each component of the conditional odds ratio for exposure in the eligible population can be estimated using data from the study sample because
(12) |
by the case–control condition and consistency. Thus, the causal odds ratio for disease given in the eligible population equals the conditional odds ratio for exposure given and in the study sample (up to random variation).
Under the descriptive case–control condition,
(13) |
The latter odds ratio can be estimated using the study sample because
(14) |
by the descriptive case–control condition. Thus, the conditional odds ratio for disease given in the eligible population equals the conditional odds ratio for exposure given in the study sample (up to random variation). □
The cohort and case–control conditions have different implications for the estimation of causal effects or associations. The cohort condition allows any measure based on conditional risks of disease to be calculated, including marginal measures based on standardization. The case–control condition allows only conditional odds ratios to be estimated. With case–control or case–cohort data, calculating conditional risks of disease requires external information about the eligible population.25
Selection Bias and Adjustment
Selection bias under the potential outcomes approach does not imply selection bias under the traditional definition. Under certain conditions, adjustment can leave a measure of association unchanged.22 Adjustment for will alter a measure of association if there is effect modification by or noncollapsibility. When we must condition on to avoid selection bias, it is important to measure the covariates in to check these conditions even if adjustment proves unnecessary.
Lu et al.26 proposed a classification of analytic selection bias for the risk difference and risk ratio into forms that occur due to conditioning on a collider (or a descendant of a collider) and forms that occur due to conditioning on an effect measure modifier. Because both measures are collapsible, these categories probably account for almost all selection bias where adjustment of the risk difference or risk ratio is necessary. The potential outcomes definition is nonparametric, so it identifies selection bias whenever it is not precluded by the structure of the causal DAG in the eligible population.
Analytic Versus Descriptive Selection Bias
The conditions given in Didelez et al.23 for the collapsibility of an - odds ratio over do not ensure that an - odds ratio that collapses over is causal.21 This is a special case of the fact that analytic selection bias can occur when there is no descriptive selection bias. However, no analytic selection bias implies no descriptive selection bias.
Theorem 3. No analytic selection bias implies no descriptive selection bias, but analytic selection bias can occur without descriptive selection bias.
Proof. Assume the analytic cohort condition holds given a set of variables . By consistency and equation (7), we have:
(15) |
If we let , this is equivalent to equation (8), which guarantees no descriptive selection bias.
Now assume the analytic case–control condition holds given and . If we let (i.e., the union of and ), then equation (12) is equivalent to equation (14), which guarantees no descriptive selection bias.
In the DAG at the top of Figure 2, the descriptive cohort condition holds given . However, the analytic cohort condition fails because conditioning on is required to block the backdoor path from to . The analytic case–control condition also fails, so there is analytic selection bias given . In the DAG at the top of Figure 3, the descriptive case–control condition holds given . However, the analytic case–control condition fails because conditioning on is required to block the backdoor path from to . The analytic cohort condition also fails, so there is analytic selection bias given . □
APPLICATIONS TO STUDY DESIGN AND ANALYSIS
The potential outcomes approach to selection bias provides a novel perspective on the variety of mechanisms that can generate selection bias, and it correctly handles cases where adjustment is necessary for generalization to the eligible population even though the unadjusted causal effect or association is valid within the study sample. It simplifies the analysis of matched studies and case–cohort studies by eliminating the need to consider the cancellation of associations along different paths in a DAG.27 The eAppendix; http://links.lww.com/EDE/C59 contains examples implemented in R.28
Selection Bias Due to a Collider at
Measures of association based on the risk of disease condition on when calculating risks within exposure groups. When there is an open backdoor path from to , this conditioning can cause descriptive selection bias. Figure 6 shows an example. The descriptive case–control condition fails because of the arrow from to in the underlying causal DAG. In the descriptive cohort condition, conditioning on the collider at opens the path ----. However, Sx ╨ DX|(X,C, where , , or . The descriptive cohort condition holds given any of these sets.
The analytic cohort condition holds only given or because conditioning on is necessary to close the backdoor path --. Conditioning on controls descriptive selection bias but not confounding, so the conditional association between and given is the same in the study sample and the eligible population but differs systematically from the conditional causal effect of on given in the eligible population. This form of selection bias requires a backdoor path from to , so it falls outside the range of selection bias considered by Hernán et al.8
Selection Bias Due to a Collider at
Measures of association in case–control studies condition on when calculating exposure prevalences among cases and controls. Figure 7 shows selection bias caused by a collider at in a case–control study. The analytic and descriptive cohort conditions fail because of the arrow from to in the underlying causal DAG. In the analytic and descriptive case–control conditions, conditioning on the collider at opens the path ---. This path can be blocked by conditioning on , so the descriptive case–control condition holds given . The variable can be in because it is not a descendant of . It cannot be in because Lx = L ╨ DX|X. Thus, the analytic case–control condition holds given and (the empty set). This form of selection bias cannot occur if there is no causal path and no open backdoor path from to , so it also falls outside the range of selection bias considered by Hernán et al.8
Randomized Clinical Trials
Figure 8 shows a randomized clinical trial in which there is a common cause of disease and selection (e.g., selection into the study is based on risk factors for disease). The arrow from to exists because selection into the trial affects the probability of treatment, which might be near zero outside the study sample. Although randomization of prevents a backdoor path from to in the causal DAG for the study sample, there is a backdoor path --- in the causal DAG for the eligible population.
The analytic and descriptive cohort conditions both hold given . In the eligible population, randomization of ensures that:
All backdoor paths from to are blocked by conditioning on , so the effect of on is unconfounded within the study sample.
Because C ╨ X|S, all treatment groups have the same distribution of . Thus, a crude measure of association between and is implicitly standardized to the distribution of in the study sample.
Therefore, a randomized trial provides a valid estimate of the causal effect of on in the study sample without adjustment for . However, generalization to the eligible population can require adjustment for if there is effect modification or noncollapsibility.22
Matched Cohort Studies
Figure 9 shows a cohort study matched on a confounder . The analytic cohort condition holds given because Cx ╨ Dx|(X,C), so the descriptive cohort condition also holds. If matching ensures that the distribution of is the same in all exposure groups in the study sample, then C ╨ X|S even though and are d-connected given . In this case, a crude measure of association based on disease risks is implicitly standardized to a common distribution of , so no adjustment for is needed to estimate the marginal causal effect of within the study sample. If matching ensures that the distribution of in each exposure group in the study sample matches the distribution of among the exposed in the eligible population, then this unadjusted estimate corresponds to the average treatment effect among the treated in the eligible population. Generalization requires adjustment for if there is effect modification or noncollapsibility and a marginal causal effect is being estimated for a different distribution of than in the study sample.22
Matched Case–Control Studies
Figure 10 shows a case–control study matched on a confounder . Matching ensures that the distribution of is equal in the case and control groups, not in the exposure groups, so the crude odds ratio in the study sample does not have a causal interpretation. The conditional odds ratios given have a causal interpretation, and they are identical (up to random variation) in the study sample and the eligible population.
Case–Cohort Studies
Figure 11 shows a causal DAG and corresponding SWIGs for a case–cohort study. indicates selection into the underlying cohort, and indicates selection into the subcohort or becoming a case (or both). For selection into the cohort, we have Cx1 ╨ Dx|X so the analytic cohort condition holds. For selection into the study sample, the cohort condition fails because of the edge from to , but the analytic case–control condition holds with and . Because the cohort condition does not hold for , the exposure odds ratio comparing cases and controls must be used to estimate the causal effect of exposure on disease. The need to condition on to control selection bias for implies that cases from outside the cohort must be excluded if the cohort was selected based on exposure.
DISCUSSION
The potential outcomes approach to selection bias is nonparametric, captures all selection bias under the structural approach of Hernán et al.8 and the traditional definition, and can be adapted to both analytic and descriptive epidemiology. It is an important practical application of SWIGs, and it provides a unified analysis of confounding and selection bias in analytic epidemiology. We hope to extend this approach to studies with time-dependent confounding and complex censoring patterns.29–31
We have assumed throughout that our DAGs completely represent causal relationships in the eligible population. Causal relationships in a different population might not be represented by the same DAGs. Thus, no selection bias does not guarantee that an estimated causal effect or association can be generalized to a population containing individuals who were not eligible for the study. It guarantees generalizability but not transportability,1 which is a strong argument for the inclusive recruitment of study participants in epidemiology.
EAPPENDIX
Implementations in R28 of examples from Figures 2 to 10 (text file).
ACKNOWLEDGMENTS
I would like to thank Miguel Hernán, Forrest Crawford, Sander Greenland, and Patrick Schnell for their useful comments.
Supplementary Material
Footnotes
Supported by National Institute of Allergy and Infectious Diseases (NIAID) grants R01 AI116770 and U01 AI169375 and National Institute of General Medical Sciences (NIGMS) grant U54 GM111274. The content is solely the responsibility of the author and does not represent the official views or policies of NIAID, NIGMS, or the National Institutes of Health.
The authors report no conflicts of interest.
Supplemental digital content is available through direct URL citations in the HTML and PDF versions of this article (www.epidem.com).
The previously submitted version of this article was posted on arXiv (https://arxiv.org/abs/2008.03786).
The data and code required to replicate the results are included as an R file in the Supplemental Digital Content http://links.lww.com/EDE/C59.
REFERENCES
- 1.Dahabreh IJ, Hernán MA. Extending inferences from a randomized trial to a target population. Eur J Epidemiol. 2019;34:719–722. [DOI] [PubMed] [Google Scholar]
- 2.Berkson J. Limitations of the application of fourfold table analysis to hospital data. Biometrics. 1946;2:47–53. [PubMed] [Google Scholar]
- 3.Hernán MA. Invited commentary: selection bias without colliders. Am J Epidemiol. 2017;185:1048–1050. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Miettinen OS, Cook EF. Confounding: essence and detection. Am J Epidemiol. 1981;114:593–603. [DOI] [PubMed] [Google Scholar]
- 5.Greenland S, Robins JM. Identifiability, exchangeability, and epidemiological confounding. Int J Epidemiol. 1986;15:413–419. [DOI] [PubMed] [Google Scholar]
- 6.Wickramaratne PJ, Holford TR. Confounding in epidemiologic studies: the adequacy of the control group as a measure of confounding. Biometrics. 1987;43:751–765. [PubMed] [Google Scholar]
- 7.Greenland S, Robins JM, Pearl J. Confounding and collapsibility in causal inference. Stat Sci. 1999a;14:29–46. [Google Scholar]
- 8.Hernán MA, Hernández-Daz S, Robins JM. A structural approach to selection bias. Epidemiology. 2004;15:615–625. [DOI] [PubMed] [Google Scholar]
- 9.Pearl J. Causal diagrams for empirical research. Biometrika. 1995;82:702–710. [Google Scholar]
- 10.Greenland S, Pearl J, Robins J. Causal diagrams for epidemiologic research. Epidemiology. 1999b;10:37–48. [PubMed] [Google Scholar]
- 11.Greenland S. Response and follow-up bias in cohort studies. Am J Epidemiol. 1977;106:184–187. [DOI] [PubMed] [Google Scholar]
- 12.Rubin DB. Estimating causal effects of treatments in randomized and nonrandomized studies. J Educ Psychol. 1974;66:688–701. [Google Scholar]
- 13.Richardson TS, Robins JM. Single world intervention graphs: A primer. In Second UAI Workshop on Causal Structure Learning, Bellevue, Washington. Association for Uncertainty in Artificial Intelligence; 2013a. [Google Scholar]
- 14.Richardson TS, Robins JM. Single world intervention graphs (SWIGs): A unification of the counterfactual and graphical approaches to causality. Center for the Statistics and the Social Sciences, University of Washington Series. Working Paper, 128(30):2013, 2013b. [Google Scholar]
- 15.Dawid AP. Conditional independence in statistical theory. J R Stat Soc Series B. 1979;41:1–15. [Google Scholar]
- 16.Rosenbaum PR, Rubin DB. The central role of the propensity score in observational studies for causal effects. Biometrika. 1983;70:41–55. [Google Scholar]
- 17.Daniel RM, Kenward MG, Cousens SN, De Stavola BL. Using causal diagrams to guide analysis in missing data problems. Stat Methods Med Res. 2012;21:243–256. [DOI] [PubMed] [Google Scholar]
- 18.Miettinen OS. The “case-control” study: valid selection of subjects. J Chronic Dis. 1985;38:543–548. [DOI] [PubMed] [Google Scholar]
- 19.Wacholder S, McLaughlin JK, Silverman DT, Mandel JS. Selection of controls in case-control studies: I. principles. Am J Epidemiol. 1992;135:1019–1028. [DOI] [PubMed] [Google Scholar]
- 20.Mansournia MA, Greenland S. The relation of collapsibility and confounding to faithfulness and stability. Epidemiology. 2015;26:466–472. [DOI] [PubMed] [Google Scholar]
- 21.Bareinboim E and Pearl J. Controlling selection bias in causal inference. In Artificial Intelligence and Statistics, 100–108. PMLR, 2012. [Google Scholar]
- 22.Greenland S, Pearl J. Adjustments and their consequences—collapsibility analysis using graphical models. Int Stat Rev. 2011;79:401–426. [Google Scholar]
- 23.Didelez V, Kreiner S, Keiding N. Graphical models for inference under outcome-dependent sampling. Stat Sci. 2010;25:368–387. [Google Scholar]
- 24.Howe CJ, Cole SR, Lau B, Napravnik S, Eron JJ, Jr. Selection bias due to loss to follow up in cohort studies. Epidemiology. 2016;27:91–97. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Miettinen O. Estimability and estimation in case-referent studies. Am J Epidemiol. 1976;103:226–235. [DOI] [PubMed] [Google Scholar]
- 26.Lu H, Cole SR, Howe CJ, Westreich D. Toward a clearer definition of selection bias when estimating causal effects. Epidemiology. 2022;33:699–706. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Mansournia MA, Hernán MA, Greenland S. Matched designs and causal diagrams. Int J Epidemiol. 2013;42:860–869. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.R Core Team. R: A Language and Environment for Statistical Computing. Vienna, Austria. R Foundation for Statistical Computing, 2023. Available at: https://www.R-project.org/. [Google Scholar]
- 29.Robins J. A new approach to causal inference in mortality studies with a sustained exposure period—application to control of the healthy worker survivor effect. Math Model. 1986;7:1393–1512. [Google Scholar]
- 30.Robins JM, Blevins D, Ritter G, Wulfsohn M. G-estimation of the effect of prophylaxis therapy for pneumocystis carinii pneumonia on the survival of aids patients. Epidemiology. 1992;3:319–336. [DOI] [PubMed] [Google Scholar]
- 31.Robins JM, Hernán MA, Brumback B. Marginal structural models and causal inference in epidemiology. Epidemiology. 2000;11:550–560. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.