Abstract
There are three common variable types in orthopedic research—confounders, colliders, and mediators. All three types of variables are associated with both the exposure (eg. surgery type, implant type, body mass index) and outcome (eg. complications, revision surgery), but differ in their temporal ordering. To reduce systematic bias, the decision to include or exclude a variable in an analysis should be based on the variable’s relationship with the exposure and outcome for each research question. In this paper, we define three types of variables with case examples from orthopedic research.
Keywords: total joint arthroplasty, databases, mediation analysis, confounding, selection bias, directed acyclic graph
Introduction
Observational studies in orthopedic research are subject to several sources of error that can reduce the reliability and validity of the findings. Examples of observational studies are cohort and case-control studies and either can be retrospective or prospective in nature.1 The type of errors that occurs in observational studies are classified as either random or systematic (Figure 1). Random errors are unavoidable, but can be expressed quantitatively, and reduced by increasing the sample size and/or choosing a more efficient study design.1,2 On the other hand, systematic errors are typically avoidable, and should be reduced or eliminated through appropriate study design and data collection strategies. Understanding potential sources of systematic bias can help orthopedic researchers improve the validity of their studies as well as critically interpret results in orthopedic literature.
Figure 1. Sources of error in clinical research.

Systematic error or bias can be introduced into a study for many reasons. Sources of systematic bias broadly fall into three categories: confounding bias, selection bias, and measurement bias. In this paper, we focus on confounding and selection biases. Measurement and misclassification bias are discussed in another manuscript in this methodology series.3 When confounding or selection bias is present, the observed relationship between the exposure and outcome is distorted from the true relationship.
Before conducting a research study, a clear testable hypothesis should be formulated. This hypothesis is often “is exposure X associated with outcome Y controlling for other factors Z?” The “exposure of interest” or “exposure” is the potential risk factor or variable that is being evaluated. Typically, we “control” or “adjust” for other factors or variables by including them in a statistical model. Common exposure variables of interest in orthopedic research include surgery type (e.g. primary or revision surgery), surgical indications (e.g. osteoarthritis versus other indications for total joint arthroplasty [TJA]), implant type (e.g. constrained versus unconstrained acetabular liners), or medical interventions (e.g. prophylactic antibiotics and anticoagulants). Common outcome variables include surgery complications, revisions or other reoperations, readmissions, and patient-reported outcomes. Other factors or variables that may need to be evaluated and accounted for are baseline variables such as age, sex, comorbidities, post-exposure variables such as manipulation under anesthesia (in an analysis looking at surgical risk factors for arthrofibrosis) and factors that influence inclusion or exclusion into a study such as limiting the study population to revision surgeries.
Prior to data collection, careful consideration of the structural relationship between the exposure of interest, outcome, and other variables or factors will help to avoid or reduce systematic bias through proper study design and/or collecting the correct set of variables to use in the analysis. One practical approach to doing this is by drawing directed acyclic graphs (DAGs).4 DAGs display the relationship between variables graphically with lines and arrowheads. When planning a study, drawing a DAG can be helpful to identify what variables should be collected. When conducting an analysis on previously collected data, drawing a DAG is a tool to help decide what variables should be included or excluded in the analysis.
The most common types of additional variables controlled for in analyses are confounders, intermediate (also known as mediator), and collider variables. All of these variables are associated with both the exposure and outcome of interest, however, differ in their temporality. Figure 2 contains four DAGs that graphically depict the relationship of an exposure, outcome, confounder, mediator, and collider. The temporal ordering of the additional variable with the exposure and outcome determines whether it needs to be included or excluded from the analysis. We provide orthopedic examples and more details on how to identify confounders, mediators, and colliders in the sections below. Adjusting for “the usual” variables in a statistical analysis may not be the correct approach for all research questions. Thus, to avoid systematic bias and perform high-quality studies, critical thought should be given to the structure of all variables before study initiation.
Figure 2: Directed acyclic graphs (DAGs) displaying the relationship between an exposure X, an outcome Y, a confounder C, a collider S, and an intermediate or mediator variable M.

Confounders
A confounding variable, also known as a confounder, distorts the true relationship between the exposure and outcome if not adjusted for in an analysis. Common confounders in TJA research include age at surgery, sex, body mass index (BMI), surgical indications, arthritis severity, and comorbidities. A confounder is a variable that 1) is associated with the exposure/treatment, 2) is associated with the outcome, and 3) is present before the exposure/treatment or is measured at baseline. A confounder can also be thought of as a common cause of both the exposure and outcome. For example, age at surgery is a confounder of the relationship between implant type (exposure) and revision risk (outcome) since age at surgery is associated with implant selection, revision risk and is measured at baseline. The definition of a confounding variable is graphically shown in Figure 2B. Since randomization of an exposure hypothetically removes the association between the confounder and exposure, adjustment for confounding variables is not usually necessary in randomized control trials.
The third criterion for defining a confounder is the most important as it distinguishes a confounder from a mediator or collider. It is not adequate to define a confounder solely based on whether its addition to a statistical model changes the association of interest by more than 10% or by some statistical variable selection procedure (e.g., forward or backwards selection or a machine learning approach). These methods do not differentiate a confounder from a mediator or collider because they do not consider the temporal ordering of the confounder variable in relation to the exposure and the outcome. If you incorrectly adjust for a mediator or collider variable in a statistical analysis, the estimated association can be biased. Therefore, critical assessment of the relationship between all potential variables related to a research question is important in order to collect and include information on the correct set of variables in an analysis. As we will discuss in sections below, common confounders can be mediators in some situations, and thus, careful consideration is required for each additional variable to include in a statistical analysis for each research question of interest.
If data on a confounding variable is collected and available to include in an analysis, it is called a measured confounder. If is it not available to include in an analysis because it was not collected, it is called an unmeasured confounder. If a variable meets the definition of a confounder, then it needs to be included (or adjusted for) in a statistical analysis. If a variable meets this definition and is not included in a model for any reason, the estimated association between the exposure and outcome will be biased. This type of bias is called residual confounding since confounding is still present even after adjustment for the other confounding variables.
Registry databases are foundational to orthopedic research. However, registries collect information on a limited number of variables, and hence, registry-based studies can be limited by residual confounding. If this is the case, the possible unmeasured confounding variables should be listed and discussed as a limitation in the discussion section of a paper. For example, if the exposure of interest is fixation type (cemented versus cementless fixation) and the outcome is revision, obesity is a confounder of this relationship.5 However, BMI is not usually collected or is mostly missing in many registry databases. If we do not have BMI measurements, then we cannot adjust for it in a statistical analysis. The measure of association we estimate between fixation type and revision is limited by possible residual confounding by BMI. On the other hand, if the exposure of interest is use of highly-crosslinked polyethylene liners and the outcome is revision, obesity is not a confounder of this relationship because it is not associated with liner choice. Therefore, adjustment for BMI is not necessary.
The amount of residual confounding bias will depend on the strength and direction of the association between the exposure and the unmeasured confounder(s) as well as the association between the outcome and the unmeasured confounder(s). Not adjusting for a confounder could mask the association between the exposure and outcome and incorrectly make it appear to be null (i.e., it appears there is no effect of the exposure on the outcome when the confounder is not adjusted for, but an effect is observed when the confounder is adjusted for). It is also possible that if a confounder is not adjusted for, the observed association between the exposure and outcome will be biased high. In this setting, the unadjusted association appears larger than the true association. There is no statistical test that can be used to test if unmeasured confounding exists. However, sensitivity analyses can be performed to determine how much an association could change under different degrees of unmeasured confounding.6
Multivariable adjustment is the most common method to adjust for measured confounders, but there are other statistical methods that can be used, including inverse probability weighting7, restriction, matching, stratification, propensity scores8, or instrumental variable9 analysis methods. Some of these methods will be covered in future methodology paper series by our group. Restriction (e.g. restriction of the study cohort to patients with osteoarthritis as the underlying operative indication) explicitly excludes patients to increase the similarity of the exposed and unexposed patients on one or more potential confounding factors. Matching and stratification can also result in exclusions of some patients if it is difficult to find suitable matches for some patients or if some strata contain patients from only one exposure group. We do not recommend using propensity scores for matching.10 Importantly, any exclusions should be based on information available at baseline. Excluding patients increases the internal validity of the results, but the analysis sample may not represent the original study population of interest (i.e., loss of generalizability) and the sample size may be too small to allow for adequate precision of the derived estimates (loss of power). Once a set of confounding variables is selected, the way they are included in a model still needs to be addressed. It is possible that nonlinear terms as well as interactions are needed for proper model fitting.
Confounder selection is nuanced, and we present one approach here sometimes referred to as the “common cause” approach. This is a simplified version of the causal backdoor path criterion of Pearl.11,12 For a more technical discussion of various methods for confounder selection we refer the reader to the following reference.13
Mediators
An intermediate variable, also known as a mediator, is a variable that is affected by the exposure and then in turn affects the outcome. In orthopedic research, common mediators include post-surgery or perioperative variables such as a dislocation or surgical complications. For example, when examining frailty as the exposure and readmission as the outcome, an example of a mediator variable is postoperative complications.14 This is because frailty increases the risk of complications, and complications increase the risk of readmission. Similarly, when examining the use of tranexamic acid as the exposure and complications as the outcome, an example of a mediator variable is blood transfusions during the perioperative period.
Mediators can be mistaken for confounders and incorrectly adjusted for in a statistical analysis. A mediator variable is a variable that is 1) associated with the exposure/treatment, 2) is associated with the outcome, and 3) and occurs after the exposure/treatment but before the outcome. This relationship is shown graphically in Figure 2D. Mediators differ from confounders in the temporality of the variable—they are collected after the exposure and before the outcome. It is incorrect to control for intermediate variables in a statistical analysis unless mediation analysis or estimating path specific effects is the study goal and this is properly considered in the analysis. A variable that is a mediator in one study may not be a mediator in another study. This fact stresses the importance that “the usual” variables are not included in every analysis and that each research question should be carefully considered when determining if a variable is included or excluded for a particular analysis.
If a mediator is included in a statistical analysis and the exposure-outcome, exposure-mediator, and mediator-outcome associations are in the same direction (i.e., all positively associated or all negatively associated), including the mediator variable in the statistical model will attenuate the estimated effect towards the null. In other words, if a researcher incorrectly includes a mediator variable in a statistical analysis, the observed association will be smaller than the true association. It is possible that the association is estimated to be null (i.e., no association) when a mediator is adjusted for when an association truly exists. Put differently, by adjusting for a mediator variable, a researcher is isolating the association from the exposure to the outcome that does not operate through the mediation pathway (X-M-Y), that is, the association between X and Y independent of M. This answers a different research question than is commonly of interest.
Colliders
A collider variable is a common effect of both the exposure and outcome. It is defined as a variable that 1) is associated with the exposure/treatment, 2) is associated with the outcome, and 3) occurs after both the exposure/treatment and outcome. This relationship is graphically depicted in Figure 2C. Adjusting for a collider will introduce bias in an analysis. This bias is called collider bias. The most common type of collider bias is selection bias. Selection bias occurs when the exposure and the outcome considered in a study are associated with a selection criterion.
Selection Bias
Selection bias is the bias that results from selection of a study sample such that some individuals are more likely to be included in the sample than others. This produces a sample that is not representative of the population of interest. When selection bias is present, the association between the exposure and outcome among those selected into a study (the observed association) is different from the association between the exposure and outcome among all people who are eligible for the study (the true association). The difference between the observed association and the true association is due to the selection bias. Types of selection bias that may arise in orthopedic research include informative dropout15 (e.g., individuals who are exposed are more likely to drop out of a study), self-selection into a study16 (e.g., people who previously experienced a complication after surgery are more likely to volunteer to be in a study that studies the complication), and incorrect selection of controls (e.g., selecting hospital-based controls instead of population-based controls). Selection bias can be unavoidable in some analyses and proper adjustment is required to correct for it. In order to properly account for selection bias, one needs to take into account the probability of being selected into the study sample. This can be done with the help of a statistician using standardization, inverse probability weighting, or stratification.
Studies in TJA cohorts that involve voluntary participation (e.g., surveys, scheduled prospective visits to clinic for research data collection) are prone to selection bias because participants are typically healthy, highly educated and non-smokers.16 For example, consider the exposure of education level and the outcome of revision surgery. A possible collider is voluntary participation (i.e. self-selection) since education level is positively associated with participation, revision surgery is inversely associated with participation, and selection into the study occurs after possible revision surgery. Thus, a spurious association between education level and revision surgery would be observed even if there is none as a result of selection bias.
Another example of selection bias in TJA research is the obesity paradox.17,18 Even though obesity is associated with an increased mortality risk in the general population, even severe obesity is not associated with a higher risk of death in TJA patients.17 In this example, the exposure is obesity, the outcome is mortality, and the collider is having undergone a TJA procedure. Since obesity is associated with numerous adverse health outcomes and a higher mortality rate in the general population, obese patients who survive long enough to undergo TJA are more healthy, paradoxically making obesity seem protective among TJA patients. In studies that restrict the patient population to individuals with TJA, this protective association of obesity is not causal, even though overweight and obese TJA patients have lower average risk of death.
Selection bias can also arise in studies of recurrent events after TJA. For example, older age and female sex are established risk factors for first-time dislocation after THA whereas use of large heads, constrained and dual-mobility liners are strongly associated with the reduced risk of dislocation. However, in a study19 examining risk factors for recurrent dislocation among those who already experienced a first-time dislocation (collider), younger patients had the highest risk of recurrent dislocation, and sex, head size, use of constrained and dual mobility liners were not associated with the risk of recurrent dislocation. The difference in the risk factors could be due to the selection bias that arises by only considering patients who had already experienced a first-time dislocation.
Guidelines for Researchers and Reviewers:
Data source: Is the data source for the study suitable in terms of sample size, coverage and representativeness?
Study variables: Are all exposure, outcome and other potential confounder variables available in sufficient detail, and accessible for the study? What variables are included in the analyses? Are the same variables adjusted for in all models? How are these variables chosen?
Exposure definition: Inclusion of subjects into the study should be based on exposure information available at the time of study entry and not based on future information.
Follow-up: Is the duration of follow-up and the time between the exposure and outcome sufficiently long for the type of outcome? Did the investigators define the timing of the outcome relative to the exposure?
Unmeasured confounding: Are there any unmeasured confounders? If so, is this acknowledged? Is any reasoning given to how this unmeasured confounding affects the results? If the data source is missing potentially relevant confounders, researchers can conduct sensitivity analyses to assess the impact of that confounder on the study results.
Are any post treatment variables adjusted for the in analyses? If so, is mediation analysis the goal?
Is the study population similar to the population conclusions are made on? Is selection bias likely? Is discussion given to the limitations of this potential bias?
Conclusions
There are three common variable types that are associated with both the exposure and outcome in research—confounders, colliders and mediators. For most analyses, only confounders should be included in statistical models. Bias can result if confounders are incorrectly not adjusted for and/or if colliders or mediators are incorrectly adjusted for in an analysis. Careful thought should be given when selecting what variables to include in a model prior to analyzing the data. Furthermore, apart from the assessment of whether confounding or selection bias exist in a study, it is necessary to also assess the strength and direction of the potential bias.
Supplementary Material
Funding:
This work was funded by a grant from the National Institute of Arthritis and Musculoskeletal and Skin Diseases (NIAMS) grant P30AR76312 and the American Joint Replacement Research-Collaborative (AJRR-C). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
References
- 1.Zaniletti I, Devick KL, Larson DL, Lewallen DG, Berry DJ, Maradit Kremers H. Study Types in Orthopedics Research: Is my study design appropriate for the research question? Journal of Arthroplasty. 2022;under review. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Zaniletti I, Devick KL, Larson DL, Lewallen DG, Berry DJ, Maradit Kremers H. P-values and Power in Orthopedic Research: Myths and Reality Journal of Arthroplasty. 2022;under review. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Zaniletti I, Devick KL, Larson DL, Lewallen DG, Berry DJ, Maradit Kremers H. Measurement Error and Misclassification in Orthopedics: when study subjects are categorized in the wrong exposure or outcome groups. Journal of Arthroplasty. 2022;under review. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Greenland S, Pearl J, Robins JM. Causal diagrams for epidemiologic research. Epidemiology. 1999;10(1):37–48. [PubMed] [Google Scholar]
- 5.Roder C, Bach B, Berry DJ, Eggli S, Langenhahn R, Busato A. Obesity, age, sex, diagnosis, and fixation mode differently affect early cup failure in total hip arthroplasty: a matched case-control study of 4420 patients. J Bone Joint Surg Am. 2010;92(10):1954–1963. [DOI] [PubMed] [Google Scholar]
- 6.Haneuse S, VanderWeele TJ, Arterburn D. Using the E-Value to Assess the Potential Effect of Unmeasured Confounding in Observational Studies. JAMA. 2019;321(6):602–603. [DOI] [PubMed] [Google Scholar]
- 7.Mansournia MA, Altman DG. Inverse probability weighting. BMJ. 2016;352:i189. [DOI] [PubMed] [Google Scholar]
- 8.Sturmer T, Joshi M, Glynn RJ, Avorn J, Rothman KJ, Schneeweiss S. A review of the application of propensity score methods yielded increasing use, advantages in specific settings, but not substantially different estimates compared with conventional multivariable methods. Journal of Clinical Epidemiology. 2006;59(5):437–447. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Greenland S An introduction to instrumental variables for epidemiologists. Int J Epidemiol. 2018;47(1):358. [DOI] [PubMed] [Google Scholar]
- 10.King G, Nielsen R. Why Propensity Scores Should Not Be Used for Matching. Political Analysis. 2019;27(4):435–454. [Google Scholar]
- 11.Pearl J Causal Diagrams for Empirical Research. Biometrika. 1995;82(4):669–688. [Google Scholar]
- 12.Pearl J Causality : models, reasoning, and inference. Cambridge, U.K.; New York: Cambridge University Press; 2000. [Google Scholar]
- 13.VanderWeele TJ. Principles of confounder selection. Eur J Epidemiol. 2019;34(3):211–219. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.McIsaac DI, Aucoin SD, Bryson GL, Hamilton GM, Lalu MM. Complications as a Mediator of the Perioperative Frailty-Mortality Association. Anesthesiology. 2021;134(4):577–587. [DOI] [PubMed] [Google Scholar]
- 15.Maradit Kremers H, Devick KL, Larson DR, Lewallen DG, Berry DJ, Crowson CS. Competing Risk Analysis: What Does It Mean and When Do We Need It in Orthopedics Research? J Arthroplasty. 2021;36(10):3362–3366. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Galea S, Tracy M. Participation rates in epidemiologic studies. Ann Epidemiol. 2007;17(9):643–653. [DOI] [PubMed] [Google Scholar]
- 17.Dowsey MM, Choong PFM, Paxton EW, Spelman T, Namba RS, Inacio MCS. Body Mass Index Is Associated With All-cause Mortality After THA and TKA. Clin Orthop Relat Res. 2018;476(6):1139–1148. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Kunutsor SK, Whitehouse MR, Blom AW. Obesity paradox in joint replacement for osteoarthritis - truth or paradox? Geroscience. 2021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Hermansen LL, Viberg B, Overgaard S. Risk Factors for Dislocation and Re-revision After First-Time Revision Total Hip Arthroplasty due to Recurrent Dislocation - A Study From the Danish Hip Arthroplasty Register. J Arthroplasty. 2021;36(4):1407–1412. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
