Skip to main content
. 2021 May 4;11:9457. doi: 10.1038/s41598-021-89020-x

Figure 1.

Figure 1

Collider bias in polygenic gene-environment models. Panel A. Schematic diagram of the collider bias which occurs between polygenic score, environment, and outcome in cases of gene-environment interdependence. Dark purple circles represent variables, unobserved confounders are shown in grey circles, collider variables are indicated by squares. By adding E into the model with the polygenic score G, we make E a collider. A collider that is not conditioned on, blocks the path between its sources (G and U); once a collider is controlled for, the path is opened as indicated by green nodes. Panel B (top). Spurious regression estimates for polygenic score and environment from the series of OLS simulations reflecting the range of gene-environment interdependence and the presence of modest, moderate, or strong confounder, U. Collider bias due to positive values of gene-environment correlation and the presence of uncontrolled confounder, which is positively correlated with covariate and outcome, results in deflation of polygenic score estimates. The degree of bias depends on the strength of unobserved confounder, U, and gene-covariate interdependence. Estimates of the environmental effect are upwardly biased but are not affected by the gene-environment correlation. Panel B (bottom). R-squared inflation plot from the series of OLS simulations; collider bias results in inflated values of explained variance statistics. R-squared statistics for the model with endogenous covariate and polygenic score includes not only the true share of the variance in Y explained by G and E (baseline estimate indicated by 0), but also the elements of variance that are due to gene-environment correlation and confounder(s), U.