Abstract
Variable importance is a key statistical issue in exposure mixtures, as it allows a ranking of exposures as potential targets for intervention, and helps to identify bad actors within a mixture. In settings where mixtures have many constituents or high between-constituent correlations, estimators of importance can be subject to bias or high variance. Current approaches to assessing variable importance have major limitations, including reliance on overly strong or incorrect constraints or assumptions, excessive model extrapolation, or poor interpretability, especially regarding practical significance. We sought to overcome these limitations by applying an established doubly-robust, machine learning-based approach to estimating variable importance in a mixtures context. This method reduces model extrapolation, appropriately controls confounding, and provides both interpretability and model flexibility. We illustrate its use with an evaluation of the relationship between telomere length, a measure of biologic aging, and exposure to a mixture of polychlorinated biphenyls (PCBs), dioxins, and furans among 979 US adults from the National Health and Nutrition Examination Survey (NHANES). In contrast with standard approaches for mixtures, our approach selected PCB 180 and PCB 194 as important contributors to telomere length. We hypothesize that this difference could be due to residual confounding in standard methods that rely on variable selection. Further empirical evaluation of this method is needed, but it is a promising tool in the search for bad actors within a mixture.
Keywords: Mixtures, Causal inference, Variable importance, persistent organic pollutants
1. Introduction
Mixtures are increasingly being considered as research targets within many areas of public health and medical sciences. Mixtures are simply collections of multiple exposures, and are the subject of myriad study questions, ranging from descriptions of the exposures making up the mixture to inquiries about causal effects of the mixture on health-related outcomes. When considering causal effects, both joint effects and independent effects can be estimated. Studies in which joint effects of a mixture are harmful tend to raise questions about which components of the mixture are the “bad actors” and the best approach for identifying them remains unclear. “Bad actor” is an informal term often used to describe both: 1) an exposure that “drives” the overall joint effect of a mixture, such as when the association between a specific job task and a health outcome can be wholly explained by the fact that workers doing the task are always exposed to a certain chemical, or 2) an exposure that has stronger associations with the health outcome than other components of the mixture and therefore stands out as a potentially harmful agent within the mixture, even if it does not wholly explain the association. The framework of causal inference (also called “causal effect estimation” by Greenland (2017)) can be useful in clarifying the goals and methods of bad actor searches in mixtures. Existing demonstrations of methods to identify bad actors within a mixture (e.g. those by Czarnota et al (2015) and Gibson et al (2019)) focus on statistical aspects of these methods. Existing approaches to bad actors implicitly work under a set of desiderata that place preference on statistical considerations like reducing the variance inflation that arises when analyzing an independent variable in relation to a correlated set of independent variables. However, these statistical considerations can be at odds with causal considerations when, for example, variance reduction comes at expense of introduction of confounding bias. In the current manuscript, we describe a set of desiderata when considering the choice of methods for assessing bad actors within a mixture. Under these statistical and causal considerations, we suggest that bad actor searches could benefit from the methods and ideas of stochastic intervention effects from Díaz Muñoz and Van Der Laan (2012), Díaz Munõz et al (2015), and Díaz Munõz and van der Laan (2018). To illustrate the application of the stochastic intervention effect framework and compare it to existing methods, we include a re-examination of associations between leukocyte telomere length (non-coding regions of DNA at the ends of chromosome) and a set of correlated, persistent organic pollutant biomarkers using publicly-available data from the cross-sectional, representative sample collected through the National Health and Nutrition Examination Study (NHANES) (Mitro et al, 2016; Gibson et al, 2019; Zipf et al, 2013).
1.1. Bad actor quantification as causal estimation
Causal inference is often an implicit, if not explicit, goal of bad actor searches. The term “bad actor” implies, by virtue of the word “act,” a consideration of causal relationships. That is, if we are examining relative importance of specific PCBs within a mixture with respect to telomere length, it is of little direct interest if the crude association between say, a one unit change in PCB 180, and telomere length is high. Rather, it needs to be interpreted within the context of the mixture itself, where effect estimates are likely influenced by co-exposure confounding. Co-exposure confounding is the term often used to describe scenarios where estimated associations between an outcome and one component of a mixture may be due to a causal effect of a different, but correlated component of the mixture. Co-exposure confounding complicates regulation, intervention, and understanding of mechanism by introducing bias. If we use biased estimates to decide on regulating a single chemical or to point toward mechanisms, the regulations may be ineffective and we may be misled about potential mechanisms. However, even if a bad actor search aspires to be causal, there is no single method that can be used to distinguish association from causation. Causal inference principles can guide design and analysis of data, but it is essential to have sound statistical approaches. Statistical approaches to “variable importance” are useful for quantifying or ranking the strength of an association of exposure with an outcome relative to other exposures under consideration Van der Laan (2006). In the remainder of the manuscript, we discuss the statistical problem of variable importance in an aspirationally causal fashion. We first describe conditions under which causal inferences about variable importance may be possible, but also explicitly consider whether these conditions can be met for most research questions. We also consider whether the approaches necessary to overcome statistical issues in mixtures (many exposures, high correlation) might be at odds with causal inferential issues.
1.2. Notation
We denote a mixture of p exposures as a vector given by M ≡ (M1,M2,…,Mp), an (unmeasured) common source of the exposures as U, a vector of covariates (potential confounders) as W, and a continuous or binary outcome as Y . We define O ≡ (Y,M,W,U) as a random variable with distribution P0. Let Mi refer to the ith exposure in the mixture and Mi/ refer to the remainder of the mixture (M1,…,Mi−1,Mi+1,…,Mp). We assume that the probability density of O can be decomposed as
Under this decomposition, Díaz Munñz and van der Laan (2018) link the observed data with a causal parameter via a non-parametric structural equation model. For now, we suppose that the causal structure underlying the data follows the causal diagram given in Figure 1. We denote an exposure regime as g(v), which is simply a function that assigns values to each component of the mixture given some vector of variables V. The notion of a regime is useful in counterfactuals, where we distinguish the observed exposures M from a counterfactual exposure that would be assigned according to covariates by the function m = g(v). Finally, we denote a potential outcome as Y (g(v)), which is the value of Y we would observe under the (possibly) counterfactual regime g(v). For notational simplicity is potential outcomes are hereafter given as Y (g). Causal consistency links the observed data with the potential outcomes by the identity Y (g) = Y if M = g(v) (Pearl, 2010).
Fig. 1.

Causal diagram (directed acyclic graph) representation of the data O≡ (U,W,M,Y) for a 2 dimensional mixture with an unmeasured predictor of each exposure in the mixture (U). Here, the diagram demonstrates that control of confounding for the effect of M1 on Y requires adjustment for known confounder W and the other component of the mixture M2, due to a common source with M1
For our purposes here, it is also useful to denote by M+(g) a (possibly) counterfactual set of exposures that depend on the “natural” value of M (Young et al, 2014). For example, Young et al (2014) give a hypothetical example in which researchers first observe individuals exercising, and if it appears that an individual is stopping exercise before 30 minutes have passed (the ”natural” value of exposure M), then a researcher intervenes to compel the individual to continue to exercise until 30 minutes have passed (M+(g)). In contrast, a more standard exposure contrast would be to randomize interventions to two groups with set durations of exercise (e.g. 30 minutes vs 15 minutes). As is demonstrated below, counterfactuals like M+(g) are useful to consider in mixtures because they offer a way to reduce a common concern of mixtures: extrapolation outside the data.
1.3. Variable importance for bad actors in mixtures
Variable importance measures roughly break into two classes, which we discuss here. These classes are based on whether nor not the rankings are based on effect size.
1.3.1. Effect size-based rankings
Effect-sized based ranking utilize comparisons of how much an outcome would be expected to change based on an incremental change in some exposure. This approach is advantageous in the sense that ”effect-size” importance, when causally interpreted, can be used to directly tie variable importance measures to decisions about whether and where to target an intervention on exposures within a mixture.
Example: linear-regression variable importance
A basic and useful reference method for variable importance of a continuous outcome is the coefficients of a linear regression model.
For example, a model given by
would allow us to establish which exposure (M1 or M2) is more important to the outcome by ranking the absolute values of β1 and β2. This basic approach can be used for various elaborations on the linear model, such as the LASSO (Tibshirani, 1996) or elastic-net algorithms (Zou and Hastie, 2005).
Issues of effect-size-based rankings
Several complications arise with this approach when considering mixtures, each of which needs to be addressed. Namely
Different distributions: the magnitude of the β coefficients in a linear model depend on the scale of each exposure. Therefore, we could arbitrarily alter relative importance by changing the units or scale of an exposure. On the other hand, common scaling can result in different levels of statistical certainty across a mixture. This problem is compounded when mixtures consist of both continuous and binary variables.
Non-linearity: If, say, the true model underlying the data included a quadratic (non-linear) term for M1, then importance would vary depending on which value of M1 were assessed, so that we could arbitrarily establish importance by considering importance at specific values of M1 (e.g. if a dose-response for M1 were quadratic and monotonically increasing, the importance measure for M1 would be larger at higher values of M1).
Non-additivity: If the true model underlying the data included a product term between M1 and M2, then importance for both of these exposure would depend not only on their own values but also on the value of the other exposure, so that analogous to non-linearity, we could arbitrarily establish importance at specific values of M1 and M2
Extrapolation and non-positivity: If, as is common in mixtures, M1 and M2 are highly correlated, then a one-unit change in one exposure, holding the other exposure constant, may not be well supported by the data and depend heavily on model extrapolation (Snowden et al, 2015). Further, a one-unit reduction in exposure for exposures that have only non-negative values may yield implausible values of exposure and invalid causal inference, known as ”non-positivity” (Westreich and Cole, 2010).
1.3.2. Non-effect-size-based rankings
A second reference method for variable importance derives from the literature on black box prediction models, where variable importance is used to rank variables according to how important they are for predictions.
Example: random forest variable importance
Random forest is an ensemble machine learning algorithm that combines information from multiple learners, which are typically recursive partitioning trees that are trained on different subsets of the data and predictors (Breiman, 2001a). One common example of variable importance for black box prediction models is the permutation algorithm used by the authors of the randomForest software package in R (Liaw and Wiener, 2002) and described by Strobl et al (2007). Here, the reduction in mean-squared prediction error for a given variable across all trees is compared to the reduction in mean-squared prediction error that occurs under a random permutation of that variable, and it is scaled by the standard deviation of that reduction across trees. Consider the example of a random forest with a single tree, where the tree is created by recursively partitioning the data into smaller and smaller covariate-defined groups based on how similar the outcome is across members of those groups. Trees in a random forest use ”in-” and ”out-of-bag” samples, where training is done with the in-bag sample, and estimates of prediction error (to assess overfit) are made in the out-of-bag sample. The mean-squared prediction error prior to any partitioning is given as . , where noob is the number of observations in the out-of-bag sample and μib is the sample mean of Y in the in-bag sample (the sample used to fit the tree). If, for example, the first partitioning happens for PCB 180 at a 38 ng/g threshold, then the mean-squared prediction error at this step is calculated as, ,where μib(PCB180i) is the sample mean of Y among the subset of the data (partition) containing the ith individual (<38 ng/g vs ≥38 ng/g). For this two-partition tree, the importance of PCB 180 is then calculated as mspe0−mspe1. This basic calculation generalizes to multiple splits and multiple trees as used in random forest, and the importance for each exposure across the forest is divided by the standard deviation of importance measures across all trees in the forest. Random forest was initially motivated by the goal of prediction (Breiman, 2001b). A motivating goal of prediction algorithms was to identify small sets of variables that could be measured in new data to make accurate predictions. The random forest importance measure directly assesses the impact of a given variable on predictions in data which has not been used to train the algorithm, and can be used as a decision tool for which variables to on which predictions should be based in future data.
1.3.3. Issues of non-effect-size-based rankings
Non-effect-size-based rankings, like those used in random forest algorithms, have several drawbacks with respect to mixtures
Lack of interpretability: little to no indication of practical importance in clinical or public health decisions.
Sensitivity to correlation: if two highly correlated variables are included in an analysis, and only one of them is causal, model fit may change very little upon removing or permuting the causal exposure if the correlated exposure is retained. This sensitivity relates to the fact that non-effect-size-based rankings often derive from settings in which the goal is prediction, rather than causal inference.
1.4. Desiderata of variable importance in mixtures
Given some common problems in mixtures, namely: multiple, correlated exposures; potentially multiple causal agents; and non-linearity and non-additivity; it is helpful to define what an ideal measure of variable importance in mixtures might look like, especially when we desire to answer causal questions. We propose that a variable importance measure for mixtures have the following desiderata:
Limited extrapolation
Interpretability as effect sizes to quantify practical importance
Appropriate adjustment for confounders
-
Flexible model form
Additionally, Van der Laan (2006) notes that variable importance measures are often considered as side-effects of other estimators, rather than targets of inference themselves. Thus, bias-variance tradeoffs may not be optimal for assessing variable importance measures because they are often not the main target of the analysis. For example, in linear models that have the lowest overall mean-squared-error for an unbiased model (via the Gauss-Markov theorem), coefficient estimates can have inferior mean-squared error compared to some shrinkage estimators of the coefficients (Greenland, 2000). Thus, if the target of investigation is variable importance, then we might also add:
Allow for estimators of variable importance that address the bias-variance tradeoff for variable importance itself, rather than just the predictions.
2. Methods
2.1. Stochastic intervention effects
One estimand that fulfills the desiderata described above are effect estimates that involve stochastic regimes, specifically the types of regimes described by Díaz Muñoz and Van Der Laan (2012), Díaz Muñoz et al (2015), and Díaz Muñoz and van der Laan (2018). A stochastic regime is a function that probabilistically assigns exposure values, possibly based on measurements of other covariates, such that an individual exposure is drawn from a conditional distribution. Specific to our purposes for mixtures, stochastic regimes of interest here depend on the “natural” value of an exposure, or the level of exposure that would have been observed, absent an intervention. For time-fixed exposures, the natural value of exposure is simply the observed exposure. We denote a stochastic regime that depends on the natural value of exposure mi as g(mi,mi/,w), where mi/ denotes all exposures in the mixture other than mi. Let the distribution of mi, given Mi/ = mi/ and W = w be supported on the interval (l(mi/,w),u(mi/,w)). For example, if m is a univariate exposure “receive a hysterectomy” and W denotes whether an individual has an intact uterus (yes or no), then the probability ranges from 0 to 1 for individuals with an intact uterus, but the probability of m is 0 among individuals without a uterus. Given this support, Díaz Muñoz and Van Der Laan (2012) defined the stochastic regime under consideration here as
This regime is equivalent to an intervention that adds a small amount (δ) to the exposure status of every individual, but only if it is less than the support denoted by u(mi/,w) (thus meeting the first desideratum of minimal extrapolation). The upper bound u can be found via the (generalized) propensity score: if the probability (density) of of exposure at mi+δ given mi/,w is zero, then g(mi,mi/,w) = mi. A stochastic intervention effect is given as a the expectation of contrast of a potential outcome under a stochastic regime and the observed outcome:
This parameter is interpretable as an effect size for a small change in exposure, thus meeting the second desideratum. The focus on effects of small changes in exposure, δ, reflect the consideration that, within a correlated mixture, estimating effects of modest changes in exposure is a useful way to address limitations of mixtures data for learning about independent effects (Snowden et al, 2015).
2.1.1. Identification
We first describe causal identification conditions that are often used for (non-stochastic) fixed exposure regime. For an intervention on the jth component of a mixtures that sets Mi to a constant g, standard identification conditions for such problems include causal consistency (as described above), as well as:
| (1) |
| (2) |
In words, conditional exchangeability means that, for an outcome and a single exposure Mi, confounding and selection bias can be controlled by conditioning on the other exposures and the covariate vector W. Positivity means that, for each observable value of the other exposures and the covariate vector, the intervention value of the exposure of interest must be potentially observable in the data. Exchangeability can be assessed via the directed acyclic graph (DAG) shown in Figure 1. A straightforward application of the rules of this diagram demonstrates that the conditional exchangeability holds for a single exposure M1 and that we can control confounding using measured variables (Pearl, 1995). Using an approach to adjust for variables sufficient to ensure exchangeability meets the third desideratum. Conditional exchangeability conditions for stochastic regimes are slightly stronger than those for a fixed regime. Notably, Richardson and Robins (2013) and Young et al (2014) show that the natural value of exposure should be considered as a potential confounder. In the straightforward causal diagram given here, there are no practical consequences, but both of those manuscripts demonstrate scenarios with time-varying exposures where effects of stochastic regimes are not identified, even though effects of fixed regimes are. Positivity means that, for each observable value of the other exposures and the covariate vector, the intervention value of the exposure of interest must be potentially observable in the data. For the stochastic regimes describe above, the positivity assumption is guaranteed by the definition of the intervention. In mixtures problems, a non-parametric estimate is typically not possible (e.g. if M or W have large dimension or contain continuous components). Thus, parametric or semi-parametric models must be used, which require some additional assumptions about model specification.
2.1.2. Estimation
Estimation of stochastic intervention effects can be carried out by targeted maximum likelihood/minimum loss estimation (TMLE) (Van der Laan et al, 2011). Estimation of this stochastic intervention effect using TMLE, along with a set of sufficient conditions for its validity, is described in detail by Díaz Muñoz and van der Laan (2018). A brief sketch is given here to build some intuition for the approach. First, we denote the generalized propensity score for the jth component of the mixture (conditional distribution function of Mi, given Mi/ and W) as g0(mi|mi/,w). Estimation of Ψ is based on the identity (given identification conditions):
| (3) |
This identity is a form of the g-computation algorithm formula, which is an integration of the conditional expectation of an outcome over the distribution of the exposure (g) and covariates (q) (Robins, 1986). TMLE uses estimators of the g-computation algorithm formula as an “initial estimator” of the target parameter. This estimate is updated by maximizing a weighted likelihood function with weights given by an auxiliary covariate defined based on an estimate of the generalized propensity score, gn(mi|mi,w). This auxiliary covariate is defined as
Conceptually, this covariate is an stabilized inverse probability weight that weights the proportion of individuals with Mi = mi so that they represent the group of individuals with Mi = mi−δ under the counterfactual stochastic intervention in which δ units were added to their exposures. Further nuance is given by Díaz Muñoz and van der Laan (2018). Under causal assumptions implied by the directed acyclic graph in Figure 1, this approach estimates an total, average, independent exposure effect, or the average effect in the population of a small change in a single exposure without directly changing other exposures. The applicability to mixtures in this setting is immediate: using stochastic intervention effect size to find bad actors means that “importance” of an exposure within a mixture directly refers to how much we could change the outcome if we made a slight change to the exposure of the population. This definition makes the population, and the actual exposure levels experienced by that population, central to how a bad actor is defined. In some cases, some exposures may mediate the effects of other exposures in a mixture (for example, ozone can be created when nitrogen oxides interact with volatile organic compounds in the air, so ozone may be a mediator of some effects of nitrogen oxides on health). When that is the case, provided the necessary conditions hold for mediation analysis, then the stochastic treatment effects of the mixture will include some direct effects and some total effects of exposure. In that case, ranking by effect size gives us a way to compare interventions in which we modify a single exposure and keep other exposures fixed at a given level. Finally, Van der Laan (2006) notes that, absent causal identification, variable importance estimates are nonetheless useful associational measures themselves.
2.1.3. Machine learning and double-robustness
Díaz Muñoz and van der Laan (2018) show that the TMLE estimator is doubly-robust, meaning that it is an (asymptotically) consistent estimator of Ψ if either a consistent g-computation estimator (outcome regression) or a consistent estimator of the generalized propensity score (exposure regression) are used (Bang and Robins, 2005). Further, these estimators can include data adaptive estimators (e.g. machine learning algorithms), which overcome the strict assumptions of parametric models and support the plausibility of obtaining a consistent estimator, thus meeting the fourth desideratum. The TMLE estimator also targets a bias-variance tradeoff for the importance measure, rather than the overall model fit, which is the fifth desideratum.
To fit outcome and exposure regression models in the example below, we used super learner (Van der Laan et al, 2007), a machine learning approach based on “stacked generalization” (Wolpert, 1992). Super learner is describe in detail both for general purposes by Van der Laan et al (2011) and for purposes of stochastic interventions by Díaz Muñoz and van der Laan (2018). Briefly, however, super learner is an ensemble machine learning approach that uses cross-validation to select a convex combination of a “library” of machine learning algorithms (learners) that minimize some loss function. That is, super learner predictions comprise a weighted average of predictions from the library. Super learner has an oracle property and will converge asymptotically to a correct model if it is in the library (Van der Laan et al, 2007). For predicting telomere length, the loss function was mean-squared error, and for predicting exposures, the loss function was given as the negative log-likelihood of exposure, given covariates.
2.2. Example: Persistent organic pollutant biomarkers and leukocytic telomere length
We demonstrate the usage of several variable importance measures using an example from a cross-sectional, population-based, observational study of the association between leukocyte telomere length and a mixture of persistent organic pollutant biomarkers. Telomeres are non-coding regions of DNA located on the ends of chromosomes. Telomere length is considered a measure of biologic aging because chromosomes lose a section of DNA each time they go through mitosis. Although the presence of telomeres helps ensure that this DNA degradation is limited to non-coding regions, over-shortening (and eventual cell death) is counteracted by telomerase, an enzyme that adds DNA bases onto the ends of telomeres to prevent or slow their loss. Because excessive telomerase activity could effectively immortalize a cell, telomere length has been proposed as a marker of carcinogenic processes. Altogether, both over- or under-regulation of telomerase can generate telomeres that are either longer or shorter, and both states can signal disease processes. Therefore, toxicity of a chemical within a mixture could mean that a given chemical pollutant results in longer or shorter telomere length. The international agency for research on cancer (IARC) has classified a number of PCBs, Dioxins, and Furans as carcinogenic to humans (International Agency for Research on Cancer, 2012, 2015). Part of the basis for classifying dioxins and dioxin-like PCBs was based on evidence for a mechanism through binding the aryl-hydrocarbon receptor (AhR), which can lead to apoptosis or changes in gene expression and cell replication. In vitro evidence from Sarkar et al (2006) suggested that telomerase was up-regulated by cells treated with dioxins, and those authors hypothesized that up-regulation could be mediated by AhR activation. In contrast, Ziegler et al (2017) found that telomere lengths were lower among individuals with higher PCB exposures (mainly non-dioxin-like PCBs), and, when plasma from PCB exposed individuals was added to T-cells in experiments, telomerase activity was suppressed. Thus, a mixture of PCBs, Dioxins, and Furans represents a set of exposures, members of which may have opposing relationships with telomerase and telomere length. Mitro et al (2016) originally categorized the exposures of interest according to their mechanistic properties, creating indices grouped as “non-dioxin-like PCBs” versus “non-ortho PCBs”, as well as a catch-all group sorting exposures by their “toxic equivalence”. The latter group quantified a prior characterization of all the pollutants of interest according to their toxic equivalency factors relative to 2,3,7,8 tetrachlorodibenzo-p-dioxin. The toxic equivalency factors were created by the World Health Organization in 2005 to characterize in vivo evidence of toxicity and can be used to generate a single weighted index exposure, with weights determined by the toxic equivalency factors Van den Berg et al (2006). We note that this type of weighted index exposure approach assumes that all exposures have linear effects in the same direction, an assumption referred to as “directional homogeneity” (Keil et al, 2020). In our re-analysis, we characterize variable importance using methods that relax this assumption.
2.2.1. Participants
Participant data for the current manuscript are based on 979 participants from the 2001–2002 cycle of the National Health and Nutrition Examination Survey (NHANES), a cross-sectional national survey conducted by the U.S. Centers for Disease Control and Prevention. The study population, and exposure and outcome assessment methods are similar to those previously reported in detail by Mitro et al (2016) and Gibson et al (2019). Briefly, 11,039 individuals were interviewed by NHANES in the 2001–2002 cycle. We excluded individuals who a) were less than 20 years of age (N=5,628) b) did not provide a blood sample or consent to use of DNA (N=1,142) c) did not have sufficient sample to estimate telomere length (N=9) d) did not participate in the environmental chemical analysis subset (n=2,850) e) were missing data on BMI, education, or serum cotinine (N=80) or f) were missing data on one of 24 PCBs, Dioxins or Furans that formed part of a toxic equivalence metric created by Mitro et al (2016) (n=327). Following these exclusions, the study sample comprised 1,003 individuals (as reported Mitro et al (2016) and Gibson et al (2019)), from which an additional 24 individuals were excluded if they were missing data on any of the 18 PCBs, dioxins or furans considered here (final N=979).
2.2.2. Exposure and telomere length quantification
Serum measurements of PCB, dioxin and furan congeners were measured by high-resolution gas chromotography/isotope-dilution high-resolution mass spectrometry. Following, Gibson et al (2019), we excluded any congener from analysis if fewer than 60% of sample concentrations were above the limit of detection, resulting in a final selection of 18 persistent organic pollutant biomarkers for further analysis. Consistent with prior analyses and recommended correction methods (O’Brien et al, 2016), all analyses utilized congener concentrations that were divided by serum lipid concentration. Telomere length was estimated using the quantitative polymerase chain reaction method, and telomere (T) length relative to a standard (S) DNA referent was calculated (Cawthon, 2002; Lan et al, 2009). The outcome for this analysis is the mean T/S ratio across up to 6 assays per sample, but we label this ratio “telomere length” for simplicity. All assays were subject to quality control evaluations, as previously described.
2.2.3. Weights and covariates
Recruitment into NHANES involves oversampling from segments of the US population. Following Mitro et al (2016) analyses were weighted to the US Census population using the 2-year dioxin subsample weights. These sampling weights are used to generalize the results to the 2000 US Census population. Following Mitro et al (2016) and Gibson et al (2019), all analyses were adjusted for age (continuous), body mass index (continuous), NHANES racial/ethnic category (non-Hispanic white [referent], non-Hispanic black, Mexican-American, and other), education category (less than high school, high school graduate, some college, and college graduate or more [referent]), lipid cotinine levels (continuous), white blood cell count (continuous, % lymphocytes/-monocytes/neutrophils/eosinophils; % basophils were a linear combination of the other values and were omitted). For linear parametric approaches, a continuous term for age2 was also entered, following prior analyses. Univariate exposure and covariate distributions are identical to those reported by Gibson et al (2019) and thus are omitted here. Continuous exposure and covariate values were each divided by 2 times their standard errors prior to analysis Gelman (2008). To retain interpretation on the absolute scale, we did not perform natural-log transformation of telomere lengths nor exposures as was done in prior analyses. No standard errors are given for importance measures that do not report them as part of their standard software. Non-parametric bootstrapping (10,000 samples) was used to estimate standard errors for other variable importance measures except the stochastic intervention based approach. The NHANES weights impact both point estimates and variance, which is accommodated by non-parametric bootstrapping. For the stochastic intervention-based approach, the weights were used as a multiplier of the weights that already underlie TMLE, which utilizes a robust variance estimator based on the influence curve (Díaz Muñoz et al (2015)).
2.2.4. Variable importance measures
Variable importance for the 18 PCBs, Dioxins, and Furans in the study sample was estimated using the following approaches:
Effect size based importance
Ordinary, LASSO, and Elastic net linear regression
Ordinary, LASSO, and Elastic net linear regression exposures were entered linearly into an adjusted linear regression model, and coefficient magnitudes were used to assess variable importance. LASSO and elastic-net both provide shrinkage and variable selection. The penalty parameter for LASSO and elastic-net were selected via 10-fold cross validation. Non-exposure covariates were forced into the model with unpenalized coefficients.
quantile-based g-computation
A linear quantile-based g-computation (qgcomp) approach was utilized (Keil et al, 2020). Exposures are first scored according to the quantile category of each exposure (e.g. all exposures between the 25th and 50th percentiles are scored identically). Next, a multiple linear regression was fit with all scored exposures and untransformed covariates, and a summary effect estimate was generated using g-computation. Variable importance is given for each exposure via the magnitude of an underlying linear model coefficient for each exposure.
Weighted quantile sum regression
A linear weighted quantile sum regression (WQSR) model was fit (Carrico et al, 2015). Briefly, this approach utilizes sample splitting where, in the first sample, a constrained linear regression is fit to estimate a convex set of coefficients (weight) for a linear combination of scored exposures (similar to quantile-based g-computation), adjusted for covariates, across a number of bootstrap samples (here, 1000). In the second sample, the weights are used to create a single index exposure that is then used in an adjusted linear regression model. Variable importance for a given congener was defined using the product of the congener’s weight and the linear model coefficient for the index. This method is standardly fit with either non-negative and/or non-positive constraints, such that two sets of variable importance measures are generated. This yields interpretations much like that of the toxic equivalencies-based approach, where variable importance is with respect to one effect direction only. We report on results for both sets of constraints.
TMLE of stochastic intervention effects
We estimated variable importance for each exposure by comparing population average leukocyte telomere length under no intervention versus that under a stochastic intervention to increase exposure for each participant by (at most) δ = 0.01 standard deviations. This contrast was chosen so that effect directions would be consistently interpreted across methods. We utilized the TMLE approach of Díaz Muñoz and van der Laan (2018), including the asymptotic variance estimators based on the efficient influence function to estimate 95% confidence intervals (CI). Super learner was used for both outcome and exposure models. It is common practice to choose a number of candidate learners for the super learner library that capture of variety of different shapes. The super learner library for the model for mean telomere length included 1) the sample mean (Mean) 2) ordinary least squares regression (OLS) 3) a generalized additive model (Wood et al, 2016, GAM) 4) multivariate adaptive regression splines (Friedman, 1991, MARS) and 5–7) LASSO, ridge and elastic net regression, utilizing 10-fold cross-validation to select the penalty parameter λ (CV-LASSO, CV-ridge, CV-elastic net) 8) a neural net (Neural net) 9–11) OLS with prior variable selection/reduction based on principle component analysis (PCA+OLS), OLS coefficient screening + OLS with selected variables (OLS+OLS), random forest importance based selection followed by OLS (RF+OLS) and 12) stepwise regression (Stepwise). The super learner library for the generalized propensity score for each of the PCBs, Dioxins, and Furans included 1) Linear regression model for exposure with homoscedastic error under a normal distribution assumption (Normal GLM) 2) Linear regression model for ln(exposure) with homoscedastic error under a normal distribution assumption (Log-normal GLM) 3–4) histogram (discrete probability) density estimator using a multinomial model with 10 equally sized categories based on quantiles (Multinomial GLM), or a multinomial elastic-net (Multinomial elastic net) 5–8) a density estimator with homoscedastic error under a Gaussian kernel density with a mean model fit by a neural net (Neural net), LASSO regression (LASSO), ridge regression (Ridge) or elastic net (Elastic net) 9) a chained learner in which predictors for the exposure of interest are first run through a principle components analysis, and linear regression with homoskedastic normal errors is then used to estimate density based on the first two principle components (PCA + Normal GLM). Super learner was implemented using the sl3 package v1.4.4 in R v4.2.2 (https://github.com/tlverse/sl3), and the TMLE algorithm was implemented in the vibr package v1.0.3 in R (https://github.com/alexpkeil1/vibr).
Non-effect size based importance
Random Forest
Random forest regression was implemented using the original algorithm described by Breiman (2001a), implemented with 10,000 trees. Variable importance was determined using the out-of-bag estimates of reduction in meansquared-error, as described above and by Strobl et al (2007).
3. Results
Weighted, bivariate spearman correlation coefficients ranged from 0.16 (1,2,3,4,6,7,8-HpCDF and PCB 180) to 0.97 (PCB 153 and PCB 138), and, considered as a group, correlations were highest among the non-dioxin-like PCBs (Figure 2).
Fig. 2.

Weighted Spearman correlation matrix, organized by non-dioxin-like PCBs (PCB74–PCB194), non-ortho PCBs (PCB126, PCB 169), and mono-ortho PCBs, furans, and dioxins (PCB118–1,2,3,4,6,7,8–HpCDF); HpCDD, heptachlorodibenzo-p-dioxin; HpCDF, heptachlorodibenzofuran; HxCDD, hexachlorodibenzo-pdioxin; HxCDF, hexachlorodibenzofuran; OCDD, octachlorodibenzo-p-dioxin; PCB, polychlorinated biphenyl; PeCDF, pentachlorodibenzofuran;
3.1. Super learner and predictions underlying the stochastic variable importance estimates
The underlying super learner prediction for telomere length (as part of the TMLE procedure) was heavily dependent on linear shrinkage models and variable selection. CV-Elastic-net predictions were given 59% of the weight for the overall prediction, while RF + OLS was given 23%. PCA + OLS and ridge regression were given 10% and 7% of the weight (Table 1). This result indicates that the best predictions of telomere length involved some dimensionality reduction, and there was not strong evidence of non-linearity. When predicting each exposure (for the purpose of estimating a generalized propensity score that is used in TMLE), most exposures had multiple learners that contributed substantially to the prediction (Figure 5). Notably, Multinomial GLM featured prominently, suggesting that regularization was less important to exposure prediction than it was to predicting telomere length.
Table 1.
Super learner coefficients for the prediction of telomere length, given the mixture of PCBs, dioxins, and furans as well as potential confounders: age , body mass index, racial/ethnic, education, lipid cotinine levels, and white blood cell count. Learners are defined in the main text.
| Learner | Coefficient |
|---|---|
| Mean | 0.00 |
| OLS | 0.00 |
| GAM | 0.00 |
| MARS | 0.00 |
| CV-LASSO | 0.00 |
| CV-ridge | 0.07 |
| CV-elastic net | 0.59 |
| Neural net | 0.00 |
| PCA + OLS | 0.10 |
| OLS + OLS | 0.00 |
| RF + OLS | 0.23 |
| Stepwise | 0.00 |
Fig. 5.

Super learner coefficients for the prediction of each congener of PCBs, dioxins, and furans, given other congeners and potential confounders: age , body mass index, racial/ethnic, education, lipid cotinine levels, and white blood cell count. Learners are defined in the main text. Note that Normal GLM, PCA+ Normal GLM, and Log-normal GLM were omitted because they did not contribute substantially to prediction of any congener.
3.2. Variable importance method results
In ordinary least squares, the congener with the largest magnitude coefficient (scaled to a per-standard-deviation change), was PCB 180 (Figure 3). In spite of standardization, PCB 138–PCB 194 generally had more uncertain estimates, reflecting the highest pairwise coefficients between these variables. The coefficient for 2,3,4,7,8–PeCDF was statistically significantly above zero, though it was a slightly smaller magnitude than the coefficient for PCB 170. Scaled to a common effect size, a one standard deviation increase in PCB 180 was associated with a −0.25, 95% CI = (−0.44, −0.05) standard unit change in mean telomere length, and a one standard deviation increase in 2,3,4,7,8–PeCDF as associated with a 0.07, 95% CI = (0.0086, 0.14) standard unit change in mean telomere length. For the penalized linear regression, the cross-validated value of the penalty parameter for the LASSO was 0.005. Only three coefficients were given non-zero values by the LASSO fit: PCB 99, PCB 126, and 2,3,4,7,8–PeCDF. The cross validated values of α and λ for the elastic-net were 0.0 and 0.09, indicating that the elastic-net penalty converged to a ridge regression penalty with no selection but heavy shrinkage. Consequently, coefficient values for the elastic net were all very close to zero relative to the other linear regression models. No exposures were selected out of the model, and no standard errors are given by the method, precluding any judgements about statistical evidence for importance. For qgcomp, the overall effect estimate (the change in mean telomere length per quartile increase in all exposures) was 0.031, with asymptotic 95% CI = (0.0010, 0.060), indicating that the joint effect of all exposures was positive. Regarding variable importance, results from qgcomp largely mirrored OLS, with the exception of PCB 126 and PCB 153 and 1,2,3,6,7,8-HxCDD, which were of higher importance in the qgcomp fit relative to the OLS fit. The differences between these approaches are the exposure basis and what a “one unit” change means for each exposure, emphasizing the importance of model form and the difficulty of establishing comparability across exposures when deciding variable importance. For WQSR, the fit with a positive constraint yielded an overall effect estimate (change in mean telomere length per unit change in the index summary exposure) of 0.042 95% CI = (0.013, 0.072). The WQSR fit with a negative constraint yielded an overall effect that was not consistent with a negative constraint [0.015, 95% CI =(−0.020, 0.052)], which the authors of the method interpret as an indication of no important predictors with negative effects. The most important variables were PCB 169, 2,3,4,7,8–PeCDF, and PCB 126. WQSR does not yield standard errors by which to estimate statistical evidence for importance. The stochastic intervention estimates were re-scaled to correspond to a standard deviation increase in exposure. These results indicated that PCB 194 and PCB 180 were the most important congeners (when considering only statistically significant results), though this method broadly yielded less precise estimates than the linear model based methods (Figure 4). A one standard deviation increase in PCB 180 was associated with a −0.25, 95% CI = (−0.40, −0.10) change in mean telomere length, and a one standard deviation increase in PCB 194 was associated with a 0.54, 95% CI = (0.04, 1.0) change in mean telomere length. PCBs 74, 187, and 118 were important in terms of effect size, but had wide confidence intervals that complicated interpretation in terms of importance. The random forest approach indicated that dioxin-like PCBs had the highest variable importance measures, and PCB 194 resulted in a 0.015 reduction in the %mean-squared out-of-bag estimate, indicating it was of highest importance for predictions.
Fig. 3.


Variable importance measures for linear model methods (left) and quantile-based linear model methods (right) for 18 PCBs, dioxins, and furans in relation to telomere length in 979 US adults, 2001–2002. WQSR(+) means WQSR with a positive constraint; WQSR(−) did not yield usable importance measures and is not included. Non-dioxin-like PCBs are represented by triangles, non-ortho PCBs by squares, and mono-ortho PCBs, furans, and dioxins by circles.
Fig. 4.


Variable importance measures for stochastic intervention approach estimated by TMLE (left) and random forest (right) for 18 PCBs, dioxins, and furans in relation to telomere length in 979 US adults, 2001–2002. Non-dioxin-like PCBs are represented by triangles, non-ortho PCBs by squares, and mono-ortho PCBs, furans, and dioxins by circles.
4. Discussion
Variable importance is a key aspect of mixtures. A number of different approaches to variable importance have been proposed, but an approach based on stochastic interventions seems promising for addressing key issues in bad actor searches. These issues include statistical concerns such as model form (non-linearity, non-additivity, double-robustness for misspecification) as well as causal inference concerns (confounding control, positivity). Variable importance as determined by the stochastic intervention approach yielded results distinct from prior approaches:, the stochastic intervention approach broadly picked non-dioxin-like PCBs as important variables, whereas other linear modeling approaches, and prior analyses of these data, tended to pick non-ortho-PCBs (PCB 126, PCB 169) and/or furans like 2,3,4,7,8–PeCDF (Gibson et al, 2019). Interestingly, the selection of PCB 194 was mirrored by random forest and, to a small extent, OLS, but not by any other approach.
These different results across methods do not necessarily imply disagreement: the different types of methods address fundamentally different questions about importance. The crucial choice when conducting a bad actor search is often not between different methods that ask the same question, but, rather, the choice among different questions about importance. The stochastic intervention approach addresses importance in a region of the data that is very close to exposures of individuals within the population. That is, importance is assessed for exposures that are highly relevant to the individuals in the data, whereas the relevance for linear model based importance, as well as non-effect-size-based importance is less clear. If effects are linear and additive, each of the effect-size-base approaches addressed here will converge to a similar answer, in spite of the fact that they imply different causal contrasts. In that case, the choice among methods becomes purely statistical. However, when the effects of exposures within a mixture are non-linear or have interactions within or outside the mixture, apparent importance can vary between these approaches. Thus, in a search for bad actors within a mixture, it becomes crucial to define the causal contrast of interest.
Even though the super learner fit for telomere length was based mostly on linear models, the use of outcome and exposure models within TMLE allows this approach to accommodate non-linearity. A number of methods have arisen recently with the specific purpose of addressing statistical issues that commonly arise in mixtures. Variable importance in mixtures has specifically been targeted by WQSR, which performed favorably to LASSO regression regarding variable selection in a simulated study (Czarnota et al, 2015). Another approach, Bayesian kernel machine regression (BKMR) was not used here because it cannot accommodate sampling weights (Bobb et al, 2015). However, both of these approaches use penalization in some form to address the potential for high variance importance measures. Gibson et al (2019) applied both approaches to the data used here (without sampling weights) and found no associations between telomere length and any non-dioxin-like PCB, similar to our findings with WQSR. Importantly, both approaches use variable selection to some extent. Variable selection addresses the statistical issue that model parameter estimates can be highly unstable when co-adjusting for a group of highly correlated variables, but it comes at the expense of controlling confounding by those same variables. For example, PCB 180 and PCB 194 have effect estimates in opposing directions and have a correlation coefficient of 0.88. Notably, none of the parametric approaches with selection (LASSO, Elastic-net, WQSR) assigned importance to either PCB 180 or PCB 194. Models that exclude either one of these risk having strong unmeasured confounding for the exposure that remains in the model. If these exposures truly have opposing effects, this would result in null-bias for both. While high exposure correlations can lead to estimates for non-causal exposures that are opposing and high variance, it is notable that PCBs 138, 153, 170, and 187 all have correlation coefficients greater than 0.85 with PCB 180, and none of them displayed this pattern in the stochastic intervention-based approach. Thus, approaches that relied on selection may have been subject to this bias. Notably, bias from selection may be reduced by some doubly-robust approaches like TMLE, which reduce the ”regularization bias” of approaches like LASSO (Chernozhukov et al, 2018).
While stochastic intervention effects seem useful for assessing importance in mixtures, more work is necessary to develop and explore estimators. TMLE relies on estimates of the generalized propensity score, which requires estimation of a conditional exposure density. While much progress has been made at developing super learner for density estimation (Díaz Muñoz and van der Laan, 2018), it is not yet understood how successful this approach can be when confronted with many correlated exposures. As Kang and Schafer (2007) exemplify, asymptotic guarantees of doubly-robust methods do not guarantee improvement of results over single-model approaches in finite samples. For causal inference, the utility of any approach relies on solid background knowledge about causal relationships among variables, rather than algorithmic cleverness (Rubin, 2008). The TMLE/super learner estimator can lever-age certain forms of background knowledge, and future implementations could include hierarchical shrinkage models or mechanistic models when applicable. Nonetheless, methods that target variable importance in the face of uncertain background knowledge are needed, given the difficulty of the problem in mixtures and the challenge of organizing knowledge about mechanism for 20–50 exposures simultaneously.
4.1. Extensions to longitudinal data
The stochastic intervention approach to variable importance has previously been considered for longitudinal exposures by Díaz Muñoz et al (2015). Those authors considered importance of time-varying predictors of death in trauma patients admitted to the emergency department. In their approach, the authors considered death during discrete intervals and examined variable importance of predictors within those intervals, essentially analyzed as a series of cross-sectional studies. However, they also considered temporality by adjusting only for prior predictor levels at each time (e.g. in the second time-interval, predictors from the first time-interval were used as adjustment variables). One could follow this general approach by utilizing longitudinal measurements of persistent organic pollutants and telomere length. The mixtures framework is flexible, however, and we could similarly consider a pooled approach in which exposures at all times are contrasted in terms of importance on an outcome measured after all exposures, such as telomere length, such that variable importance can be compared across exposures and across time, though care would need to be taken as the size of the model in such a framework could grow quickly with multiple measurements of a mixture over time.
4.2. Conclusions
Traditional methods for identifying bad actors within a mixture can mask statistical uncertainty and practical importance. Stochastic intervention approaches to mixtures are promising for a number of statistical and causal reasons. The example here with telomere length and persistent organic pollutants is reassuring that this approach can give unique insights while keeping appropriately modest goals in the face of the hard statistical problems of mixtures.
References
- Bang H, Robins JM (2005) Doubly robust estimation in missing data and causal inference models. Biometrics 61(4):962–973 [DOI] [PubMed] [Google Scholar]
- Van den Berg M, Birnbaum LS, Denison M, et al. (2006) The 2005 world health organization reevaluation of human and mammalian toxic equivalency factors for dioxins and dioxin-like compounds. Toxicological sciences 93(2):223–241 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bobb JF, Valeri L, Claus Henn B, et al. (2015) Bayesian kernel machine regression for estimating the health effects of multi-pollutant mixtures. Biostatistics 16(3):493–508 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Breiman L (2001a) Random forests. Machine learning 45:5–32 [Google Scholar]
- Breiman L (2001b) Statistical modeling: The two cultures (with comments and a rejoinder by the author). Statistical science 16(3):199–231 [Google Scholar]
- Carrico C, Gennings C, Wheeler DC, et al. (2015) Characterization of weighted quantile sum regression for highly correlated data in a risk analysis setting. Journal of agricultural, biological, and environmental statistics 20:100–120 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cawthon RM (2002) Telomere measurement by quantitative pcr. Nucleic acids research 30(10):e47–e47 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chernozhukov V, Chetverikov D, Demirer M, et al. (2018) Double/debiased machine learning for treatment and structural parameters
- Czarnota J, Gennings C, Wheeler DC (2015) Assessment of weighted quantile sum regression for modeling chemical mixtures and cancer risk. Cancer informatics 14:CIN–S17295 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Díaz Muñoz I, van der Laan MJ (2018) Stochastic treatment regimes in van der Laan MJ and Rose S (eds) Targeted learning in data science: Causal inference for complex longitudinal studies. Springer International Publishing, Switzerland, pp 219–232 [Google Scholar]
- Díaz Muñoz I, Van Der Laan M (2012) Population intervention causal effects based on stochastic interventions. Biometrics 68(2):541–549 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Díaz Muñoz I, Hubbard A, Decker A, et al. (2015) Variable importance and prediction methods for longitudinal problems with missing variables. PloS one 10(3):e0120031. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Friedman JH (1991) Multivariate adaptive regression splines. The annals of statistics 19(1):1–67 [Google Scholar]
- Gelman A (2008) Scaling regression inputs by dividing by two standard deviations. Statistics in medicine 27(15):2865–2873 [DOI] [PubMed] [Google Scholar]
- Gibson EA, Nunez Y, Abuawad A, et al. (2019) An overview of methods to address distinct research questions on environmental mixtures: an application to persistent organic pollutants and leukocyte telomere length. Environmental Health 18:1–16 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Greenland S (2000) Principles of multilevel modelling. International journal of epidemiology 29(1):158–167 [DOI] [PubMed] [Google Scholar]
- Greenland S (2017) For and against methodologies: some perspectives on recent causal and statistical inference debates. European journal of epidemiology 32:3–20 [DOI] [PubMed] [Google Scholar]
- International Agency for Research on Cancer (2012) Chemical agents and related occupations. IARC Monogr Eval Carciog Risk Hum 100 [PMC free article] [PubMed] [Google Scholar]
- International Agency for Research on Cancer (2015) Polychlorinated and polybrominated biphenyls. IARC Monogr Eval Carciog Risk Hum 107 [Google Scholar]
- Kang JD, Schafer JL (2007) Demystifying double robustness: A comparison of alternative strategies for estimating a population mean from incomplete data. Statistical Science 22(4):523–539 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Keil AP, Buckley JP, O’Brien KM, et al. (2020) A quantile-based g-computation approach to addressing the effects of exposure mixtures. Environmental health perspectives 128(4):047004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Van der Laan MJ (2006) Statistical inference for variable importance. The International Journal of Biostatistics 2(1) [Google Scholar]
- Van der Laan MJ, Polley EC, Hubbard AE (2007) Super learner. Statistical applications in genetics and molecular biology 6(1) [DOI] [PubMed] [Google Scholar]
- Van der Laan MJ, Rose S, et al. (2011) Targeted learning: causal inference for observational and experimental data, vol 4. Springer [Google Scholar]
- Lan Q, Cawthon R, Shen M, et al. (2009) A prospective study of telomere length measured by monochrome multiplex quantitative pcr and risk of non-hodgkin lymphoma. Clinical Cancer Research 15(23):7429–7433 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liaw A, Wiener M (2002) Classification and regression by randomforest. R News 2(3):18–22. URL https://CRAN.R-project.org/doc/Rnews/ [Google Scholar]
- Mitro SD, Birnbaum LS, Needham BL, et al. (2016) Cross-sectional associations between exposure to persistent organic pollutants and leukocyte telomere length among us adults in nhanes, 2001–2002. Environmental health perspectives 124(5):651–658 [DOI] [PMC free article] [PubMed] [Google Scholar]
- O’Brien KM, Upson K, Cook NR, et al. (2016) Environmental chemicals in urine and blood: improving methods for creatinine and lipid adjustment. Environmental health perspectives 124(2):220–227 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pearl J (1995) Causal diagrams for empirical research. Biometrika 82(4):669–688 [Google Scholar]
- Pearl J (2010) Brief report: On the consistency rule in causal inference:” axiom, definition, assumption, or theorem?”. Epidemiology pp 872–875 [DOI] [PubMed] [Google Scholar]
- Richardson TS, Robins JM (2013) Single world intervention graphs (swigs): A unification of the counterfactual and graphical approaches to causality. Center for the Statistics and the Social Sciences, University of Washington Series Working Paper 128(30):2013 [Google Scholar]
- Robins J (1986) A new approach to causal inference in mortality studies with a sustained exposure period—application to control of the healthy worker survivor effect. Mathematical modelling 7(9–12):1393–1512 [Google Scholar]
- Rubin DB (2008) For objective causal inference, design trumps analysis. The Annals of Applied Statistics 2(3):808 – 840 [Google Scholar]
- Sarkar P, Shiizaki K, Yonemoto J, et al. (2006) Activation of telomerase in bewo cells by estrogen and 2, 3, 7, 8-tetrachlorodibenzo-p-dioxin in co-operation with c-myc. International journal of oncology 28(1):43–51 [PubMed] [Google Scholar]
- Snowden JM, Reid CE, Tager IB (2015) Framing air pollution epidemiology in terms of population interventions, with applications to multi-pollutant modeling. Epidemiology (Cambridge, Mass) 26(2):271. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Strobl C, Boulesteix AL, Zeileis A, et al. (2007) Bias in random forest variable importance measures: Illustrations, sources and a solution. BMC bioinformatics 8(1):1–21 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 [Google Scholar]
- Westreich D, Cole SR (2010) Invited commentary: positivity in practice. American journal of epidemiology 171(6):674–677 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wolpert DH (1992) Stacked generalization. Neural networks 5(2):241–259 [Google Scholar]
- Wood SN, Pya N, Säfken B (2016) Smoothing parameter and model selection for general smooth models. Journal of the American Statistical Association 111(516):1548–1563 [Google Scholar]
- Young JG, Heráan MA, Robins JM (2014) Identification, estimation and approximation of risk under interventions that depend on the natural value of treatment using observational data. Epidemiologic methods 3(1):1–19 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ziegler S, Schettgen T, Beier F, et al. (2017) Accelerated telomere shortening in peripheral blood lymphocytes after occupational polychlorinated biphenyls exposure. Archives of toxicology 91:289–300 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zipf G, Chiappa M, Porter KS, et al. (2013) Health and nutrition examination survey plan and operations, 1999–2010. Vital Health Stat 1 56(1–37) [PubMed] [Google Scholar]
- Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. Journal of the royal statistical society: series B (statistical methodology) 67(2):301–320 [Google Scholar]
