Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2009 Oct 29.
Published in final edited form as: Methods Mol Biol. 2009;580:371–381. doi: 10.1007/978-1-60761-325-1_20

The Effect of Lipid Adjustment on the Analysis of Environmental Contaminants and the Outcome of Human Health Risks

Audrey J Gaskins, Enrique F Schisterman
PMCID: PMC2770167  NIHMSID: NIHMS153058  PMID: 19784610

Summary

Past literature on exposure to lipophilic agents such as organochlorines (OCs) is conflicting, posing challenges for the interpretation of their potential human health risks. Since blood is often used as a proxy for adipose tissue, it is necessary to model serum lipids when assessing health risks of OCs. Using a simulation study, we evaluated four statistical models (unadjusted, standardized, adjusted, and two-stage) for the analysis of polychlorinated biphenyls (PCBs) exposure, serum lipids, and health outcome risk. Eight candidate true causal scenarios, depicted by directed acyclic graphs, were used to illustrate the ramifications of misspecification of underlying assumptions when interpreting results. Biased results were produced when statistical models that deviated from the underlying causal assumptions were used with the lipid standardization method found to be particularly prone to bias. We concluded that investigators must consider biology, biological medium, laboratory measurement, and other underlying modeling assumptions when devising a statistical model for assessing health outcomes in relation to environmental exposures.

Keywords: Causal modeling, Directed acyclic graphs, Risk estimation, Serum lipids, Organochlorines, Polychlorinated biphenyls

1. Introduction

When assessing potential human health risks, persistent lipophilic xenobiotics pose particular methodological challenges. Current literature on exposure to these lipophilic agents such as organochlorines (OCs) is ambiguous which further impairs the ability to quantify these risks (14). Serum OC concentrations are dependent on serum lipid concentrations (5, 6) but only under certain circumstances, when equilibrium is reached, can information regarding serum OC and serum lipid levels be predictive of the overall OC body burden (7). Higher serum lipid levels should correspond to higher serum OC concentrations but serum OC concentrations and lipids are affected postprandial, so both must be considered in relation to the quantity and timing of food consumption (8). Serum (or plasma) samples are frequently used due to the difficulty of collecting adipose tissue but these samples can introduce methodological challenges to estimating health risks particularly when using nonfasting samples (9). The alternative, fasting samples, has potentially larger drawbacks, hampering the feasibility of epidemiological research and adversely impacting study participation. Thus, to avoid the large drawbacks with fasting samples and methodological challenges posed by nonfasting samples, further attention should be focused on nonfasting serum samples’ relation to serum lipids (5, 7, 10).

Our limited understanding of the true relation between serum and adipose tissue concentrations of lipophilic xenobiotics to serum lipids and health outcomes makes model specification difficult (11, 12). To overcome this, investigators typically make assumptions about the relation between serum lipids and serum OCs expressing OC measurements as a wet-weight, lipid-weight, or lipid-standardization value. Lipid standardization (OC concentration per gram of fat) is particularly useful for comparing exposure concentrations across tissue specimens or study populations (13). Lipid weight (OC per unit of serum lipids) is advocated as superior to wet weight (OC per unit serum) in the measurement of persistent lipophilic chemicals (7), especially when assuming body burden equilibrium. Other approaches include the use of a log-linear model with serum lipids included as a separate term in the regression equation (14) and a two-stage analysis where serum lipids are regressed on serum OCs with residuals entered as individual risk factors (2). The best way to model the relationship among serum OCs, lipids, and health outcomes remains an understudied area critical for the assessment of health effects. In this analysis, we demonstrate the impact of model (mis)specification and its potential effects on the interpretation of study findings.

2. Materials

2.1. Computer Programs

The SAS software package (SAS Institute, Cary, NC, USA) was used for all simulations and analyses.

3. Methods

3.1. Directed Acyclic Graphs

Optimal modeling of the statistical relations among serum OCs, serum lipids, and health outcomes requires conceiving a causal model that reflects the following considerations: (a) biologic plausibility, (b) laboratory capability for quantifying compounds and lipids, (c) underlying statistical assumptions (e.g., error structure), and (d) other relevant study covariates (e.g., known and potential confounders). A single-headed arrow represents a causal relation, a dashed line represents a noncausal association, and the absence of an arrow signifies no relation between two variables. In our scenarios, the hypothetical “causal truths,” are based on the literature and their relation to frequently used models. For bias calculations, we assume perfect laboratory measurements of OCs and the absence of unmeasured confounders. The eight directed acyclic graphs (DAGs) are illustrated in Fig. 1 with polychlorinated biphenyls (PCBs) chosen to exemplify the role of OCs. These eight DAGS are also described in detail below.

Fig. 1.

Fig. 1

Causal scenarios for relations among polychlorinated biphenyl (PCB), serum lipids (SL), and outcome (Y)

  1. PCB and SL are marginally dependent conditional on Y; serum PCB (S-PCB) causes Y, and SL causes Y.

  2. PCB is a cause of Y; S-PCB causes Y, independent of SL.

  3. PCB and Y are marginally dependent on and blocked by SL; S-PCB causes SL, which causes Y.

  4. Y and SL are marginally dependent and blocked by serum PCB; S-PCB causes Y and SL.

  5. PCB and SL are marginally dependent conditional on both the shared ancestor variable, A, and Y. An unmeasured variable, A, causes both S-PCB and SL, each of which independently causes the outcome. This is the traditional situation of confounding, with SL acting as a confounder of the relation between serum PCBs, PCBs, and Y.

  6. PCB and SL are marginally dependent on the ancestor, A; SL and Y are marginally dependent on A and, thus effectively, on PCB. S-PCB and SL are caused by A, but only PCB is causally related to Y.

  7. PCB per unit SL and Y are marginally dependent conditional on adipose tissue PCB. Adipose tissue PCB (A-PCB) causes serum PCB per unit serum lipid and causes Y; PCB and outcome are correlated rather than directly causally related.

  8. Blocked and unblocked path. Y is both directly caused by PCB and marginally dependent conditional on SL; S-PCB causes Y, as well as SL, which causes Y.

3.2. Statistical Models

We investigated four statistical models for the analysis of hypothesized PCB exposure, serum lipids, and a health outcome along with eight plausible DAGs for each model. All models assume that there are no unmeasured confounders. The basis of all the models is

P=Pr(Y=1X,SL)

where ϒ is the dichotomous-dependent variable representing the presence/absence of the disease, X the PCB, and SL the serum lipids.

3.3. Simulations

A simulation study was conducted to evaluate the utility of the various models for the different scenarios depicted by the DAGs. Using the causal structures, lognormal distributions were assigned for PCB and serum levels and a binomial outcome variable, Y, was assumed with Pr (ϒ = 1|PCB, serum lipids). Using Fig. 1h as an example, the given associations motivate the model

ln(SL)=α0+γ[ln(X)]logit(P)=α1+β1ln(X)+β2{E[ln(SL)X]}=α0+α1+(β1+β2γ)[In(X)] (1)

The log odds [logit P(X, SL)] equals an intercept (α0), the prevalence among the unexposed, plus the factor, β1 + β2γ, by which PCB affects the probability of the event. Since there is no serum lipids term, there is no linear influence from the serum lipid levels. The four models used for the simulations are listed and described below.

1. Unadjusted Model

This model is equivalent to the use of wet-weight values when estimating the effect of an exposure such as PCBs on health outcomes without consideration of serum lipids.

logit(P)=α1+β1ln(x) (2)

This model is only suitable with the assumption that serum lipids are not a confounder regardless of the relation between lipids and the outcome. The inclusion or exclusion of lipids as an adjustor may affect model fit but it will not impact PCB exposure/response estimates.

2. Standardized model

This model is one way to account for the effect of serum lipids on serum OC levels by dividing the serum concentration of PCBs by serum lipids. The basis of this model is

logit(P)=α2+β2ln(XSLm)=α2+β2[ln(X)m×ln(SL)] (3)

where the power, m, is a factor that generalizes the relation of PCBs and serum lipids.

3. Adjusted Model

In this model, there is an assumption that PCBs are not standardized for serum lipids. This is reflected in the absence of an association between lipids and the study outcome. The basis of this model is

logit(P)=α3+β3ln(X)+β4ln(SL) (4)

The standardized model is a member of the family of adjusted models and in general, is applicable under the same set of assumptions. Comparing the lipid component in the standardized model [ln(X)−m × ln(SL)] with that in the adjusted [β4 ln(SL)] demonstrates that equivalent results are produced when β4 is set equal to −m. Because of the added β-coefficient, the adjusted model is generally more flexible than the standardized model.

4. Two-stage model

This model includes the effects of both PCBs and serum lipids on the outcome:

ln(SL)=α+β5ln(X)+Rlogit(P)=α4+β6ln(X)+β7×(R) (5)

Both the intercept and the β-coefficient are simple functions of the parameters from the adjusted model and the regression of serum lipids on log PCBs. The coefficient for the residual term, R, is also precisely that of the adjusted model’s lipids term:

α4=α3β4αβ6=β3β5β4β7=β4

To further evaluate the efficiency of the models, the effects that the serum lipid measurement error and the strength of linear relationship have on the outcome were assessed. Measurement error was set to [ εN(0,σe2)] with differing values of σe2. The relation between PCB and serum lipids was analyzed by varying α from the linear regression equation, SL = α0 + αX. In these quantitative representations of DAGs, the magnitude of effects, error, and bias are functions of the values chosen for the parameters. The independent effect of PCB was set as a constant (βln PCB = 0.6 in the logistic regression model) with approximate values taken from literature (15). In the unpublished data, there was a significant linear relation between total serum PCBs and serum lipids with regression coefficient of approximately 0.3. These values represented the strength of the linear relation between PCB while serum lipids values represented a range from very weak association (α = 0.01) to strong association (α = 2.0).

3.4. Results

Table 1 displays the bias and mean square error when using σe2=1 and α = 0.3 as the underlying casual truths for the four statistical models in each DAG scenario.

Table 1.

Percent bias of estimates of effect of PCBs on outcome for evaluated statistical models

DAGb Percent bias (MSE)a
Unadjusted Standardized Adjusted Two-stage
A 1.2 (1.26) −51.3 (10.3) 1.8 (1.28) 1.8 (1.28)
B −0.8 (1.34) −75.9 (21.1) −.07 (1.35) −0.7 (1.33)
C −15.4 (2.78) −351.3 (161.1) −99.4 (1.59) 1.1 (2.78)
D 0.4 (1.14) −79.8 (23.3) 0.8 (1.17) 0.5 (1.14)
E 24.0 (3.37) −128.8 (60.3) 0.1 (1.39) 27.2 (3.37)
F −0.4 (1.29) −85.0 (26.4) −0.1 (1.41) −0.3 (1.29)
G −86.3 (27.0) −1.0 (1.51) −1.0 (1.51) −85.9 (27.0)
H −11.2 (1.75) −128.3 (59.7) −25.4 (3.65) −8.7 (1.75)
a

Mean square error multiplied by 100 for illustration (shown in parentheses)

b

See Fig. 1

Serum lipid measurement error distributed normally with mean 0; variance 1; α (strength of linear relation between log PCB and log serum lipids) = 0.3; 500 repetitions; n = 1,000

For Fig. 1a that represents PCB and SL as independent causes of the outcome, all the models except the standardized produce minimally biased estimates. The standardized model, instead, results in a largely underestimated bias of the PBC effect on outcome. In Fig. 1b, SL is completely extraneous and thus the bias occurs similarly to the previous situation in Fig. 1a. Figure 1c depicts a scenario where the effect of PCB acts strictly through SL. It was found to be best represented by the two-stage approach. The unadjusted model produced minimal bias but the adjusted and particularly the standardized model resulted in large underestimates of bias (99% and 351%, respectively). This particular scenario was the one instance where the adjusted model produced extremely large bias compared to its performance on the other seven models where bias was kept to almost a fourth of its value.

When SLs are affected by PCBs, but do not directly influence the outcome, as depicted in Fig. 1d, standardization is the only modeling approach with substantial bias, underestimating the true effect by nearly 80% compared to the others which are within 1% of the true effect. In Fig. 1e, the confounded case, only the adjusted model performed with minimal bias. The lack of adjustment in the other models failed to address the SL confounder and further indicated that the standardization, when present, was not a sufficient method to account for this confounder. The two-stage model fails because in adjusting for serum lipids via the residuals, the model misattributes the association between PBC and SL as a causal link. This ultimately results in biased estimates of the effect of interest-the total effect of PCB on risk.

Similar to Fig. 1a, b, and d, using the standardized model for Fig. 1f produced biased underestimates much larger than those from the other three models. This can be attributed to the noncausal correlation between PCB and SL that is represented in the DAG. Figure 1g depicts serum levels of PCB as being dependent on the adipose levels of PCB which are then casually related to the outcome. Given this situation, the standardization model, which up till now had produced substantial bias in all scenarios, functioned optimally. The adjusted model resulted in similar unbiased estimates while neither the unadjusted nor two-stage model worked well. Figure 1h represents a direct and indirect causal link of PCB with outcome. PCB not only affects the outcome, it indirectly affects SL which indirectly affects the outcome as well. This relationship was not represented well by any of the models with the least biased estimates resulting from the two-stage (which separates total into estimated direct and indirect) and unadjusted (which estimates total effect) models.

After comparing the standardized and adjusted models over all eight causal scenarios, with the exception of the Fig. 1g, the adjusted model consistently produced smaller bias than the standardized model. Even under ideal conditions for the standardized model (as depicted in Fig. 1g), the adjusted model produced a nearly identical, unbiased estimate. The two-stage and unadjusted models produced similar results, except in the case of Fig. 1c where the two-stage yielded substantially less bias.

1. Measurement Error

The potential measurement error accompanying the quantification of serum lipids was addressed through the use of an error term with mean 0 and variance σe2 which was added to the simulated distribution of serum lipids. Figures 24 display bias as a function of error at four values of α for each of the models. The bias as a function of σe2 followed three distinct patterns among the eight DAGs.

Fig. 2.

Fig. 2

Comparison of bias for standardization versus all other models as a function of measurement error of serum lipids and strength of linear association of polychlorinated biphenyl (PCB) with serum lipids for Fig. 1a, b, d, and f. Bias for the standardized model was systematically centered on −0.60 (100% underestimation). The vertical line at σe2 signifies the level used for Table 1.

Fig. 4.

Fig. 4

Bias as a function of measurement error of serum lipids and strength of linear association of polychlorinated biphenyl (PCB) with serum lipids for Fig. 1g. The vertical line at σe2 signifies the level used for Table 1.

The first pattern was displayed by Fig. 1a, b, d, and f and is shown in Fig. 2. In this pattern, bias was stable for the unadjusted, adjusted, and two-stage models consistently staying close to zero. Only for the standardized model was the relation between bias and σe2 more complicated. In this model, bias increased with measurement error when the relation between PCB and lipids was weak (low α) but decreased with measurement error when the relation between the two variables was strong (higher α). The transition between this increase and decrease (the inflection point) occurred at a value of σe2 that ranged from 0.5 for Fig. 1f to 3.0 for Fig. 1a.

The second pattern was displayed by Fig. 1c, e, and h and is show in Fig. 3. Similar to the first pattern described above, the bias for the standardized model varied in a nonlinear manner, increasing for all values of α except for the highest (α = 2). The adjusted and two-stage models were essentially robust to measurement error; however, they did not always produce unbiased estimates of parameters for all underlying DAGs. This was particularly apparent at different levels of α. In the adjusted model, a stronger relation between PCB and lipids (higher α) resulted in greater bias. The bias of estimates produced by the unadjusted model varied slightly with σe2 depending on the DAG being modeled. For Fig. 1c and h, the bias increased slightly with increasing measurement error (from 0 to 0.1 for σe2=0.8 and from 0 to 0.2 for σe2). For Fig. 1e, the bias decreased with increasing measurement error as the strength of the noncausal relation between PCBs and serum lipids was altered by the variance in lipids.

Fig. 3.

Fig. 3

Bias as a function of measurement error of serum lipids and strength of linear association of polychlorinated biphenyl (PCB) with serum lipids for Fig. 1c, e, and h. The vertical line at σe2 signifies the level used for Table 1.

The third pattern was displayed by Fig. 1g and is shown in Fig. 4. In this pattern, both the standardized and adjusted models produced unbiased estimates robust to measurement error. The unadjusted and two-stage model produced biased estimates that were equally prone to measurement error. Regardless of the strength of the linear relation between PCB and lipids (α), the bias for all four of the models remained unchanged in this scenario.

2. Application

In this analysis, we described and evaluated four statistical models – unadjusted, adjusted, standardized, and two-stage – commonly used to assess the effects of lipophilic environmental contaminants on human health. Each statistical model showed minimal bias for at least the causal truth for which it was ideally suited. Every model except for the standardized performed well in all but one scenario. The standardized model, on the other hand, produced large biases for most of the DAGs evaluated and even produced similar biases to the adjusted model for the DAG in which standardization is optimal.

The basic causal scenarios, depicted in the eight DAGs, included only two to four factors which impact levels of both PCB and serum lipids. When additional factors are considered, the evaluation becomes much more complex and the trade-off between efficiency and robustness becomes more important. Even though in our simulation, the adjusted model produced consistently unbiased estimates, there are circumstances where adjustment is inappropriate and should be avoided. Examples of these situations include adjusting for a collider (an effect of two or more other variables in the graph), which has been demonstrated to bias estimators of effect (16, 17). Factors that share a common cause will also give large bias appearing correlated in strata of that common cause. Given an alleged relation between PCB and serum lipids, adjusting for these factors might generate spurious associations if an unmeasured factor is related to both serum lipid levels and the outcome.

Our simulations demonstrated that statistical models that fail to uphold the underlying assumptions about causality lead to biased results. This bias can have negative implications on the interpretation of effects of exposures on human health end points. Equivocal findings may arise in part from the varying laboratory and analytic approaches for specifying serum lipids when using nonfasting blood specimens to estimate risk. Investigators should remember to consider biology, biological medium, and laboratory methodology when specifying a statistical model. They should also take caution to make sure the model’s underlying assumptions are appropriate for the study.

Footnotes

1

Bias was stable for the unadjusted, adjusted, and two-stage models consistently staying close to zero. Only for the standardized model was the relation between bias and variance more complicated.

2

The adjusted model consistently produced smaller bias than the standardized model. Even under ideal conditions for the standardized model (DAG G), the adjusted model produced a nearly identical, unbiased estimate.

3

The two-stage and unadjusted models produced similar results except in the case of DAG C where the two-stage yielded substantially less bias.

References

  • 1.Calle EE, Frumkin H, Henley SJ, Savitz DA, Thun MJ. Organochlorines and breast cancer risk. CA Cancer J Clin. 2002 Sep;52(5):301–9. doi: 10.3322/canjclin.52.5.301. [DOI] [PubMed] [Google Scholar]
  • 2.Hunter DJ, Hankinson SE, Laden F, Colditz GA, Manson JE, Willett WC, et al. Plasma organochlorine levels and the risk of breast cancer. N Engl J Med. 1997 Oct 30;337(18):1253–8. doi: 10.1056/NEJM199710303371801. [DOI] [PubMed] [Google Scholar]
  • 3.Laden F, Collman G, Iwamoto K, Alberg AJ, Berkowitz GS, Freudenheim JL, et al. 1,1-Dichloro-2,2-bis(p-chlorophenyl)ethylene and polychlorinated biphenyls and breast cancer: combined analysis of five U.S. studies. J Natl Cancer Inst. 2001 May 16;93(10):768–76. doi: 10.1093/jnci/93.10.768. [DOI] [PubMed] [Google Scholar]
  • 4.Laden F, Hankinson SE, Wolff MS, Colditz GA, Willett WC, Speizer FE, et al. Plasma organochlorine levels and the risk of breast cancer: an extended follow-up in the Nurses’ Health Study. Int J Cancer. 2001 Feb 15;91(4):568–74. doi: 10.1002/1097-0215(200002)9999:9999<::aid-ijc1081>3.0.co;2-w. [DOI] [PubMed] [Google Scholar]
  • 5.Eyster JT, Humphrey HE, Kimbrough RD. Partitioning of polybrominated biphenyls (PBBs) in serum, adipose tissue, breast milk, placenta, cord blood, biliary fluid, and feces. Arch Environ Health. 1983 Jan;38(1):47–53. doi: 10.1080/00039896.1983.10543978. [DOI] [PubMed] [Google Scholar]
  • 6.Guo YL, Emmett EA, Pellizzari ED, Rohde CA. Influence of serum cholesterol and albumin on partitioning of PCB congeners between human serum and adipose tissue. Toxicol Appl Pharmacol. 1987 Jan;87(1):48–56. doi: 10.1016/0041-008x(87)90083-4. [DOI] [PubMed] [Google Scholar]
  • 7.Brown JF, Jr, Lawton RW. Polychlorinated biphenyl (PCB) partitioning between adipose tissue and serum. Bull Environ Contam Toxicol. 1984 Sep;33(3):277–80. doi: 10.1007/BF01625543. [DOI] [PubMed] [Google Scholar]
  • 8.Phillips DL, Smith AB, Burse VW, Steele GK, Needham LL, Hannon WH. Half-life of polychlorinated biphenyls in occupationally exposed workers. Arch Environ Health. 1989 Nov;44(6):351–4. doi: 10.1080/00039896.1989.9935905. [DOI] [PubMed] [Google Scholar]
  • 9.Whitcomb BW, Schisterman EF, Buck GM, Weiner JM, Greizerstein H, Kostyniak PJ. Relative concentrations of organochlorides in adipose tissue and serum among reproductive age women. Environ Toxicol Pharmacol. 2005;19:203–13. doi: 10.1016/j.etap.2004.04.009. [DOI] [PubMed] [Google Scholar]
  • 10.Brown JF, Jr, Lawton RW, Morgan CB. PCB metabolism, persistence, and health effects after occupational exposure: implications for risk assessment. Chemosphere. 1994 Nov;29(9–11):2287–94. doi: 10.1016/0045-6535(94)90396-4. [DOI] [PubMed] [Google Scholar]
  • 11.Calvert GM, Willie KK, Sweeney MH, Fingerhut MA, Halperin WE. Evaluation of serum lipid concentrations among U.S. workers exposed to 2,3,7,8-tetrachlorodibenzo-p-dioxin. Arch Environ Health. 1996 Mar;51(2):100–7. doi: 10.1080/00039896.1996.9936001. [DOI] [PubMed] [Google Scholar]
  • 12.Mussalo-Rauhamaa H. Partitioning and levels of neutral organochlorine compounds in human serum, blood cells, and adipose and liver tissue. Sci Total Environ. 1991 Apr 15;103(2–3):159–75. doi: 10.1016/0048-9697(91)90142-2. [DOI] [PubMed] [Google Scholar]
  • 13.Morgan DP, Roan CC. Chlorinated hydrocarbon pesticide residue in human tissues. Arch Environ Health. 1970 Apr;20(4):452–7. doi: 10.1080/00039896.1970.10665621. [DOI] [PubMed] [Google Scholar]
  • 14.Moysich KB, Ambrosone CB, Vena JE, Shields PG, Mendola P, Kostyniak P, et al. Environmental organochlorine exposure and postmenopausal breast cancer risk. Cancer Epidemiol Biomarkers Prev. 1998 Mar;7(3):181–8. [PubMed] [Google Scholar]
  • 15.Wolff MS, Toniolo PG. Environmental organochlorine exposure as a potential etiologic factor in breast cancer. Environ Health Perspect. 1995 Oct;103(Suppl 7):141–5. doi: 10.1289/ehp.95103s7141. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Greenland S, Brumback B. An overview of relations among causal modelling methods. Int J Epidemiol. 2002 Oct;31(5):1030–7. doi: 10.1093/ije/31.5.1030. [DOI] [PubMed] [Google Scholar]
  • 17.Hernan MA, Hernandez-Diaz S, Werler MM, Mitchell AA. Causal knowledge as a prerequisite for confounding evaluation: an application to birth defects epidemiology. Am J Epidemiol. 2002 Jan 15;155(2):176–84. doi: 10.1093/aje/155.2.176. [DOI] [PubMed] [Google Scholar]

RESOURCES