Skip to main content
PLOS ONE logoLink to PLOS ONE
. 2011 May 12;6(5):e19416. doi: 10.1371/journal.pone.0019416

Behavior of QQ-Plots and Genomic Control in Studies of Gene-Environment Interaction

Arend Voorman 1, Thomas Lumley 1, Barbara McKnight 1, Kenneth Rice 1,*
Editor: Stacey Cherny2
PMCID: PMC3093379  PMID: 21589913

Abstract

Genome-wide association studies of gene-environment interaction (GxE GWAS) are becoming popular. As with main effects GWAS, quantile-quantile plots (QQ-plots) and Genomic Control are being used to assess and correct for population substructure. However, in GInline graphicE work these approaches can be seriously misleading, as we illustrate; QQ-plots may give strong indications of substructure when absolutely none is present. Using simulation and theory, we show how and why spurious QQ-plot inflation occurs in GInline graphicE GWAS, and how this differs from main-effects analyses. We also explain how simple adjustments to standard regression-based methods used in GInline graphicE GWAS can alleviate this problem.

Introduction

Genome-wide association studies of Gene-environment interaction (GInline graphicE GWAS) are now being undertaken to search for modification of environmental effects by genotypes [1], [2]. As in main-effects GWAS that search for the effects of genotype alone, differences in recent ancestry, termed population substructure, can be mistaken for true genetic effects, and is therefore a serious concern [1], [3].

In main-effects GWAS, the extent of the substructure problem is typically addressed using Genomic Control [4]. Here, under the assumption that processes of local mating and genetic drift inflate measures of association in the same way genome-wide, the degree of inflation of the median test statistic (known as Inline graphic) is a useful assessment of the degree of test statistic inflation at all levels. Dividing test statistics by Inline graphic is a widely-used approach to correct for minor substructure problems; for examples, see e.g. [5], [6]. Adjusting for principal components, which we will use in this paper, is another popular correction method [7], [8].

In GInline graphicE GWAS, one can also argue that substructure leads to inflation of test statistics by a multiplicative factor. However, in GInline graphicE GWAS the same inflation can also be caused by an entirely different mechanism: systematic underestimation of variability of effect estimates across the genome. This is not confounding, but it gives the appearance of confounding; hence nave use of Genomic Control can be misleading.

In this paper, we show how the separate effects of population substructure and underestimation of variability affect interpretation of GInline graphicE GWAS results, and we show how this problem can be solved. In the Results section, using simulation and theory, we describe how spurious QQ-plot inflation can occur. We also illustrate how model-robust estimates of standard errors (also known as “sandwich” standard errors) rectify the problem, while retaining Inline graphic's ability to identify true substructure.

Assumptions in GInline graphicE GWAS: classical approaches

In general, regression methods incorporate assessments of variability by estimating standard errors; for a given estimated effect (i.e. Inline graphic), larger standard errors reflect greater variability from sample to sample, and produce less significant results. However, the precise assumptions reflected in these statements of variability differ between methods.

Under “classical” or “model-based” regression approaches, standard errors only account for random variation in the phenotype (denoted Inline graphic). Furthermore, for their validity these classical variability estimates require that the mean value of Inline graphic is truly linear in the coefficients of the independent variables, such as environmental variables (denoted Inline graphic) or genotypes (denoted Inline graphic) [9].

To illustrate these classical assumptions, we consider linear regression, with Inline graphic coded as 0/1/2 copies of the minor allele. For classical main-effects analysis one might assume that the mean value of Inline graphic truly is

graphic file with name pone.0019416.e019.jpg

Association would be assessed using the least squares regression estimator Inline graphic and its estimated standard error, which is based on estimated random variation in the phenotype Inline graphic with the values of the observed predictor Inline graphic fixed. (Formally, the analysis is ‘conditioned’ on the independent variable Inline graphic) [10].

Using the classical approach for interaction analyses, one might instead assume that

graphic file with name pone.0019416.e024.jpg (1)

Inference would use Inline graphic and its estimated standard error, where again the variability accounted for by model-based standard errors is that of the phenotype, Inline graphic, in replicate experiments where Inline graphic and Inline graphic are fixed at the values observed in the original data.

How does the mean model assumption affect GWAS work? In main effects analyses, the validity of the mean model is not a major concern. Under the ‘strong null hypothesis' of no association between Inline graphic and Inline graphic, the true mean value of Inline graphic is simply

graphic file with name pone.0019416.e032.jpg

This means that the model assumptions hold under the null hypothesis, which is sufficient for valid p-values. But in GInline graphicE work, even under the null hypothesis of no statistical interaction (Inline graphic in (1)), model-based standard errors assume that the mean of Inline graphic is truly linear in Inline graphic and the residual variance is constant with respect to Inline graphic. When this assumption fails, model-based errors may be too small.

How does accounting for different sources of variability impact GWAS work? In main-effects analyses, we typically have the same, well-specified model for each gene we test, under the null hypothesis. In this case, the variability in our estimates is the same whether or not Inline graphic is truly fixed. As a result, model-based standard errors can be used to produce valid QQ-plots, even though each point on the plot represents a different Inline graphic. But when there is mean-model mis-specification in GInline graphicE GWAS, variability in interaction term coefficient estimates from Inline graphic to Inline graphic becomes important. QQ-plots using model-based standard errors provide results based on viewing Inline graphic as random, and Inline graphic and Inline graphic as fixed. This contrasts with the observed variation in p-values entering the computation of Inline graphic, where Inline graphic is fixed, but Inline graphic varies – all along the genome. In particular, this means that Inline graphic varies in a way not accounted for by model-based analysis.

We will see that in GInline graphicE GWAS using model-based standard errors, the behavior of QQ-plots and Inline graphic may not be as straightforward as in main-effects work. In Results, we show how violation of the assumptions both about mean-model validity and what is considered random can lead to misbehaved QQ-plots in Inline graphic studies.

Assumptions in GInline graphicE GWAS: robust approaches

‘Model-robust’ standard errors are an alternative to model-based. Here, instead of assuming a particular form for the mean Inline graphic given Inline graphic and Inline graphic, standard error estimation views regression estimates as simple summaries of the observed association between Inline graphic and Inline graphic, or Inline graphic and Inline graphic. For example, interaction terms summarize how a measure of the Inline graphic association differs between values of Inline graphic. While the summary is expressed linearly, no underlying assumption of true linearity, in either the Inline graphic relationship or how it differs between levels of Inline graphic, is required for accurate standard error estimates [11]. Thus, concerns about mis-specification of the mean model in GInline graphicE GWAS disappear. This form of standard error estimation should give inherently better-behaved QQ-plots than the model-based approach.

Model-robust standard error estimates are known as “heteroscedasticity-consistent”, “model-agnostic”, “Huber-White”, or “sandwich” standard errors, and are available in standard statistical software [12][14]. Unlike model-based standard errors, they summarize uncertainty in estimates where Inline graphic and all independent variables are considered random. In Inline graphic GWAS work, this means that repeated sampling variability in Inline graphic, Inline graphic and Inline graphic is accounted for. However, when we examine QQ plots we have Inline graphic and Inline graphic fixed while only Inline graphic varies. As will be discussed in the theoretical portion of the Results, this produces about the same amount of variability as when all variables are considered random, and more than when only Inline graphic is considered random. As a result, robust standard errors should give a better assessment of variability than model-based standard errors when we vary Inline graphic due to genome-wide comparison as we do on a QQ-plot.

Results

Simulation results

Before deriving theoretical results, we illustrate the scope of the difference between model-based and model-robust inference in GInline graphicE GWAS, and the extent of QQ-plot inflation that may be produced in the absence of population substructure.

In Figures 1 and 2, we show the QQ plots for linear regression results in GInline graphicE GWAS, based on simulations of well specified and misspecified modeled relationships between Inline graphic and Inline graphic. All simulations use Wald tests, independent Normal phenotypes Inline graphic, biallelic genotypes Inline graphic in Hardy Weinberg equilibrium with MAF varying between 0.02 and 0.5 and coded as 0/1/2 copies of the minor allele; for details see Methods. Importantly, the null hypothesis of no Inline graphic interaction holds throughout, and no population substructure is present. Using model-based standard errors, in Figure 1 we see no inflation beyond that expected by chance alone. In Figure 2, in the presence of either of two types of slight model mis-specification, substantial inflation of model-based statistics is observed (Inline graphic and Inline graphic), well beyond chance, despite the absence of real interactions or of population substructure. Using the model-robust approach, we see no inflation in the correctly specified model (Figure 1), or for either of the mis-specified models (Figure 2).

Figure 1. Correctly Specified Model.

Figure 1

In this scenario the data is generated according to Inline graphic, independent of Inline graphic. Both the model-based and robust standard errors are valid estimates of variability, as demonstrated by the QQ-plot.

Figure 2. Mis-specified model.

Figure 2

Panels A and C show scatterplots of Inline graphic vs. Inline graphic generated according to Inline graphic and Inline graphic respectively, independent of Inline graphic. Panels B and D demonstrate the corresponding effect of this mis-specified mean model and non-constant variance.

In Figure 3, we show that similar behavior can occur when substructure is present in an interaction analysis with model mis-specification. Here, structure was incorporated by assigning MAFs to two sub-populations, choosing Wright's Inline graphic to be 0.01, and the mis-specification exactly that displayed in panels A and B of Figure 2. Using a model-based analysis that accounts for the substructure by including one principal component of the SNP data as a covariate in the regression, we see that inflation persists, spuriously. However, the principal component-adjusted model-robust inference removes the substructure problem, and again gives correctly-calibrated Inline graphic-values.

Figure 3. QQ-plots with added population structure.

Figure 3

In the left panel, nothing is done to account for the structure. On the right, the results are adjusted for principal components, leaving about the same amount of inflation as the case with no population stratification.

Finally, in Figure 4, we show that similar behavior holds for non-linear regression analysis. In these, model-based errors assume linearity on a modified scale: logitInline graphic for logistic regression, and the log hazard for Cox proportional hazards regression. Here, in the top row we show results for binary Inline graphic, a Inline graphic relationship that is non-linear on a logit scale, and no true interaction. In the bottom row, we show similar results for a mis-specified Cox proportional hazards regression, with uniform censoring at the median [15]. Similar results hold when using likelihood ratio tests and joint tests of Inline graphic.

Figure 4. Example of behavior in logistic and proportional hazards regression.

Figure 4

The top row displays the results for logistic regression, and the bottom for proportional hazards. The data was simulated according to Inline graphic and Inline graphic with half of the data censored at the median survival time. The top left shows the log odds of an event, which demonstrates non-linearity that was not specified in the model. The plot on the lower left displays a loess curve through the Schoenfeld residuals from the regression of Y on E. A non-zero slope is indicative of violation of the proportional hazards assumption.

Theoretical results

We now develop theoretical results governing the behavior of Inline graphic under model-based and model-robust analyses of GInline graphicE GWAS.

In the absence of population structure, the population parameter consistently estimated by Inline graphic for interaction terms can be viewed as a ratio of conditional and unconditional variances, as follows:

graphic file with name pone.0019416.e103.jpg (2)

where .4549 is the median of the Inline graphic distribution and Inline graphic is the variance estimate, either model-based or robust, used in the analysis. For simplicity we first consider the situation where 1) Inline graphic for all Inline graphic, where 2) Inline graphic is independent of Inline graphic, and where 3) the minor allele frequency is the same for all SNPs Inline graphic. We note that, in the absence of population stratification, the first two conditions are approximately true for nearly all SNPs. The third condition will later be relaxed. Under these three conditions, Inline graphic is approximately constant and can be factored out of the computation of the median in equation (2).

Since Inline graphic and Inline graphic is asymptotically Normal,

graphic file with name pone.0019416.e114.jpg

is consistent for the variance of Inline graphic taken over the distribution of Inline graphic but conditioning on Inline graphic and Inline graphic. The genomic control Inline graphic can then be written as

graphic file with name pone.0019416.e120.jpg

The numerator of Inline graphic is the empirical variance of the regression coefficients and is always a good estimate of Inline graphic, the true variance over genotypes fixing the outcome and exposure variable. The denominator of Inline graphic is the estimated variance of Inline graphic from the regression analysis. If model-based inference is used, this estimates Inline graphic, the variance taken over the distribution of the outcome, conditional on the predictor variables. If a model-robust variance estimator is used, the denominator estimates Inline graphic, the unconditional variance of Inline graphic taken over the distribution of all variables.

To see that Inline graphic should be approximately 1 when there is no population structure, despite the conditioning on Inline graphic and Inline graphic that is implicit in the computation of its numerator, we can examine the variance decomposition:

graphic file with name pone.0019416.e131.jpg (3)

The numerator of Inline graphic accurately estimates the second term in this decomposition. We show in Appendix S1 that the first term is approximately zero for the case of linear regression, so

graphic file with name pone.0019416.e133.jpg

as required. Our simulations confirm that this result also holds for logistic regression and Cox regression.

So far we have assumed constant MAF, but the arguments do not depend on the value of the MAF, nor does the conclusion that Inline graphic. Since Inline graphic is defined from the median of the chi-squared statistic, if Inline graphic for the SNPs with each fixed MAF we must also have Inline graphic pooling over a range of MAF. For this reason, the results should hold with typical range of MAFs seen in GWAS so long as the sample size and MAF are large enough to allow accurate estimation of the sandwich variances. This is further supported by the simulation results, which used a wide range of MAFs.

The analog of equation 3 for the model-based estimator is

graphic file with name pone.0019416.e138.jpg (4)

The first term in this decomposition is not negligible unless the Inline graphic model is correctly specified, so under model mis-specification

graphic file with name pone.0019416.e140.jpg

and Inline graphic will tend to be greater than 1 even when there is no confounding by population substructure. Figure 5 shows an example of this.

Figure 5. Illustrating the variance decomposition.

Figure 5

The panels show estimates of Inline graphic over replications with different variables held constant. At left, the Inline graphic relationship is truly linear. Because Inline graphic is the same regardless of which variables are held constant, then according to the variance decomposition, so is the variability. In the right panel the Inline graphic relationship is exponential. With Inline graphic and Inline graphic fixed, a certain amount of within-sample correlation remains fixed, making Inline graphic different for each instance of Inline graphic. Both the Inline graphic setting where Inline graphic and Inline graphic are fixed and Inline graphic is random, and the setting when all variables are random incorporate this extra variability.

As a further complication, the model-based variance estimator Inline graphic need not be close to the true variance, the second term in equation 4, if the model is misspecified [16].

Discussion

We have seen in the above that standard errors that rely on model assumptions can be underestimates of Inline graphic when those model assumptions are not met, while model-robust estimates of variance provide well-calibrated standard errors and p-values. This distinction can be seen in all types of regression examined. The problem is not merely theoretical; our research was motivated by seeing apparent population substructure similar to that in Figure 2 in initial analyses of a Inline graphic GWAS of echocardiographic traits [17] and noticing that the inflation was absent in cohorts that had used model-robust standard error estimates. The simulation results from linear regression show that even mild heteroskedasticity or mean-model mis-specification can inflate model-based test statistics.

Intuitively explaining sources of variability

The impact of different sources of variability and its relation to model mis-specification is not well recognized. We illustrate the situation for Inline graphic GWAS in 5. Here, for a continuous phenotype Inline graphic, continuous exposure Inline graphic, and binary genotype Inline graphic, we show the spread of Inline graphic estimates holding different variables constant when there is no true interaction or population structure present. Within the blue boxes, Inline graphic and Inline graphic are held fixed while Inline graphic is varied to produce different estimates of Inline graphic. From boxplot to boxplot Inline graphic is varied. Each blue boxplot illustrates the variability in Inline graphic using what model-based errors assume is fixed; it can be compared to the variability with Inline graphic, Inline graphic, and Inline graphic all random, and with Inline graphic and Inline graphic fixed. Under model mis-specification, it is clear that the distribution of Inline graphic varies from Inline graphic to Inline graphic, and that the variability in Inline graphic is larger when Inline graphic, Inline graphic and Inline graphic are all random or when only Inline graphic and Inline graphic are fixed.

When the linear model is true, as in the data summarized in left panel, then the linear trend is the same for any level of Inline graphic. When this is true, the variability in Inline graphic is the same whether or not Inline graphic and Inline graphic are taken to be random. However, when the linear model is not true, then the linear trend need not be the same at different levels of Inline graphic. In right panel of Figure 5, the data were generated according to an exponential relationship between Inline graphic and Inline graphic. Under this model the linear trend will be steeper in samples where the values of Inline graphic are larger. Now for any single instance of Inline graphic and Inline graphic there is always some small degree of correlation between them within the data. As each of these small, fixed associations between Inline graphic and Inline graphic varies over Inline graphic, there is truly effect modification: subjects with different genotypes will tend to have slightly different levels of Inline graphic, and hence a slightly different relationship with Inline graphic. So in addition to the usual sampling variability in estimating Inline graphic, we have this ‘bias’ that varies from each pair of Inline graphic and Inline graphic to the next. If we add these two sources of variability, we obtain the full variability that we observe when Inline graphic and Inline graphic are also random.

Conclusions

In GInline graphicE GWAS, nave use of QQ-plots and genomic control with model-based standard errors may lead to false conclusions about substructure. The extent of this problem depends on the degree of mis-specification of the mean-model, the form of regression used, and the distribution of the environmental exposure. Use of model-robust inference offers a simple alternative that avoids these difficulties, and retains genomic control as a useful tool for the assessment of substructure.

Methods

Simulation studies in R [18] were used to assess the performance of model-based standard errors and sandwich standard errors in a variety of scenarios, with the genomic-control Inline graphic used to assess the degree of inflation in the test statistics. Visually, this can be seen in QQ-plots.

We simulated a normally distributed environmental exposure, and a response generated from this either under a correctly specified linear model, or under a quadratic mean-model. Genotypes at 10,000 loci were simulated according to a binomial distribution, with minor allele frequency (MAF), drawn from a beta(.5,.5) distribution truncated at 1/2, and with frequencies filtered to be above 0.02. We found that the behavior of the simulations was not affected in a substantial way when the MAF was fixed at any particular value for all loci. In this way, genotype is entirely unrelated to phenotype in these simulations, and so we would hope that tests for gene-environment interaction yield uniformly distributed p-values, as they should be under the null hypothesis.

Population stratification was simulated by drawing an MAF for each of two sub-populations at each locus, centered around some MAF drawn from the distribution described above. These sub-population MAFs were distributed according to a beta distribution parametrized by the central MAF and Wright's Inline graphic, in this case chosen to be 0.01 [4]. In order to allow for confounding, we created a slight difference in the relationship between phenotype and environmental exposure: the linear component of the relationship was Inline graphic 20% of the population and Inline graphic in the other population, while the quadratic component was Inline graphic in both groups.

In addition to linear regression, performance of model-based and sandwich standard errors was assessed in logistic and proportional hazards regression. In these situations we generated simulations in which departures from linearity were on the appropriate transformed scale. In logistic regression, this meant that the linearity was judged on the scale of the logit of the probability of ‘success’. In proportional hazards, the scale was on the log hazard scale. To achieve this, we generated exponentially distributed event times where the exponentiated ‘rate’ parameter was related quadratically to exposure.

Supporting Information

Appendix S1

(PDF)

Footnotes

Competing Interests: The authors have declared that no competing interests exist.

Funding: This research was funded in part by NIH/NHLBI training grant T32 HL07183-34 and by research grant R01 HL074745. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Hunter DJ. Gene-environment interactions in human diseases. Nat Rev Genet. 2005;6:287–298. doi: 10.1038/nrg1578. [DOI] [PubMed] [Google Scholar]
  • 2.Thomas D. Gene-environment-wide association studies: emerging approaches. Nat Rev Genet. 2010;11:259–272. doi: 10.1038/nrg2764. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Pearson T, Manolio T. How to interpret a genome-wide association study. Jama. 2008;299:1335. doi: 10.1001/jama.299.11.1335. [DOI] [PubMed] [Google Scholar]
  • 4.Devlin B, Roeder K. Genomic control for association studies. Biometrics. 1999;55:997–1004. doi: 10.1111/j.0006-341x.1999.00997.x. [DOI] [PubMed] [Google Scholar]
  • 5.Ganesh S, Zakai N, van Rooij F, Soranzo N, Smith A, et al. Multiple loci inuence erythrocyte phenotypes in the CHARGE Consortium. Nature genetics. 2009;41:1191–1198. doi: 10.1038/ng.466. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Nolte I, Wallace C, Newhouse S, Waggott D, Fu J, et al. Common genetic variation near the phospholamban gene is associated with cardiac repolarisation: meta-analysis of three genome-wide association studies. PLoS One. 2009;4:e6138. doi: 10.1371/journal.pone.0006138. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Price A, Patterson N, Plenge R, Weinblatt M, Shadick N, et al. Principal components analysis corrects for stratification in genome-wide association studies. Nature genetics. 2006;38:904–909. doi: 10.1038/ng1847. [DOI] [PubMed] [Google Scholar]
  • 8.Zhang F, Wang Y, Deng H. Comparison of population-based association study methods correcting for population stratification. PLoS One. 2008;3:3392. doi: 10.1371/journal.pone.0003392. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Draper N, Smith H. Applied regression analysis. 1998;706 John Wiley and Sons. New York. [Google Scholar]
  • 10.Cox D. Principles of statistical inference. Cambridge Univ Pr; 2006. [Google Scholar]
  • 11.White H. A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity. Econometrica. 1980;48:817–838. [Google Scholar]
  • 12.StataCorp. Stata statistical software: Release. 2009;11 [Google Scholar]
  • 13.Zeileis A. Econometric computing with hc and hac covariance matrix estimators. Journal of Statistical Software. 2004;11:1–17. [Google Scholar]
  • 14.Zeileis A. Object-oriented computation of sandwich estimators. Journal of Statistical Software. 2006;16:1–16. [Google Scholar]
  • 15.Therneau T original R port by Thomas Lumley. survival: Survival analysis, including penalised likelihood. 2009. URL http://CRAN.R-project.org/package=survival. Accessed 2011 April 7. R package version 2.35-4.
  • 16.Royall R. Model robust confidence intervals using maximum likelihood estimators. International Statistical Review/Revue Internationale de Statistique. 1986;54:221–226. [Google Scholar]
  • 17.Glazer NL, Felix JF, Dörr M, Chen MH, Schmidt R, et al. Genome-wide meta-analyses of snp by environmental factor interactions on echocardiographic traits: a charge-echogen study. 2010. In Press.
  • 18.R Development Core Team. R: A Language and Environment for Statistical Computing. 2009. R Foundation for Statistical Computing, Vienna, Austria. URL http://www.R-project.org. ISBN 3-900051-07-0.

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Appendix S1

(PDF)


Articles from PLoS ONE are provided here courtesy of PLOS

RESOURCES