Abstract
There has been extensive literature on modeling gene-gene interaction (GGI) and gene-environment interaction (GEI) in case-control studies with limited literature on statistical methods for GGI and GEI in longitudinal cohort studies. We borrow ideas from the classical two-way analysis of variance (ANOVA) literature to address the issue of robust modeling of interactions in repeated-measures studies. While classical interaction models proposed by Tukey and Mandel have interaction structures as a function of main effects, a newer class of models, additive main effects and multiplicative interaction (AMMI) models, do not have similar restrictive assumptions on the interaction structure. AMMI entails a singular value decomposition of the cell residual matrix after fitting the additive main effects and has been shown to perform well across various interaction structures. We consider these models for testing GGI and GEI from two perspectives: likelihood ratio test based on cell means and a regression based approach using individual observations. Simulation results indicate that both approaches for AMMI models lead to valid tests in terms of maintaining the type I error rate, with the regression approach having better power properties. The performance of these models was evaluated across different interaction structures and 12 common epistasis patterns. In summary, AMMI model is robust with respect to misspecified interaction structure and is a useful screening tool for interaction even in the absence of main effects. We use the proposed methods to examine the interplay between the hemochromatosis gene and cumulative lead exposure on pulse pressure in the Normative Aging Study.
Keywords: likelihood ratio test, longitudinal data, mixed model, non-additivity, parametric bootstrap, singular value decomposition, two-step regression
INTRODUCTION
Prospective cohort studies examining gene-gene or gene-environment interaction (GGI or GEI) effects on disease-related quantitative traits have received considerable attention in recent years [Bookman et al. 2011, Fan et al. 2012]. The detection of GEI plays a critical role in identifying a sub-population of the genetically susceptible individuals that are strongly affected by an adverse exposure. A better understanding of GEI may lead to the development of more effective disease prevention and intervention strategies. Studies of GEI in relation to disease development are facilitated by life-time characterization of exposure data, which are often available in prospective cohort studies. Repeated measures design in a prospective cohort study may increase power to detect interaction effects [Wong et al. 2003] and provide better ways to handle exposure measurement error. In addition, repeated measures data provide valuable information for delineating potentially time-dependent form of GGI or GEI, thereby permitting a much more detailed assessment of the dynamically evolving interplay between genes and environment.
Cohort studies for GGI or GEI are typically characterized by unequal sample size in each genotype-genotype or genotype-exposure configuration as a result of unbalanced allele frequencies and heterogeneous environmental exposure distributions in a population. A common analysis strategy for such unbalanced data involves modeling GGI or GEI by a product term in a regression setting, implying that the effect of two factors may not be purely additive in their contribution to the quantitative trait. Alternatively, one can try to model the interaction term in the generalized additive mixed model framework with nonlinear exposure and time effects [Lin and Zhang 1999], but tests for such non-parametric, smoothed interaction terms may yield reduced power for moderate sample size. Therefore, flexible yet parsimonious modeling of GGI or GEI is of interest in the longitudinal setting. In this paper, we propose likelihood ratio tests (LRT) for GGI and GEI using a sparse representation of interaction borrowing ideas from the classical ANOVA literature.
Genetic factors (G) and environmental exposures (E) are frequently treated as binary or ordered categorical variables. Consequently, GGI and GEI are often analyzed in the form of a two-way table. Considering G as a row variable with I categories and E as a column variable with J categories, the mean structure of a general two-way classification model for analyzing row × column interactions is given by
(1) |
where μij is the the expected (mean) value of a quantitative trait corresponding to the ith row and the jth column, μ is the grand mean, Ri is the additive main effect of the ith row, Cj is the additive main effect of the jth column, and γij is the non-additive effect of the ith row and the jth column. The sum-to-zero conditions,
(2) |
ensure identifiability of the parameters in (1), so the degrees of freedom (df) for testing γij in a fully saturated model is (I − 1)(J − 1). While a saturated model (1) is flexible for estimation of γij, the df for interaction tests can increase considerably for finely cross-classified tables, which is inefficient and may result in low power for detecting GGI or GEI.
To improve the power of the test for GGI and GEI in longitudinal cohort studies, we explore alternative parsimonious interaction structures that were proposed in the classical ANOVA literature for testing interaction with only one observation per cell. Several models are summarized in the following:
[Tukey 1949] |
[Mandel 1961] |
[Mandel 1961] |
[Tukey 1962] |
with constraints for models (b) and (c), respectively, and additional constraints for model (d). The multiplicative interaction term is proportional to the main effects of one or both factors. The null hypotheses of no interaction for models (a)–(d) are θ = 0, λi = 0 (i = 1, …, I − 1), ηj = 0 (j = 1, …, J − 1), and θ = λi = ηj = 0 (i = 1, …, I − 2; j = 1, …, J − 2), corresponding to 1, I −1, J − 1, and I + J − 3 df for the tests of interaction effects, respectively. A more flexible potential alternative is the additive main effects and multiplicative interactions (AMMI) model [Gollob 1968, Mandel 1971]
where M represents the number of interaction factors being extracted, M ≤ min(I − 1, J − 1), and a residual remains if not all interaction factors are used. The terms {αimβjm} can the considered as the weights corresponding to a multiplicative contrast among {γij} with Σi αim = Σj βjm = 0 and Σi αimαim′ = Σj βjmβjm′ = 0 for m ≠ m′. For normalized contrasts ( , {dm, αim, βjm} can be obtained by applying singular value decomposition (SVD) to {γij}. Since the motivation for using an AMMI model is to extract a low rank approximation to the interaction matrix to save df and thus to enhance efficiency for the test, we focus on AMMI models with M = 1 (AMMI1). For all subsequent discussions, model (e) refers to AMMI1 model. The null hypothesis of no interaction for AMMI1 model is H0 : d1 = 0.
Models (a)–(e) were conceived from a statistical objective of reducing df and enhancing power of tests for interaction. They have been used in designed genotype-by-environment yield trials in agricultural studies [Freeman 1973, Zobel et al. 1988, Crossa et al. 1990]. These models were not conceived from a mechanistic or human biological perspective. Model (a) has recently been used to test for genetic effects in case-control studies [Chatterjee et al. 2006] and repeated measures data of complex traits [Maity et al. 2009]. Models (a), (b), (c), and (e) have also been applied for GGI effects on quantitative traits in cross-sectional studies [Barhdadi and Dubé 2010]. In unbalanced designs, the sums of squares associated with the two factors and their interaction are not orthogonal to one another. Consequently, the difficulties that arise in applying these nonlinear interaction models to unbalanced data involve obtaining unbiased parameter estimates, partitioning the sums of squares, deriving the appropriate test statistics and their null distributions. Mukherjee et al. [2012] proposed a screening tool for GGI and GEI using cell means from an unbalanced repeated measures array. This approach is appealing due to a closed-form analytical expression of the test statistic. However, violations in the homoscedasticity assumption of cell mean error distributions result in inflated type I error. While their proposed resampling-based method recognizes unbalanced, repeated measures data structure, the test implemented for AMMI models lacks power because it was not based on a theoretically derived pivot but an ad hoc extension of the balanced, cross-sectional case.
To overcome some of the limitations of the previous methods, we propose alternate approaches to explore GGI and GEI using models (a)–(e). We first describe our improved cell-mean approach that properly handles unbalanced data. Specifically, we adapt and modify the test proposed by Boik [1989] under a reduced-rank model for application to GGI/GEI using AMMI models. Next, we extend models (a)–(e) to the repeated measures setting using a mixed-effects modeling framework. We then develop a parametric bootstrap resampling approach by replacing the ad hoc pivot in Mukherjee et al. [2012] with a LRT-based pivot derived from the maximum likelihood under a nonlinear mixed-effects model. The power and type I error of our proposed tests are examined through a series of simulation studies. Lastly, we apply the proposed methods to a GEI study concerning the modifying effects of polymorphisms in the hemochromatosis gene (HFE) on the association between cumulative lead exposure and pulse pressure [Zhang et al. 2010]. Subject- and time-specific contributions to GEI are investigated using outputs from the AMMI model.
METHODS
Likelihood Ratio Test based on Cell Means
Following the notations in (1), let yijkh be the hth measurement corresponding to the kth individual in the (i, j)th cell (or equivalently, row i and column j) in a longitudinal cohort study, i = 1, …, I, j = 1, …, J, k = 1, …, Nij, h = 1, …, nijk. Let N denote the total number of individuals, N = ΣiΣj Nij. Let Ȳ = {Ȳij} be the I × J matrix of sample means with . Let L be the matrix of main effects, parameterized as , where 1ν is a length-ν vector of ones, and R = (R1, …, RI)′, and C = (C1, …, CJ)′ are the parameter vectors representing row and column effects, respectively. Let Γ be the I × J matrix of interaction effects, so the mean structures of models (a)–(e) can be expressed as E(Ȳ) = L+Γ. Throughout our treatment of the problem, we consider the drop-outs in longitudinal studies to be missing at random, leading to the unbalanced data structure.
We propose to use an empirical variance estimate for the variance of Ȳij (denoted as ) that accounts for within-subject correlation. Let σ2P(ρ) be a symmetric nijk × nijk within-subject covariance matrix, where ρ is a s × 1 parameter vector that fully characterizes the correlation matrix P(ρ), and σ2 is a scale parameter. Both ρ and σ2 can be estimated by Pearson residuals, namely, r̂ijkh = yijkh − ȳijk [Liang and Zeger 1986]. The pooled estimate of σ2 is
The estimation of ρ is conditional on the correlation structure. For a compound symmetric correlation structure, s = 1, corr(yijkh, yijkh′) = ρ for h ≠ h′. The pooled estimate for ρ is
Finally, the empirical variance estimate for Ȳij is given by
(3) |
Given δ̂ij, we maximize the likelihood of Ȳ under the normality assumption on the cell means, namely, . Maximizing the log-likelihood is equivalent to least squares fitting of μij subject to weights . For classical interaction models (a)–(d) involving nonlinearity in the parameters, the maximum likelihood (ML) estimates for L and Γ are obtained using a quasi-Newton method in R [R Core Team 2012] with function ‘optim’ and L-BFGS-B algorithm [Nocedal and Wright 1999]. Quasi-Newton methods are sequential line search algorithms, and generally require only the gradient of the objective to be computed at each iterate. When convergence is reached, we calculate the log-likelihood under the null (ℓ̂0) and under the alternative (ℓ̂1) to construct the LRT statistic: −2(ℓ̂0 − ℓ̂1). Under H0, the LRT statistic approximately follows a central chi-square distribution with df = 1, I − 1, J − 1, and I + J − 3 for models (a), (b), (c), and (d), respectively. The comparison of empirical quantiles of the LRT statistics with chi-square quantiles is presented in Supporting Information (Figure S.1).
Boik [1989] proposed the likelihood ratio criterion to test the rank of Γ for unbalanced data without repeated measures. For AMMI1 models, the test for non-additivity is H0 : d1 = 0 vs. Ha : d1 = 1, which is equivalent to H0: rank(Γ) = 0 vs. Ha: rank(Γ) = 1. Let Hν be the row-space or column-space projection operator, , and let be a full-rank factorization of Hν with dimension ν × (ν − 1) satisfying . We have
with rank(Γ) = rank(Φ) = r ≤ p = min(I − 1, J − 1). The elements of Φ form a basis for the set of interaction contrasts. Define K = (KJ ⊗ KI) so that K′vec(Ȳ) is a linear function of vec(Ȳ) without containing the main effects (because K′vec(L) = 0). Hence, E[K′vec(Ȳ)] = K′vec(Γ) = vec(Φ) and .
The goal is to maximize the likelihood function of Ȳ subject to the constraint rank(Γ) = r, which is the same as computing
(4) |
where , and is replaced by in (3). The constrained ML estimate Φ̂ is the solution to S(r). Due to the weight matrix W, a direct SVD solution does not exist. Instead, Φ̂ can be obtained by criss-cross regression [Gabriel and Zamir 1979]. Write Φ = AB′, where A and B are (I − 1) × r and (J − 1) × r, respectively. Now (4) becomes a standard weighted least squares problem. Given A(n), B is updated as
(5) |
In turn, given B(n+1), A is updated as
(6) |
We alternate (5) and (6) until convergence of (4) is reached, and Φ̂ = ÂB̂′. The LRT statistic is S(0) − S(1), where S(0) = K′vec(Ȳ)W−1K′vec(Ȳ). The asymptotic null distribution of this LRT statistic converges in distribution to the maximum root of a p–variate Wishart matrix with df = max(I −1, J −1) in balanced designs [Boik 1989]. The corresponding 95th and 99th percentiles of this distribution can be found in Hanumara and Thompson Jr [1968]. With unbalanced data, under the assumption that Nij = Σj Nij Σi Nij/Σij Nij, the null distribution of the LRT is known to be identical to that in balanced designs. Due to correlated nature of the outcome data, these approximations are not directly applicable to our context. However, our numerical work illustrates that using this reference distribution provides a conservative approximation to the test.
Parameter Estimation based on Individual Observations
The cell-means approach provides a quick way of summarizing interaction effects for repeated measures data. In presence of confounders and other covariates, a mixed-effects regression model uses all individual observations and provides a general framework for handling repeated measurements. Let yijk denote the length-nijk observation vector for subject (i, j, k),
(7) |
where μij is the mean response value for the (i, j)th cell, Zijk is a nijk × q design matrix for the random effects, eijk ~ (0, Σijk), not depending on i, j, or k except that its size is nijk × nijk, and bijk is a length-q vector of subject-specific random effects, independent of eijk. The random effects are distributed as (0, Ψ), where Ψ is the q × q covariance matrix for the random effects. It follows that the variance-covariance matrix for yijk is .
Classical Interaction Models
To avoid computationally intensive iterations associated with ML estimation for models (a)–(d), we propose a two-step regression procedure to approximate the interaction parameters. The idea is similar to Milliken and Johnson [1989], who applied a two-step regression procedure in two-way tables to estimate nonlinear interaction effects. Let Xijk be the design matrix with dimension nijk × IJ that allows estimation of all plausible effects from the row and column factors. In the first step, we fit a saturated interaction model to the data using a linear mixed-effects model:
(8) |
where ξ = (μ, R1, …, RI−1, C1, …, CJ−1, γ11, …, (I−1)(J−1))′. The log-likelihood function is
(9) |
The variance components are estimated by restricted maximum likelihood (REML) [Patterson and Thompson 1971], and the I × J fixed effect estimates are
(10) |
In the second step, we extract the main effect estimates from ξ̂ and compute the residuals
(11) |
Since the interaction term of Tukey’s and Mandel’s models involves main effects, we perform a second regression (without intercept) where the residuals rijk are treated as the response variable and the respective specific forms of main effect estimates are treated as the regressors to obtain the corresponding slope estimates. The second-step regression equations for models (a)–(c) are:
(12) |
(13) |
(14) |
with εijk ~ (0, Ωε(nijk×nijk)). One can select a covariance structure Ωε depending on the criterion of model fitting. Note that parameter constraints in (2) are handled in the regressors, and there are I −1 and J −1 regression equations in (13) and (14), respectively. For model (d), we first obtain the estimates of λi and ηj then compute the second-step residuals using {R̂i, Ĉj, λ̂i, η̂j}, and finally estimate θ (see Supporting Information for details).
AMMI1 Model
Given that the interaction structure of model (e) is derived from a SVD of the matrix of residuals after removing additive effects, we propose to perform SVD to the saturated Γ̂ matrix as obtained from (10). The resulting largest singular value of Γ̂ is an approximation of d̂1. The corresponding left and right singular vectors are approximations of α̂i and β̂j, for i = 1, …, I, j = 1, …, J.
Remark 1
We evaluated the bias and mean squared error (MSE) properties of the two-step regression estimators through simulation. The empirical results indicate that the two-step regression estimators appear to be unbiased, even under misspecified correlation structures (Table S.2 in Supporting Information). The estimator of d1 for AMMI1 models (obtained by SVD of the estimated saturated interaction matrix), however, slightly over-estimates d1.
Parametric Bootstrap using a LRT Pivot
We construct a LRT statistic based on the non-iterative two-step regression estimates. First, the log-likelihood under the null hypothesis is obtained by fitting an additive mixed model (denoted as ℓ̂0), and the log-likelihood under the alternative hypothesis is obtained by the previous two-step regression procedure (denoted as ℓ̂1). Specifically for calculating ℓ̂1, we extract μ̂, R̂i, and Ĉj from (10) and obtain the interaction effect estimates from the second-step regression of residuals on a pre-specified structure of main effect estimates for models (a)–(d). Subsequently, an approximate LRT pivot is created, Λ̂ = −2(ℓ̂0 − ℓ̂1). Because the parameter estimates used in Λ̂ are not proper ML estimates, the resulting test statistic does not have a standard asymptotic distribution. We use parametric bootstrap to elicit the null distribution of this LRT-based pivot. Since permuting Y or subjects across the configurations of G and E factors can remove both interaction and main effects, we generate pseudo data under the null hypothesis of no interaction while preserving the main effects using the model: , where . Ψ̂ and Σ̂ are REML estimates from the saturated mixed-effects model in (8). For each simulated null sample, a Λ̂ is computed. Repeating the procedure for a large number of times (e.g., we use 1000) provides an approximate distribution of Λ̂ under H0. Finally, an empirical p-value is obtained by calculating the proportion of all Λ̂ that exceeds the observed Λ̂.
Simulation Settings
We carried out a series of simulation studies to examine the following properties of the proposed tests: [1] type I error for the LRT using cell means (LRT-CM) and for the parametric bootstrap approach with the LRT-based statistic Λ̂ (LRT-PB); [2] power comparison of AMMI1 to saturated interaction model if AMMI1 model holds; [3] power comparison of LRT-PB to LRT-CM; and [4] performance of models (a)–(e) across classical interaction structures. In addition, we compared relative performances of our proposed tests to existing strategies for testing GGI or GEI, including the standard saturated interaction model and the naive cell mean approach of Barhdadi and Dubé [2010] (not accounting for correlated data). Furthermore, we evaluated the performance of each model in detecting GGI with repeated measures on a quantitative trait under 12 epistasis patterns.
Individual-level outcome Y with nijk repeated measures on subject (i, j, k) were generated for a total of N subjects. The general description of the model is given by
(15) |
where , and {e, b} are mutually independent. Cell means were first generated according to models (a)–(e), and the data vector for each individual with a given mean and a covariance structure was generated from a multivariate normal distribution. The interaction terms in all models were scaled in such a way that they contributed to 15% of the total variation explained by the model, and the remainder was attributed to row and column main effects. While simulating data under an AMMI1 model, we assigned the entire contribution due to interaction effect to the first interaction factor.
To evaluate model performance in terms of detecting common patterns of GGI, data were simulated according to 12 epistasis models [Barhdadi and Dubé 2010]: (1) dominant or dominant; (2) dominant or recessive; (3) modified model; (4) dominant and dominant; (5) recessive or recessive; (6) dominant and recessive; (7) recessive and recessive [(1)–(7) from [Jung et al. 2009]]; (8) checkerboard; (9) diagonal [(8) and (9) from [Culverhouse et al. 2004]]; (10) threshold; (11) additive and additive; and (12) a general model. The additive and additive model and the general model are purely epistatic models, that is, the quantitative trait depends on genotype from two loci in the absence of any marginal effects. Figure 1 gives a visual representation of the true cell means for the 12 epistasis patterns in our simulation.
We considered 3 × 3 table settings for all simulations to mimic studies of GGI. In addition, we evaluated the power of AMMI1 models in 9 × 5 table settings as described in [2]. For 3 × 3 tables, minor allele frequencies for the two loci were set at 0.3 and 0.4, respectively. For 9 × 5 tables, combinations of the two loci with allele frequencies 0.3 and 0.4 (resulting in nine categories) along with an environmental exposure with five levels (each with probability 0.2) were considered. Hardy-Weinberg equilibrium was assumed to hold for both loci. We set (i) σ2 = 8, ρ = 0.5 (or ) and (ii) σ2 = 16, ρ = 0.5 (or ). We also considered ρ = {0.2, 0.5, 0.8} for the power evaluation of AMMI1 models. Under each simulation setting, 1000 datasets were generated with 1800 and 3600 subjects for 3 × 3 and 9 × 5 tables, respectively. The number of repeated measurements per subject was generated from a multinomial distribution similar to the analysis dataset: nijk ∈ {2, 3, 4, 5, 6}, n = {nijk : 1 ≤ k ≤ Nij, 1 ≤ i ≤ I, 1 ≤ j ≤ J} ~ mult(N, p), p = (0.15, 0.2, 0.3, 0.2, 0.15). This is equivalent to generating outcome data missing completely at random.
RESULTS
Simulation Findings
Type I error
We generated data under an additive model, H0 :γij = 0 (while Ri, Cj ≠ 0) as well as under a completely null model, H0 : γij = Ri = Cj = 0 for all i, j. Figure 2 shows the percentage of false rejections for the five interaction models from 1000 simulations at 5% significance level. Under the additive model, the type I error rates for all models using LRT-CM and LRT-PB are maintained at the nominal 5%. Under the null model, type I error rates for models (a)–(e) using LRT-PB as well as for model (e) using LRT-CM are still maintained at 5%. LRT-CM for classical models (a)–(d), however, are either too liberal or too conservative (>12% for model (a), >8% for models (b) and (c), and <3% for model (d)).
Power
The gain in power using an AMMI1 model compared to a saturated interaction model increases as the table dimension increases. In the 3 × 3 array setting, saturated models (4 df for the interaction effects) appear to have similar power to AMMI1 models (data not shown). In the 9 × 5 array setting, AMMI1 models using LRT-CM and LRT-PB clearly have greater power than saturated models when the true interaction only has one interaction factor (Figure 3). The highest observed gain in power for AMMI1 using LRT-PB compared to the saturated model (32 df for the interaction effects) is 11% under three correlation settings. As ρ increases from 0.2 to 0.8, AMMI1 begins to show power gain across a wider range of d1.
Figure 4 shows the percentage of interactions detected by each test across a set of true simulation models. Overall, the power of LRT-PB is increased by 2–5% compared to LRT-CM. When Tukey’s model (a) is the true model, all other models are able to capture some interactions (70–82% when , 38–53% when ). Under Mandel’s column model (b), Tukey’s row-column (d) and AMMI1 (e) are able to detect the interaction (both power >99% when and >90% when ); whereas Tukey’s 1-df model (a) and Mandel’s row model (c) have very low power (both <50% with LRT-CM and <6% with LRT-PB). Similar properties are observed for simulations under Mandel’s row model (c). With Tukey’s row-column model (d) being the simulation model, all alternatives, except model (a), are able to detect the interaction with power greater than 60%. When the true model is an AMMI1 model (e), models (a)–(c) have relatively low power to detect interaction (<50% when and <32% when ). Saturated model has lower power than AMMI1 in most cases.
Figure 5 shows the percentages of interaction detected by six interaction models using LRT-PB under 12 common epistasis models. Given the robust performance of model (e), AMMI1 model appears to be a desirable approach for evaluating common epistasis structures, especially when main effects do not exist (e.g., epistasis models (10) and (12)).
We also compared our proposed methods with those in Barhdadi and Dubé [2010] in terms of type I error and power. As expected, the tests in Barhdadi and Dubé [2010] assuming balanced data structure and not accounting for within-subject correlation yield inflated type I error (especially for Tukey’s and Mandel’s models) and lower power (see Figure S.2 in the Supporting Information).
Application to the Normative Aging Study (NAS)
The NAS is a multidisciplinary longitudinal study initiated by the U.S. Veterans Administration in 1963 [Bell et al. 1966]. We analyzed 671 participants from a subset of the NAS data who were successfully genotyped for the HFE gene and had baseline measurements of tibia bone lead (a measure of cumulative lead exposure) [Zhang et al. 2010]. The analysis goal was to investigate effect modification by the different HFE alleles on the association between lead exposure and pulse pressure (PP), which is a strong predictor of heart problems for older adults. Since 1991, data had been collected every 3–5 years until 2011 with a median follow-up time of 12 years, including physical examination, blood pressure and laboratory measurements, and questionnaire data. The majority (97%) of the participants were Caucasian. The average age was 66.29 ± 7.14 (range 48–93) at the time of tibia bone lead measurement. More than 96% of subjects had repeated measurements on blood pressure, and over 65% of them had at least four measurements during the study period contributing to a total of 2914 observations.
Two major mutations in the HFE gene (C282Y and H63D mutations) were considered for analysis following Zhang et al. [2010]. Let (AA, Aa, aa) and (BB, Bb, bb) denote wild type, having one variant allele, and having two variant alleles for C282Y and H63D, respectively. As a result of small sample sizes in certain homozygote genotypes (N=5 for aaBB, N=17 for AAbb) and compound heterozygotes (N=14 for AaBb and N=0 for Aabb, aaBb, aabb), we were unable to test for interaction between the two loci. Since the research interest was to compare three mutually exclusive groups (wild type, H63D, C282Y), 14 subjects with compound heterozygotes (AaBb) were excluded from analysis. Consequently, the HFE genotypes were classified into three categories for analysis: AABB, AaBB or aaBB, and AABb or AAbb. The environmental exposure (cumulative lead) was a continuous variable, but to illustrate the proposed methods, we categorized bone lead levels into three groups (Low: ≤15, Medium: (15, 25], and High: >25 μg/g). Table 1 lists the observed cell means of PP and the number of participants for each G × E configuration.
Table 1.
HFE gene | Tibia lead levels (μg/g)
|
||
---|---|---|---|
Low: ≤ 15 | Medium: > 15 and ≤ 25 | High: > 25 | |
Wild-type (AABB) | 52.94 (161) | 56.16 (149) | 56.61 (131) |
C282Y (AaBB or aaBB) | 51.89 (23) | 56.65 (39) | 59.10 (23) |
H63D (AABb or AAbb) | 52.58 (54) | 57.72 (53) | 64.49 (38) |
We applied LRT-CM and LRT-PB to test this GEI effect. According to the Akaike information criterion (AIC) for model fit, we chose a random-intercept mixed-effects model for analysis:
(16) |
where is the random-effect coefficient for subject (i, j, k), is the random error term, and { } are assumed to be constant across individuals. We also considered the model adjusting for baseline age, time since baseline in years, and squared time.
(17) |
For LRT-CM adjusting for covariates, cell means were formed by the residuals from a regression of the outcome on covariates other than G and E. This is an ad hoc approach for covariate adjustment since correlations of covariates with G and E are ignored. In general, LRT-PB based on a full regression model with G, E, and covariates will yield more power.
Results for GEI
Table 2 shows the p-values for testing HFE × Lead Exposure interaction using LRT-CM and LRT-PB and the saturated interaction model. Using LRT-CM without adjustment for any covariate, the interaction was significant in all four classical models (p < 0.05), whereas (e) gave a p-value between 0.05 and 0.10. After adjusting for the covariates, model (e) detected the interaction using LRT-CM (p < 0.01). Regardless of covariate adjustment, the interaction was significant for models (a)–(e) using LRT-PB (p < 0.01), and also for the saturated interaction model (p < 0.02). P-values for the GEI effect decreased further for all tests with adjustment for baseline age, time since baseline, and squared time. Given the significant GEI on all models, this interaction may be real and not model dependent.
Table 2.
Model | Hypothesis |
p-value
|
|||
---|---|---|---|---|---|
LRT-CMa | LRT-CMb | LRT-PBa | LRT-PBb | ||
Model (a) | H0 : θ = 0 | 0.008 | 0.002 | 0.002 | 0.003 |
Model (b) | H0 : λi = 0 (Lead) | 0.029 | 0.007 | 0.009 | 0.008 |
Model (c) | H0 : ηj = 0 (HFE) | 0.015 | 0.002 | 0.002 | 0.001 |
Model (d) | H0 : θ = λi = ηj = 0 | 0.035 | 0.005 | 0.007 | 0.002 |
Model (e) | H0 : d1 = 0 | <0.10 | <0.01 | 0.009 | 0.002 |
Saturated Model | HFE × Lead | 0.015 | 0.006 |
No covariate adjustment.
Adjusting for baseline age, time since baseline, and squared time. For LRT-CM, residuals from a regression of pulse pressure on all other covariates except lead levels and genotype were used to form the cell means.
According to the SVD of Γ̂G×E under a saturated interaction model with covariate adjustment, the first and second characteristic roots of Γ̂G×E were d̂1 = 5.65 and d̂2 = 1.24, respectively (Table S.3). The first interaction factor contributed to over 80% of the total contribution to the interaction term. The association between PP and bone lead levels was strongest among H63D variant (AABb or AAbb) carriers, compared to C282Y variant (AaBB or aaBB) carriers and wild-type (AABB) participants. Based on the saturated interaction model estimates, the estimated difference in mean PP for H63D variant carriers with High versus Low lead levels was 9.6 mmHg [95% confidence interval (CI), 0.43–14.83 mmHg]. The same estimated differences were 3.52 mmHg [95% CI, −3.39–10.43 mmHg] and −0.33 mmHg [95% CI, −2.95–2.29 mmHg] for C282 variant carriers and wild-type participants, respectively. Although the underlying mechanisms are poorly understood, the common prooxidant and proinflammatory effects shared by lead exposure and H63D HFE, especially through increased production of monocyte chemo-attractant protein-1 [Zhang et al. 2010], may explain the observed H63D HFE-lead interaction.
Subject-specific and Age-specific Contributions to GEI
Using the estimates of singular vectors (α̂im, β̂jm), we investigated subject-specific and age-specific contributions to GEI in the first and the second interaction factors (m = 1, 2) via the sum of squared deviations [Mukherjee et al. 2012]. Briefly, the variation due to subject (i, j, k) can be calculated by d̂ijkm = α̂imβ̂jmr̂ijk., where r̂ijk. is the mean of nijk subject-level residuals from (11) after removing main effects. The variation in the contribution of subject (i, j, k) is then (d̂ijkm − d̂.m)2, where d̂.m = ΣiΣjΣk d̂ijkm/N. For the age-specific contribution, we constructed eight three-year age intervals. The first and last intervals contained observations from those who were younger than 65 and who were 83 or older, respectively. The cell means of PP and numbers of participants for genotypes and lead levels based on different age categories are presented in Figure S.3 in the Supporting Information. The contribution due to the tth age interval is calculated as d̂tm = ΣiΣj α̂imβ̂jmr̂tij., where r̂tij. is the average of residuals (11) in the tth age interval among individuals in the (i, j)th cell, t = 1, …, 8. The variation in the contribution of the tth interval is (d̂tm − d̂.m)2, where .
Figure 6 displays subject-specific contributions from the 671 individuals (left panel) and contributions of eight age intervals to the first interaction factor of GEI (right panel). The plot indicates that the modifying effect of the HFE gene on the effect of cumulative lead exposure on PP spiked around age 75. This was due to the fact that the mean difference in PP between the Low and the High bone lead groups became largest in that age interval with H63D (AABb or AAbb), whereas the difference in PP among those with wild-type of HFE (AABB) was the smallest. A stratified analysis by baseline age also indicates time-dependent evidence of interaction effects (see Supporting Information). Figure S.4 shows patterns corresponding to the second interaction factor with substantially less subject-specific and age-specific variability for the GEI. Although these graphical diagnostics do not establish a statistical significance of a G × E × Time term, they provide important insight into longitudinal features of the interaction factors.
DISCUSSION
We have proposed new likelihood ratio tests for GGI and GEI effects in longitudinal cohort studies using a sparse representation of interaction structure via Tukey’s and Mandel’s models as well as AMMI1 models. AMMI1 appears to be a robust and flexible model in detecting interaction effects across a spectrum of interaction structures. Moreover, it is relatively powerful in detecting certain epistasis structures with no appreciable main effects but potential interaction. In contrast, Tukey’s and Mandel’s models fail to detect interactions if the interaction structure is misspecified.
Both of our approaches require prior assumptions of the mean structure under the null hypothesis of no interaction and an underlying correlation structure for within-subject measurements. When either part of the model is misspecified, the power and the false rejection rate might be affected. Although this is a generic limitation of parametric modeling, we performed additional simulations to evaluate the influence of misspecification of covariance structure on the proposed tests. We generated data under several common correlation structures (e.g., compound symmetry, autoregressive, unstructured) and analyzed interactions assuming a different correlation structure. The results show that under a misspecification of covariance structure, type I error rates are maintained for LRT-PB but can be slightly inflated for LRT-CM.
In our simulation studies, we did not see a vast difference in the power between LRT-CM and LRT-PB. The correlation across repeated measurements in the LRT-CM approach is accounted for by the weight matrix W. Therefore, the test is not based on naive subject-level averages as in Mukherjee et al. [2012]. The main advantage of LRT-PB is the flexible regression structure that allows all readily available mixed model estimation tools to be used.
We have focused on developing valid tests for the five interaction models, yet there are some limitations. First, a caveat of the ML estimation for classical interaction models (a)–(d) based on cell means is that when the underlying main effects are relatively small, the estimation for interaction parameters would become numerically unstable. Depending on initial values, the final converged estimates might be local ML estimates instead of global ML estimates. Second, SVD of the estimated saturated interaction matrix yields approximate estimates rather than proper ML estimates for AMMI model parameters. Our simulation results indicate that this estimator leads to slight over-estimation of d1. Nevertheless, LRT-PB for AMMI1 still maintains the nominal type I error rate and in general possesses greater power than a saturated interaction model. In addition, how to connect the parameters of the AMMI model to directly interpretable quantities for biological interactions are not clear. At this stage, AMMI model remains a screening strategy for testing non-additivity. Third, covariate adjustment was not considered in our simulation studies. In practice, one can incorporate time effect and other (time-varying) covariates with LRT-CM and LRT-PB, as we did in our data example. Lastly, to our knowledge, no replication study has examined the interaction effect between the HFE gene and cumulative lead exposure on pulse pressure. We randomly split the data in half and analyzed the two halves for GEI as an assessment of internal consistency, and the results were consistent with our findings. As discussed in Zhang et al. [2010], the conclusion from the NAS data analysis may not be generalizable to other populations given that the study population was exclusively white men. Besides, unmeasured confounding factors and interactions with other genetic polymorphisms or environmental factors were not considered.
The proposed analysis strategies are useful for detecting GGI and GEI effects in longitudinal data. A full likelihood-based approach using a general nonlinear mixed-effects model set-up would be more appealing if the appropriate test statistics and their closed form null distributions can be obtained. Further work is required to investigate specialized nonlinear optimization algorithms in the ML framework to replace the two-step estimation and to construct a valid and more efficient test. It is also important to develop a formal test for individual- and time-specific contributions to interactions, which will ultimately lead to better understanding of GGI and GEI.
Supplementary Material
Acknowledgments
The research of BM was supported by the NSF grant DMS-1007494 and the NIH grants CA156608 and ES020811. SKP was supported by the NIEHS K01-ES016587. The VA Normative Aging Study is supported by the Cooperative Studies Program/Epidemiology Research and Information Center of the U.S. Department of Veterans Affairs and is a component of the Massachusetts Veterans Epidemiology Research and Information Center, Boston, Massachusetts. The authors will like to thank Dr. Joel Schwartz, Dr. Howard Hu, and NAS participants for sharing the data resources.
Footnotes
We declare that we have no conflict of interest.
Contributor Information
Yi-An Ko, Department of Biostatistics, University of Michigan.
Paramita Saha-Chaudhuri, Department of Biostatistics and Bioinformatics, Duke University.
Sung Kyun Park, Department of Epidemiology, University of Michigan.
Pantel Steve Vokonas, VA Normative Aging Study, Veterans Affairs Boston Healthcare System. Department of Epidemiology, Boston University School of Public Health.
Bhramar Mukherjee, Department of Biostatistics, University of Michigan.
References
- Barhdadi A, Dubé M. Testing for gene-gene interaction with AMMI models. Statistical Applications in Genetics and Molecular Biology. 2010;9:1–27. doi: 10.2202/1544-6115.1410. [DOI] [PubMed] [Google Scholar]
- Bell B, Rose C, Damon A. The veterans administration longitudinal study of healthy aging. The Gerontologist. 1966;6:179–184. doi: 10.1093/geront/6.4.179. [DOI] [PubMed] [Google Scholar]
- Boik R. Reduced-rank models for interaction in unequally replicated two-way classifications. Journal of Multivariate Analysis. 1989;28:69–87. [Google Scholar]
- Bookman E, McAllister K, Gillanders E, Wanke K, Balshaw D, Rutter J, Reedy J, Shaughnessy D, Agurs-Collins T, Paltoo D, et al. Gene-environment interplay in common complex diseases: forging an integrative model–recommendations from an NIH workshop. Genetic Epidemiology. 2011;35:217–225. doi: 10.1002/gepi.20571. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chatterjee N, Kalaylioglu Z, Moslehi R, Peters U, Wacholder S. Powerful multilocus tests of genetic association in the presence of gene-gene and gene-environment interactions. The American Journal of Human Genetics. 2006;79:1002–1016. doi: 10.1086/509704. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Crossa J, Gauch H, Zobel R. Additive main effects and multiplicative interaction analysis of two international maize cultivar trials. Crop Science. 1990;30:493–500. [Google Scholar]
- Culverhouse R, Klein T, Shannon W. Detecting epistatic interactions contributing to quantitative traits. Genetic Epidemiology. 2004;27:141–152. doi: 10.1002/gepi.20006. [DOI] [PubMed] [Google Scholar]
- Fan R, Albert P, Schisterman E. A discussion of gene–gene and gene–environment interactions and longitudinal genetic analysis of complex traits. Statistics in Medicine. 2012;31:2565–2568. doi: 10.1002/sim.5495. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Freeman G. Statistical methods for the analysis of genotype-environment interactions. Heredity. 1973;31:339–354. doi: 10.1038/hdy.1973.90. [DOI] [PubMed] [Google Scholar]
- Gabriel K, Zamir S. Lower rank approximation of matrices by least squares with any choice of weights. Technometrics. 1979;21:489–498. [Google Scholar]
- Gollob H. A statistical model which combines features of factor analytic and analysis of variance techniques. Psychometrika. 1968;33:73–115. doi: 10.1007/BF02289676. [DOI] [PubMed] [Google Scholar]
- Hanumara R, Thompson W., Jr Percentage points of the extreme roots of a wishart matrix. Biometrika. 1968;55:505–512. [Google Scholar]
- Jung J, Sun B, Kwon D, Koller D, Foroud T. Allelic-based gene-gene interaction associated with quantitative traits. Genetic Epidemiology. 2009;33:332–343. doi: 10.1002/gepi.20385. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liang K, Zeger S. Longitudinal data analysis using generalized linear models. Biometrika. 1986;73:13–22. [Google Scholar]
- Lin X, Zhang D. Inference in generalized additive mixed modelsby using smoothing splines. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 1999;61:381–400. [Google Scholar]
- Maity A, Carroll R, Mammen E, Chatterjee N. Testing in semiparametric models with interaction, with applications to gene–environment interactions. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 2009;71:75–96. doi: 10.1111/j.1467-9868.2008.00671.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mandel J. Non-additivity in two-way analysis of variance. Journal of the American Statistical Association. 1961;56:878–888. [Google Scholar]
- Mandel J. A new analysis of variance model for non-additive data. Technometrics. 1971;13:1–18. [Google Scholar]
- Milliken G, Johnson D. Analysis of Messy Data, Volume II: Nonreplicated Experiments. Van Nostrand Reinhold; New York: 1989. [Google Scholar]
- Mukherjee B, Ko Y, VanderWeele T, Roy A, Park S, Chen J. Principal interactions analysis for repeated measures data: application to gene–gene and gene–environment interactions. Statistics in Medicine. 2012;31:2531–2551. doi: 10.1002/sim.5315. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nocedal J, Wright S. Numerical Optimization. Springer Verlag; New York: 1999. [Google Scholar]
- Patterson H, Thompson R. Recovery of inter-block information when block sizes are unequal. Biometrika. 1971;58:545–554. [Google Scholar]
- R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing; Vienna, Austria: 2012. [Google Scholar]
- Tukey J. One degree of freedom for non-additivity. Biometrics. 1949;5:232–242. [Google Scholar]
- Tukey J. The future of data analysis. The Annals of Mathematical Statistics. 1962;33:1–67. [Google Scholar]
- Wong M, Day N, Luan J, Chan K, Wareham N. The detection of gene–environment interaction for continuous traits: should we deal with measurement error by bigger studies or better measurement? International Journal of Epidemiology. 2003;32:51–57. doi: 10.1093/ije/dyg002. [DOI] [PubMed] [Google Scholar]
- Zhang A, Park S, Wright R, Weisskopf M, Mukherjee B, Nie H, Sparrow D, Hu H. HFE H63D polymorphism as a modifier of the effect of cumulative lead exposure on pulse pressure: the Normative Aging Study. Environmental Health Perspectives. 2010;118:1261–1266. doi: 10.1289/ehp.1002251. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zobel R, Wright M, Gauch H. Statistical analysis of a yield trial. Agronomy Journal. 1988;80:388–393. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.