Abstract
In genetic association analysis, a joint test of multiple distinct phenotypes can increase power to identify sets of trait-associated variants within genes or regions of interest. Existing multi-phenotype tests for rare variants make specific assumptions about the patterns of association with underlying causal variants and the violation of these assumptions can reduce power to detect association. Here we develop a general framework for testing pleiotropic effects of rare variants on multiple continuous phenotypes using multivariate kernel regression (Multi-SKAT). Multi-SKAT models effect sizes of variants on the phenotypes through a kernel matrix and performs a variance component test of association. We show that many existing tests are equivalent to specific choices of kernel matrices with the Multi-SKAT framework. To increase power of detecting association across tests with different kernel matrices, we developed a fast and accurate approximation of the significance of the minimum observed p-value across tests. To account for related individuals, our framework uses random effects for the kinship matrix. Using simulated data and amino acid and exome-array data from the METSIM study, we show that Multi-SKAT can improve power over single-phenotype SKAT-O test and existing multiple phenotype tests, while maintaining type I error rate.
Keywords: Multiple phenotypes, Gene-based test, Rare-variants, Pleiotropy, Phenotype kernel, SKAT, Copula, Related individuals, METSIM study
Introduction
Since the advent of array genotyping technologies, genome-wide association studies (GWAS) have identified numerous genetic variants associated with complex traits. Despite these many discoveries, GWAS loci explain only a modest proportion of heritability for most traits. This may be due, in part, to the fact that these association studies are underpowered to identify associations with rare variants(Korte and Farlow, 2013). To identify such rare variant associations, gene- or region-based multiple variant tests have been developed(Lee et al., 2014). By jointly testing rare variants in a target gene or region, these methods can increase power over a single variant test and are now used as a standard approach in rare variant analysis.
Recent GWAS results have shown that many GWAS loci are associated with multiple traits (Solovieff et al., 2013). Nearly 17% of variants in National Heart Lung and Blood Institute (NHLBI) GWAS categories are associated with multiple traits (Sivakumaran et al., 2011). For example, 44% of autoimmune risk single nucleotide polymorphisms (SNPs) have been estimated to be associated with two or more autoimmune diseases (Cotsapas et al., 2011). Detecting such pleiotropic effects is important to understand the underlying biological structure of complex traits. In addition, by leveraging cross-phenotype associations, the power to detect trait-associated variants can be increased.
Identifying the cross-phenotype effects requires a suitable joint or multivariate analysis framework that can leverage the dependence of the phenotypes. Various methods have been proposed for multiple phenotype analysis in GWAS (Ferreira and Purcell, 2009; Huang et al., 2011; Zhou and Stephens, 2014; Ried et al., 2012; Ray et al., 2016). Extending them, several groups have developed multiple phenotype tests for rare variants (Wang et al., 2015; Broadaway et al., 2016; Wu and Pankow, 2016; Lee et al., 2016; Sun et al., 2016; Maity et al., 2012; Yan et al., 2015; Zhan et al., 2017). For example, Wang et al. (2015) proposed a multivariate functional linear model (MFLM); Broadaway et al. (2016) used a dual-kernel based distance-covariance approach to test for cross phenotype effects of rare variants by comparing similarity in multivariate phenotypes to similarity in genetic variants (GAMuT)(Chiu et al., 2017); Wu et al. (Wu and Pankow, 2016) developed a score based sequence kernel association test for multiple traits, MSKAT, which has been shown to be similar in performance to GAMuT(Broadaway et al., 2016); and Zhan et al. (2017) proposed DKAT, which uses the dual kernel approach as in GAMuT but provides more robust performance when the dimension of phenotypes is high compared to the sample size.
Despite these developments, existing methods have important limitations. Most methods were developed under specific assumptions regarding the effects of the variants on multiple phenotypes, and hence lose power if the assumptions are violated (Ray et al., 2016). For example, if genetic effects are heterogeneous across multiple phenotypes, methods assuming homogeneous genetic effects can lose a substantial amount of power. Although there has been a recent attempt to combine analysis results from different models (Zhan et al., 2017), no scalable methods have been developed to evaluate the significance of the combined results in genome-wide scale analysis. In addition, most existing methods and software cannot adjust for relatedness between individuals; thus, to apply these methods, related individuals must be removed from the analysis to maintain type I error rate. For example, in the METabolic Syndrome In Men (METSIM) study ~ 15% of individuals are estimated to be related up to the second degree.
Here, we develop Multi-SKAT, a general framework that extends the mixed effect model-based kernel association tests to a multivariate regression framework while accounting for family relatedness. Mixed effect models have been widely used for rare-variant association tests. Popular rare variant tests such as SKAT(Wu et al., 2011) and SKAT-O(Lee et al., 2012b) are based on mixed effect models. By using kernels to relate genetic variants to multiple continuous phenotypes, Multi-SKAT allows for flexible modeling of the genetic effects on the phenotypes. The idea of using kernels for genotypes and phenotypes were previously used by the dual kernel approaches such as GAMuT and DKAT. However, in contrast to these two similarity-based methods, Multi-SKAT is multivariate regression based and hence provides a natural way to adjust for covariates and also can account for sample relatedness by incorporating random effects for the kinship matrix. Many of the existing methods for multiple phenotype rare variant tests can be viewed as special cases of Multi-SKAT with particular choices of kernels. Furthermore, to avoid loss of power due to model misspecification, we develop computationally efficient omnibus tests, which allow for aggregation of tests over several kernels and provide fast p-value calculation (Demarta and McNeil, 2005).
The article is organized as follows: in the first section, we present the multivariate mixed effect model and kernel matrices. We particularly focus on the phenotype-kernel and describe omnibus procedures that can aggregate results across different choices of kernels and kinship adjustment. In the next section we describe the simulation experiments that clearly demonstrate that Multi-SKAT tests have increased power to detect associations compared to existing methods like GAMuT, MSKAT and others in most of the scenarios. Further we applied Multi-SKAT to detect the cross-phenotype effects of rare nonsynonymous and protein-truncating variants on a set of nine amino acids measured on 8,545 Finnish men from the METSIM study.
Material and Methods
Single-phenotype region-based tests
To describe the Multi-SKAT tests, we first present the existing model of the single-phenotype gene or region-based tests. Let yk = (y1k, y2k, ⋯ , ynk)T be an n × 1 vector of the kth phenotype over n individuals; X an n × q matrix of the q non-genetic covariates including the intercept; Gj = (Gij, ⋯, Gnj)T is an n × 1 vector of the minor allele counts (0, 1, or 2) for a binary genetic variant j; and G = [G1, ⋯, Gm] is an n × m genotype matrix for m genetic variants in a target region. The regression model shown in equation (1) can relate m genetic variants to phenotype k,
(1) |
where αk is a q × 1 vector of regression coefficients of q non-genetic covariates, β.k = (β1k, ⋯, βmk)T is an m × 1 vector of regression coefficients of the m genetic variants, and ϵk is an n × 1 vector of non-systematic error term with each element following . To test for H0: β.k = 0, a variance component test under the mixed effects model have been proposed to increase power over the usual F-test(Wu et al., 2011). The variance component test assumes that the regression coefficients, β.k, are random variables and follow a centered distribution with variance τ2ΣG (see below). Under these assumptions, the test for β.k = 0 is equivalent to testing τ = 0. The score statistic for this test is
(2) |
where is the estimated mean of yk under the null hypothesis of no association. The test statistic Q asymptotically follows a mixture of chi-squared distributions under the null hypothesis and p-values can be computed by inverting the characteristic function (Davies, 1980).
The kernel matrix ΣG plays a critical role; it models the relationship among the effect sizes of the variants on the phenotypes. Any positive semidefinite matrix can be used for ΣG providing a unified framework for the region-based tests. A frequent choice of ΣG is a sandwich type matrix ΣG = WRGW, where W = diag(w1, .., wm) is a diagonal weighting matrix for each variant, and RG is a correlation matrix between the effect sizes of the variants. RG = Im×m implies uncorrelated effect sizes and corresponds to SKAT, and RG = 1m1mT corresponds to the burden test, where Im×m is an m × m diagonal matrix and 1m = (1, ⋯ 1)T is an m × 1 vector with all elements being unity. Furthermore, a linear combination of these two matrices corresponds to RG = ρ1m1mT + (1 – ρ)Im×m, which is used for SKAT-O (Lee et al., 2012b).
Multiple-phenotype region-based tests
Extending the idea of using kernels, we build a model for multiple phenotypes. The multivariate linear model shown in equation (3) can relate genetic variants to K correlated phenotypes,
(3) |
where Y = (y1, ⋯ , yK) is an n × K phenotype matrix; A is a q × K matrix of coefficients of X; B = (βij) is an m × K matrix of coefficients where βij denotes the effect of the ith variant on the jth phenotype and E is an n × K matrix of non-systematic errors. Let vec(·) denote the matrix vectorization function, and then vec(E) follows N(0, In ⊗ V), where V is a K × K covariance matrix and ⊗ represents the Kronecker product.
In addition to assuming that β.k follows a centered distribution with covariance τ2ΣG, we further assume that βi. = (βi1, ⋯ , βiK)T, which is the vector of regression coefficients of variant i for K multiple phenotypes, follows a centered distribution with covariance τ2ΣP, which implies that vec(B) follows a centered distribution with covariance τ2ΣG ⊗ ΣP. As before, the null hypothesis H0 : vec(B) = 0 is equivalent to τ = 0. The corresponding score test statistic is
(4) |
where and are the estimated mean and covariance of Y under the null hypothesis.
ΣP plays a similar role as ΣG but with respect to phenotypes. ΣP represents a kernel in the phenotypes space and models the relationship among the effect sizes of a variant on each of the phenotypes. Any positive semidefinite matrix can be used as ΣP.
The proposed approach provides a double-flexibility in modeling. Through the choice of structures for ΣG and ΣP, we can control the dependencies of genetic effects. Additionally, similar to SKAT, the use of a sandwich type matrix WRGW for ΣG allows us to upweight rare variants by using Beta(1, 25) weights as in Wu et al (Wu et al., 2011). Most of our hypotheses about the underlying genetic structure of a set of phenotypes can be modeled through varying structures of these two matrices.
Phenotype kernel structure ΣP
The use of ΣG has been extensively studied previously in literature (Wu et al., 2011; Lee et al., 2012b,a). Here we propose several choices for ΣP and study their effect from a modeling perspective.
Homogeneous (Hom) Kernel
It is possible that effect sizes of a variant on different phenotypes are homogeneous, in which case βj1 = ⋯ = βjK. Under this assumption,
(5) |
Under ΣP,Hom, the effect sizes βjk, (k = 1, ⋯ K) for a variant j are the same for all the phenotypes.
Heterogeneous (Het) Kernel
Effect sizes of a variant on different phenotypes can be heterogeneous in which βj1 ≠ ⋯ ≠ βjK. Under this assumption, we can construct
(6) |
The ΣP,Het implies that the effect sizes () are uncorrelated among themselves. This also indicates that the correlation among the phenotypes is not affected by this particular region or gene.
Phenotype Covariance (PhC) Kernel
We may model ΣP as proportional to the estimated residual covariance across the phenotypes as,
(7) |
where is the estimated covariance matrix among the phenotypes. This model assumes that the correlation between the effect sizes is proportional to that between the residual phenotypes after adjusting for the non-genetic covariates.
Principal Component (PC) Kernel
Principal component analysis (PCA) is a popular tool for multivariate analysis. In multiple phenotype tests, PC-based approaches have been used to reduce the dimension in phenotypes (Aschard et al., 2014). Here we show that PC-based approach can be included in our framework. Let L = (L1, ⋯ , LK) be the loading matrix with each column Li produces the ith PC score. In Appendix A, we show that using is equivalent to assuming heterogeneous effects with all PCs as phenotypes. Instead of using all the PC’s, we can use selected PC’s that represent the majority of cumulative variation in phenotypes. For example, we can jointly test the PC’s that have cumulative variance of 90%. If the top t PC’s have been chosen for analysis using ν% cumulative variance as cutoff, we can use
where Lsel = [L1, ⋯ , Lt, 0, ⋯ , 0] and 0 represents a vector of 0’s of appropriate length.
Relationship with other Multiple-Phenotype rare variant tests
We have proposed a uniform framework of Multi-SKAT tests that depend on ΣG and ΣP. There are certain specific choices of kernels that correspond to other published methods.
Using ΣP,PhC and ΣG = WImWT is identical to the GAMuT(Broadaway et al., 2016) with the projection phenotype kernel and the MSKAT with the Q statistic (Wu and Pankow, 2016).
Using and ΣG = WImWT is identical to GAMuT(Broadaway et al., 2016) with the linear phenotype kernel and the MSKAT with the Q′ statistic (Wu and Pankow, 2016).
Using ΣP,Hom and ΣG = WImWT is identical to hom-MAAUSS (Lee et al., 2016).
Using ΣP,Het and ΣG = WImW T is identical to het-MAAUSS (Lee et al., 2016) and MF-KM (Yan et al., 2015).
For the detailed proof, please see Appendix B.
Minimum p-value based omnibus tests (minP & minPcom)
The model and the corresponding test of association that we propose has two parameters, ΣG and ΣP, which are absent in the null model of no association. Since our test is a score test, ΣG and ΣP cannot be estimated from the data. One possible approach is to select ΣG and ΣP based on prior knowledge; however, if the selected ΣG and ΣP do not reflect underlying biology, the test may have substantially reduced power (Ray et al., 2016; Lee et al., 2016). In an attempt to achieve robust power, we aggregate results across different ΣG and ΣP using the minimum of p-values from different kernels.
Although this omnibus test approach has been used in rare variant tests and multiple phenotype analysis for combining multiple kernels from genotypes and phenotypes (Zhan et al., 2017; Wu et al., 2013; Urrutia et al., 2015; He et al., 2017), it is challenging to calculate the p-value, since the minimum p-value does not follow the uniform distribution. One possible approach is using permutation or perturbation to calculate the monte-carlo p-value (Urrutia et al., 2015; Zhan et al., 2017); however, this approach is computationally too expensive to be used in genome-wide analysis. To address it, here we propose a fast copula based p-value calculation for Multi-SKAT, which needs only a small number of resampling steps to calculate the p-value.
Suppose ph is the p-value for Qh with given hth ΣG and ΣP , h = 1, ⋯ , b, and TP = (p1, ⋯ , pb)T is an b × 1 vector of p-values of b such Multi-SKAT tests. The minimum p-value test statistic after the Bonferroni adjustment is b × pmin, where pmin is the minimum of the b p-values. In the presence of positive correlation among the tests, this approach is conservative and hence might lack power of detection. Rather than using Bonferroni corrected pmin, more accurate results can be obtained if the joint null distribution or more specifically the correlation structure of TP can be estimated. Here we adopt a resampling based approach to estimate this correlation structure. Note that our test statistic is equivalent to
(8) |
where . Under the null hypothesis S approximately follows an uncorrelated normal distribution N(0, InK). Using this, we propose the following resampling algorithm
Step 1. Generate nK samples from an N(0, 1) distribution, say SR.
Step 2. Calculate b different test statistics as for all the choices of ΣP and calculate p-values.
Step 3. Repeat the previous steps independently for R(= 1000) iterations, and calculate the correlation between the p-values of the tests from the R resampling p-values.
With the estimated null correlation structure, we use a Copula to approximate the joint distribution of TP (Demarta and McNeil, 2005; He et al., 2017). Copula is a statistical approach to construct joint multivariate distribution using marginal distribution of each variable and correlation structure. Since marginally each test statistic Q follows a mixture of chi-square distributions, which has a heavier tail than normal distribution, we propose to use a t-Copula to approximate the joint distribution, i.e, we assume the joint distribution of TP to be multivariate t with the estimated correlation structure. The final p-value for association is then calculated from the distribution function of the assumed t-Copula.
When calculating the correlation across the p-values, Pearson’s correlation coefficient can be unreliable since it depends on normality and homoscedasticity assumptions. To avoid such assumptions we recommend estimating the null-correlation matrix of the p-values through Kendall’s tau (τ), which is a non-parametric approach based on concordance of ranks.
The minimum p-value approach can be used to combine different ΣP given ΣG, or combine both ΣP and ΣG. For example two ΣG’s corresponding to SKAT (WW) and Burden kernels () and four ΣP’s (ΣP,Hom, ΣP,Het, ΣP,PhC, ΣP,PC–09) can be comnibed, which results in the omnibus test of these eight different tests. To differentiate the latter, we will call it minPcom which combines SKAT and Burden type kernels of ΣG.
Adjusting for relatedness
We formulated equation (3) and corresponding tests under the assumption of independent individuals. If individuals are related, this assumption is no longer valid, and the tests may have inflated type I error rate. Since our method is regression-based, we can relax the independence assumption by introducing a random effect term to account for the relatedness among individuals.
Let Φ be the kinship matrix of the individuals and Vg is a co-heritability matrix, denoting the shared heritability between the phenotypes. Extending the model presented in equation(3), we incorporate Φ and Vg as
(9) |
where Z is an n × K matrix with vec(Z) following N(0, Φ ⊗ Vg). Z represents a matrix of random effects arising from shared genetic effects between individuals due to the relatedness. The remaining terms are the same as in equation (3). The corresponding score test statistic is
(10) |
where and is the estimated covariance matrix of vec(Y) under the null hypothesis. Similar to the previous versions for unrelated individuals, QKin asymptotically follows a mixture of chi-square under the assumption of no association.
This approach depends on the estimation of the matrices Φ, Vg and V. The kinship matrix Φ can be estimated using the genome-wide genotype data (Manichaikul et al., 2010). Several of the published methods like LD-Score(Bulik-Sullivan et al., 2015), PHENIX(Dahl et al., 2016) and GEMMA(Zhou and Stephens, 2014; Zhou et al., 2013) can jointly estimate Vg and V. In our numerical analysis, we have used PHENIX. This is an efficient method to fit local maximum likelihood variance components in a multiple phenotype mixed model through an E-M algorithm.
Once the matrices Φ, Vg and V are estimated, we compute the asymptotic p-values for QKin by using a mixture of chi-square distribution. The computation of QKin requires large matrix multiplications, which can be time and memory consuming. To reduce computational burden, we employ several transformations. We perform an eigen-decomposition on the kinship matrix Φ as Φ = UΛUT, where U is an orthogonal matrix of eigenvectors and Λ is a diagonal matrix of corresponding eigenvalues. We obtain the transformed phenotype matrix as , the transformed covariate matrix as , the transformed random effects matrix and transformed residual error matrix . Rquation (9) can be transformed into
(11) |
All the properties of the tests developed from equation (3) are directly applicable to those from equation(11). QKin can be computed from this transformed equation as,
(12) |
where , is the estimated mean of under the null hypothesis and . Asymptotic p-values can be obtained from the corresponding mixture of chi-squares distribution. Further, omnibus strategies for the tests developed from equation (3) are applicable in this case with similar modifications. For example, the resampling algorithm for minimum p-value based omnibus test can be implemented here as well by noting that approximately follows an uncorrelated normal distribution.
Simulations
We carried out extensive simulation studies to evaluate the type I error control and power of Multi-SKAT tests. For type I error simulations without related individuals and all power simulations, we generated 10,000 chromosomes over 1Mbp regions using a coalescent simulator with European demographic model (Schaffner et al., 2005). The MAF spectrum of the simulated variants is shown in Supplementary Figure S6, showing that most of the variants are rare variants. Since the average length of the gene is 3 kbps we randomly selected a 3 kbps region for each simulated dataset to test for associations. For the type I error simulations with related individuals, to have a realistic kinship structure, we used the METSIM study genotype data.
Phenotypes were generated from the multivariate normal distribution as
(13) |
where yi = (yi1, ⋯ , yiK)T is the outcome vector, Gj is the genotype of the jth variant, and βj is the corresponding effect size, and V is a covariance of the non-systematic error term. We use V to define level of covariance between the traits. I is a k × 1 indicator vector, which has 1 when the corresponding phenotype is associated with the region and 0 otherwise. For example, if there are 5 phenotypes and the last three are associated with the region, I = (0, 0, 1, 1, 1)T.
To evaluate whether Multi-SKAT can control type I error under realistic scenarios, we simulated a dataset with 9 phenotypes with a correlation structure identical to that of 9 amino acid phenotypes in the METSIM data (See Supplementary Figure S1). Phenotypes were generated using equation (13) with β = 0. Total 5, 000, 000 datasets with 5, 000 individuals were generated to obtain the empirical type-I error rates at α = 10−4, 10−5 and 2.5 × 10−6, which are corresponding to candidate gene studies to test for 500 and 5000 genes and exome-wide studies to test for all 20, 000 protein coding genes, respectively.
Next, we evaluated type I error controls in the presence of related individuals. To have a realistic related structure we used the METSIM study genotype data. We generated a random subsample of 5000 individuals from the METSIM study individuals and generated null values for the 9 phenotypes from MVN(0, Ve), where is the estimated kinship matrix of the 5000 selected individuals, and are estimated co-heritability and residual variance matrices respectively for these individuals as estimated using the MPMM function in the PHENIX R-package (version 1.0). For each set of 9 phenotypes, we performed the Multi-SKAT tests for a randomly selected 5000 genes in the METSIM data. For the details about the data, see next section. We carried out this procedure 1000 times and obtained 5, 000, 000 p-values, and estimated type I error rate as proportions of p-values smaller than the given level α.
Our simulation studies focus on evaluating the power of the proposed tests when the number of phenotypes are 5 or 6. Power simulations were performed both in situations when there was no pleiotropy (i.e., only one of the phenotypes was associated with the causal variants) and also when there was pleiotropy. Under pleiotropy, since it is unlikely that all the phenotypes are associated with genotypes in the region, we varied the number of phenotypes associated. For each associated phenotype, 30% or 50% of the rare variants (MAF < 1%) were randomly selected to be causal variants. We modeled the rarer variants to have stronger effect, as ∣βj∣ = c∣log10(MAFj)∣. We used c = 0.3 which yields ∣βj∣ = 0.9 for variants with MAF= 10−3. Our choice of β yielded the average heritability of associated phenotypes between 1% to 4%. We also considered situations that all causal variants were trait-increasing variants (i.e. positive β) or 20 % of causal variants were trait-decreasing variants (i.e. negative β). Empirical power was estimated from 1000 independent datasets at exome-wide α = 2.5 × 10−6.
In type I error and power simulations, we compared the following tests:
Bonferroni adjusted minimum p-values from gene-based test (SKAT, Burden or SKAT-O) on each phenotype (minPhen)
Multi-SKAT with ΣP,Hom (Hom)
Multi-SKAT with ΣP,Het (Het)
Multi-SKAT with ΣP,PhC (PhC)
Multi-SKAT with ΣP,PC–0.9 (PC-Sel)
Minimum P-value of Hom, Het, PhC and PC-Sel using Copula (minP)
Minimum P-value of Hom, Het, PhC and PC-Sel with ΣG being SKAT and Burden, using Copula (minPcom)
For the Multi-SKAT tests, we used two different ΣG’s corresponding to SKAT (i.e. ΣG = WW) and Burden tests (i.e. ). For the variant weighting matrix W = diag(w1, ⋯ , wm), we used wj = Beta(MAFj, 1, 25) function to upweight rarer variants, as recommended by Wu et al. (Wu et al., 2011).
Computation Time
We estimated the computation time of Multi-SKAT tests and the existing methods. Using simulated datasets of 5000 related and unrelated individuals with 10 phenotypes and 20 genetic variants, we estimated the computation time of Multi-SKAT tests with and without kinship adjustments. To compare the computation performance of Multi-SKAT tests with the existing methods, we generated datasets of unrelated individuals with five different sample sizes (n = 1000, 2000, 5000, 10000, 15000 and 20000) and four different number of variants (m = 10, 20, 50, 100). For each simulation setup, we generated 100 datasets and obtained the average value of the computation time.
Analysis of the METSIM study exomechip data
To investigate the cross-phenotype roles of low frequency and rare variants on amino acids, we analyzed data on 8545 participants of the METSIM study on whom all 9 amino acids (Alanine, Leucine, Isoleucine, Glycine,Valine, Tyrosine, Phenylalanine, Glutamine, Histidine) were measured by proton nuclear magnetic resonance spectroscopy(Teslovich et al., 2018). Individuals were genotyped on the Illumina ExomeChip and OmniExpress arrays and we included individuals that passed sample QC filters (Huyghe et al., 2013). The kinship between the individuals was estimated via KING (version 2.0) (Manichaikul et al., 2010). We adjusted the amino acid levels for age, age2 and BMI and inverse-normalized the residuals. The phenotype correlation matrix after covariate adjustment is shown in Figure 3 and Supplementary Figure S1. Subsequently, we estimated the genetic heritability matrix and the residual covariance matrix using the MPMM function from PHENIX (Dahl et al., 2016) R package.
We included rare (MAF < 1%) nonsynonymous and protein-truncating variants with a total rare minor allele count of at least 5 for genes that had at least 3 rare variants leaving 5207 genes for analysis. We set a stringent significance threshold at 9.6 × 10−6 corresponding to the Bonferroni adjustment for 5207 genes. Further, we also considered a less stringent threshold of 10−4, corresponding to a candidate gene study of 500 genes, as suggestive to study the associations which were not significant but close to the threshold.
Results
Type I Error simulations
We estimated empirical type I error rates of the Multi-SKAT tests with and without related individuals. For unrelated individuals, we simulated 5, 000 individuals and 9 phenotypes based on the correlation structure for the amino acids phenotypes in the METSIM study data. For related individuals, we simulated 5, 000 individuals using the kinship matrix for randomly chosen METSIM individuals (see the Method section). We performed association tests and estimated type I error rate as the proportion of p-values less than the specified α levels. Type I error rates of the Multi-SKAT tests were well maintained at α = 10−4, 10−5 and 2.5 × 10−6 for both unrelated and related individuals (Table 1), which correspond to candidate gene studies of 500 and 5000 genes and exome-wide studies to test for all 20, 000 protein coding genes, respectively. For example, at level α = 2.5 × 10−6, the largest empirical type I error rate from any of the Multi-SKAT tests was 3.4 × 10−6, which was within the 95% confidence interval (CI = (1.6 × 10−6, 4 × 10−6)).
Table 1:
Level | Σ G | minPhen | Hom | Het | PhC | PC-Sel | minP | minPcom | |
---|---|---|---|---|---|---|---|---|---|
2.5 × 10−06 | SKAT | 2.4 × 10−06 | 2.6 × 10−06 | 2.6 × 10−06 | 2.8 × 10−06 | 2.6 × 10−06 | 3.4 × 10−06 | 2.8 × 10−06 | |
Burden | 2.4 × 10−06 | 2.8 × 10−06 | 2.6 × 10−06 | 2.6 × 10−06 | 2.6 × 10−06 | 3.0 × 10−06 | |||
Independent samples (without kinship adjustment) |
10−05 | SKAT | 9.6 × 10−06 | 9.6 × 10−06 | 9.8 × 10−06 | 9.8 × 10−06 | 9.8 × 10−06 | 9.4 × 10−06 | 1.2 × 10−05 |
Burden | 9.4 × 10−06 | 9.6 × 10−06 | 9.8 × 10−06 | 9.6 × 10−06 | 9.6 × 10−06 | 9.6 × 10−06 | |||
10−04 | SKAT | 9.6 × 10−05 | 9.4 × 10−05 | 9.6 × 10−05 | 9.8 × 10−05 | 9.8 × 10−05 | 9.8 × 10−05 | 1.1 × 10−04 | |
Burden | 9.8 × 10−05 | 9.6 × 10−05 | 9.4 × 10−05 | 9.7 × 10−05 | 9.6 × 10−05 | 9.7 × 10−05 | |||
2.5 × 10−06 | SKAT | 2.2 × 10−06 | 2.4 × 10−06 | 2.4 × 10−06 | 2.8 × 10−06 | 2.6 × 10−06 | 3.0 × 10−06 | 2.6 × 10−06 | |
Burden | 2.2 × 10−06 | 2.6 × 10−06 | 2.4 × 10−06 | 2.6 × 10−06 | 2.4 × 10−06 | 3.2 × 10−06 | |||
Related samples (with kinship adjustment) |
10−05 | SKAT | 9.6 × 10−06 | 9.8 × 10−06 | 9.8 × 10−06 | 9.8 × 10−06 | 9.8 × 10−06 | 9.4 × 10−06 | 1.4 × 10−05 |
Burden | 9.7 × 10−06 | 9.6 × 10−06 | 9.4 × 10−06 | 9.4 × 10−06 | 9.4 × 10−06 | 9.4 × 10−06 | |||
10−04 | SKAT | 9.4 × 10−05 | 9.6 × 10−05 | 9.7 × 10−05 | 9.8 × 10−05 | 9.4 × 10−05 | 9.8 × 10−05 | 1.2 × 10−04 | |
Burden | 9.5 × 10−05 | 9.7 × 10−05 | 9.6 × 10−05 | 9.8 × 10−05 | 9.8 × 10−05 | 9.7 × 10−05 |
Power simulations
We compared the empirical power of the minPhen (Bonferroni adjusted minimum p-value for the phenotypes) and Multi-SKAT tests. For each simulation setting, we generated 1, 000 sequence datasets of 5, 000 unrelated individuals and for each test estimated empirical power as the proportion of p-values less than α = 2.5×10−6, reflecting Bonferroni correction for testing 20, 000 independent genes. Since the Hom and Het tests are identical to hom-MAAUSS and het-MAAUSS, respectively, and using PhC is identical to both GAMuT (with projection phenotype kernel) and MSKAT, our power simulation studies effectively compare the existing multiple phenotype tests.
In Figure 1, we show the results for 5 phenotypes with compound symmetric correlation structure with the correlation 0.3 or 0.7, where 30% of rare variants (MAF < 0.01) were positively associated with 1, 2 or 3 phenotypes. Since it is unlikely that all the phenotypes are associated with the region, we restricted the number of associated phenotypes to at most 3. In most scenarios, PhC, PC-Sel and Het had greater power among the Multi-SKAT tests with fixed phenotype kernels (i.e. Hom, Het, PhC and PC-Sel) while minP, maintained high power as well. For example, when the correlation between the phenotypes was 0.3 (i.e. ρ = 0.3) and SKAT kernel was used for the genotype kernel ΣG, if 3 phenotypes were associated with the region, minP and PhC were more powerful than the other tests. If the correlation between the phenotypes was ρ = 0.7 and Burden kernel was used for genotype kernel ΣG, Het, PC-Sel and minP had higher power than the rest of the tests when 2 phenotypes were associated. It is noteworthy that Hom had the lowest power in all the scenarios of Figure 1.
Figure 2 demonstrates scenarios involving 6 phenotypes and clustered correlation structures where PhC was outperformed by other choices of the phenotype kernel ΣP. When all three phenotype clusters had associated phenotypes and the correlation within the clusters was low (ρ = 0.3) (Figure 2, upper panel), Hom and minP tests outperformed PhC when the SKAT kernel was used. This may be because that the phenotype correlation structure did not reflect the genetic association pattern. When 2 small clusters had high within-cluster correlation (ρ = 0.7) and one large cluster had low within-cluster correlation (ρ = 0.3) (Figure 2, lower panel), Het and minP had higher power than PhC.
When 20% of causal variants were trait-decreasing variants (80% trait-increasing), the power of Multi-SKAT tests with Burden ΣG was reduced (Supplementary Figure S2 and S3). This is because the association signals were attenuated due to the mix of trait-increasing and trait-decreasing variants. Since SKAT is robust regardless of the association direction, power with SKAT ΣG was largely maintained. The relative performance of methods with different ΣP given ΣG was quantitatively similar to the results without trait-decreasing variants.
Further, we estimated power of minPcom, which combines tests across phenotype (ΣP) and genotype ΣG kernels. The power of minPcom was evaluated for the compound symmetric phenotype correlation structure presented in Figure 1 and was compared with the two minP tests of SKAT (minP-SKAT) and Burden (minP-Burden) ΣG kernels. Figure 3 shows empirical power with and without trait-decreasing variants. When all genetic effect coefficients were positive (Figure 3, left panel) the performances of minP-SKAT and minP-Burden were similar for both the situations where the correlation between the phenotypes were low (i.e. ρ = 0.3) and high (i.e. ρ = 0.7). When 20% of genetic effect coefficients were negative (Figure 3, right panel), as expected, the power of minP-Burden was substantially decreased. Across all the situations, the power of minPcomwas similar to the most powerful minP with fixed genotype kernel ΣG. When 50% of variants were causal variants and all genetic effect coefficients were positive (Supplementary Figure S4, left panel), minP-Burden was more powerful than minP-SKAT, and minPcom had similar power than minP-Burden.
Overall, our simulation results show that the omnibus tests, especially minPcom, had robust power throughout all the simulation scenarios considered. When ΣG and ΣP were fixed, power depended on the model of association and the correlation structure of the phenotypes. Overall, the proposed Multi-SKAT tests generally outperformed the single phenotype test (minPhen), even when only one phenotype was associated with genetic variants.
Application to the METSIM study exomechip data
Inborn errors of amino acid metabolism cause mild to severe symptoms including type 2 diabetes(Stančáková et al., 2012; Würtz et al., 2012, 2013) and liver diseases(Tajiri and Shimizu, 2013) among others. Amino acid levels are perturbed in certain disease states, e.g., glutamic and aspartic acid levels are reduced in Alzheimer disease brains(Allan Butterfield and Pocernich, 2003); Isoleucine, glycine, alanine, phenylalanine, and threonine levels are increased in cerebo-spinal fluid (CSF) of individuals with motor neuron disease(de Belleroche et al., 2003). To find rare variants associated with the 9 measured amino acid levels, we applied the Multi-SKAT tests to the METSIM study data(Teslovich et al., 2018). The MAF spectrum of the genotyped variants is shown in Supplementary Figure S6, showing that most of the variants are rare variants. We estimated the relatedness between individuals by KING (Manichaikul et al., 2010), and coheritability of the amino acid phenotypes and the corresponding residual variance using PHENIX(Dahl et al., 2016) (Supplementary Figure S1). Among the 8,545 METSIM participants with non-missing phenotypes and covariates, 1,332 individuals had a second degree or closer relationship with one or more of the METSIM participants. A total of 5, 207 genes with at least three rare variants were included in our analysis. The Bonferroni corrected significance threshold was α = 0.05/5207 = 9.6 × 10−6. Further we used a less significant cutoff of α = 10−4 for a gene to be suggestive. After identifying associated genes, we carried out backward elimination procedure (Appendix C) to investigate which phenotypes are associated with the gene. This procedure iteratively removes phenotypes based on minPcom p-values.
QQ plots for the p-values obtained by minPhen and Multi-SKAT omnibus tests (minP and minPcom) are displayed in Figure 4. Due to the presence of several strong associations, for the ease of viewing, any p-value < 10−12 was collapsed to 10−12. The QQ plots are well calibrated with slight inflation in tail areas. The genomic-control lambda (λGC) varied between 0.97 and 1.04, which indicates no inflation of test statistics. Table 2 shows genes with p-values less than 10−4 for minPhen or minPcom. Table 5 shows SKAT-O p-values for each of the gene - amino acid pairs. Among the eight significant or suggestive genes displayed in Table 2, minPcom provides more significant p-values than minPhen for six genes: Glycine decarboxylase (GLDC [MIM: 238300]), Histidine ammonia-lyase (HAL [MIM: 609457]), Phenylalanine hydroxylase (PAH [MIM: 612349]), Dihydroorotate dehydrogenase (DHODH [MIM: 126064]), Mediator of RNA polymerase II transcription subunit 1 (MED1 [MIM: 604311]), Serine/Threonine Kinase 33 (STK33 [MIM: 607670]). Interestingly, PAH and MED1 are significant by minPcom, but not significant by minPhen. PAH encodes an Phenylalanine hydroxylase, which catalyzes the hydroxylation of the aromatic side-chain of phenylalanine to generate tyrosine. MED1 is involved in the regulated transcription of nearly all RNA polymerase II-dependent genes. This gene does not show any single phenotype association, but cross-phenotype analysis produced evidence of association. Using backward elimination we find that Phenylalanine and Tyrosine are the last two phenotypes to be eliminated (Supplementary Table S2). We have provided a detailed description of the function and clinical implications of the significant and suggestive genes in Supplementary Table S4.
Table 2:
Gene | Chromosome | Rare SNPs | MAC | minPhen | minP-SKAT | minP-Burden | minPcom |
---|---|---|---|---|---|---|---|
GLDC | 9 | 4 | 183 | 2.2 × 10−63 | 1.1 − 10−72 | 9.7 × 10−64 | 2.3 × 10−72 |
HAL | 3 | 6 | 42 | 2.9 × 10−08 | 1.1 × 10−05 | 4.2 × 10−11 | 9.5 × 10−11 |
DHODH | 16 | 5 | 90 | 1.3 × 10−06 | 9.7 × 10−08 | 7.7 × 10−06 | 1.1 × 10−07 |
PAH | 12 | 6 | 27 | 6.1 × 10−04 | 1.4 × 10−06 | 1.3 × 10−06 | 1.9 × 10−06 |
MED1 | 17 | 3 | 147 | 6.2 × 10−01 | 3.9 × 10−06 | 3.4 × 10−04 | 4.9 × 10−06 |
STK33 | 11 | 6 | 180 | 8.9 × 10−01 | 3.2 × 10−05 | 3.3 × 10−02 | 4.2 × 10−05 |
ALDH1L1 | 3 | 8 | 103 | 8.4 × 10−08 | 3.8 × 10−04 | 4.2 × 10−05 | 5.4 × 10−05 |
BCAT2 | 19 | 3 | 133 | 1.1 × 10−05 | 9.3 × 10−03 | 3.4 × 10−05 | 6.2 × 10−05 |
Table 5:
Gene | Ala | Gln | Gly | His | Ile | Leu | Phe | Tyr | Val |
---|---|---|---|---|---|---|---|---|---|
GLDC | 2.9 × 10−01 | 2.9 × 10−03 | 2.5 × 10−64 | 1.0 × 10−02 | 1.5 × 10−01 | 3.4 × 10−02 | 8.5 × 10−01 | 6.6 × 10−01 | 4.7 × 10−03 |
HAL | 9.9 × 10−01 | 1.1 × 10−01 | 3.1 × 10−01 | 3.2 × 10−09 | 2.6 × 10−01 | 2.5 × 10−01 | 6.6 × 10−01 | 1.2 × 10−01 | 3.5 × 10−01 |
DHODH | 1.4 × 10−07 | 9.9 × 10−01 | 9.0 × 10−02 | 3.6 × 10−01 | 7.7 × 10−01 | 1.7 × 10−01 | 1.9 × 10−01 | 3.0 × 10−01 | 1.0 × 10−02 |
PAH | 7.9 × 10−01 | 3.0 × 10−01 | 2.8 × 10−01 | 8.6 × 10−01 | 4.5 × 10−01 | 8.1 × 10−01 | 6.8 × 10−05 | 4.0 × 10−01 | 9.9 × 10−01 |
MED1 | 1.0 × 10−01 | 5.1 × 10−01 | 7.9 × 10−01 | 5.9 × 10−01 | 3.3 × 10−01 | 1.4 × 10−01 | 6.9 × 10−01 | 6.8 × 10−02 | 6.7 × 10−01 |
STK33 | 8.4 × 10−01 | 8.2 × 10−01 | 3.5 × 10−01 | 5.7 × 10−01 | 6.6 × 10−01 | 1.0 × 10−01 | 9.9 × 10−01 | 8.1 × 10−01 | 4.1 × 10−01 |
ALDH1L1 | 6.7 × 10−01 | 7.6 × 10−01 | 9.3 × 10−09 | 7.3 × 10−01 | 3.4 × 10−01 | 3.2 × 10−01 | 9.9 × 10−01 | 5.9 × 10−01 | 2.5 × 10−01 |
BCAT2 | 1.4 × 10−01 | 2.6 × 10−01 | 8.2 × 10−01 | 9.9 × 10−01 | 1.0 × 10−01 | 2.1 × 10−02 | 5.4 × 10−01 | 5.0 × 10−01 | 1.2 × 10−06 |
Among other genes, GLDC has the smallest p-value. Variants in GLDC are known to cause glycine encephalopathy (MIM: 605899) (Hughes, 2009). To investigate whether our results were supported by single phenotype associations, we applied SKAT-O to each of the 9 amino acid phenotypes. Univariate SKAT-O test with each of these phenotype reveals that this gene has a strong association with Glycine (p-value = 2.5 × 10−64, Table 5). Among the variants genotyped in this gene, rs138640017 (MAF = 0.009) appears to drive the association (single variant p-value = 1.0 × 10−64). Variants in HAL cause histidinemia (MIM: 235800) in human and mouse. This gene shows significant univariate association with Histidine (SKAT-O p-value = 3.2 × 10−8, Table 5) which in turn is influenced by the association of rs141635447 (MAF = 0.005) with Histidine (single variant p-value = 3.7 × 10−13). Similarly, variants in DHODH, which have been previously found to be associated with postaxial acrofacial dysostosis (MIM: 263750), have significant cross-phenotype association although the result us mostly driven by the association with Alanine (SKAT-O p-value =1.4 × 10−07, Table 5). ALDH1L1 catalyzes conversion of 10-formyltetrahydrofolate to tetrahydrofolate. Published results show that common variant rs1107366, 5kb upstream of ALDH1L1, is associated with Glycine-Seratinine ratio (Xie et al., 2013). Down-regulation of BCAT2 in mice causes elevated serum branched chain amino acid levels and features of maple syrup urine disease.
Table 3 shows p-values of Multi-SKAT kernel and minP with two genotype kernels (SKAT and Burden). Among phenotype kernels, PhC and Het generally produced the smallest p-values. We further applied Multi-SKAT tests without kinship adjustment on the whole METSIM study individuals. As expected, this produced inflation in QQ plots (Supplementary Figure S5).
Table 3:
Gene | Multi-SKAT |
|||||||||
---|---|---|---|---|---|---|---|---|---|---|
Σ G | Name | Chromosome | Rare SNPs | MAC | minPhen | Hom | Het | PhC | PC-Sel | minP |
GLDC | 9 | 4 | 183 | 6.8 × 10−63 | 1.3 × 10−03 | 1.3 × 10−17 | 2.9 × 10−73 | 5.3 × 10−59 | 1.1 × 10−72 | |
DHODH | 16 | 5 | 90 | 5.1 × 10−08 | 1.4 × 10−01 | 3.2 × 10−03 | 3.1 × 10−08 | 7.1 × 10−06 | 9.7 × 10−08 | |
PAH | 12 | 6 | 27 | 4.3 × 10−05 | 7.9 × 10−02 | 1.8 × 10−05 | 3.9 × 10−07 | 6.3 × 10−04 | 1.4 × 10−06 | |
SKAT | MED1 | 17 | 3 | 147 | 8.7 × 10−02 | 3.4 × 10−01 | 9.3 × 10−07 | 2.8 × 10−04 | 1.9 × 10−04 | 3.9 × 10−06 |
HAL | 12 | 6 | 42 | 3.8 × 10−05 | 5.5 × 10−01 | 3.5 × 10−03 | 2.9 × 10−06 | 1.5 × 10−04 | 1.1 × 10−05 | |
STK33 | 11 | 6 | 180 | 4.0 × 10−01 | 2.3 × 10−01 | 8.9 × 10−06 | 1.1 × 10−03 | 5.7 × 10−03 | 3.2 × 10−05 | |
ALDH1L1 | 3 | 8 | 103 | 4.4 × 10−02 | 1.1 × 10−01 | 2.2 × 10−02 | 1.6 × 10−04 | 4.2 × 10−04 | 3.8 × 10−04 | |
BCAT2 | 19 | 3 | 133 | 9.7 × 10−04 | 5.0 × 10−01 | 8.1 × 10−02 | 2.4 × 10−03 | 3.5 × 10−03 | 9.3 × 10−03 | |
GLDC | 9 | 4 | 183 | 1.4 × 10−59 | 7.5 × 10−04 | 1.5 × 10−15 | 1.3 × 10−64 | 2.1 × 10−54 | 9.7 × 10−64 | |
HAL | 12 | 6 | 42 | 2.4 × 10−08 | 2.8 × 10−05 | 4.6 × 10−01 | 4.2 × 10−12 | 7.9 × 10−03 | 4.2 × 10−11 | |
PAH | 12 | 6 | 27 | 4.3 × 10−05 | 2.0 × 10−01 | 5.3 × 10−04 | 7.2 × 10−06 | 9.3 × 10−03 | 1.3 × 10−06 | |
Burden | DHODH | 16 | 5 | 90 | 1.3 × 10−07 | 5.3 × 10−02 | 1.3 × 10−02 | 2.9 × 10−06 | 7.6 × 10−04 | 7.7 × 10−06 |
BCAT2 | 19 | 3 | 133 | 8.4 × 10−06 | 4.0 × 10−01 | 1.7 × 10−02 | 1.1 × 10−0B | 6.1 × 10−03 | 3.7 × 10−05 | |
ALDH1L1 | 3 | 8 | 103 | 8.3 × 10−08 | 1.7 × 10−02 | 5.1 × 10−03 | 2.1 × 10−06 | 1.7 × 10−05 | 4.2 × 10−05 | |
MED1 | 17 | 3 | 147 | 9.4 × 10−01 | 1.8 × 10−01 | 8.7 × 10−05 | 2.1 × 10−03 | 7.1 × 10−01 | 3.4 × 10−04 | |
STK33 | 11 | 6 | 180 | 4.8 × 10−01 | 7.1 × 10−01 | 7.0 × 10−04 | 3.8 × 10−02 | 6.4 × 10−01 | 6.3 × 10−03 |
To directly compare our results with existing methods we applied GAMuT, DKAT and MSKAT to the METSIM dataset. Since these methods cannot be applied to related individuals, we eliminated 1332 individuals that were related up to second degree, leaving us 7213 individuals. Table 4 shows p-values of different methods on the eight significant or suggestive genes displayed in Table 2. Since DKAT and GAMuT had nearly identical p-values when the same kernels were used, DKAT p-values were not shown in Table 4. For unrelated individuals, as expected, p-values produced by MSKAT with Q statistic, GAMuT with projection phenotype kernel and PhC (with SKAT ΣG) were very similar, and minPcom provided similar or more significant p-values than PhC. Interestingly MSKAT with Q’ statistic and GAMuT with linear phenotype kernel have less significant p-values than the other tests. We found that in 5 of the 8 genes in Table 4, using all individuals with kinship correction produced more significant PhC and minPcom p-values than using only unrelated individuals. Further, we have listed the top 10 genes for each of PhC, GAMuT and MSKAT with unrelated individuals (Supplementary Table S4). Except for the genes in Table 4, no other genes were found to be significant or suggestive.
Table 4:
Unrelated individuals (n = 7213) |
All individuals (n = 8545) |
|||||||
---|---|---|---|---|---|---|---|---|
Gene | MSKAT (Q) |
MSKAT (Q′) |
GAMuT (Projection) |
GAMuT (Linear) |
PhC (ΣG = SKAT) |
minPcom | PhC (ΣG = SKAT) |
minPcom |
GLDC | 8.9 × 10−54 | 6.1 × 10−15 | 0 * | 6.2 × 10−15 | 8.1 × 10−54 | 1.3 × 10−53 | 2.9 × 10−73 | 2.3 × 10−72 |
HAL | 9.5 × 10−05 | 4.2 × 10−02 | 9.6 × 10−05 | 4.3 × 10−02 | 9.5 × 10−05 | 2.9 × 10−07 | 2.9 × 10−06 | 9.5 × 10−11 |
DHODH | 2.1 × 10−06 | 2.2 × 10−03 | 2.4 × 10−06 | 2.2 × 10−03 | 1.9 × 10−06 | 8.3 × 10−06 | 3.1 × 10−08 | 1.1 × 10−07 |
PAH | 9.9 × 10−06 | 1.3 × 10−02 | 1.0 × 10−05 | 1.3 × 10−02 | 9.9 × 10−06 | 5.1 × 10−05 | 3.9 × 10−07 | 1.9 × 10−06 |
MED1 | 1.7 × 10−03 | 2.7 × 10−01 | 1.7 × 10−03 | 2.7 × 10−01 | 1.7 × 10−03 | 2.8 × 10−05 | 2.8 × 10−04 | 4.9 × 10−06 |
STK33 | 6.8 × 10−04 | 6.9 × 10−01 | 6.7 × 10−04 | 6.9 × 10−01 | 6.7 × 10−04 | 1.9 × 10−06 | 1.1 × 10−03 | 4.2 × 10−05 |
ALDH1L1 | 5.9 × 10−05 | 3.1 × 10−02 | 6.1 × 10−05 | 3.0 × 10−02 | 6.0 × 10−05 | 9.5 × 10−06 | 1.6 × 10−04 | 5.4 × 10−05 |
BCAT2 | 6.1 × 10−04 | 1.5 × 10−02 | 6.3 × 10−04 | 1.5 × 10−02 | 6.2 × 10−04 | 3.7 × 10−05 | 2.4 × 10−03 | 6.2 × 10−05 |
: GAMuT software collapses any p-value lower than 10−50 to 0
Overall, our METSIM amino acid data analysis suggests that the proposed method can be more powerful than the single phenotype tests as well as existing tests, while maintaining type I error rate even in the presence of the relatedness. It also shows that the omnibus tests (minP and minPcom) provides robust performance by effectively aggregating results of various kernels.
Computation Time
When ΣP and ΣG are given, p-values of Multi-SKAT are computed by the Davies method (Davies, 1980), which inverts the characteristic function of the mixture of chi-squares. On average, Multi-SKAT tests for a given ΣP and ΣG required less than 1 CPU sec (Intel Xeon 2.80 GHz) when applied to a dataset with 5000 independent individuals, 20 variants and 10 phenotypes (Supplementary Table S1). With the kinship adjustment for 5000 related individuals, computation time was increased to 3 CPU sec. Since minPcom requires only a small number of resampling steps to estimate the correlation among tests, it is still scalable for genome-wide analysis. In the same dataset, minPcom required 4 and 10 CPU sec on average without and with the kinship adjustment, respectively. Further, Multi-SKAT given ΣP and ΣG, is computationally equivalent to MSKAT and takes less than 1 CPU-sec for up to 20,000 samples, with 20 variants (Supplementary Figure S7 A), while GAMuT takes considerably more time than these two. The performance of minPcom is similar to GAMuT for small and moderate sample sizes (7.5 and 7.1 CPU-secs respectively for 10,000 samples) and performs better than GAMuT for larger sample sizes (14.9 and 34.6 CPU-secs respectively for 20,000 samples). Computation time of all the methods were slightly increased when the number of variants were 100 (Supplementary Figure S7 B). Analyzing the METSIM dataset with minPcom required 10 hours when parallelized into 5 processes.
Discussion
In this article, we have introduced a general framework for rare variant tests for multiple phenotypes. As demonstrated, Multi-SKAT gains flexibility with regard to modeling the relationship between phenotypes and genotypes through the use of the kernels ΣP and ΣG. Many published methods, including GAMuT, MSKAT and MAAUSS, can be viewed as special cases of the Multi-SKAT test with corresponding values of ΣP and ΣG, which illuminates the underlying assumptions of these methods and their relationships. In addition, by unifying existing methods to the common framework, our approach provides a way to combine different methods through the minimum p-value based omnibus test. Our method can also adjust for sample relatedness. From simulation studies we have found that the proposed method is scalable to genome-wide analysis and can outperform the single phenotype test and existing multiple phenotype tests. The METSIM data analysis demonstrated that the proposed methods perform well in practice.
It is natural to assume that different genes follow different models of association. For some genes, the effect of the variants on the phenotypes might be independent of each other, thus best detected by the Het phenotype kernel for ΣP, while for others, the effects might be nearly the same and best detected by the Hom phenotype kernel. If the kernel structures are chosen based on prior knowledge and the selected ΣG and ΣP do not reflect underlying biology, the test may have substantially reduced power. The omnibus test, which uses the minimum p-value from the various choices of kernels, has been a useful approach under such situations in genetic association analysis (Lee et al., 2012b; Urrutia et al., 2015; Zhan et al., 2017). We applied this ominibus test to Multi-SKAT and used a Copula to obtain p-values. As seen in simulation studies and real data analysis, our omnibus approaches (minP and minPcom) are scalable to genome-wide analysis and provide robust power regardless of underlying genetic models.
Multi-SKAT retains most of the desirable properties of SKAT. The asymptotic p-values of all the Multi-SKAT tests, other than minP and minPcom, can be analytically obtained via Davies’ method. The p-value calculations for minP and minPcom depend on a resampling based approach but a reliable estimate can be obtained using a small number of resampling steps. Thus, computationally all the Multi-SKAT tests are scalable at the genome-wide level. This method also allows the inclusion of prior information through weighting of variants.
Additionally, Multi-SKAT can adjust for the relatedness among study individuals by accounting for their kinship matrix. As shown in Supplementary Figure S5, in the presence of related individuals, lack of adjustment for relatedness can produce inflated type I error rate. Since Multi-SKAT is a regression based approach, it effectively incorporates the relatedness by including a random effect term for kinship. Type I error simulation and METSIM data analysis show that our approach produced more significant p-values than alternative methods, like GAMuT and MSKAT, while controlling type I error rates.
Although Multi-SKAT provides a general framework for gene-based multiple phenotype tests, the current approach is limited to continuous phenotypes. In the future, using a generalized mixed effect model framework, we aim to extend Multi-SKAT to binary phenotypes.
In summary, we have developed a powerful multiple phenotype test for rare variants. The proposed method has robust power regardless of the underlying biology and can adjust for family relatedness. Our method can be a scalable and practical solution to test for multiple phenotypes and will contribute to detecting rare variants with pleiotropic effects. All our methods are implemented in the R package MultiSKAT (see Web Resource).
Supplementary Material
Acknowledgments
This work was supported by grants R01 HG008773 and R01 LM012535 (D.D. and S.L.) and R01 HG000376 and U01 DK062370 (L.S. and M.B.) from the National Institutes of Health. We would like to thank investigators of the METSIM study for access to the genotype and amino-acid phenotype data.
Grant Numbers
This research was conducted with support from the following grants from National Institute of Health:
R01 HG008773 (D.D. and S.L.)
R01 LM012535 (D.D. and S.L.)
R01 HG000376 (L.S. and M.B.)
U01 DK062370 (L.S. and M.B.)
Appendices
A. Principal Component (PC) Kernel
Let Li be the loading vector for the ith PC, which produces the ith PC score Pi = YLi. In PCA-based analysis, PC scores are used as outcomes instead of original Y. Since the genetic information regarding the phenotypes may not be confined to the top few PCs (Aschard et al., 2014), we first consider using all PCs. Let P = (P1, ⋯ PK). Since PCs are orthogonal, we assume genetic effects to multiple PCs are heterogeneous, which resulted in
(14) |
where is the mean of P under the null hypothesis and is the estimated covariance matrix between the PC’s. will be a diagonal matrix since PCs are orthogonal. Equation (14) can be written as
(15) |
where L = (L1, ⋯ , LK) is a K × K PC loading matrix. Equation (15) shows that by using , we can carry out PC-based tests. It is to be noted that the genetic effects of the PC’s do not need to be assumed to be heterogeneous. Any kernel structure that is applicable to the test statistic in equation 4 can be applied here as well.
B. Relationship between Multi-SKAT and existing methods
For the ease of algebraic expressions, we will consider that all the K phenotypes have residual variance 1. For the general case of different residual variances, ΣP should be replaced by where Tw = diag(σ1, ⋯ , σK), σk being the residual standard error of kth phenotype.
B.1. MSKAT
The Q statistic of MSKAT(Wu and Pankow, 2016) is given by
(16) |
where is a matrix of score statistics (Wu and Pankow, 2016). Using row-vectorization properties
Then QMSKAT can be written as
which is the Multi-SKAT test statistics with ΣG = WW and .
Further, the Q′ of MSKAT is given by
(17) |
Using the similar algebra as above, this can be written as
which is the Multi-SKAT test statistics with ΣG = WW and .
B.2. GAMuT
Suppose and Gadj = HG are covariate adjusted phenotype and genotype matrices where H = I – X(XT X)−1 XT. With the intercept in X, Yadj and Gadj are mean centered. The covariate adjusted GAMuT test statistics is
where
and . Using the fact that is the estimate of variance after adjusting covariates and (since H is a symmetric idempotent matrix), we show, for the projection kernel
which is the same as the Multi-SKAT test statistic with ΣG = WW and .
Similarly for the linear kernel,
which is the Multi-SKAT test statistic with ΣG = WW and .
B.3. MAAUSS and MF-KM
There exists two different version of the MAAUSS tests. The homogeneous version of MAAUSS assumes that the effects of a variant on multiple phenotypes are identical and uses the following test statistic
(18) |
which is identical to the Multi-SKAT test statistic with ΣG = WW and . The heterogeneous version of MAAUSS assumes that the effects of a variant on multiple phenotypes are independent, and uses the following test statistic
(19) |
which is identical to the Multi-SKAT test statistic with ΣG = WW and ΣP = I. Note that the test statistic of MF-KM is exactly the same as QMAAUSS–HET.
C. Backward elimination procedure to identify associated phenotypes
After identifying the gene or region associated with multiple phenotypes, next question would be identifying truly associated phenotypes. Here we present a simple backward elimination algorithm to iteratively remove relatively less important phenotypes. A similar method has previously been applied to identify rare causal variants in an associated gene (Ionita-Laza et al., 2014).
Step 1. Start with a set of k phenotypes PhenCurrent = {y1, y2, ⋯ yk and compute a Multi-SKAT test association p-value for the set PhenCurrent denoted by pCurrent.
Step 2. Remove each of the phenotypes one at a time from the set PhenCurrent. The resulting set is Phen−i = y1, y2, ⋯ yi−1, yi+1 ⋯ yk} for i = 1, 2, ⋯ , k and compute the corresponding p-values p−i for that same Multi-SKAT test.
Step 3. Remove the phenotype j that leads to the smallest p-value, i.e. j = argmin{p−1, p−2, ⋯ , p−k}. Update PhenCurrent to Phen−j.
Step 4. Continue removing phenotypes till only 1 phenotype is left.
Supplementary Table S2 shows the backward elimination results of 5 most significant and suggestive genes in the METSIM study data analysis as per the p-values reported by minPcom. Although this procedure does not provide a set of phenotypes truly associated, it provides the relative importance of the phenotypes in driving association signals. For example, the minPcom p-value for GLDC was 2.3 × 10−72. When each of the phenotypes were removed one at a time and the minPcom p-values were calculated on the remaining 8 phenotypes, we found that eliminating Isoleucine (Ile) actually improved the signal. The minPcom p-value of the set of 8 amino acids leaving out Ile was 2.8 × 10−73. This indicates that Isoleucine has very minimal contribution to the association between the amino acids and GLDC. Subsequently, Valine was the next phenotype to be eliminated indicating that it has the next lowest contribution after Isoleucine. Carrying out this procedure further, we find that Glycine is the last phenotype to remain indicating that it is the strongest driver of the signal. This is in agreement to the single phenotype SKAT-O results (Table 5). Similary for genes HAL, DHODH, PAH and MED1, Histine, Alanine, Phenylalanine and Tyrosine were the most associated phenotypes, respectively. Interestingly for PAH and MED1, single phenotype p-values are not significant, which suggests that multiple phenotypes are associated with these genes.
Footnotes
Web Resources
MultiSKAT R-package: https://github.com/diptavo/MultiSKAT
GAMuT R-package: https://epstein-software.github.io/GAMuT
MSKAT R-package: https://github.com/baolinwu/MSKAT
PHENIX R-package: https://mathgen.stats.ox.ac.uk/genetics_software/phenix/phenix.html
Online Mendelian Inheritance in Man (OMIM): http://www.omim.org
References
- Allan Butterfield D and Pocernich CB (2003). The glutamatergic system and alzheimer’s disease. CNS Drugs, 17(9):641–652. [DOI] [PubMed] [Google Scholar]
- Aschard H, Vilhjálmsson BJ, Greliche N, Morange P-E, Trégouët D-A, and Kraft P (2014). Maximizing the power of principal-component analysis of correlated phenotypes in genome-wide association studies. The American Journal of Human Genetics, 94(5):662–676. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Broadaway KA, Cutler DJ, Duncan R, Moore JL, Ware EB, Min A Jhun LFB, Zhao W, Smith JA, Peyser PA, Kardia SL, Ghosh D, and Epstein MP (2016). A statistical approach for testing cross-phenotype effects of rare variants. American Journal of Human Genetics, 98(3):525–540. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bulik-Sullivan BK, Loh P-R, Finucane HK, Ripke S, Yang J, of the Psychiatric Genomics Consortium, S. W. G., Patterson N, Daly MJ, Price AL, and Neale BM (2015). Ld score regression distinguishes confounding from polygenicity in genome-wide association studies. Nature Genetics, 47:291–295. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chiu C, Jung J, Wang Y, Weeks D, Wilson A, Bailey-Wilson J, Amos C, Mills J, Boehnke M, Xiong M, and Fan R (2017). A comparison study of multivariate fixed models and gene association with multiple traits (gamut) for next generation sequencing. Genetic Epidemiology, 41(1):18–34. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cotsapas C, Voight BF, Rossin E, Lage K, Neale BM, Wallace C, Abecasis GR, Barrett JC, Behrens T, Cho J, et al. (2011). Pervasive sharing of genetic effects in autoimmune disease. PLoS genetics, 7(8):e1002254. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dahl A, Iotchkova V, Baud A, Johansson A, Gyllensten U, Soranzo N, Mott R, Kranis A, and Marchini J (2016). A multiple-phenotype imputation method for genetic studies. Nature Genetics, 48:466–472. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Davies RB (1980). The distribution of a linear combination of x2 random variables. Applied Statistics, 29(3):323–333. [Google Scholar]
- de Belleroche J, Recordati A, and Clifford Rose F (2003). Elevated levels of amino acids in the csf of motor neuron disease patients. CNS Drugs, 17(9):641–652. [DOI] [PubMed] [Google Scholar]
- Demarta S and McNeil AJ (2005). The t copula and related copulas. International Statistical Review/Revue Internationale de Statistique, pages 111–129. [Google Scholar]
- Ferreira M and Purcell S (2009). A multivariate test of association. Bioinformatics, 25:132–133. [DOI] [PubMed] [Google Scholar]
- He Z, Xu B, Lee S, and Ionita-Laza I (2017). Unified sequence-based association tests allowing for multiple functional annotations and meta-analysis of noncoding variation in metabochip data. The American Journal of Human Genetics, 101(3):340–352. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Huang J, Johnson A, and O’Donnell C (2011). Prime: a method for characterization and evaluation of pleiotropic regions from multiple genome-wide association studies. Bioinformatics, 27:1201–1206. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hughes I (2009). Pediatric endocrinology and inborn errors of metabolism. Journal of Paediatrics and Child Health, 45(11):668–669. [Google Scholar]
- Huyghe JR, Jackson AU, Fogarty MP, Buchkivoch ML, Stancakova A, Stringham HM, Sim X, Yang L, Fuchsberger C, Cederberg H, and et al. (2013). Exome array analysis identifies new loci and low-frequency variants influencing insulin processing and secretion. Nature Genetics, 45:197–201. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ionita-Laza I, Capanu M, De Rubeis S, McCallum K, and Buxbaum JD (2014). Identification of rare causal variants in sequence-based studies: methods and applications to vps13b, a gene involved in cohen syndrome and autism. PLoS Genet, 10(12):e1004729. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Korte A and Farlow A (2013). The advantages and limitations of trait analysis with gwas: a review. Plant Methods, 9:9–29. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lee S, Abecasis GR, Boehnke M, and Lin X (2014). Rare-variant association analysis: study designs and statistical tests. The American Journal of Human Genetics, 95(1):5–23. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lee S, Emond MJ, Bamshad MJ, Barnes KC, Rieder MJ, Nickerson DA, Team ELP, Christiani DC, Wurfel MM, Lin X, et al. (2012a). Optimal unified approach for rare-variant association testing with application to small-sample case-control whole-exome sequencing studies. The American Journal of Human Genetics, 91(2):224–237. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lee S, Won S, Kim YJ, Kim Y, Kim B-J, and Park T (2016). Rare variant association test with multiple phenotypes. Genetic Epidemiology. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lee S, Wu MC, and Lin X (2012b). Optimal tests for rare variant effects in sequencing association studies. Biostatistics, 13(4):762–775. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Maity A, Sullivan P, and Tzeng J (2012). Multivariate phenotype association analysis by marker-set kernel machine regression. Genetic Epidemiology, 36:686–695. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Manichaikul A, Mychaleckyj JC, Rich SS, Daly K, Sale M, and Chen WM (2010). Robust relationship inference in genome-wide association studies. Bioinformatics, 26(22):2867–2873. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ray D, Pankow JS, and Basu S (2016). Usat: A unified score-based association test for multiple phenotype-genotype analysis. Genetic epidemiology, 40(1):20–34. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ried JS, Doring A, Oexle K, Meisinger C, Winkelmann J, Klopp N, Meitinger T, Peters A, Suhre K, Wichmann H, and Gieger C (2012). PSEA: Phenotype set enrichment analysis–a new method for analysis of multiple phenotypes. Genetic Epidemiology, 36:244–252. [DOI] [PubMed] [Google Scholar]
- Schaffner SF, Foo C, Gabriel S, Reich D, Daly MJ, and Altshuler D (2005). Calibrating a coalescent simulation of human genome sequence variation. Genome research, 15(11):1576–1583. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sivakumaran S, Agakov F, Theodoratou E, Prendergast JG, Zgaga L, Manolio T, Rudan I, McKeigue P, Wilson JF, and Campbell H (2011). Abundant pleiotropy in human complex diseases and traits. The American Journal of Human Genetics, 89(5):607–618. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Solovie N, Cotsapas C, Lee PH, Purcell SM, and Smoller JW (2013). Pleiotropy in complex traits: challenges and strategies. Nature reviews. Genetics, 14(7):483. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stančáková A, Civelek M, Saleem N, Soininen P, Kangas A, Cederberg H and Paananen J, Pihlajamäki J, Bonnycastle L, Morken M, et al. (2012). Hyperglycemia and a common variant of gckr are associated with the levels of eight amino acids in 9,369 finnish men. Diabetes, 61:1895–1902. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sun J, Oualkacha K, Forgetta V, Zheng H-F, Richards JB, Ciampi A, and Greenwood CM (2016). A method for analyzing multiple continuous phenotypes in rare variant association studies allowing for flexible correlations in variant effects. European Journal of Human Genetics, 24(9):1344–1351. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tajiri K and Shimizu Y (2013). Branched-chain amino acids in liver diseases. World Journal of Gastroentology, 19:7620–7629. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Teslovich TM, , Kim DS, Yin X, Stancakova A, Jackson AU, Weilscher M, Naj A, and et al. (2018). Identification of seven novel loci associated with amino acid levels using single-variant and gene-based tests in 8545 finnish men from the metsim study. Human Molecular Genetics, 27:1664–1674. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Urrutia E, Lee S, Maity A, Zhao N, Shen J, Li Y, and Wu M (2015). Rare variant testing across methods and thresholds using the multi-kernel sequence kernel association test (MK-SKAT). Stat Interface, 8(4):495–505. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang Y, Liu A, Mills J, Boehnke M, Wilson A, Bailey-Wilson J, Xiong M, Wu C, and Fan R (2015). Pleiotropy analysis of quantitative traits at gene level by multivariate functional linear models. Genetic Epidemiology, 39:259–275. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Würtz P, Mäkinen V-P, Soininen P, Kangas A, Tukiainen T, Kettunen J, Savolainen M, Tammelin T, Viikari J, Rönnemaa T, et al. (2012). Metabolic signatures of insulin resistance in 7,098 young adults. Diabetes, 61:1372–1380. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Würtz P, Soininen P, Kangas A, Rönnemaa T, Lehtimäki T, Kähönen M, Viikari J, Raitakari O, and Ala-Korpela M (2013). Branched-chain and aromatic amino acids are predictors of insulin resistance in young adults. Diabetes Care, 36:648–655. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wu B and Pankow J (2016). Sequence kernel association test of multiple continuous phenotypes. Genetic Epidemiology, 40(2):91–100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wu MC, Lee S, Cai T, Li Y, Boehnke M, and Lin X (2011). Rare-variant association testing for sequencing data with the sequence kernel association test. Americal Journal of Human Genetics, 89:82–93. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wu MC, Maity A, Lee S, Simmons EM, et al. (2013). Kernel machine snp-set testing under multiple candidate kernels. Genetic Epidemiology, 37(3):267–275. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xie W, Wood AR, Lyssenko V, Weedon MN, Knowles JW, Alkayyali S, Assimes TL, Qaurtermous T, Abbasi F, Paananen J, and et al. (2013). Genetic variants associated with glycine metabolism and their role in insulin sensitivity and type 2 diabetes. Diabetes, 62:2141–2150. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yan Q, Weeks DE, Celedón JC, Tiwari HK, Li B, Wang X, Lin W-Y, Lou X-Y, Gao G, Chen W, et al. (2015). Associating multivariate quantitative phenotypes with genetic variants in family samples with a novel kernel machine regression method. Genetics, 201(4):1329–1339. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhan X, Zhao N, Plantinga A, Thornton T, Conneely K, Epstein M, and Wu M (2017). Powerful genetic association analysis for common or rare variants with high-dimensional structured traits. Genetics, 206(4):1779–1790. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhou X, Carbonetto P, and Matthew S (2013). Polygenic modeling with bayesian sparse linear mixed models. PLoS Genetics, 9(2):e1003264. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhou X and Stephens M (2014). Efficient multivariate linear mixed model algorithms for genome-wide association studies. Nature Methods, 11:407–409. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.