Abstract
The joint analysis of multiple traits has recently become popular since it can increase statistical power to detect genetic variants and there is increasing evidence showing that pleiotropy is a widespread phenomenon in complex diseases. Currently, most of existing methods use all of the traits for testing the association between multiple traits and a single variant. However, those methods for association studies may lose power in the presence of a large number of noise traits. In this paper, we propose an “optimal” maximum heritability test (MHT-O) to test the association between multiple traits and a single variant. MHT-O includes a procedure of deleting traits that have weak or no association with the variant. Using extensive simulation studies, we compare the performance of MHT-O with MHT, Trait-based Association Test uses Extended Simes procedure (TATES), SUM_SCORE and MANOVA. Our results show that, in all of the simulation scenarios, MHT-O is either the most powerful test or comparable to the most powerful test among the five tests we compared.
Introduction
Increasing evidence shows that pleiotropy, the effect of one variant on multiple traits, is a widespread phenomenon in complex diseases [1]. Furthermore, in genetic association studies of complex diseases, multiple related traits are usually measured. For example, hyperuricemia is usually present in patients with gout [2]; coronary heart disease is predicted by cytokine interleukin-6, C-reactive protein, interleukin-1, tumor necrosis factor-α and fibrinogen [3, 4]; and neuropsychiatric disorders depend on a range of overlapping clinical characteristics [5]. Although most published genome-wide association studies (GWASs) analyze each of the related traits separately, joint analysis of multiple traits may increase statistical power to detect genetic variants [6–9]. Thus, joint analysis of multiple traits has recently become popular.
Several statistical methods have been developed for joint analysis of multiple traits. These methods can be roughly divided into three groups: combining the univariate analysis results, regression methods, and dimension reduction methods. For combining univariate analysis results, one first conducts the univariate test by performing an association test for each trait individually and then combines the univariate test statistics or combines the p-values of the univariate tests [2, 10–12]. Regression methods include mixed effect models [9, 13, 14], generalized estimating equation (GEE) methods [15, 16], and reverse regression methods [5, 17]. Mixed effect models can account for relatedness, population structure, and polygenic background effect, but it is computationally challenging. The GEE methods, based on a marginal regression model, allow the variant having different effect sizes and effect directions on different traits. These methods can also accommodate covariates and different types of traits. Reverse regression methods take genotypes as the response variable and multiple traits as independent predictors, therefore, reverse regression models do not need to know the complex distributions of traits and can be applied to a large number of mixed types of traits. Dimension reduction methods include canonical correlation analysis (CCA) [18], principal components of traits (PCT) [19], and principal components of heritability (PCH) [20–23]. CCA is to seek a linear combination of multiple variants and a linear combination of multiple traits such that the correlation between the two linear combinations reaches its maximum. The PCT methods are usually based on the first PC or first few PCs of the traits [22, 24]. However, as Aschard et al. [2014] showed that testing only the first few PCs often has low power, whereas combining signals across all PCs can have greater power. Nevertheless, it is not clear how many PCs are needed, and how robust these methods are when there exists noise traits. PCH is to find a linear combination of multiple traits such that this linear combination has the maximum heritability.
In this article, we first propose a maximum heritability test (MHT). Based on MHT, we develop an “optimal” maximum heritability test (MHT-O) to test the association between multiple traits and a single variant. In each step of MHT-O, we delete one trait that has the weakest association with the variant. Then, we find the optimal number of traits and use MHT to test the association between the optimal number of traits and the variant. Using extensive simulation studies, we compare the performance of MHT-O with MHT, Trait-based Association Test uses Extended Simes procedure (TATES) [11], SUM_SCORE and MANOVA [8]. Our results show that, in all of the simulation scenarios, MHT-O is either the most powerful test or comparable to the most powerful test among the five tests we compared.
Method
We consider a sample with n unrelated individuals. Each individual has K (potentially correlated) traits and has been genotyped at one variant. Let Y = (Y1,…,YK)T denote the random vector of K traits and X denote the random variable of the genotype score at a variant. Let yi = (yi1,…,yiK)T denote the values of K traits and xi denote the genotype score of the ith individual, where xi is the number of minor alleles that the ith individual has at the variant. We can consider that y1,…,yn is a random sample from Y and x1,…,xn is a random sample from X.
Now, let us consider linear models
We partition the total phenotypic covariance of Y as VP = VG + VR [25]; VG = var[β1X,…, βKX] = var(X)ββT is the genetic variance due to the genotype scores X, where β = (β1,…, βK)T; VR = var[ε1,…, εK] is the residual covariance after removing the genetic effect. var(X) can be estimated by . β and VR can be estimated from the linear models
βk is estimated by the least square estimator. Let rik denote the estimates of residuals εik. Then, the (j, k)th element of VR is estimated by .
Let us consider a linear combination of Y, , where w = (w1,…,wK)T. The heritability of wTY can be written as
If we define , we can write as
where . The heritability of wTY depends on w and we can find a linear combination of wTY that has the largest heritability among all linear combinations of Y. We define the maximum heritability as the test statistic to test the association between these K traits and the variant. We denote this test as maximum heritability test (MHT). The MHT statistic can be written as
where λmax(A) denotes the largest eigenvalue of matrix A.
However, the test statistic TMHT may lose power in the presence of a large number of noise traits. Therefore, we propose an “optimal” maximum heritability test (MHT-O) to test the association between multiple traits and the variant. MHT-O includes a procedure of deleting traits that have weak or no association with the variant. It has the following steps:
Step 1. Given traits Y = (Y1,…,YK), initialize r = K and Y(r) = Y. Denote TMHT, r as TMHT based on Y(r).
Step 2. Denote as TMHT based on Y(r) with the ith trait deleted for i = 1,…,r; denote and . Let Y(r−1) denote Y(r) with the Ith trait deleted and update r = r − 1.
Step 3. Repeat step 2 until r = 1.
Denote pr as the p-value of TMHT, r. The test statistic of MHT-O is defined as
We use a permutation test to evaluate the p-value of TMHT−O. Intuitively, two layers of permutations are needed to estimate pr and the overall p-value for the test statistic TMHT−O. Ge et al. [26] proposed that one layer of permutation can be used to estimate these p-values. We use the permutation procedure of Ge et al. to estimate pr and the overall p-value for the test statistic TMHT−O. In details, we randomly shuffle the genotypes in each permutation. Suppose we perform B times of permutations. Let denote the value of TMHT, r based on the bth permuted data, where b = 0 represents the original data. Then, we transfer to by
Let , then, the p-value of TMHT−O is given by
The R code of MHT-O is available at Shuanglin Zhang’s homepage http://www.math.mtu.edu/~shuzhang/software.html.
Comparisons of Methods
We compare our proposed method with MHT, TATES [11], MANOVA [8], and SUM_SCORE. TATES combines p-values obtained in a standard univariate GAWS to acquire one trait-based p-value, while correcting for correlations between components. SUM_SCORE performs an association test for each trait individually to obtain the univariate score test statistic for each trait. Then, the test statistic of SUM_SOCRE is the summation of the univariate score test statistics. We use asymptotic distributions to evaluate the p-values of SUM_SCORE, TATES and MANOVA.
Simulation
To evaluate the type I error rates and powers of MHT and MHT-O, we generate genotypes according to minor allele frequency (MAF) and assume Hardy Weinberg equilibrium. Then, we generate K traits by the factor model [11, 19]
(1) |
where y = (y1,…,yK)T; x is the genotype score at the variant of interest; λ = (λ1,…,λK) is the vector of effect sizes of the genetic variant on the K traits; f = (f1,…,fR)T ∼ MVN(0, Σ), Σ = (1 − ρ)I + ρA, A is a matrix with elements of 1, I is the identity matrix, and ρ is the correlation between factors; γ is a K by R matrix; c is a constant number; and ε = (ε1,…, εK)T is a vector of residuals, and ε1,…, εK are independent, and εk ∼ N(0, 1) for k = 1,…, K.
Based on Eq (1), we consider five models:
Model 1: There is only one factor and genotypes impact on all traits with the same effect size. That is, R = 1, λ = (β,…,β)T, and γ = (1,…,1)T.
Model 2: There are five factors and genotypes impact on one factor. That is, , and γ = diag(D1, D2, D3, D4, D5), where for i = 1,…,5.
Model 3: There are two factors and genotypes impact on one factor. That is, , and γ = diag(D1, D2), where for i = 1, 2.
Model 4: There are five factors and genotypes impact on one trait. That is, R = 5, λ = (0,…,0, β)T, and γ = diag(D1, D2, D3, D4, D5), where for i = 1,…,5.
Model 5: There is only one factor and genotypes impact on one trait. That is, R = 1, λ = (0,…,0, β)T, and γ = (1,…,1)T.
To evaluate type I error rates of MHT and MHT-O, we let β = 0. To evaluate powers, we let β > 0. In the simulation studies for evaluation of type I error rates and powers, we set MAF = 0.3 and ρ = 0.2.
Results
To evaluate the type I error rates of the two proposed tests (MHT and MHT-O), we consider 20 quantitative traits. We also consider different sample sizes, different significance levels, and different models. In each simulation scenario, the p-values of MHT and MHT-O are estimated by 1,000 permutations and the type I error rates of the two tests are evaluated using 10,000 replicated samples. For 10,000 replicated samples, the 95% confidence intervals (CIs) for estimated type I error rates of nominal levels 0.05 and 0.01 are (0.046, 0.054) and (0.008, 0.012), respectively (see Appendix for details). The estimated type I error rates of the two tests are summarized in Table 1. From this table, we can see that 58 out of 60 (greater than 95%) estimated type I error rates are within the 95% CIs and the two estimated type I error rates (0.05415 and 0.0126) not within the 95% CIs are very close to the bound of the corresponding 95% CI, which indicates that the two tests are all valid.
Table 1. The estimated type I error rates of MHT and MHT-O.
Sample size | |||||
---|---|---|---|---|---|
500 | 1000 | 2000 | |||
Model 1 | α = 0.05 | MHT-O | 0.05415 | 0.0494 | 0.04875 |
MHT | 0.05235 | 0.05005 | 0.0501 | ||
α = 0.01 | MHT-O | 0.01035 | 0.012 | 0.0091 | |
MHT | 0.00985 | 0.01195 | 0.01105 | ||
Model 2 | α = 0.05 | MHT-O | 0.0499 | 0.0515 | 0.0526 |
MHT | 0.04815 | 0.05175 | 0.05285 | ||
α = 0.01 | MHT-O | 0.01045 | 0.01175 | 0.01135 | |
MHT | 0.0117 | 0.0118 | 0.0126 | ||
Model 3 | α = 0.05 | MHT-O | 0.05015 | 0.0517 | 0.05315 |
MHT | 0.04875 | 0.0507 | 0.0529 | ||
α = 0.01 | MHT-O | 0.00995 | 0.0109 | 0.012 | |
MHT | 0.0104 | 0.01035 | 0.012 | ||
Model 4 | α = 0.05 | MHT-O | 0.04815 | 0.0516 | 0.05255 |
MHT | 0.04875 | 0.05275 | 0.0507 | ||
α = 0.01 | MHT-O | 0.00975 | 0.0118 | 0.00975 | |
MHT | 0.00855 | 0.012 | 0.01 | ||
Model 5 | α = 0.05 | MHT-O | 0.04865 | 0.0499 | 0.04975 |
MHT | 0.05095 | 0.05195 | 0.04755 | ||
α = 0.01 | MHT-O | 0.012 | 0.0119 | 0.00915 | |
MHT | 0.01075 | 0.01115 | 0.0096 |
For power comparisons, we consider different values of the effect size, different models, and different numbers of traits. Sample size is 1,000 for all the cases. In each of the simulation scenarios, the p-values of MHT and MHT-O are estimated using 1,000 permutations and the p-values of SUM_SCORE, TATES and MANOVA are estimated using their asymptotic distributions. The powers of all of the five tests are evaluated using 500 replicated samples at a significance level of 0.05.
Fig 1 gives the power comparisons of the five tests (SUM_SCORE, TATES, MHT, MHT-O and MANOVA) for the power as a function of the effect size based on the five models for 20 traits. This figure shows that (1) MHT-O is either the most powerful one (genotypes directly impact on a single trait: models 4–5) or comparable to the most powerful one (genotypes directly impact on all or a portion of the traits: models 1–3) among the five tests; (2) MHT and MANOVA have very similar powers; (3) MHT and MANOVA are much less powerful than other methods when genotypes directly impact on only a portion of the traits (models 2–3); (4) TATES is much less powerful than other methods when genotypes directly impact on all the traits (model 1); and (5) SUM_SCORE is much less powerful than other methods when genotypes directly impact on a single trait (models 4–5).
Power comparisons of the five tests for 30 and 40 traits are given in Figs 2 and 3, respectively. The patterns of power comparisons for 30 and 40 traits (Figs 2 and 3) are similar to that for 20 traits (Fig 1). We also give power comparisons of the five tests using a significance level of 5×10−8 with 108 permutations and 500 replicates for 20 traits under model 1 (S1 Fig). S1 Fig shows that the patterns of the power comparisons using significance level 5×10−8 are similar to that using a significance level of 0.05 in Fig 1 (model 1). In summary, MHT-O is either the most powerful test or comparable to the most powerful test among all the tests we compared. Therefore, our MHT-O is a robust test to a variety of models.
Discussion
We propose MHT-O to perform joint analysis of multiple traits in association studies based on the following reasons: (1) multiple related traits are usually measured in genetic association studies of complex diseases; (2) there is increasing evidence showing that pleiotropy is a widespread phenomenon in complex diseases; and (3) the power of existing methods decreases in the presence of non-associated traits. The proposed MHT-O includes a procedure of deleting traits that have weak or no association with the variant. Therefore, it can be robust to the existence and the number of non-associated traits. By deleting one trait that has the weakest association with the variant in each step, MHT-O can maintain high power in the presence of a large number of non-associated traits. This feature is essentially important when there exist a large number of correlated traits but there are no guidelines to select relevant traits. Our results show that MHT-O has correct type I error rates and is either the most powerful test or comparable to the most powerful test among the five tests we compared. No other methods in the simulation studies show consistent good performance.
Due to the allelic heterogeneity and the extreme rarity of individual variants in rare variant association studies, the variant-by-variant methods for common variant association studies may not be optimal [27]. It has been shown by recent studies that complex diseases are caused by both common and rare variants [28–34]. Statistical methods including burden tests [27, 35–38], quadratic tests [39–41], and combined tests [42–44] have been developed for rare variant association studies with a single trait. Currently, there are limited researches on rare variant association studies for joint analysis of multiple traits [14, 45]. MHT-O can be extended to rare variant association studies by extending Eq (1) to include multiple variants. MHT-O can also be extended to family-based studies by extending Eq (1) to mixed linear model. However, the performance of MHT-O in rare variant association studies and in family-based association studies needs further investigation.
The fact that population stratification can seriously confound association results has been long recognized in association studies based on unrelated individuals [46, 47]. Several methods to control for population stratification have been developed for association studies based on unrelated individuals. These methods include principal component (PC) approach [48–52], genomic control (GC) approach [53–55], and mixed linear model (MLM) approach [29, 56]. Like most association tests based on unrelated individuals, MHT-O subjects to bias due to population stratification. To make MHT-O robust to population stratification, we can use the PC approach. Let Pi = (pi1,…,piL)T denote the first L PCs of the genotypes at a set of genomic markers for the ith individual. Let and denote the residuals of the regressions and the residuals of the regression xi = α0 + αTPi + εi, respectively. Using and to replace yik and xi, we can make MHT-O robust to population stratification. However, the performance of using the PC approach to control for population stratification in MHT-O needs further investigations.
Appendix
Let p denote the p-value of the test and denote a random variable
where α is the significance level. Then, Pr(ξ = 1) = α and Pr(ξ = 0) = 1 − α because p follows a uniform distribution between 0 and 1 under the null hypothesis. Suppose there are R replicates. Let ξi denote the value of ξ for the ith replicate, where i = 1,…,R Then, the estimated type I error rate is given by that asymptotically follows a normal distribution . Thus, .
We define as the 95% confidence interval for the estimated type I error rate for the nominal level α.
Supporting Information
Data Availability
All relevant data are within the paper and its Supporting Information file.
Funding Statement
The authors have no support or funding to report.
References
- 1.Sivakumaran S, Agakov F, Theodoratou E, Prendergast JG, Zgaga L, Manolio T, et al. Abundant pleiotropy in human complex diseases and traits. American journal of human genetics. 2011;89(5):607–18. 10.1016/j.ajhg.2011.10.004 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Yang Q, Wu H, Guo CY, Fox CS. Analyze multivariate phenotypes in genetic association studies by combining univariate association tests. Genetic epidemiology. 2010;34(5):444–54. 10.1002/gepi.20497 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Rifai N, Ridker PM. Inflammatory markers and coronary heart disease. Current opinion in lipidology. 2002;13(4):383–9. . [DOI] [PubMed] [Google Scholar]
- 4.Yudkin JS, Kumari M, Humphries SE, Mohamed-Ali V. Inflammation, obesity, stress and coronary heart disease: is interleukin-6 the link? Atherosclerosis. 2000;148(2):209–14. . [DOI] [PubMed] [Google Scholar]
- 5.O'Reilly PF, Hoggart CJ, Pomyen Y, Calboli FC, Elliott P, Jarvelin MR, et al. MultiPhen: joint model of multiple phenotypes can increase discovery in GWAS. PloS one. 2012;7(5):e34861 10.1371/journal.pone.0034861 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Solovieff N, Cotsapas C, Lee PH, Purcell SM, Smoller JW. Pleiotropy in complex traits: challenges and strategies. Nature reviews Genetics. 2013;14(7):483–95. 10.1038/nrg3461 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Stephens M. A unified framework for association analysis with multiple related phenotypes. PloS one. 2013;8(7):e65245 10.1371/journal.pone.0065245 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Yang Q, Wang Y. Methods for analyzing multivariate phenotypes in genetic association studies. J Probab Stat. 2012;2012:652569 10.1155/2012/652569 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Zhou X, Stephens M. Efficient multivariate linear mixed model algorithms for genome-wide association studies. Nature methods. 2014;11(4):407–9. 10.1038/nmeth.2848 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.O'Brien PC. Procedures for comparing samples with multiple endpoints. Biometrics. 1984;40(4):1079–87. . [PubMed] [Google Scholar]
- 11.van der Sluis S, Posthuma D, Dolan CV. TATES: efficient multivariate genotype-phenotype analysis for genome-wide association studies. PLoS genetics. 2013;9(1):e1003235 10.1371/journal.pgen.1003235 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Kim J, Bai Y, Pan W. An Adaptive Association Test for Multiple Phenotypes with GWAS Summary Statistics. Genetic epidemiology. 2015;39(8):651–63. 10.1002/gepi.21931 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Korte A, Vilhjalmsson BJ, Segura V, Platt A, Long Q, Nordborg M. A mixed-model approach for genome-wide association studies of correlated traits in structured populations. Nature genetics. 2012;44(9):1066–71. 10.1038/ng.2376 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Casale FP, Rakitsch B, Lippert C, Stegle O. Efficient set tests for the genetic analysis of correlated traits. Nature methods. 2015;12(8):755–8. 10.1038/nmeth.3439 . [DOI] [PubMed] [Google Scholar]
- 15.Zeger SL, Liang KY. Longitudinal data analysis for discrete and continuous outcomes. Biometrics. 1986;42(1):121–30. . [PubMed] [Google Scholar]
- 16.Zhang Y, Xu Z, Shen X, Pan W, Alzheimer's Disease Neuroimaging I. Testing for association with multiple traits in generalized estimation equations, with application to neuroimaging data. Neuroimage. 2014;96:309–25. 10.1016/j.neuroimage.2014.03.061 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Yan T, Li Q, Li Y, Li Z, Zheng G. Genetic association with multiple traits in the presence of population stratification. Genetic epidemiology. 2013;37(6):571–80. 10.1002/gepi.21738 . [DOI] [PubMed] [Google Scholar]
- 18.Tang CS, Ferreira MA. A gene-based test of association using canonical correlation analysis. Bioinformatics. 2012;28(6):845–50. 10.1093/bioinformatics/bts051 . [DOI] [PubMed] [Google Scholar]
- 19.Aschard H, Vilhjalmsson BJ, Greliche N, Morange PE, Tregouet DA, Kraft P. Maximizing the power of principal-component analysis of correlated phenotypes in genome-wide association studies. American journal of human genetics. 2014;94(5):662–76. 10.1016/j.ajhg.2014.03.016 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Ott J, Rabinowitz D. A principal-components approach based on heritability for combining phenotype information. Human heredity. 1999;49(2):106–11. 22854. . [DOI] [PubMed] [Google Scholar]
- 21.Lange C, van Steen K, Andrew T, Lyon H, DeMeo DL, Raby B, et al. A family-based association test for repeatedly measured quantitative traits adjusting for unknown environmental and/or polygenic effects. Statistical applications in genetics and molecular biology. 2004;3:Article17. 10.2202/1544-6115.1067 . [DOI] [PubMed] [Google Scholar]
- 22.Klei L, Luca D, Devlin B, Roeder K. Pleiotropy and principal components of heritability combine to increase power for association analysis. Genetic epidemiology. 2008;32(1):9–19. 10.1002/gepi.20257 . [DOI] [PubMed] [Google Scholar]
- 23.Zhou JJ, Cho MH, Lange C, Lutz S, Silverman EK, Laird NM. Integrating multiple correlated phenotypes for genetic association analysis by maximizing heritability. Hum Hered. 2015;79(2):93–104. 10.1159/000381641 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Feng T, Zhang S, Sha Q. A method dealing with a large number of correlated traits in a linkage genome scan. BMC proceedings. 2007;1 Suppl 1:S84 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Falconer DS, Mackay TFC. Introduction to quantitative genetics. 4th ed. Essex, England: Longman; 1996. xiii, 464 p. p. [Google Scholar]
- 26.Ge Y, Dudoit S, Speed TP. Resampling-based multiple testing for microarray data analysis. Test. 2003;12(1):1–77. [Google Scholar]
- 27.Li B, Leal SM. Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data. Am J Hum Genet. 2008;83(3):311–21. 10.1016/j.ajhg.2008.06.024 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Bodmer W, Bonilla C. Common and rare variants in multifactorial susceptibility to common diseases. Nat Genet. 2008;40(6):695–701. 10.1038/ng.f.136 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Kang HM, Sul JH, Service SK, Zaitlen NA, Kong SY, Freimer NB, et al. Variance component model to account for sample structure in genome-wide association studies. Nature genetics. 2010;42(4):348–54. 10.1038/ng.548 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Pritchard JK. Are rare variants responsible for susceptibility to complex diseases? Am J Hum Genet. 2001;69(1):124–37. 10.1086/321272 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Pritchard JK, Cox NJ. The allelic architecture of human disease genes: common disease-common variant…or not? Hum Mol Genet. 2002;11(20):2417–23. . [DOI] [PubMed] [Google Scholar]
- 32.Stratton MR, Rahman N. The emerging landscape of breast cancer susceptibility. Nat Genet. 2008;40(1):17–22. 10.1038/ng.2007.53 . [DOI] [PubMed] [Google Scholar]
- 33.Teer JK, Mullikin JC. Exome sequencing: the sweet spot before whole genomes. Hum Mol Genet. 2010;19(R2):R145–51. 10.1093/hmg/ddq333 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Walsh T, King MC. Ten genes for inherited breast cancer. Cancer Cell. 2007;11(2):103–5. 10.1016/j.ccr.2007.01.010 . [DOI] [PubMed] [Google Scholar]
- 35.Morgenthaler S, Thilly WG. A strategy to discover genes that carry multi-allelic or mono-allelic risk for common diseases: a cohort allelic sums test (CAST). Mutat Res. 2007;615(1–2):28–56. 10.1016/j.mrfmmm.2006.09.003 . [DOI] [PubMed] [Google Scholar]
- 36.Madsen BE, Browning SR. A groupwise association test for rare mutations using a weighted sum statistic. PLoS genetics. 2009;5(2):e1000384 10.1371/journal.pgen.1000384 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Price AL, Kryukov GV, de Bakker PI, Purcell SM, Staples J, Wei LJ, et al. Pooled association tests for rare variants in exon-resequencing studies. American journal of human genetics. 2010;86(6):832–8. 10.1016/j.ajhg.2010.04.005 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Zawistowski M, Gopalakrishnan S, Ding J, Li Y, Grimm S, Zollner S. Extending rare-variant testing strategies: analysis of noncoding sequence and imputed genotypes. American journal of human genetics. 2010;87(5):604–17. 10.1016/j.ajhg.2010.10.012 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Neale BM, Rivas MA, Voight BF, Altshuler D, Devlin B, Orho-Melander M, et al. Testing for an unusual distribution of rare variants. PLoS genetics. 2011;7(3):e1001322 10.1371/journal.pgen.1001322 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Sha Q, Wang X, Wang X, Zhang S. Detecting association of rare and common variants by testing an optimally weighted combination of variants. Genetic epidemiology. 2012;36(6):561–71. 10.1002/gepi.21649 . [DOI] [PubMed] [Google Scholar]
- 41.Wu MC, Lee S, Cai T, Li Y, Boehnke M, Lin X. Rare-variant association testing for sequencing data with the sequence kernel association test. American journal of human genetics. 2011;89(1):82–93. 10.1016/j.ajhg.2011.05.029 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Sha Q, Zhang S. A rare variant association test based on combinations of single-variant tests. Genetic epidemiology. 2014;38(6):494–501. 10.1002/gepi.21834 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Derkach A, Lawless JF, Sun L. Robust and powerful tests for rare variants using Fisher's method to combine evidence of association from two or more complementary tests. Genetic epidemiology. 2013;37(1):110–21. 10.1002/gepi.21689 . [DOI] [PubMed] [Google Scholar]
- 44.Lee S, Emond MJ, Bamshad MJ, Barnes KC, Rieder MJ, Nickerson DA, et al. Optimal unified approach for rare-variant association testing with application to small-sample case-control whole-exome sequencing studies. American journal of human genetics. 2012;91(2):224–37. 10.1016/j.ajhg.2012.06.007 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Wang Y, Liu A, Mills JL, Boehnke M, Wilson AF, Bailey-Wilson JE, et al. Pleiotropy analysis of quantitative traits at gene level by multivariate functional linear models. Genetic epidemiology. 2015;39(4):259–75. 10.1002/gepi.21895 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Knowler WC, Williams RC, Pettitt DJ, Steinberg AG. Gm3;5,13,14 and type 2 diabetes mellitus: an association in American Indians with genetic admixture. American journal of human genetics. 1988;43(4):520–6. [PMC free article] [PubMed] [Google Scholar]
- 47.Lander ES, Schork NJ. Genetic dissection of complex traits. Science. 1994;265(5181):2037–48. . [DOI] [PubMed] [Google Scholar]
- 48.Bauchet M, McEvoy B, Pearson LN, Quillen EE, Sarkisian T, Hovhannesyan K, et al. Measuring European population stratification with microarray genotype data. American journal of human genetics. 2007;80(5):948–56. 10.1086/513477 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Chen HS, Zhu X, Zhao H, Zhang S. Qualitative semi-parametric test for genetic associations in case-control designs under structured populations. Annals of human genetics. 2003;67(Pt 3):250–64. . [DOI] [PubMed] [Google Scholar]
- 50.Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, Reich D. Principal components analysis corrects for stratification in genome-wide association studies. Nature genetics. 2006;38(8):904–9. 10.1038/ng1847 . [DOI] [PubMed] [Google Scholar]
- 51.Zhang S, Zhu X, Zhao H. On a semiparametric test to detect associations between quantitative traits and candidate genes using unrelated individuals. Genetic epidemiology. 2003;24(1):44–56. 10.1002/gepi.10196 . [DOI] [PubMed] [Google Scholar]
- 52.Zhu X, Zhang S, Zhao H, Cooper RS. Association mapping, using a mixture model for complex traits. Genetic epidemiology. 2002;23(2):181–96. 10.1002/gepi.210 . [DOI] [PubMed] [Google Scholar]
- 53.Devlin B, Roeder K. Genomic control for association studies. Biometrics. 1999;55(4):997–1004. . [DOI] [PubMed] [Google Scholar]
- 54.Devlin B, Roeder K, Wasserman L. Genomic control, a new approach to genetic-based association studies. Theoretical population biology. 2001;60(3):155–66. 10.1006/tpbi.2001.1542 . [DOI] [PubMed] [Google Scholar]
- 55.Reich DE, Goldstein DB. Detecting association in a case-control study while correcting for population stratification. Genetic epidemiology. 2001;20(1):4–16. . [DOI] [PubMed] [Google Scholar]
- 56.Zhang Z, Ersoz E, Lai CQ, Todhunter RJ, Tiwari HK, Gore MA, et al. Mixed linear model approach adapted for genome-wide association studies. Nature genetics. 2010;42(4):355–60. 10.1038/ng.546 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
All relevant data are within the paper and its Supporting Information file.