Associating Multivariate Traits with Genetic Variants Using Collapsing and Kernel Methods with Pedigree- or Population-Based Studies

Li-Chu Chien

doi:10.1155/2021/8812282

. 2021 Feb 9;2021:8812282. doi: 10.1155/2021/8812282

Associating Multivariate Traits with Genetic Variants Using Collapsing and Kernel Methods with Pedigree- or Population-Based Studies

Li-Chu Chien ^1,^✉

PMCID: PMC7889379 PMID: 33628328

Abstract

In genetic association analysis, several relevant phenotypes or multivariate traits with different types of components are usually collected to study complex or multifactorial diseases. Over the past few years, jointly testing for association between multivariate traits and multiple genetic variants has become more popular because it can increase statistical power to identify causal genes in pedigree- or population-based studies. However, most of the existing methods mainly focus on testing genetic variants associated with multiple continuous phenotypes. In this investigation, we develop a framework for identifying the pleiotropic effects of genetic variants on multivariate traits by using collapsing and kernel methods with pedigree- or population-structured data. The proposed framework is applicable to the burden test, the kernel test, and the omnibus test for autosomes and the X chromosome. The proposed multivariate trait association methods can accommodate continuous phenotypes or binary phenotypes and further can adjust for covariates. Simulation studies show that the performance of our methods is satisfactory with respect to the empirical type I error rates and power rates in comparison with the existing methods.

1. Introduction

Genome-wide association studies (GWAS) intend to find genetic variants such as single nucleotide polymorphisms (SNPs) associated with common traits or with complex diseases [1, 2]. Association studies, where the correlation relationship between a genetic variant and a trait is evaluated, are helpful for mapping genes influencing complex diseases [3]. In the study of complex diseases, data on several correlated phenotypes or a multivariate phenotype with several components are often collected to get a better understanding of the disease [1, 3, 4]. Multivariate correlated traits are influenced through multiple variants simultaneously. Therefore, by a suitable joint or multivariate analysis framework of multivariate traits, we can not only gain more statistical power to identify pleiotropic effects of genetic variants on multivariate traits [3, 5–12] but also can further understand the genetic architecture of the disease of interest [5, 13]. Thus, recently, the joint analysis of multivariate traits has become popular because it can increase statistical power over analyzing only one trait at a time [1, 4].

Several statistical methods have been developed to identify the association between multivariate traits and a genetic variant [1, 5]. Current multivariate methods can be classified into three groups [1, 2, 5]: regression methods [14–16], variable reduction methods [11, 13, 17, 18], and combining analysis [9, 19–23]. However, many of the existing methods for multivariate association analysis cannot be straightaway extended to rare variant analyses, due to their enormous numbers causing the problems of multiple comparison or multiple testing and their low minor allele frequencies [2, 5, 24]. Moreover, sparsity of data could lead to problems on estimating regression parameters and fitting regression models [2]. Hence, it is necessary for proposing statistical methods for identifying the association between multivariate traits and multiple genetic variants (common and/or rare variants) [5]. In recent years, various statistical techniques have been proposed for this purpose in GWAS [8, 17, 25–27]. Furthermore, several approaches have been extendedly developed for the investigation of rare variants associated with multivariate traits [2, 28–38].

Although these new developments keep many benefits, existing methods have some potential limitations [39]. Most current methods are constructed under some specific assumptions about the effects of genetic variants on multivariate traits [39]. These current approaches suffer a severe loss in power once the model assumptions are violated [26, 39].

In this investigation, we develop the statistical methods for identifying pleiotropic effects of genetic variants on multivariate traits using collapsing and kernel methods with pedigree- or population-structured data. The proposed multivariate trait association method is able to handle binary phenotypes or continuous phenotypes and further can adjust for covariates. Moreover, the proposed multivariate trait association method not only can leverage the dependence on the phenotypes but also can account for the sample relatedness in the pedigree-based or population-based structured data.

The rest of the article is organized as follows. In the materials and methods section, we construct the multivariate effect model using the joint GEE model formulation (JGEE) [40]. We apply the JGEE to pedigree- or population-structured data and introduce a retrospective framework for analyzing multivariate traits in genetic association studies. The proposed framework is applicable to the burden test, the kernel test, and the omnibus test for autosomes and the X chromosome. In the simulation studies, we examine the finite sample size performance of the proposed multivariate association methods and evaluate the comparison results with the existing method, Multi-SKAT [39]. Concluding remarks and future possibilities for continuity are given in the conclusion section and the limitation section.

2. Materials and Methods

2.1. Notations

To describe the proposed multivariate trait association method based on the pedigree- or population-based structured data, we suppose that there are N independent pedigrees and each pedigree has n_i subjects. We assume that the n_i subjects have been sequenced in a genetic region of interest (e.g., a gene) that contains p variants. Let y_ik = (y_i1k, y_i2k,⋯,y_{in_ik})^T be the n_i × 1 phenotype vector for the k^th phenotype of the i^th pedigree. Let y_i = (y_i1, y_i2, ⋯, y_iK) be the (n_i × K) × 1 response vector for the K phenotypes that we are interested in. Let x_im = (x_i1m, x_i2m,⋯,x_{in_im})^T be the n_i × 1 vector for the m^th covariate of the i^th pedigree. Let x_i = (x_i0, x_i1, ⋯, x_iq) be the n_i × (q + 1) covariate matrix for the (q + 1) nongenetic covariates that we want to adjust for. Let α_k = (α_0k, α_1k,⋯,α_qk)^T be the (q + 1) × 1 vector of regression coefficients of the (q + 1) nongenetic covariates with the element α_mk being the effect of the m^th covariate on the k^th trait. Let g_i = (g_i1, g_i2, ⋯, g_ip) be the n_i × p genetic matrix for p genetic variants in a target region of interest where g_il = (g_i1l, g_i2l,⋯,g_{in_il})^T is the n_i × 1 vector for a genetic variant l (g_ijl = 0, 1, or 2 for 0, 1, or 2 copies of the minor allele, respectively). Let β_k = (β_1k, β_2k,⋯,β_pk)^T be the p × 1 vector of regression coefficients of the p genetic variants with the element β_lk being the effect of the l^th genetic variant on the k^th trait.

2.2. Multitrait Regression-Based Tests for Pedigree Data

We let X_i = I_K ⊗ x_i be the (n_i × K) × ((q + 1) × K) covariate matrix and G_i = I_K ⊗ g_i be the (n_i × K) × (p × K) genotype matrix for the i^th pedigree where I_K is an identity matrix of dimension K × K and ⊗ stands for the Kronecker product. According to the generalized linear model [41], we assume that the marginal density of y_ijk is f(y_ijk) = exp[{y_ijkθ_ijk − a(θ_ijk) + b(y_ijk)}/ϕ] with two moments, μ_ijk = E(y_ijk) = ∂a(θ_ijk)/∂θ_ijk and Var(y_ijk) = (∂μ_ijk/∂θ_ijk)ϕ, where ϕ is a scale parameter. Let θ_i = (θ_i1^T, θ_i2^T,⋯,θ_iK^T)^T be the (n_i × K) × 1 vector with the components θ_ik = (θ_i1k, θ_i2k,⋯,θ_{in_ik})^T, k = 1, 2, ⋯, K and η_i = (η_i1^T, η_i2^T,⋯,η_iK^T)^T be the (n_i × K) × 1 vector with the components η_ik = (η_i1k, η_i2k,⋯,η_{in_ik})^T = x_iα_k + g_iβ_k, k = 1, 2, ⋯, K for the k^th trait of the i^th pedigree.

Based on the joint GEE model formulation [40], we construct the multivariate linear model for describing the association relationship between K correlated traits and genetic variants, which is given as follows:

\begin{matrix} μ_{i} = g (η_{i}) and η_{i} = X_{i} α + G_{i} β, \end{matrix}

(1)

where g⁻¹(•) is the inverse function of g(•) and is a response-specific link function [40], μ_i = (μ_i1^T, μ_i2^T,⋯,μ_iK^T)^T is the (n_i × K) × 1 vector of the expected mean of the multivariate traits y_i = (y_i1^T, y_i2^T,⋯,y_iK^T)^T, α = (α₁^T, α₂^T,⋯,α_K^T)^T is the ((q + 1) × K) × 1 vector of regression coefficients of the (q + 1) nongenetic covariates for the K correlated traits, and β = (β₁^T, β₂^T,⋯,β_K^T)^T is the (p × K) × 1 vector of regression coefficients of the p genetic variants for the K correlated traits.

Let R_{n_i}(φ) and R_K(γ) be the n_i × n_i within-in cluster correlation matrix and the K × K multivariate-response cluster correlation matrix, which depend on a vector of parameters φ and γ, respectively. The (n_i × K) × (n_i × K) working (or approximate) covariance matrix of y_i is given by [40].

\begin{matrix} V_{i} = A_{i}^{1 / 2} (R_{K} (γ) \otimes R_{n_{i}} (φ)) A_{i}^{1 / 2} ϕ, \end{matrix}

(2)

where A_i = diag(A_i1, A_i2, ⋯, A_iK) is a (n_i × K) × (n_i × K) block diagonal matrix with the components A_ik = diag(∂μ_i1k/∂θ_i1k, ∂μ_i2k/∂θ_i2k, ⋯, ∂μ_{in_ik}/∂θ_{in_ik}), k = 1, 2, ⋯, K being the n_i × n_i diagonal matrices. According to equation (1), under the null hypothesis of no association between genotypes and phenotypes, we propose the multivariate association methods including the homogeneous kernel statistic (HoK), the heterogeneous kernel statistic (HeK), and burden test (BT). Moreover, we propose the homogeneous omnibus test (HoO) and heterogeneous omnibus test (HeO) by combining the HoK with the BT and by combining the HeK with the BT, respectively.

2.2.1. Kernel Statistic

We let H be a p × p correlation matrix of genotype scores with element H_ll′ for markers l and l′. Let m_l denote the minor allele frequency (MAF) of the l^th marker. Let $S_{i} = {\hat{V}}_{i}^{- 1} (y_{i} - {\hat{μ}}_{i}) = {(S_{i 1}^{T}, S_{i 2}^{T}, \dots, S_{i K}^{T})}^{T}$ be the ((n_i × K) × 1) vector of the standard residuals with components S_ik = (S_i1k, S_i2k,⋯,S_{in_ik})^T, k = 1, 2, ⋯, K, where ${\hat{V}}_{i}^{- 1}$ is the inverse matrix of ${\hat{V}}_{i}$ . Here, ${\hat{V}}_{i}$ and ${\hat{μ}}_{i}$ are the estimates of V_i and μ_i. Here and henceforth, all estimates are calculated based on the null hypothesis of the genetic effects β equal to zero. All unknown parameters and the working within-in and multivariate-response cluster correlation matrices are estimated by the R package JGEE [42].

(1) Homogeneous Kernel Statistic. We suppose that w_l is a marker-specific weight of the l^th variant and assume that the genetic effects on the K different phenotypes are homogeneous (i.e., β₁ = β₂ = ⋯ = β_K). Based on the JGEE model with the genotype as random variables considered, we propose the homogeneous quadratic (kernel) association statistic (HoK) as follows:

\begin{matrix} κ_{Ho} = \sum_{l = 1}^{p} {[w_{l} \sum_{k = 1}^{K} \sum_{i = 1}^{N} g_{i l}^{T} {\overset{\land}{Δ}}_{i k} {\overset{\land}{A}}_{i k} S_{i k}]}^{2} = \sum_{l = 1}^{p} {[w_{l} \sum_{k = 1}^{K} Z_{l k}]}^{2} = \sum_{l = 1}^{p} {[w_{l} Z_{l}]}^{2} = \sum_{l = 1}^{p} {\tilde{Z}}_{l}^{2}, \end{matrix}

(3)

where ${\tilde{Z}}_{l} = w_{l} Z_{l}$ , Z_l = ∑_k=1^KZ_lk, $Z_{l k} = \sum_{i = 1}^{N} g_{i l}^{T} {\hat{Δ}}_{i k} {\hat{A}}_{i k} S_{i k}$ _, ${\hat{A}}_{i k}$ is the estimate of A_ik, and ${\hat{Δ}}_{i k}$ is the estimate of Δ_ik = diag(∂θ_i1k/∂η_i1k, ∂θ_i2k/∂η_i2k, ⋯, ∂θ_{in_ik}/∂η_{in_ik}) that is a n_i × n_i diagonal matrix for the k^th phenotype of the i^th pedigree. The null distribution of κ_Ho asymptotically follows a mixture chi-square distribution ∑_l=1^pλ_lχ_l,1², where χ_l,1²s are independent random variables following a chi-square distribution with one degree of freedom and (λ₁, λ₂, ⋯, λ_p) are nonzero eigenvalues of the null covariate matrix of ${Cov}_{0} ({\tilde{Z}}_{l}, {\tilde{Z}}_{l^{'}}) = 2 C_{Ho} w_{l} w_{l^{'}} H_{l l^{'}} \sqrt{m_{l} (1 - m_{l}) m_{l^{'}} (1 - m_{l^{'}})}$ where $C_{Ho} = \sum_{i = 1}^{N} [(\sum_{k = 1}^{K} S_{i k}^{T} {\hat{A}}_{i k} {\hat{Δ}}_{i k}) Ω_{i} (\sum_{k = 1}^{K} {\hat{Δ}}_{i k} {\hat{A}}_{i k} S_{i k})]$ and Ω_i is a n_i × n_i matrix of genetic correlations for all n_i individuals in the i^th pedigree, which has the same definition given by Schaid et al. [43] and can be calculated by the R package kinship2 [44]. When the genetic relationship between subjects j and j′ in the i^th pedigree is unknown, the elements of the genetic correlation Ω_i can be estimated through genomic data [43, 45], and its estimate is given by [43]

\begin{matrix} {\hat{Ω}}_{i} = \frac{1}{p} \sum_{l = 1}^{p} \frac{(g_{i j l} - 2 m_{l}) (g_{i j^{'} l} - 2 m_{l})}{2 m_{l} (1 - m_{l})} . \end{matrix}

(4)

(2) Heterogeneous Kernel Statistic. We assume that the genetic effects on the K different phenotypes are heterogeneous (i.e., β₁ ≠ β₂≠⋯≠β_K). The heterogeneous quadratic (kernel) association statistic (HeK) is defined by

\begin{matrix} κ_{He} = \sum_{l = 1}^{p} \sum_{k = 1}^{K} {[w_{l k} \sum_{i = 1}^{N} g_{i l}^{T} {\overset{\land}{Δ}}_{i k} {\overset{\land}{A}}_{i k} S_{i k}]}^{2} = \sum_{l = 1}^{p} \sum_{k = 1}^{K} {[w_{l k} Z_{l k}]}^{2} = \sum_{l = 1}^{p} \sum_{k = 1}^{K} {\tilde{Z}}_{l k}^{2}, \end{matrix}

(5)

where ${\tilde{Z}}_{l k} = w_{l k} Z_{l k}$ and w_lk is a marker-specific weight of the l^th variant of the k^th trait. The null distribution of κ_He asymptotically follows a mixture chi-square distribution ∑_l=1^{(p × K)}λ_lχ_l,1², where χ_l,1²s are independent random variables following a chi-square distribution with one degree of freedom, and (λ₁, λ₂, ⋯, λ_{(p × K)}) are nonzero eigenvalues of the null covariate matrix of ${Cov}_{0} ({\tilde{Z}}_{l k}, {\tilde{Z}}_{l^{'} k^{'}}) = 2 C_{He} w_{l k} w_{l^{'} k^{'}} H_{l l^{'}} \sqrt{m_{l} (1 - m_{l}) m_{l^{'}} (1 - m_{l^{'}})}$ , where $C_{He} = \sum_{i = 1}^{N} [S_{i k}^{T} {\hat{A}}_{i k} {\hat{Δ}}_{i k} Ω_{i} {\hat{Δ}}_{i k^{'}} {\hat{A}}_{i k^{'}} S_{i k^{'}}]$ .

Theoretical p values of κ_Ho and κ_He are approximately calculated by Kuonen's saddlepoint method [46] and obtained by the R package pchisqsum. A theory for the derivation of the HoK test (κ_Ho) and the HeK test (κ_He) is shown in Appendix S1.

2.2.2. Burden Test

We let ${\tilde{g}}_{i}^{T} = \sum_{l = 1}^{p} w_{l} g_{i l}^{T}$ be a weighted average of genotype scores for the i^th pedigree. On the basis of the HoK test (κ_Ho) and the HeK test (κ_He) in equations (3) and (5) with the same marker-specific weight of the l^th variant for each trait k (i.e., w_l = w_lk, k = 1, 2, ⋯, K), we propose the burden test (BT) as follows:

\begin{matrix} BT = \frac{{[\sum_{i = 1}^{N} (\sum_{k = 1}^{K} S_{i k}^{T} {\overset{\land}{A}}_{i k} {\overset{\land}{Δ}}_{i k}) {\tilde{g}}_{i}]}^{2}}{\sum_{i = 1}^{N} [(\sum_{k = 1}^{K} S_{i k}^{T} {\hat{A}}_{i k} {\hat{Δ}}_{i k}) {Cov}_{0} ({\tilde{g}}_{i}) (\sum_{k = 1}^{K} {\hat{Δ}}_{i k} {\hat{A}}_{i k} S_{i k})]}, \end{matrix}

(6)

where the null covariance matrix of ${\tilde{g}}_{i}$ is given by

\begin{matrix} {Cov}_{0} ({\tilde{g}}_{i}) = {Cov}_{0} (\sum_{l = 1}^{p} w_{l} g_{i l}) = \sum_{l = 1}^{p} w_{l}^{2} {Cov}_{0} (g_{i l}, g_{i l}) + 2 \sum_{l = 1}^{p} \sum_{l^{'} = l + 1}^{p} w_{l} w_{l^{'}} {Cov}_{0} (g_{i l}, g_{i l^{'}}) = Ω_{i} \sum_{l = 1}^{p} \sum_{l^{'} = 1}^{p} 2 w_{l} w_{l^{'}} H_{l l^{'}} \sqrt{m_{l} (1 - m_{l}) m_{l^{'}} (1 - m_{l^{'}})} . \end{matrix}

(7)

Then,

\begin{matrix} BT = \frac{{[\sum_{i = 1}^{N} (\sum_{k = 1}^{K} S_{i k}^{T} {\overset{\land}{A}}_{i k} {\overset{\land}{Δ}}_{i k}) {\tilde{g}}_{i}]}^{2}}{2 \sum_{l = 1}^{p} \sum_{l^{'} = 1}^{p} w_{l} w_{l^{'}} H_{l l^{'}} \sqrt{m_{l} (1 - m_{l}) m_{l^{'}} (1 - m_{l^{'}})} C_{Ho}} . \end{matrix}

(8)

The null distribution of BT asymptotically follows a chi-square distribution with one degree of freedom.

2.2.3. Omnibus Test

Let p_Ho, p_He, and p_BT denote the p values obtained by the HoK, HeK, and BT statistics. Based on the idea of the p value combination method through the Cauchy distribution [47–49], we propose the homogeneous omnibus test (HoO) and heterogeneous omnibus test (HeO).

(1) Homogeneous Omnibus Test. Combining the p_Ho with the p_BT, we construct the homogeneous omnibus test (HoO) as follows:

\begin{matrix} O_{Ho} = - \frac{1}{2} [F_{C}^{- 1} (p_{Ho}) + F_{C}^{- 1} (p_{BT})], \end{matrix}

(9)

where F_C⁻¹ stands for the inverse cumulative distribution function of the standard Cauchy distribution.

(2) Heterogeneous Omnibus Test. Combining the p_He with the p_BT, we construct the heterogeneous omnibus test (HeO) as follows:

\begin{matrix} O_{He} = - \frac{1}{2} [F_{C}^{- 1} (p_{He}) + F_{C}^{- 1} (p_{BT})] . \end{matrix}

(10)

The null distributions of the O_Ho test and the O_He test asymptotically follow a standard Cauchy distribution [47–49]. The p values of the O_Ho test and the O_He test are calculated by the R package RNOmni [50].

The kernel statistic, the burden test, and the omnibus test are also applicable to the X chromosome. Additional technical information for extensions to the X chromosome is shown in Appendix S2.

3. Simulation Studies

We conduct the numerical simulation studies to assess the finite sample performance of the proposed methods and evaluate the comparison results with two existing methods, the minimum p value SKAT statistic (mPK), and the minimum p value burden statistic (mPB) [39]. The two existing methods are implemented by the R package Multi-SKAT [39]. Based on the similar simulation set-up as those usually considered from existing genetic association tests [39, 43, 51], we investigate the effect of the proposed methods, HoK, HeK, BT, HoO, and HeO, for identifying genetic variants that are associated with multiple traits. We simultaneously generate 10,000 European-like (EUR) and 10,000 admixed African American-like (AA) haplotypes of length 200 kb using a calibrated human demographic model through the COSI software [51, 52]. A 3 kb region is randomly selected in our numerical simulations. We generate a total of 10,000 databases for each simulation scenario in our studies.

3.1. Type I Error Rate and Power Simulations

In the heterogeneous population with nuclear family data considered, continuous and binary phenotypes for trait k for individual j in the i^th family are generated from the multivariate linear model in equation (1) with K = 2 and n_i = 3. More precisely, continuous and binary phenotypes are generated by the following linear and logit models, respectively:

\begin{matrix} y_{i} = X_{i} α + G_{i} β + ε_{i}, \end{matrix}

(11)

\begin{matrix} logit (P (y_{i} = 1)) = X_{i} α + G_{i} β, \end{matrix}

(12)

where y_i = (y_i1^T, y_i2^T)^T, X_i = I₂ ⊗ x_i, G_i = I₂ ⊗ g_i, α = (α₁^T, α₂^T)^T, β = (β₁^T, β₂^T)^T, and ε_i = (ε_i11, ε_i21, ε_i31, ε_i12, ε_i22, ε_i32)^T. Here, the elements x_i0 = (x_i10, x_i20, x_i30)^T of the covariance matrix x_i = (x_i0, x_i1, x_i2) is a 3 × 1 vector of all ones. The elements x_i1 = (x_i11, x_i21, x_i31)^T of x_i are independently generated with an equal probability of being 0 or 1. The elements x_i2 = (x_i12, x_i22, x_i32)^T of x_i are generated from a multivariate normal distribution with a mean of 0.5 and a covariance matrix with diagonal entries of 1 and all off-diagonal entries of 0.1. The regression coefficients of the covariate matrix x_i for the k^th correlated trait are given by α_k = (α_0k, α_1k, α_2k)^T = (0.01, 0.1, 0.1)^T and α_k = (α_0k, α_1k, α_2k)^T = (−1.4, 0.1, 0.1)^T, respectively, for continuous traits and binary traits for k = 1, 2.

For continuous traits, the error terms ε_i = (ε_i11, ε_i21, ε_i31, ε_i12, ε_i22, ε_i32)^T in equation (11) follow a multivariate normal distribution having a mean of zero, a within-in cluster correlation matrix (i.e., Cor(ε_ijk, ε_ij′k)) with diagonal entries of 1 and all off-diagonal entries of 0.2 and a subject-across-response correlation matrix (i.e., Cor(ε_ijk, ε_ij′k′)) with diagonal entries of 0.3 and all off-diagonal entries of 0.1. Similarly, binary traits y_i in equation (12) are generated with the same within-in cluster correlation matrix (i.e., Cor(y_ijk, y_ij′k)) and the same subject-across-response correlation matrix (i.e., Cor(y_ijk, y_ij′k′)) as the continuous traits y_i in equation (11). These correlated phenotypes are generated by the R package BinNor [53].

For type I error simulations, the regression coefficients of genetic variants, β = (β₁^T, β₂^T)^T, in equations (11) and (12) are equal to zero under the null hypothesis. For power simulations, under the alternative hypothesis, we simulate that 35% of low variants with the MAF < 0.03 are causal. For each setting, either all causal SNPs have a positive effect, or 80% of causal SNPs are positive, and 20% of causal SNPs are negative. The regression coefficients of genetic variants, β = (β₁^T, β₂^T)^T, are set by 0.095 × ∣log₁₀(m_l)∣ or 0.095 × log₁₀(m_l) corresponding to the risk or protective variant l, l = 1, 2, ⋯, p [51]. Under the assumption that the genetic effects on the two different phenotypes are heterogeneous (i.e., β₁ ≠ β₂), the genetic effects β₁ for the first traits y_i1 are set as described above, while the genetic effects β₂ for the second traits y_i2 are set by zero. On the other hand, under the assumption that the genetic effects on the two different phenotypes are homogeneous (i.e., β₁ = β₂), the genetic effects β₁ and β₂ for the first and second traits have the same settings as described above.

We simulate 1,400 nuclear families with 800 nuclear families from the European samples and 600 nuclear families from African-American samples. The marker-specific weight w_l for variant l is considered as the beta density function w_l = Beta(m_l, λ₁, λ₂) with shape parameters λ₁ > 0 and λ₂ > 0 [51]. To study the effect of the marker-specific weight w_l of variant l on the phenotypes, we consider the unweighted marker-specific weight with w_l = Beta(m_l, 1, 1) = 1 and the weighted marker-specific weight with w_l = Beta(m_l, 1, 25) [51]. The empirical type I error rates based on fifty thousand replicates and the empirical power rates based on two thousand replicates are reported for all simulation results. The “exchangeable” and “unstructured” structures are considered for the working within-cluster and multivariate-response correlation matrices for the proposed methods, HoK, HeK, and BT, respectively.

4. Results

4.1. Empirical Type I Error Rates

Table 1 reports the results of a simulation comparison on empirical type I error rates when the phenotypes are considered to be continuous. Table 1 displays that the proposed methods, HoK, HoO, HeK, HeO, and BT, well control the empirical type I error rates regardless of the weight of the marker-specific weight. Similarly, the existing methods, mPK and mPB, have good performance on controlling the empirical type I error rates. Our simulation results show that the seven competing methods, HoK, HoO, HeK, HeO, BT, mPK, and mPB, reasonably control the empirical type I error rates for autosome analyses with continuous traits. The seven competing approaches display similar performance in terms of the empirical type I error rates for the X chromosome analyses with continuous traits (Appendix S3: Table S1).

Table 1.

Empirical type I errors of the seven competing methods with continuous traits.

Marker-specific weight (w_l)	Nominal level	Working correlation	Method
Marker-specific weight (w_l)	Nominal level	Working correlation	HoK³	HoO	HeK	HeO	BT	mPK⁴	mPB
Unweighted marker-specific weight¹	0.05	U/U²	0.04876	0.04960	0.05036	0.05228	0.04914	0.04352	0.04692
		E/E	0.04866	0.04994	0.05016	0.05216	0.04914	0.04352	0.04692
	0.01	U/U	0.00918	0.01012	0.01016	0.01030	0.01034	0.00854	0.01036
		E/E	0.00924	0.00994	0.01008	0.01022	0.01028	0.00854	0.01036
	0.001	U/U	0.00078	0.00082	0.00086	0.00070	0.00084	0.00084	0.00088
		E/E	0.00080	0.00078	0.00084	0.00070	0.00082	0.00084	0.00088
	0.0001	U/U	0.00008	0.00002	0.00006	0.00008	0.00008	0.00006	0.00014
		E/E	0.00006	0.00002	0.00006	0.00008	0.00008	0.00006	0.00014
Weighted marker-specific weight	0.05	U/U	0.05030	0.04998	0.05158	0.05134	0.04696	0.04604	0.04536
		E/E	0.05054	0.05010	0.05176	0.05122	0.04714	0.04604	0.04536
	0.01	U/U	0.00992	0.00942	0.01080	0.00972	0.00888	0.00978	0.01008
		E/E	0.00992	0.00944	0.01088	0.00978	0.00886	0.00978	0.01008
	0.001	U/U	0.00078	0.00086	0.00126	0.00098	0.00082	0.00124	0.00134
		E/E	0.00076	0.00088	0.00122	0.00102	0.00080	0.00124	0.00134
	0.0001	U/U	0.00006	0.00008	0.00006	0.00006	0.00010	0.00002	0.00010
		E/E	0.00006	0.00008	0.00008	0.00006	0.00010	0.00002	0.00010

Open in a new tab

¹The unweighted marker-specific weight is given by w_l = Beta(m_l, 1, 1) = 1; the weighted marker-specific weight is given by w_l = Beta(m_l, 1, 25). ²U/U represents the structures of the working within-cluster and multivariate-response correlation matrices considered by the unstructured structures; E/E represents the structures of the working within-cluster and multivariate-response correlation matrices considered by the exchangeable structures. ³HoK, HoO, HeK, HeO, and BT are our proposed methods. ⁴mPK and mPB are executed by the R package Multi-SKAT [39].

Table 2 reports the empirical type I error rates based on the proposed methods, HoK, HeK, BT, HoO, and HeO, for the binary data. The two existing methods, mPK and mPB, aren't included for comparison. This reason is that implementing the two existing methods, mPK and mPB, via the R package Multi-SKAT [39], the MPMM (multiple phenotype mixed model) function in the R package PHENIX [54–56] is a necessary tool for this process. However, the MPMM function is suitable for the continuous phenotypes [56] or is suitable for the binary phenotypes with the condition that the number of cases is sufficiently large [39]. In other words, in some sense, the two existing methods, mPK and mPB, are limited to continuous phenotypes [39].

Table 2.

Empirical type I errors of the five competing methods with binary traits.

Marker-specific weight (w_l)	Nominal level	Working correlation	Method
Marker-specific weight (w_l)	Nominal level	Working correlation	HoK³	HoO	HeK	HeO	BT
Unweighted marker-specific weight¹	0.05	U/U²	0.04944	0.05154	0.05086	0.05280	0.04952
		E/E	0.04930	0.05144	0.05068	0.05318	0.04946
	0.01	U/U	0.00974	0.00994	0.00982	0.01026	0.01000
		E/E	0.00974	0.00998	0.00984	0.01028	0.00998
	0.001	U/U	0.00068	0.00084	0.00100	0.00098	0.00106
		E/E	0.00066	0.00084	0.00102	0.00094	0.00104
	0.0001	U/U	0.00008	0.00002	0.00012	0.00010	0.00000
		E/E	0.00008	0.00002	0.00012	0.00010	0.00002
Weighted marker-specific weight	0.05	U/U	0.05170	0.04900	0.05256	0.04922	0.04576
		E/E	0.05168	0.04920	0.05232	0.04930	0.04556
	0.01	U/U	0.01028	0.00976	0.00996	0.00972	0.00886
		E/E	0.01024	0.00982	0.00986	0.00976	0.00884
	0.001	U/U	0.00110	0.00080	0.00096	0.00090	0.00088
		E/E	0.00112	0.00076	0.00096	0.00088	0.00090
	0.0001	U/U	0.00004	0.00008	0.00010	0.00012	0.00006
		E/E	0.00006	0.00008	0.00010	0.00012	0.00008

Open in a new tab

Table 2 shows that the proposed methods appropriately control the type I error rates when the marker-specific weight is considered for w_l = Beta(m_l, 1, 1) or w_l = Beta(m_l, 1, 25) for variant l for binary traits. On the other hand, the empirical type I error rates of the proposed methods for X chromosome analyses with binary traits are depicted in Table S2 in Appendix S3. These empirical type I error rates show similar results as that for autosome analyses.

In summary, our simulation results show that the proposed multivariate trait association methods, HoK, HoO, HeK, HeO, and BT, have reasonable control of type I error rates for continuous traits or binary traits whether the marker is X chromosomal or autosomal. On the other hand, the existing methods, mPK and mPB, yield well-controlled type I error rates for the autosome analyses or the X chromosome analyses with continuous traits (Table 1 or Table S1), regardless of the weight of the marker-specific weight.

4.2. Empirical Power

Figure 1 exhibits the comparison results of the empirical power rates for the autosome analyses with continuous traits, when the working within-cluster and multivariate-response correlation matrices of the proposed methods, HoK, HeK, and BT, are considered to be exchangeable. As expected, the empirical power rates of the seven competing methods with a weighted marker-specific weight of w_l = Beta(m_l, 1, 25) are higher than that with an unweighted marker-specific weight of w_l = Beta(m_l, 1, 1) = 1. The heterogeneous kernel statistic (HeK) has slightly greater empirical power rates than other methods, when the genetic effects on the different phenotypes are heterogeneous (i.e., β₁ ≠ β₂), and causal SNPs have positive effects or negative effects on phenotypes. On the other hand, the existing method, mPB, has bigger empirical power rates, when the genetic effects on the different phenotypes are heterogeneous (i.e., β₁ ≠ β₂), and all causal SNPs have a positive association on phenotypes. Moreover, the empirical power rates of the homogeneous omnibus test (HoO) are larger than that of the other six competing methods, when the genetic effects on the different phenotypes are homogeneous (i.e., β₁ = β₂). Evidently, the seven competing methods have their respective advantages in identifying the association between genetic effects and multiple continuous traits for autosome analyses.

Power comparisons of the seven competing methods with continuous traits for each scenario at the nominal level of 0.001. (a) Unweighted marker-specific weight: w_l = Beta(m_l, 1, 1) = 1. (b) Weighted marker-specific weight: w_l = Beta(m_l, 1, 25).

Similar empirical power rates are obtained from the working within-cluster and multivariate-response correlation matrices of the proposed methods, HoK, HeK, and BT, considered to be unstructured. Hence, these empirical power rates are not shown in order to save space. On the other hand, the seven competing approaches display a similar performance in testing for the X chromosome analyses with continuous traits (Appendix S3: Figure S1).

Figure 2 exhibits the comparison results of empirical power rates for the autosome analyses with binary traits when the working within-cluster and multivariate-response correlation matrices of the proposed methods, HoK, HeK, and BT, are considered to be exchangeable. As a similar reason for investigating the empirical type I error rates with binary traits, the two existing methods, mPK and mPB, aren't included for power comparison.

Power comparisons of the five competing methods with binary traits for each scenario at the nominal level of 0.001. (a) Unweighted marker-specific weight: w_l = Beta(m_l, 1, 1) = 1. (b) Weighted marker-specific weight: w_l = Beta(m_l, 1, 25).

Figure 2 shows that the heterogeneous kernel statistic (HeK) and the heterogeneous omnibus test (HeO) outperform over other methods in terms of the empirical power rates, when the genetic effects on the different phenotypes are heterogeneous (i.e., β₁ ≠ β₂). On the other hand, the empirical power rates of the homogeneous omnibus test (HoO) are bigger than that of the other competing methods, when the genetic effects on the different phenotypes are homogeneous (i.e., β₁ = β₂). As expected, in general, the heterogeneous kernel statistic (HeK) is more powerful than the homogeneous kernel statistic (HoK), when the genetic effects on the different phenotypes are heterogeneous (i.e., β₁ ≠ β₂). On the other hand, the homogeneous kernel statistic (HoK) is more powerful than the heterogeneous kernel statistic (HeK), when the genetic effects on the different phenotypes are homogeneous (i.e., β₁ = β₂). In a word, the proposed methods, HoK, HoO, HeK, HeO, and BT, have their respective merits in examining the association between genetic effects and multiple binary traits for autosome analyses.

Similarly, when the working within-cluster and multivariate-response correlation matrices of the proposed methods, HoK, HeK, and BT, are considered to be unstructured, the empirical power rates have similar results and thus they are omitted. On the other hand, the empirical power rates of the proposed methods for X chromosome analyses with binary traits are presented in Figure S2 in Appendix S3. These empirical power rates show similar results as that discussed in Figure 2.

In summary, the seven competing methods, HoK, HoO, HeK, HeO, BT, mPK, and mPB, have their respective merits in diagnosing whether genetic effects are associated with multiple continuous traits for autosome analyses or the X chromosome analyses. Similarly, the proposed methods, HoK, HoO, HeK, HeO, and BT, have their respective advantages in examining whether there are associations between genetic effects and multiple binary traits for autosome analyses or the X chromosome analyses.

To furthermore examine the performance of the proposed methods, additional simulation studies for continuous traits and binary traits are presented in Appendix S4 and Appendix S5 with higher correlations of phenotypes and higher dimensions of phenotypes considered, respectively. In general, these competing methods based on higher correlations of phenotypes or higher dimensions of phenotypes can provide a bigger empirical power rate for the analysis of continuous traits or binary traits. However, we note that these competing methods based on higher correlations of phenotypes or higher dimensions of phenotypes more easily have empirical type I error rate inflation at a smaller nominal level, especially for binary data analysis (Appendix S5: Tables S5-S6 and Appendix S6: Table S7), in comparison with these methods based on lower correlations of phenotypes or lower dimensions of phenotypes. A detailed discussion of these additional simulation results is given in Appendixes S4 and S5.

However, we note that the proposed methods have a high computational cost, especially for binary data. Under our simulation setting and framework, we carry out a single simulated data set by using a computer based on one CPU core at 2.1 GHz. The average computational times of the homogeneous and heterogeneous tests with a weighted marker-specific weight w_l = Beta(m_l, 1, 25) under the alternative hypothesis for continuous data are 0.83 and 0.91 minutes, respectively, while that for binary data are 4.77 and 4.80 minutes, respectively. Therefore, in the current version, such a framework algorithm implementation is unsatisfactory for analyzing a large-scale high-dimensional data set in practice.

5. Conclusion

In this investigation, we develop a retrospective framework for identifying the pleiotropic effects of genetic variants on multivariate traits by using collapsing and kernel methods with pedigree- or population-structured data. The proposed framework, corresponding to the burden test, the kernel test, and the omnibus test, provides a sound basis for genetic association analyses for autosomes and the X chromosome. The proposed multivariate trait association methods based on the JGEE model can flexibly accommodate continuous phenotypes or binary phenotypes and further can adjust for covariates.

One critical advantage of the proposed methods is that the homogeneous kernel statistic (HoK), the heterogeneous kernel statistic (HeK), and the burden test (BT) retain all of the benefits of the retrospective tests proposed by Schaid et al. [43] who treated the genotype data as random variables by conditioning the phenotypes as constants. On the other hand, the homogeneous omnibus test (HoO) and the heterogeneous kernel statistic (HeO) keep the advantages of the Cauchy combination tests proposed by Liu and Xie [48] who showed that the Cauchy combination tests are robust to model misspecification and robustly protect the type I error rates [49].

Another important benefit of the proposed method is that the HoK test, the HeK test, and the BT test keep the benefits of the JGEE model that validly account for complex correlations between subjects within the cluster (within-cluster correlations) and between different phenotypes from the same subjects (multivariate-response correlations). Moreover, the proposed test statistics, HoK, HeK, and BT, based on the JGEE model can efficaciously account for covariate adjustment whether the phenotypes are continuous or binary.

Our simulation studies show that an unweighted marker-specific weight w_l = Beta(m_l, 1, 1) = 1 and an exchangeable structure of the working within-cluster and multivariate-response correlations are recommended for the practical data analysis if the data cannot sufficiently provide valid information for estimating the structures of the working within-cluster and multivariate-response correlations before the start of the data analysis. Moreover, the homogeneous kernel statistic (HoK) is more robust than the heterogeneous statistic (HeK) in controlling the empirical type I errors, because the null distribution of the HeK statistic asymptotically follows a mixture chi-square distribution with a larger degree of freedom, in comparison with the null distribution of the HoK statistic. However, the HeK statistic is more powerful than the HoK statistic when the genetic effects on the different phenotypes are heterogeneous.

On the other hand, our simulation results show that for the autosome analyses or the X chromosome analyses with continuous traits, the seven competing methods, HoK, HoO, HeK, HeO, BT, mPK, and mPB, show good performance with well-controlled type I errors, while the seven competing methods have their respective merits for identifying the association between the genetic effects and multiple continuous traits. In addition, our simulation results show that for the autosome analyses or the X chromosome analyses with binary traits, the proposed methods, HoK, HoO, HeK, HeO, and BT, can control empirical type I errors with lower correlations of phenotypes or with lower dimensions of phenotypes (Table 2 and Table S2), while these proposed methods have their respective advantages for identifying the genetic variants associated with multiple binary traits. However, we observe that the proposed methods, HoK, HoO, HeK, HeO, and BT, with higher correlations of phenotypes or with higher dimensions of phenotypes, more easily have the infection of empirical type I errors at a smaller nominal level (Appendix S5: Tables S5-S6 and Appendix S6: Table S7), although these method under such situations have higher empirical power rates.

6. Limitation

The proposed multivariate trait association methods have their limitations. First, these proposed methods cannot simultaneously include the continuous traits and binary traits in analysis. Thus, future studies are needed to extend the idea of the proposed multivariate trait association methods for simultaneously considering continuous traits and binary traits in analysis. Second, the multivariate trait association methods, based on higher correlations of phenotypes or higher dimensions of phenotypes, easily suffer from the problem of the inflated type I errors, especially when the binary traits are considered (Appendix S5: Tables S5-S6 and Appendix S6: Table S7). Although the JGEE model provides an efficient algorithm for estimating the structure of the working within-cluster and multivariate-response correlations, a large-scale pedigree study always suffers from a more complex and high-dimensional structure of the within-cluster and multivariate-response correlations in pedigree database analysis. Hence, in the future, a more effective algorithm for estimating the complicated and high-dimensional (or higher correlational) structure of the working within-cluster and multivariate-response correlations is necessary to be proposed, especially when the analysis focuses on the binary traits. Third, in comparison with the null distribution of the homogeneous kernel statistic, the null distribution of the heterogeneous kernel statistic follows a larger degree of freedom test, which easily causes such a heterogeneous test to suffer from the problem of the type I error inflation. Therefore, overcoming the problem of the type I error inflation from the heterogeneous test is an essential part of the future work. Fourth, the proposed methods, which have a high computational cost especially for binary data, are inappropriate for analyzing large-scale high-dimensional data in practice. Thus, a more effective algorithm for reducing computational cost is needed to be proposed in further research. Moreover, the software of the proposed methods is computationally inconvenient and particularly inadequate for the mass GWAS data in practice. Therefore, the software of the proposed methods, which is convenient to be used, is a further work in the future. Fifth, our current work focuses mainly on the low- and common-frequency variants. Extension of the proposed methods to the rare variants deserves further works.

Acknowledgments

The authors would like to thank the editor and the referees for their constructive comments, which significantly improve the presentation of the article. This work is supported by grant MOST 108-2118-M-037-001-MY2 of Ministry of Science and Technology, Taiwan, R.O.C.

Data Availability

The data supporting the findings of this study are available within the article and its supplementary materials.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Supplementary Materials

Appendix S1: the null distribution of the kernel statistic. Appendix S2: extension to the X chromosome. Appendix S3: simulation results based on the X chromosome. Appendix S4: additional simulation studies for continuous traits. Appendix S5: additional simulation studies for binary traits. Appendix S6: limitation

Click here for additional data file.^{(1.7MB, pdf)}

References

1.Zhu H., Zhang S., Sha Q. A novel method to test associations between a weighted combination of phenotypes and genetic variants. PLoS One. 2018;13(1, article e0190788) doi: 10.1371/journal.pone.0190788. [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Lee S., Won S., Kim Y. J., et al. Rare variant association test with multiple phenotypes. Genetic Epidemiology. 2017;41(3):198–209. doi: 10.1002/gepi.22021. [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Yang Q., Wang Y. Methods for analyzing multivariate phenotypes in genetic association studies. Journal of probability and statistics. 2012;2012:13. doi: 10.1155/2012/652569.652569 [DOI] [PMC free article] [PubMed] [Google Scholar]
4.Liang X., Sha Q., Zhang S. Joint analysis of multiple phenotypes in association studies using allele-based clustering approach for non-normal distributions. Annals of Human Genetics. 2018;82(6):389–395. doi: 10.1111/ahg.12260. [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Wang Z., Sha Q., Fang S., Zhang K., Zhang S. Testing an optimally weighted combination of common and/or rare variants with multiple traits. PLoS One. 2018;13(7, article e0201186) doi: 10.1371/journal.pone.0201186. [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Solovieff N., Cotsapas C., Lee P. H., Purcell S. M., Smoller J. W. Pleiotropy in complex traits: challenges and strategies. Nature Reviews Genetics. 2013;14(7):483–495. doi: 10.1038/nrg3461. [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Stephens M. A unified framework for association analysis with multiple related phenotypes. PLoS One. 2013;8(7, article e65245) doi: 10.1371/journal.pone.0065245. [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Zhou X., Stephens M. Efficient multivariate linear mixed model algorithms for genome-wide association studies. Nature Methods. 2014;11(4):407–409. doi: 10.1038/nmeth.2848. [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Liang X., Wang Z., Sha Q., Zhang S. An adaptive Fisher’s combination method for joint analysis of multiple phenotypes in association studies. Scientific Reports. 2016;6(1, article 34323) doi: 10.1038/srep34323. [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Wang Z., Wang X., Sha Q., Zhang S. Joint analysis of multiple traits in rare variant association studies. Annals of Human Genetics. 2016;80(3):162–171. doi: 10.1111/ahg.12149. [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Wang Z., Sha Q., Zhang Z. Joint analysis of multiple traits using "optimal" maximum heritability test. PLoS One. 2016;11(3, article e0150975) doi: 10.1371/journal.pone.0150975. [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Zhu H., Zhang S., Sha Q. Power comparisons of methods for joint association analysis of multiple phenotypes. Human heredity. 2015;80(3):144–152. doi: 10.1159/000446239. [DOI] [PubMed] [Google Scholar]
13.Aschard H., Vilhjálmsson B., Greliche N., Morange P., Trégouët D., Kraft P. Maximizing the power of principal-component analysis of correlated phenotypes in genome-wide association studies. The American Journal of Human Genetics. 2014;94(5):662–676. doi: 10.1016/j.ajhg.2014.03.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Korte A., Vilhjálmsson B. J., Segura V., Platt A., Long Q., Nordborg M. A mixed-model approach for genome-wide association studies of correlated traits in structured populations. Nature Genetics. 2012;44(9):1066–1071. doi: 10.1038/ng.2376. [DOI] [PMC free article] [PubMed] [Google Scholar]
15.O’Reilly P. F., Hoggart C. J., Pomyen Y., et al. MultiPhen: joint model of multiple phenotypes can increase discovery in GWAS. PLoS One. 2012;7(5, article e34861) doi: 10.1371/journal.pone.0034861. [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Zhang Y., Xu Z., Shen X., Pan W., Alzheimer's Disease Neuroimaging Initiative Testing for association with multiple traits in generalized estimation equations, with application to neuroimaging data. NeuroImage. 2014;96:309–325. doi: 10.1016/j.neuroimage.2014.03.061. [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Ferreira M. A. R., Purcell S. M. A multivariate test of association. Bioinformatics. 2009;25(1):132–133. doi: 10.1093/bioinformatics/btn563. [DOI] [PubMed] [Google Scholar]
18.Klei L., Luca D., Devlin B., Roeder K. Pleiotropy and principal components of heritability combine to increase power for association analysis. Genetic Epidemiology. 2008;32(1):9–19. doi: 10.1002/gepi.20257. [DOI] [PubMed] [Google Scholar]
19.O’Brien P. C. Procedures for comparing samples with multiple endpoints. Biometrics. 1984;40(4):1079–1087. doi: 10.2307/2531158. [DOI] [PubMed] [Google Scholar]
20.Yang Q., Wu H., Guo C.-Y., Fox C. S. Analyze multivariate phenotypes in genetic association studies by combining univariate association tests. Genetic Epidemiology. 2010;34(5):444–454. doi: 10.1002/gepi.20497. [DOI] [PMC free article] [PubMed] [Google Scholar]
21.van der Sluis S., Posthuma D., Dolan C. V. TATES: efficient multivariate genotype-phenotype analysis for genome-wide association studies. PLoS Genet. 2013;9(1, article e1003235) doi: 10.1371/journal.pgen.1003235. [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Kim J., Bai Y., Pan W. An adaptive association test for multiple phenotypes with GWAS summary statistics. Genetic Epidemiology. 2015;39(8):651–663. doi: 10.1002/gepi.21931. [DOI] [PMC free article] [PubMed] [Google Scholar]
23.Zhu X., Feng T., Tayo B. O., et al. Meta-analysis of correlated traits via summary statistics from GWASs with an application in hypertension. The American Journal of Human Genetics. 2015;96(1):21–36. doi: 10.1016/j.ajhg.2014.11.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
24.Li B., Leal S. M. Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data. The American Journal of Human Genetics. 2008;83(3):311–321. doi: 10.1016/j.ajhg.2008.06.024. [DOI] [PMC free article] [PubMed] [Google Scholar]
25.Huang J., Johnson A., O'Donnell C. PRIMe: a method for characterization and evaluation of pleiotropic regions from multiple genome-wide association studies. Bioinformatics. 2011;27(9):1201–1206. doi: 10.1093/bioinformatics/btr116. [DOI] [PMC free article] [PubMed] [Google Scholar]
26.Ray D., Pankow J., Basu S. USAT: a unified score-based association test for multiple phenotype-genotype analysis. Genetic Epidemiology. 2016;40(1):20–34. doi: 10.1002/gepi.21937. [DOI] [PMC free article] [PubMed] [Google Scholar]
27.Ried J., Döring A., Oexle K., et al. PSEA: phenotype set enrichment analysis-a new method for analysis of multiple phenotypes. Genetic Epidemiology. 2012;36(3):244–252. doi: 10.1002/gepi.21617. [DOI] [PubMed] [Google Scholar]
28.Yan Q., Weeks D., Celedón J., et al. Associating multivariate quantitative phenotypes with genetic variants in family samples with a novel kernel machine regression method. Genetics. 2015;201(4):1329–1339. doi: 10.1534/genetics.115.178590. [DOI] [PMC free article] [PubMed] [Google Scholar]
29.Zhan X., Zhao N., Plantinga A., et al. Powerful genetic association analysis for common or rare variants with high-dimensional structured traits. Genetics. 2017;206(4):1779–1790. doi: 10.1534/genetics.116.199646. [DOI] [PMC free article] [PubMed] [Google Scholar]
30.Maity A., Sullivan P., Tzeng J. Multivariate phenotype association analysis by marker-set kernel machine regression. Genetic Epidemiology. 2012;36(7):686–695. doi: 10.1002/gepi.21663. [DOI] [PMC free article] [PubMed] [Google Scholar]
31.UK10K Consortium, Sun J., Oualkacha K., et al. A method for analyzing multiple continuous phenotypes in rare variant association studies allowing for flexible correlations in variant effects. European Journal of Human Genetics. 2016;24(9):1344–1351. doi: 10.1038/ejhg.2016.8. [DOI] [PMC free article] [PubMed] [Google Scholar]
32.Wang Y., Liu A., Mills J. L., et al. Pleiotropy analysis of quantitative traits at gene level by multivariate functional linear models. Genetic Epidemiology. 2015;39(4):259–275. doi: 10.1002/gepi.21895. [DOI] [PMC free article] [PubMed] [Google Scholar]
33.Casale F. P., Rakitsch B., Lippert C., Stegle O. Efficient set tests for the genetic analysis of correlated traits. Nature Methods. 2015;12(8):755–758. doi: 10.1038/nmeth.3439. [DOI] [PubMed] [Google Scholar]
34.Broadaway K. A., Cutler D. J., Duncan R., et al. A statistical approach for testing cross-phenotype effects of rare variants. The American Journal of Human Genetics. 2016;98(3):525–540. doi: 10.1016/j.ajhg.2016.01.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
35.Cichonska A., Rousu J., Marttinen P., et al. metaCCA: summary statistics-based multivariate meta-analysis of genome-wide association studies using canonical correlation analysis. Bioinformatics. 2016;32(13):1981–1989. doi: 10.1093/bioinformatics/btw052. [DOI] [PMC free article] [PubMed] [Google Scholar]
36.Lin J., Tabassum R., Ripatti S., Pirinen M. MetaPhat: detecting and decomposing multivariate associations from univariate genome-wide association statistics. Frontiers in Genetics. 2020;11 doi: 10.3389/fgene.2020.00431. [DOI] [PMC free article] [PubMed] [Google Scholar]
37.Bhattacharjee S., Rajaraman P., Jacobs K. B., et al. A subset-based approach improves power and interpretation for the combined analysis of genetic association studies of heterogeneous traits. The American Journal of Human Genetics. 2012;90(5):821–835. doi: 10.1016/j.ajhg.2012.03.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
38.23andMe Research Team, Social Science Genetic Association Consortium, Turley P., et al. Multi-trait analysis of genome-wide association summary statistics using MTAG. Nature Genetics. 2018;50(2):229–237. doi: 10.1038/s41588-017-0009-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
39.Dutta D., Scott L., Boehnke M., Lee S. Multi-SKAT: general framework to test for rare-variant association with multiple phenotypes. Genetic Epidemiology. 2019;43(1):4–23. doi: 10.1002/gepi.22156. [DOI] [PMC free article] [PubMed] [Google Scholar]
40.Inan G., Yucel R. Joint GEEs for multivariate correlated data with incomplete binary outcomes. Journal of Applied Statistics. 2017;44(11):1920–1937. doi: 10.1080/02664763.2016.1238049. [DOI] [Google Scholar]
41.Liang K. Y., Zeger S. L. Longitudinal data analysis using generalized linear models. Biometrika. 1986;73(1):13–22. doi: 10.1093/biomet/73.1.13. [DOI] [Google Scholar]
42.Inan G. JGEE: joint generalized estimating equation solver. 2015. R package version 1.1.
43.Schaid D. J., McDonnell S. K., Sinnwell J. P., Thibodeau S. N. Multiple genetic variant association testing by collapsing and kernel methods with pedigree or population structured data. Genetic Epidemiology. 2013;37(5):409–418. doi: 10.1002/gepi.21727. [DOI] [PMC free article] [PubMed] [Google Scholar]
44.Sinnwell J., Therneau T., Schaid D., Atkinson E., Mester C. kinship2. 2020. R package version 1.8.5.
45.Thornton T., McPeek M. S. ROADTRIPS: case-control association testing with partially or completely unknown population and pedigree structure. The American Journal of Human Genetics. 2010;86(2):172–184. doi: 10.1016/j.ajhg.2010.01.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
46.Chen H., Meigs J. B., Dupuis J. Sequence kernel association test for quantitative traits in family samples. Genetic Epidemiology. 2013;37(2):196–204. doi: 10.1002/gepi.21703. [DOI] [PMC free article] [PubMed] [Google Scholar]
47.Liu Y., Chen S., Li Z., Morrison A. C., Boerwinkle E., Lin X. ACAT: a fast and powerful p value combination method for rare-variant analysis in sequencing studies. The American Journal of Human Genetics. 2019;104(3):410–421. doi: 10.1016/j.ajhg.2019.01.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
48.Liu Y., Xie J. Cauchy combination test: a powerful test with analytic p-value calculation under arbitrary dependency structures. Journal of the American Statistical Association. 2020;115(529):393–402. doi: 10.1080/01621459.2018.1554485. [DOI] [PMC free article] [PubMed] [Google Scholar]
49.McCaw Z. R., Lane J. M., Saxena R., Redline S., Lin X. Operating characteristics of the rank-based inverse normal transformation for quantitative trait analysis in genome-wide association studies. Biometrics. 2020;76(4):1262–1272. doi: 10.1111/biom.13214. [DOI] [PMC free article] [PubMed] [Google Scholar]
50.McCaw Z. R. Rank normal transformation omnibus test. 2019. R package version 0.7.1.
51.Lee S., Teslovich T. M., Boehnke M., Lin X. General framework for meta-analysis of rare variants in sequencing association studies. The American Journal of Human Genetics. 2013;93(1):42–53. doi: 10.1016/j.ajhg.2013.05.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
52.Schaffner S. F., Foo C., Gabriel S., Reich D., Daly M. J., Altshuler D. Calibrating a coalescent simulation of human genome sequence variation. Genome Research. 2005;15(11):1576–1583. doi: 10.1101/gr.3709305. [DOI] [PMC free article] [PubMed] [Google Scholar]
53.Amatya A., Demirtas H., Gao R. Simultaneous generation of multivariate binary and normal variates. 2020. R package version 2.3.2.
54.Torices R., Muñoz-Pajares A. J. Phenotypic integration index. 2017. R package version 1.3.1. [DOI] [PMC free article] [PubMed]
55.Torices R., Muñoz-Pajares A. J. PHENIX: an R package to estimate a size-controlled phenotypic integration index. Applications in Plant Sciences. 2015;3(5, article 1400104) doi: 10.3732/apps.1400104. [DOI] [PMC free article] [PubMed] [Google Scholar]
56.Dah A., Hore V., Iotchkova V., Marchini J. Network inference in matrix-variate Gaussian models with non-independent noise. 2013. https://arxiv.org/abs/1312.1622.

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Click here for additional data file.^{(1.7MB, pdf)}

Data Availability Statement

The data supporting the findings of this study are available within the article and its supplementary materials.

[B1] 1.Zhu H., Zhang S., Sha Q. A novel method to test associations between a weighted combination of phenotypes and genetic variants. PLoS One. 2018;13(1, article e0190788) doi: 10.1371/journal.pone.0190788. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B2] 2.Lee S., Won S., Kim Y. J., et al. Rare variant association test with multiple phenotypes. Genetic Epidemiology. 2017;41(3):198–209. doi: 10.1002/gepi.22021. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B3] 3.Yang Q., Wang Y. Methods for analyzing multivariate phenotypes in genetic association studies. Journal of probability and statistics. 2012;2012:13. doi: 10.1155/2012/652569.652569 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B4] 4.Liang X., Sha Q., Zhang S. Joint analysis of multiple phenotypes in association studies using allele-based clustering approach for non-normal distributions. Annals of Human Genetics. 2018;82(6):389–395. doi: 10.1111/ahg.12260. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B5] 5.Wang Z., Sha Q., Fang S., Zhang K., Zhang S. Testing an optimally weighted combination of common and/or rare variants with multiple traits. PLoS One. 2018;13(7, article e0201186) doi: 10.1371/journal.pone.0201186. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B6] 6.Solovieff N., Cotsapas C., Lee P. H., Purcell S. M., Smoller J. W. Pleiotropy in complex traits: challenges and strategies. Nature Reviews Genetics. 2013;14(7):483–495. doi: 10.1038/nrg3461. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B7] 7.Stephens M. A unified framework for association analysis with multiple related phenotypes. PLoS One. 2013;8(7, article e65245) doi: 10.1371/journal.pone.0065245. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B8] 8.Zhou X., Stephens M. Efficient multivariate linear mixed model algorithms for genome-wide association studies. Nature Methods. 2014;11(4):407–409. doi: 10.1038/nmeth.2848. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B9] 9.Liang X., Wang Z., Sha Q., Zhang S. An adaptive Fisher’s combination method for joint analysis of multiple phenotypes in association studies. Scientific Reports. 2016;6(1, article 34323) doi: 10.1038/srep34323. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B10] 10.Wang Z., Wang X., Sha Q., Zhang S. Joint analysis of multiple traits in rare variant association studies. Annals of Human Genetics. 2016;80(3):162–171. doi: 10.1111/ahg.12149. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B11] 11.Wang Z., Sha Q., Zhang Z. Joint analysis of multiple traits using "optimal" maximum heritability test. PLoS One. 2016;11(3, article e0150975) doi: 10.1371/journal.pone.0150975. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B12] 12.Zhu H., Zhang S., Sha Q. Power comparisons of methods for joint association analysis of multiple phenotypes. Human heredity. 2015;80(3):144–152. doi: 10.1159/000446239. [DOI] [PubMed] [Google Scholar]

[B13] 13.Aschard H., Vilhjálmsson B., Greliche N., Morange P., Trégouët D., Kraft P. Maximizing the power of principal-component analysis of correlated phenotypes in genome-wide association studies. The American Journal of Human Genetics. 2014;94(5):662–676. doi: 10.1016/j.ajhg.2014.03.016. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B14] 14.Korte A., Vilhjálmsson B. J., Segura V., Platt A., Long Q., Nordborg M. A mixed-model approach for genome-wide association studies of correlated traits in structured populations. Nature Genetics. 2012;44(9):1066–1071. doi: 10.1038/ng.2376. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B15] 15.O’Reilly P. F., Hoggart C. J., Pomyen Y., et al. MultiPhen: joint model of multiple phenotypes can increase discovery in GWAS. PLoS One. 2012;7(5, article e34861) doi: 10.1371/journal.pone.0034861. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B16] 16.Zhang Y., Xu Z., Shen X., Pan W., Alzheimer's Disease Neuroimaging Initiative Testing for association with multiple traits in generalized estimation equations, with application to neuroimaging data. NeuroImage. 2014;96:309–325. doi: 10.1016/j.neuroimage.2014.03.061. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B17] 17.Ferreira M. A. R., Purcell S. M. A multivariate test of association. Bioinformatics. 2009;25(1):132–133. doi: 10.1093/bioinformatics/btn563. [DOI] [PubMed] [Google Scholar]

[B18] 18.Klei L., Luca D., Devlin B., Roeder K. Pleiotropy and principal components of heritability combine to increase power for association analysis. Genetic Epidemiology. 2008;32(1):9–19. doi: 10.1002/gepi.20257. [DOI] [PubMed] [Google Scholar]

[B19] 19.O’Brien P. C. Procedures for comparing samples with multiple endpoints. Biometrics. 1984;40(4):1079–1087. doi: 10.2307/2531158. [DOI] [PubMed] [Google Scholar]

[B20] 20.Yang Q., Wu H., Guo C.-Y., Fox C. S. Analyze multivariate phenotypes in genetic association studies by combining univariate association tests. Genetic Epidemiology. 2010;34(5):444–454. doi: 10.1002/gepi.20497. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B21] 21.van der Sluis S., Posthuma D., Dolan C. V. TATES: efficient multivariate genotype-phenotype analysis for genome-wide association studies. PLoS Genet. 2013;9(1, article e1003235) doi: 10.1371/journal.pgen.1003235. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B22] 22.Kim J., Bai Y., Pan W. An adaptive association test for multiple phenotypes with GWAS summary statistics. Genetic Epidemiology. 2015;39(8):651–663. doi: 10.1002/gepi.21931. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B23] 23.Zhu X., Feng T., Tayo B. O., et al. Meta-analysis of correlated traits via summary statistics from GWASs with an application in hypertension. The American Journal of Human Genetics. 2015;96(1):21–36. doi: 10.1016/j.ajhg.2014.11.011. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B24] 24.Li B., Leal S. M. Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data. The American Journal of Human Genetics. 2008;83(3):311–321. doi: 10.1016/j.ajhg.2008.06.024. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B25] 25.Huang J., Johnson A., O'Donnell C. PRIMe: a method for characterization and evaluation of pleiotropic regions from multiple genome-wide association studies. Bioinformatics. 2011;27(9):1201–1206. doi: 10.1093/bioinformatics/btr116. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B26] 26.Ray D., Pankow J., Basu S. USAT: a unified score-based association test for multiple phenotype-genotype analysis. Genetic Epidemiology. 2016;40(1):20–34. doi: 10.1002/gepi.21937. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B27] 27.Ried J., Döring A., Oexle K., et al. PSEA: phenotype set enrichment analysis-a new method for analysis of multiple phenotypes. Genetic Epidemiology. 2012;36(3):244–252. doi: 10.1002/gepi.21617. [DOI] [PubMed] [Google Scholar]

[B28] 28.Yan Q., Weeks D., Celedón J., et al. Associating multivariate quantitative phenotypes with genetic variants in family samples with a novel kernel machine regression method. Genetics. 2015;201(4):1329–1339. doi: 10.1534/genetics.115.178590. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B29] 29.Zhan X., Zhao N., Plantinga A., et al. Powerful genetic association analysis for common or rare variants with high-dimensional structured traits. Genetics. 2017;206(4):1779–1790. doi: 10.1534/genetics.116.199646. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B30] 30.Maity A., Sullivan P., Tzeng J. Multivariate phenotype association analysis by marker-set kernel machine regression. Genetic Epidemiology. 2012;36(7):686–695. doi: 10.1002/gepi.21663. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B31] 31.UK10K Consortium, Sun J., Oualkacha K., et al. A method for analyzing multiple continuous phenotypes in rare variant association studies allowing for flexible correlations in variant effects. European Journal of Human Genetics. 2016;24(9):1344–1351. doi: 10.1038/ejhg.2016.8. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B32] 32.Wang Y., Liu A., Mills J. L., et al. Pleiotropy analysis of quantitative traits at gene level by multivariate functional linear models. Genetic Epidemiology. 2015;39(4):259–275. doi: 10.1002/gepi.21895. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B33] 33.Casale F. P., Rakitsch B., Lippert C., Stegle O. Efficient set tests for the genetic analysis of correlated traits. Nature Methods. 2015;12(8):755–758. doi: 10.1038/nmeth.3439. [DOI] [PubMed] [Google Scholar]

[B34] 34.Broadaway K. A., Cutler D. J., Duncan R., et al. A statistical approach for testing cross-phenotype effects of rare variants. The American Journal of Human Genetics. 2016;98(3):525–540. doi: 10.1016/j.ajhg.2016.01.017. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B35] 35.Cichonska A., Rousu J., Marttinen P., et al. metaCCA: summary statistics-based multivariate meta-analysis of genome-wide association studies using canonical correlation analysis. Bioinformatics. 2016;32(13):1981–1989. doi: 10.1093/bioinformatics/btw052. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B36] 36.Lin J., Tabassum R., Ripatti S., Pirinen M. MetaPhat: detecting and decomposing multivariate associations from univariate genome-wide association statistics. Frontiers in Genetics. 2020;11 doi: 10.3389/fgene.2020.00431. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B37] 37.Bhattacharjee S., Rajaraman P., Jacobs K. B., et al. A subset-based approach improves power and interpretation for the combined analysis of genetic association studies of heterogeneous traits. The American Journal of Human Genetics. 2012;90(5):821–835. doi: 10.1016/j.ajhg.2012.03.015. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B38] 38.23andMe Research Team, Social Science Genetic Association Consortium, Turley P., et al. Multi-trait analysis of genome-wide association summary statistics using MTAG. Nature Genetics. 2018;50(2):229–237. doi: 10.1038/s41588-017-0009-4. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B39] 39.Dutta D., Scott L., Boehnke M., Lee S. Multi-SKAT: general framework to test for rare-variant association with multiple phenotypes. Genetic Epidemiology. 2019;43(1):4–23. doi: 10.1002/gepi.22156. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B40] 40.Inan G., Yucel R. Joint GEEs for multivariate correlated data with incomplete binary outcomes. Journal of Applied Statistics. 2017;44(11):1920–1937. doi: 10.1080/02664763.2016.1238049. [DOI] [Google Scholar]

[B41] 41.Liang K. Y., Zeger S. L. Longitudinal data analysis using generalized linear models. Biometrika. 1986;73(1):13–22. doi: 10.1093/biomet/73.1.13. [DOI] [Google Scholar]

[B42] 42.Inan G. JGEE: joint generalized estimating equation solver. 2015. R package version 1.1.

[B43] 43.Schaid D. J., McDonnell S. K., Sinnwell J. P., Thibodeau S. N. Multiple genetic variant association testing by collapsing and kernel methods with pedigree or population structured data. Genetic Epidemiology. 2013;37(5):409–418. doi: 10.1002/gepi.21727. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B44] 44.Sinnwell J., Therneau T., Schaid D., Atkinson E., Mester C. kinship2. 2020. R package version 1.8.5.

[B45] 45.Thornton T., McPeek M. S. ROADTRIPS: case-control association testing with partially or completely unknown population and pedigree structure. The American Journal of Human Genetics. 2010;86(2):172–184. doi: 10.1016/j.ajhg.2010.01.001. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B46] 46.Chen H., Meigs J. B., Dupuis J. Sequence kernel association test for quantitative traits in family samples. Genetic Epidemiology. 2013;37(2):196–204. doi: 10.1002/gepi.21703. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B47] 47.Liu Y., Chen S., Li Z., Morrison A. C., Boerwinkle E., Lin X. ACAT: a fast and powerful p value combination method for rare-variant analysis in sequencing studies. The American Journal of Human Genetics. 2019;104(3):410–421. doi: 10.1016/j.ajhg.2019.01.002. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B48] 48.Liu Y., Xie J. Cauchy combination test: a powerful test with analytic p-value calculation under arbitrary dependency structures. Journal of the American Statistical Association. 2020;115(529):393–402. doi: 10.1080/01621459.2018.1554485. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B49] 49.McCaw Z. R., Lane J. M., Saxena R., Redline S., Lin X. Operating characteristics of the rank-based inverse normal transformation for quantitative trait analysis in genome-wide association studies. Biometrics. 2020;76(4):1262–1272. doi: 10.1111/biom.13214. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B50] 50.McCaw Z. R. Rank normal transformation omnibus test. 2019. R package version 0.7.1.

[B51] 51.Lee S., Teslovich T. M., Boehnke M., Lin X. General framework for meta-analysis of rare variants in sequencing association studies. The American Journal of Human Genetics. 2013;93(1):42–53. doi: 10.1016/j.ajhg.2013.05.010. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B52] 52.Schaffner S. F., Foo C., Gabriel S., Reich D., Daly M. J., Altshuler D. Calibrating a coalescent simulation of human genome sequence variation. Genome Research. 2005;15(11):1576–1583. doi: 10.1101/gr.3709305. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B53] 53.Amatya A., Demirtas H., Gao R. Simultaneous generation of multivariate binary and normal variates. 2020. R package version 2.3.2.

[B54] 54.Torices R., Muñoz-Pajares A. J. Phenotypic integration index. 2017. R package version 1.3.1. [DOI] [PMC free article] [PubMed]

[B55] 55.Torices R., Muñoz-Pajares A. J. PHENIX: an R package to estimate a size-controlled phenotypic integration index. Applications in Plant Sciences. 2015;3(5, article 1400104) doi: 10.3732/apps.1400104. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B56] 56.Dah A., Hore V., Iotchkova V., Marchini J. Network inference in matrix-variate Gaussian models with non-independent noise. 2013. https://arxiv.org/abs/1312.1622.

PERMALINK

Associating Multivariate Traits with Genetic Variants Using Collapsing and Kernel Methods with Pedigree- or Population-Based Studies

Li-Chu Chien

Abstract

1. Introduction

2. Materials and Methods

2.1. Notations

2.2. Multitrait Regression-Based Tests for Pedigree Data

2.2.1. Kernel Statistic

2.2.2. Burden Test

2.2.3. Omnibus Test

3. Simulation Studies

3.1. Type I Error Rate and Power Simulations

4. Results

4.1. Empirical Type I Error Rates

Table 1.

Table 2.

4.2. Empirical Power

Figure 1.

Figure 2.

5. Conclusion

6. Limitation

Acknowledgments

Data Availability

Conflicts of Interest

Supplementary Materials

References

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Associating Multivariate Traits with Genetic Variants Using Collapsing and Kernel Methods with Pedigree- or Population-Based Studies

Li-Chu Chien

Abstract

1. Introduction

2. Materials and Methods

2.1. Notations

2.2. Multitrait Regression-Based Tests for Pedigree Data

2.2.1. Kernel Statistic

2.2.2. Burden Test

2.2.3. Omnibus Test

3. Simulation Studies

3.1. Type I Error Rate and Power Simulations

4. Results

4.1. Empirical Type I Error Rates

Table 1.

Table 2.

4.2. Empirical Power

Figure 1.

Figure 2.

5. Conclusion

6. Limitation

Acknowledgments

Data Availability

Conflicts of Interest

Supplementary Materials

References

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases