Abstract
Estimating the genetic relatedness between two traits based on the genome-wide association data is an important problem in genetics research. In the framework of high-dimensional linear models, we introduce two measures of genetic relatedness and develop optimal estimators for them. One is genetic covariance, which is defined to be the inner product of the two regression vectors, and another is genetic correlation, which is a normalized inner product by their lengths. We propose functional de-biased estimators (FDEs), which consist of an initial estimation step with the plug-in scaled Lasso estimator, and a further bias correction step. We also develop estimators of the quadratic functionals of the regression vectors, which can be used to estimate the heritability of each trait. The estimators are shown to be minimax rate-optimal and can be efficiently implemented. Simulation results show that FDEs provide better estimates of the genetic relatedness than simple plug-in estimates. FDE is also applied to an analysis of a yeast segregant data set with multiple traits to estimate the genetic relatedness among these traits.
Keywords: Genetic correlations, genome-wide association studies, inner product, quadratic functional, minimax rate of convergence
1. Introduction
1.1. Motivation and Background
Genome-wide association studies (GWAS) have led to identification of thousands of genetic variants or single nucleotide polymorphisms (SNPs) that are associated with various complex phenotypes (Manolio, 2010). Results from these GWAS have shown that many complex phenotypes share common genetic variants, including various autoimmune diseases (Zhernakova et al., 2009) and psychiatric disorders (Lee et al., 2013). These empirical evidence of shared genetic etiology for various phenotypes provides important insights of common pathophysiologies for related disorders that can be explored for drug repositioning and for studying disease etiology. Such knowledge of genetic sharing can potentially be explored to increase the accuracy of genetic risk prediction (Maier et al., 2015; Wray et al., 2007; Purcell et al., 2009). The concept of genetic relatedness or genetic correlations has been proposed to describe the shared genetic associations within pairs of quantitative traits based on GWAS data. This is in contrast to the traditional approaches of estimating co-heritability based on twin or family studies, where measurements of both traits are required on the same set of individuals. Due to the availability of GWAS data sets of many important traits, there has been significant recent interest in methods for quantifying and estimating the genetic relatedness between two traits based on large scale genetic association data.
Several measures of genetic relatedness have been proposed using GWAS data. Lee et al. (2012) and Yang et al. (2013) extended the mixed-effect model framework to estimate genetic covariance and genetic correlation between two traits. In their models, each individual’s trait value is associated with a random genetic effect, which is correlated across individuals by virtue of sharing some of the genetic variants affecting the traits, and an environmental random effect. Co-heritability is then defined as the square-root of the ratio of the covariance of the genetic random effects to the product of the total variances. The mixed-effect model approach requires knowledge of the identity of the causal variants, and hence the covariance matrix. This is however not available. Lee et al. (2012) and Yang et al. (2013) approximated the genetic relationship between every pair of individuals across the set of causal variants by the genetic relationship across the set of all genotyped variants. However, the very large number of variants used for estimating the genetic correlations, most of them likely not causative, might mask out the correlations on the set of causal variants, leading to inaccurate and suboptimal estimation of heritability (Golan and Rosset, 2011). Bulik-Sullivan et al. (2015) studied the genetic relatedness based on another random effects model for the two traits and developed a cross-trait linkage disequilibrium (LD) score regression to estimate the genetic covariance and genetic correlation. This approach shares similarity with the mixed-effect model approach of Yang et al. (2013) but has the advantages of only using the GWAS summary statistics. Lee and van der Wer (2016) developed an algorithm for multivariate linear mixed model analysis and demonstrated its use in estimating co-heritability.
To alleviate the difficulty of estimating the covariance matrix in the commonly used mixed effect model framework of estimating the heritability or co-heritability, we take a regression approach with fixed genetic effects in high-dimensional settings. High-dimensional linear regression provides a natural framework for GWAS in order to identify the trait-associated genetic variants, and its advantages over the simple univariate analysis have been demonstrated (Wu et al., 2009). The study of heritability in high-dimensional regression analysis has been studied in Bonnet et al. (2015); Verzelen and Gassiat (2016); Janson et al. (2016). However, high-dimensional regression analysis has not been explored to study the genetic relatedness between two traits based on genetic association data. The goal of this paper is to define two quantities that can be used to measure the genetic relatedness between a pair of traits based on GWAS data in the framework of high-dimensional linear models. Our definitions of the genetic relatedness reflect covariance or correlation of the trait-associated genetic variants. This is different from the mixed-effects model-based approaches where the genetic relatedness is defined through the variance/covariance matrix of the individual-specific random effects and the data from all the genetic variants are used to approximate the true covariance matrix.
1.2. Definition and Problem Formulation
A pair of trait values (, ) are modeled as a linear combination of genetic variants and an error term that includes environmental and unmeasured genetic effects,
| (1) |
where the rows are i.i.d. p-dimensional Sub-gaussian random vectors with covariance matrix , the rows are i.i.d. p-dimensional Sub-gaussian random vectors with covariance matrix and the error follows the multivariate normal distribution with mean zero and covariance
and is assumed to be independent of and .
In the study of genetic relatedness, the pair of traits and are assumed to have mean zero, and the jth column of , , and the jth column of , , are the numerically coded genetic markers at the jth genetic variant and are assumed to have mean zero and variance 1. Under this model, if the columns of and are independent, for the i-th observation,
therefore and can then be interpreted as the narrow sense heritability (Bulik-Sullivan et al., 2015).
Based on this model, one measure of genetic relatedness is the inner product of the regression coefficients
| (2) |
which measures the shared genetic effects between these two traits. Bulik-Sullivan et al. (2015) defined this quantity as the genetic covariance due to the genetic variants. Alternatively, a normalized inner product called genetic correlation, that is the ratio
| (3) |
can also be used. In the case where one of and is vanishing, the ratio is defined as zero, which indicates no correlation between two traits when one of the regression vector is zero. With this normalization, is always between −1 and 1 and can be used to compare the genetic relatedness among multiple pairs. Note that to exhibit genetic correlation, the directions of effect must also be consistently aligned.
Although Bulik-Sullivan et al. (2015) defined (2) and (3) as genetic covariance and genetic correlation, they treated and as random vectors with a particular covariance form and then proposed to apply LD regression to estimate the expectation of . The focus of this paper is to develop estimators for and based on two GWAS data with genotype data measured on the same set of genetic markers, denoted by and .
1.3. Methods and Main Results
A naive estimator is to estimate and first and then plug-in the estimators of and into the expressions (2) and (3). For the problem of interest, usually there are more genetic markers than the sample size, that is . However, for any trait, one expect that only a few of these markers have nonzero effects. One can apply any high-dimensional sparse regression methods such as Lasso (Tibshirani, 1996), scaled Lasso (Sun and Zhang, 2012) and marginal regression with screening (McCarthy et al., 2008; Fan et al., 2012) to estimate these sparse regression coefficients. The aforementioned plug-in estimators, however, have several drawbacks in estimating the genetic relatedness. The Lasso approach shrinks the estimation towards 0, in particular, some weak effects might be shrunken to 0, yet the accumulation of these weak effects may contribute significant to the trait variability. It is possible that some genetic variants may have strong effects on one trait and weak effects on the other trait. Due to shrinkage, the plug-in of Lasso type estimators fails to capture this part of contribution to genetic relatedness from such genetic variants. Marginal regression calculates the regression score between the trait and each single marker (i.e., and ), and screen for the large scores. This approach also suffers from the existence of weak effects, as the marginal scores must be large enough to survive in the screening step.
We propose a two-step procedure to estimate the genetic relatedness measure defined in (2), where step 1 is involved with estimating the inner product by the plug-in scaled Lasso estimator, and step 2 is involved with correcting the plug-in scaled Lasso estimator. Similar two-step procedures are proposed to estimate the quadratic functionals and . To estimate the normalized inner product defined in (3), we plug-in the estimators of the inner product and quadratic functionals into the definition (3). Due to the correction step, we name our estimators as Functional De-biased Estimators (FDEs).
FDEs are shown to achieve the minimax optimal convergence rates of estimating and . The optimality of FDEs results from the unique way of balancing the bias and variance for estimating and . To illustrate this, we focus on estimation of , take the plug-in estimator of the scaled Lasso estimators (Sun and Zhang, 2012) and the plug-in of the de-biased Lasso estimators (Javanmard and Montanari, 2014; van de Geer et al., 2014; Zhang and Zhang, 2014) as examples and compare them with FDE estimator of . Note that the scaled Lasso estimator achieves the optimal convergence rate of estimating the whole vector and the de-biased estimator achieves the optimal convergence rate of estimating the single coordinate . However, simply plugging in the scaled Lasso estimators or the de-biased Lasso estimators does not lead to a good estimator of since the plug-in estimator of scaled Lasso estimators suffers from a large bias and the plug-in estimator of de-biased Lasso estimators suffers from the inflation of variance.
In contrast, FDE estimator of balances the bias and variance in the optimal way. Specifically, in the correction step of FDE estimator, the bias caused by plugging in the scaled Lasso estimator is corrected through adding the minimum amount of variance. As demonstrated in the simulation studies, FDE consistently outperforms the plug-in estimator of the scaled Lasso estimators and the plug-in estimator of the de-biased Lasso estimators. In addition, FDEs do not suffer from dependency among genetic markers. FDEs work for a broad class of dependency structure of genetic markers.
The theoretical analysis given in Section 3 establishes the optimal convergence rates of estimating , , and . To facilitate the discussion, we control the norm of regression coefficients and as and where , are positive constant independent of , . Here, we present the most interesting regime where the signals are strong in the sense of , where is the dimension, is the sample size, k is the maximum sparsity of and and is a positive constant independent of , , . We have shown that the optimal rate of convergence for estimating , and is
The optimal rate depends not only on , and , but also the upper bound for the signal strength . In addition, we have shown that the optimal convergence rate of estimating is
In contrast to estimating , and , the optimal rate scales to the inverse of the lower bound for the signal strength, represented by . The estimators , , and proposed in Section 3 are shown to adaptively achieve the optimal rates for estimating , , and , respectively.
1.4. Notation and Definitions
Basic notation and definitions used in the rest of the paper are defined here. For a matrix , , , and denote respectively the i-th row, j-th column, and (i, j)-th entry of the matrix , denotes the i-th row of excluding the j-th coordinate, and denotes the sub-matrix of excluding the j-th column. Let . For a subset , denotes the sub-matrix of consisting of columns with and for a vector , is the sub-vector of with indices in and is the sub-vector with indices in . For a vector , the norm of is defined as for with denoting the cardinality of non-zero elements of and . For a matrix and , is the matrix operator norm. In particular, is the spectral norm. For a symmetric matrix , and denote respectively the smallest and largest eigenvalue of . For a set , denotes the cardinality of . For , and is the sign of , i.e., if , if and . Define the Sub-gaussian norm of as where is the unit sphere in . The random vector is defined to be Sub-gaussian if its corresponding Sub-gaussian norm is bounded; see Vershynin (2012) for more on Subgaussian random variables. For the design matriices and , we define the corresponding sample covariance matrices as and . Let denote the upper quantile of the standard normal distribution. For two positive sequences and , means for all , if and if and . and are used to denote generic positive constants that may vary from place to place. For any two sequences of numbers and , we will write if .
1.5. Organization of the Paper
The rest of the paper is organized as follows. Section 2 presents the procedures for estimating , , , and in details. In Section 3, minimax convergence rates for the estimation problems are established and the proposed estimators are shown to attain the optimal rates. In Section 4, simulation studies are conducted to evaluate the empirical performance of FDEs. A yeast cross data is used to illustrate the estimators in Section 5. Discussion is provided in Section 6. The proofs of main theorems are present in Section 7. The remaining proofs and the extended simulation studies are given in the supplementary materials.
2. Estimation Methods
2.1. Estimation of
Since the inner product is of significant interest in its own right, we first consider the estimation of . The scaled Lasso estimators for high-dimensional linear model (1) are defined through the following optimization algorithm (Sun and Zhang, 2012),
| (4) |
and
| (5) |
where . To construct an optimal estimator of , it is helpful to analyze the error of the plugin estimator ,
| (6) |
The last term on the right hand side, is “small”, but the first two terms and can be large. This provides the motivation for the proposed estimator, where we first estimate these two terms and then subtract them from to obtain the final estimator of .
The intuition for estimating is given first. Since
| (7) |
multiplying both sides of (7) by a vector yields
| (8) |
which can be written as
| (9) |
If the vector can be chosen such that the right hand side of (9) is “small”, then is a good estimator of . Since the first term on the right hand side of (9) is upper bounded as , we control the right hand side of (9) through constructing a projection vector such that is constrained and the second term of (9) is controlled through minimizing its variance . This leads to the following convex optimization algorithm for identifying the projection vector for estimating ,
| (10) |
where .
Remark 1.
The solution of the above optimization problem might not be unique and is defined as any minimizer of the optimization problem. The theory established in Section 3 still holds for any minimizer of (10). The optimization problem (10) is solved through its equivalent Lagrange dual problem, which is computationally efficient and scales well to the high-dimensional problem. See Step 2 in Table 1 for more details.
Table 1:
FDE algorithm without sample splitting for estimating the inner product, quadratic functionals and the normalized inner product.
| Input: design matrices: , ; response vectors: , ; tuning parameters , . | |
| Output: , , and . | |
|
| |
| Initial Lasso estimators: | |
| 1. | Scaled Lasso: Calculate and from (4) and (5) with the tuning parameter . |
| Inner product calculation: | |
| 2. | Projection vector : Calculate , where , and . Repeat until can not be solved with replaced by , or . |
| 3. | Projection vector : Calculate , where , and . Repeat until can not be solved with replaced by , or . |
| 4. | Correction: . |
| Quadratic functional calculation: | |
| 5. | Projection vector : Calculate , where , and . Repeat until can not be solved with replaced by , or . |
| 6. | Projection vector : Calculate , where , and . Repeat until can not be solved with replaced by , or . |
| 7. | Correction: |
| Ratio calculation: | |
| 8. | |
Once the projection vector is obtained, is then estimated by . Similarly, the projection vector for estimating can be obtained via the convex algorithm
| (11) |
where . Then is estimated by .
The final estimator of is given by
| (12) |
It is clear from the above discussion that the key idea for the construction of the final estimator is to identify the projection vectors and such that and are well approximated. It will be shown in Section 3 that the estimator is adaptively minimax rate-optimal.
Remark 2.
As mentioned, simply plugging in the Lasso, scaled Lasso, or de-biased estimator does not lead to a good estimator of . Another natural approach is to first threshold the de-biased estimator to obtain a sparse estimator of the coefficient vectors (see details in Zhang and Zhang (2014, Section 3.3), Guo et al. (2016, equation (10))) and then plugin this thresholded estimator. This estimator is referred to as the thresholded estimator. Simulations in Section 4 demonstrate that the proposed estimator defined in (12) outperforms the three plug-in estimators using the scaled Lasso, de-biased, and thresholded estimators.
2.2. Estimation of and
In order to estimate the normalized inner product , it is necessary to estimate the quadratic functionals and . To this end, we randomly split the data into two subsamples with sample size and with sample size and the data into two subsamples with sample size and with sample size .
With a slight abuse of notation, let and denote the optimizers of the scaled Lasso algorithm (4) applied to and (5) applied to , respectively. For the scaled Lasso algorithms, the sample sizes and are replaced by and , respectively. Again, the simple plug-in estimator of is not a good estimator of because of the following error decomposition,
| (13) |
where the second term on the right hand side of (13) is “small”, but the first can be large. Specifically, the term is estimated first and then is added to to obtain the final estimator of . To estimate , a projection vector is identified such that the following difference is controlled,
| (14) |
with . Define the projection vector as the solution to the following optimization algorithm
| (15) |
where . We then estimate by and propose the final estimator of as
| (16) |
Similarly, the estimator of is given by
| (17) |
where
| (18) |
with and .
Remark 3.
Sample splitting is used here for the purpose of the theoretical analysis. In the simulation study (Section 4), the performance of the proposed estimator without sample splitting is investigated; see Steps 5 – 7 in Table 1. The proposed estimator without sample splitting performs even better numerically than with sample splitting since more observations are used in constructing the initial estimators and and the projection vectors and .
2.3. Estimation of
Given the estimators , and constructed in Sections 2.1 and 2.2, a natural estimator for the normalized inner product is given by
| (19) |
where , and are estimators of , and defined in (12), (16) and (17), respectively. It is possible that one of and is 0 if and are close to zero. In this case, the normalized inner product is estimated as 0. Since is always between −1 and 1, the estimator is truncated to ensure that it is within the range. The FDE algorithm without sample splitting for calculating the estimators , , , and is detailed in Table 1.
3. Theoretical Analysis
3.1. Upper Bound Analysis
The samples sizes and are assumed to be of the same order, that is . Let be the smallest of two sample sizes. The following assumptions are introduced to facilitate the theoretical analysis.
(A1) The population covariance matrices and satisfy , and , where is a positive constant. The random design matrix is assumed to be independent of the other random design matrix . The noise levels and satisfy , where is a positive constant.
(A2) The norms of the coefficient vectors and are bounded away from zero in the sense that
| (20) |
Assumption (A1) places a condition on the spectrum of the covariance matrices and and an upper bound on the noise levels and . Assumption (A2) requires that the total strengths of the signals have to be bounded away from zero by , which is only used in the upper bound analysis of the normalized inner product .
The following theorem establishes the convergence rates of the estimators , , and , proposed in (12), (16) and (17), respectively.
Theorem 1.
Suppose the assumption (A1) holds and for some . Then for any fixed constant , with probability at least , we have
| (21) |
| (22) |
| (23) |
where is a positive constant.
The upper bound of estimating not only depends on , and , but also scales to the signal strengths and . For the estimation of the quadratic functional (or ), the convergence rate depends on (or ). The following theorem establishes the convergence rate of the estimator proposed in (19).
Theorem 2.
Suppose the assumptions (A1) and (A2) hold and for some . Then for any fixed constant , with probability at least , we have
| (24) |
where is a positive constant.
In contrast to Theorem 1, Theorem 2 requires the extra assumption (A2) on the signal strengths and . The convergence rate of estimating is scaled to the inverse of the signal strength, . This is different from the error bound in Theorem 1, where the estimation accuracy is scaled to the signal strength. The lower bound results established in Theorem 3 will demonstrate the necessity of Assumption (A2) for estimation of .
3.2. Minimax Lower Bounds
This section establishes the minimax lower bounds of estimating , , and . We first introduce parameter spaces for , which is defined as the product of parameter spaces for and . We define the following parameter space for both and ,
| (25) |
where and are positive constants. The parameter space defined in (25) requires that the signal contains less than non-zero coefficients and the norm is upper bounded by , where is allowed to grow with and . The lower bound results in Theorem 3 show that the estimation difficulties of , and depend on . The other conditions and are regularity conditions. Based on the definition (25), the parameter space for is defined as a product of two parameter spaces,
| (26) |
For establishing optimal bounds of, we define the following parameter space
| (27) |
where
with . In contrast to the parameter space where is upper bounded by , the parameter space requires the signal strength to be lower bounded by , where is allowed to grow with and . The lower bound in Theorem 3 shows that the estimation difficulty of depends on .
The following theorem establishes the minimax lower bounds for the convergence rates of estimating the inner product , the quadratic functionals and and the normalized inner product .
Theorem 3.
Suppose for some constants and . Then
| (28) |
| (29) |
| (30) |
| (31) |
Remark 4.
Estimation of quadratic functionals has been extensively studied in the classical Gaussian sequence model. See, for example, Donoho and Nussbaum (1990); Efromovich and Low (1996); Laurent and Massart (2000); Cai and Low (2005, 2006); Collier et al. (2015) for details. In the regime for some constants and , Theorem 2 in (Collier et al., 2015) gives a lower bound, , for estimating in the sequence model. In contrast, an extra term appears in the lower bound given in (29) for estimating in high-dimensional linear regression. One intuitive reason for this extra term is that high-dimensional linear regression is involved with an extra inverse process than Gaussian sequence model. Estimation of the quadratic functional in high-dimensional linear regression is fundamentally harder than that in the Gaussian sequence model. For the high-dimensional linear regression, the estimation lower bound in (29) can also be established by the general lower bounds developed in Cai and Guo (2017a). See Section 8 in Cai and Guo (2017a) for details.
3.3. Optimality of FDEs
In this section, we establish the optimality of FDEs by combining Theorems 1 and 2 over the parameter spaces and defined in (26) and (27), respectively.
Corollary 1.
Suppose and for some constants , and . Then
| (32) |
| (33) |
| (34) |
| (35) |
where and is a positive constant.
Combined with Theorem 3, Corollary 1 implies that, for , the estimators , , proposed in (12),(16) and (17) achieve the minimax lower bounds (28), (29) and (30) within a constant factor, that is, FDEs are minimax rate-optimal. On the other hand, if , estimation of , and is uninteresting as the trivial estimator 0 achieves the minimax lower bound in this case. For estimation of , under the assumption , Corollary 1 shows that the estimator given in (19) achieves the minimax lower bound in (31). Hence is the rate-optimal estimator of under the assumption (A2). When , estimation of becomes trivial as the simple estimator 0 attains the minimax lower bound. This demonstrates the necessity of the assumption (A2) in Theorem 2.
4. Simulation Evaluations and Comparisons
We compare the finite-sample performance of several estimators of and using simulations. These estimators included plug-in scaled Lasso estimator (Sun and Zhang, 2012), plug-in de-biased estimator (Javanmard and Montanari, 2014; van de Geer et al., 2014; Zhang and Zhang, 2014), plug-in thresholded estimator (Zhang and Zhang, 2014, Section 3.3) and the proposed estimator FDE. Specifically, they are defined as
FDE: The inner product is estimated by in (12) and the ratio is estimated by in (19). We consider FDE with sample splitting (FDE-S) and without sample splitting (FDE-NS) for
- Plug-in scaled Lasso estimator (Lasso): The inner product is estimated by and the normalized inner product is estimated by
Plug-in de-biased estimator (De-biased): Denote the de-biased Lasso estimators as and . The inner product is estimated by and the normalized inner product is estimated by .
Plug-in thresholded estimator (Thresholded): Denote the thresholded estimators as and . The inner product is estimated by and the normalized inner product is estimated by .
Implementation of the de-biased, thresholded and FDE estimators requires the scaled Lasso estimators and in the initial step. The scaled Lasso estimator is implemented by the equivalent square-root Lasso algorithm (Belloni et al., 2011). The theoretical tuning parameter is , which may be conservative in the numerical studies. Instead, the tuning parameter is chosen as . However, the performances of all estimators are evaluated across a grid of tuning parameter values (see Supplementary Material, Section A.1). The results showed that b = .5 was a good choice for all the estimators. Hence, was used for the numerical studies in this section and Section 5. To implement the FDE algorithm, the other tuning parameter is chosen as for the correction Steps 2,3,5 and 6 in Table 1.
Comparisons of estimates of and are presented below. Results on estimating the quadratic functionals are presented in the Supplementary Material, Section A.2. For each setting, with the parameters , , , , specified, we generate the data and compare different methods as follows,
Generate sets and , with , and . For and , generate and , for , , and set and , for , .
Generate , , and , .
Generate the noise , , and , . Generate the outcome as and .
With , , , and , estimate and through different estimators.
Repeat 2–4 for times.
We evaluate the performance of an estimator by the mean squared error (MSE), which is defined as
| (36) |
for a given quantity T and its estimate from l-th replication. We consider two different settings with two sets of parameters , , , and and the simulation for each setting is repeated L = 300 times.
Experiment 1.
The parameters are set as follows, , the sparsity parameters , and the covariance matrices and satisfy . For given positive values and , the signals of satisfy that , for , , and the signals in satisfy that for . This simulation aims to investigate the case where the coefficients for one regression are much larger than the other by varying the signal strength parameters as .
The results are summarized in Table 2. For all combinations of the signal strength parameters, in terms of estimating the inner product , FDE consistently outperformed the plug-in estimates with Lasso and Thresholded Lasso. Moreover, with increasing difference between and , the advantage of FDE over the plugin estimate using Lasso or thresholded Lasso became larger. The same results were observed for estimation of the normalized inner product , where FDE-NS had consistent better performance than other methods. Although De-biased performed well in terms of estimating , it performed much worse than FDE-NS for estimating .
Table 2:
Mean square errors (MSE) of the estimates of the inner product and the normalized inner product for various signal strength parameters. Lasso: plugin estimator with the scaled Lasso estimator; De-biased: plug-in estimator with the de-biased estimator; Thresholded: plug-in estimator with the thresholded estimator; FDE: the proposed estimator ; FDE-S: the proposed estimator with sample splitting; FDE-NS: the proposed estimator without sample splitting.
| Strength parameters, |
|||||||||
|---|---|---|---|---|---|---|---|---|---|
| (1.8, .4) | (2.2, .3) | (2.6, .2) | (3, .1) | (.1, 1.6) | (.2, 1.4) | (.3, 1.2) | (.4, 1) | ||
| Truth | 8.088 | 7.414 | 5.841 | 3.370 | 1.797 | 3.145 | 4.044 | 4.493 | |
|
|
|||||||||
| MSE | |||||||||
| Lasso | 9.295 | 11.564 | 12.560 | 7.279 | 2.377 | 4.889 | 5.409 | 4.800 | |
| De-biased | 1.733 | 2.191 | 2.324 | 1.386 | .449 | .838 | .985 | .886 | |
| Thresholded | 2.029 | 3.377 | 6.463 | 5.789 | 1.877 | 3.024 | 2.432 | 1.546 | |
| FDE | 1.847 | 2.471 | 2.662 | 2.118 | .734 | .995 | 1.028 | .986 | |
|
| |||||||||
| Truth | .5314 | .5314 | .5314 | .5314 | .5314 | .5314 | .5314 | .5314 | |
|
|
|||||||||
| MSE | |||||||||
| Lasso | .0023 | .0075 | .0332 | .1260 | .1574 | .0624 | .0227 | .0087 | |
| De-biased | .0208 | .0415 | .0864 | .1590 | .1736 | .1068 | .0627 | .0373 | |
| Thresholded | .0045 | .0139 | .0585 | .1753 | .0964 | .0981 | .0389 | .0153 | |
| FDE-S | .0337 | .0303 | .0621 | .0678 | .2130 | .1199 | .0694 | .0616 | |
| FDE-NS | .0036 | .0064 | .0163 | .0580 | .0892 | .0237 | .0116 | .0061 | |
As discussed in Section 2, the sample splitting of estimating the normalized inner product is simply proposed to facilitate the theoretical analysis and might not be necessary for the algorithm. Our simulation results indicated that the proposed estimator without sample splitting (FDE-NS) performed quite well in all settings, even better than FDE-S, due to the fact that more samples were used for estimation and correction steps. Such observations led us to use the proposed estimator without sample splitting (FDE-NS) in the real data analysis in Section 5.
Experiment 2.
The parameters are set as follows, , signal strength parameters and the covariance matrices and satisfy . The signals of follow that , for , , and the signals in satisfy that for . This simulation setting is set to investigate the relationship between the performance of estimators and the signal sparsity level and vary the sparsity levels of and as , and fix the number of common signals at . Since the number of the associated variants is very large for both coefficient vectors, large values of and would induce strong signals such that all the methods perform well. Instead, we consider a more challenging setting where the signal magnitude is small, that is and .
The results are summarized in Table 3. Clearly, FDE outperformed the other methods. When the signals became denser, the improvement of FDE over other methods was more pronounced. For estimation of , the results showed that FDE-NS consistently outperformed other estimators. As the number of signals increased, the MSE corresponding to FDE-NS decreased quickly.
Table 3:
Mean square errors (MSE) of the estimates of the inner product and the normalized inner product for various sparsity parameters. Lasso: plug-in estimator with the scaled Lasso estimator; De-biased: plug-in estimator with the de-biased estimator; Thresholded: plug-in estimator with the thresholded estimator; FDE: the proposed estimator ; FDE-S: the proposed estimator with sample splitting; FDE-NS: the proposed estimator without sample splitting.
| Sparsity parameter, |
|||||||||
|---|---|---|---|---|---|---|---|---|---|
| 40 | 50 | 60 | 70 | 80 | 90 | 100 | 110 | ||
| Truth | .190 | .170 | .219 | .212 | .179 | .221 | .183 | .221 | |
|
|
|||||||||
| MSE | |||||||||
| Lasso | .032 | .025 | .039 | .036 | .024 | .035 | .023 | .028 | |
| De-biased | .015 | .015 | .018 | .017 | .024 | .027 | .040 | .066 | |
| Thresholded | .027 | .021 | .031 | .029 | .020 | .025 | .018 | .018 | |
| FDE | .020 | .014 | .021 | .022 | .011 | .013 | .008 | .008 | |
|
| |||||||||
| Truth | .4027 | .2908 | .3122 | .2592 | .1914 | .2110 | .1573 | .1725 | |
|
|
|||||||||
| MSE | |||||||||
| Lasso | .1157 | .0517 | .0539 | .0370 | .0166 | .0180 | .0097 | .0079 | |
| De-biased | .1267 | .0601 | .0659 | .0411 | .0160 | .0173 | .0063 | .0059 | |
| Thresholded | .1392 | .0687 | .0732 | .0504 | .0262 | .0277 | .0155 | .0142 | |
| FDE-S | .1154 | .1225 | .0779 | .0574 | .0456 | .0499 | .0450 | .0493 | |
| FDE-NS | .0847 | .0340 | .0368 | .0294 | .0115 | .0091 | .0055 | .0047 | |
5. Genetic Relatedness Yeast Colony Growths of Based on Genome Wide Association Data
Bloom et al. (2013) reported a large scale genome-wide association study of 46 quantitative traits based on 1,008 Saccharomyces cerevisiae segregants crossbred from a laboratory strain and a wine strain. The data set included 11,623 unique genotype markers. Since many of these markers are highly correlated and differ only in a few samples, Bloom et al. (2013) further selected a set of 4,410 markers that are weakly dependent based on the linkage disequilibrium information. Specifically, these markers were selected by picking one marker closest to each centimorgan position on the genetic map. The maker genotypes are coded as 1 or −1, according to which strain it came from and satisfy the sub-Gaussian conditions. The traits of interest were the end-point colony size normalized by the control growth under 46 different growth media, including Hydrogen Peroxide, Diamide, Calcium, Yeast Nitrogen Base (YNB) and Yeast extract Peptone Dextrose (YPD), etc. Bloom et al. (2013) showed that the genetic variants are associated with many of such trait values. It is therefore important to genetic relatedness among these related traits.
To demonstrate the genetic relatedness among these traits, eight traits were considered, including the normalized colony sizes under Calcium Chloride (Calcium), Diamide, Hydrogen Peroxide (Hydrogen), Paraquat, Raffinose, 6 Azauracil (Azauracil), YNB, and YPD. Each trait was normalized to have variance 1, so the quadratic norm represents the total genetic effects for each trait and an estimate the heritability. FDE was applied to every pair of these 8 traits without sample splitting, for a total of 28 pairs. The results are summarized in Table 4, including estimates of the heritability, genetic covariance and and genetic correlation for each of the 28 pairs. The genetic heritability of these traits ranged from 0.22 for Raffinose to 0.67 for YPD. About two thirds of these pairs had an estimated genetic correlation smaller than 0.1, indicating relatively weak genetic correlations among these traits.
Table 4:
FDE estimation for the heritability (bold diagonals), genetic covariance(upper diagonals) and genetic correlation (lower diagonals) among for each pair of 8 colony growth traits of the yeast segregants.
| Traits | Calcium | Diamide | Hydrogen | Paraquat | Raffinose | Azauracil | YNB | YPD |
|---|---|---|---|---|---|---|---|---|
| Calcium | .3314 | −.0189 | −.1003 | .0084 | .0927 | .0095 | .0656 | .−0134 |
| Diamide | −.0286 | .4390 | .0598 | −.0039 | .0500 | .0446 | −.0159 | .0803 |
| Hydrogen | −.1579 | .0942 | .4033 | .0576 | −.1040 | .0601 | .0672 | .0637 |
| Paraquat | .0117 | −.0053 | .0799 | .5199 | .0023 | .0365 | .1148 | .1029 |
| Raffinose | .1972 | .1065 | −.2213 | .0049 | .2208 | .0137 | .0830 | .0331 |
| Azauracil | .0172 | .0809 | .1089 | .0661 | .0248 | .3045 | −.0259 | .0703 |
| YNB | .0968 | −.0235 | .0991 | .1693 | .1224 | −.0383 | .4594 | .4246 |
| YPD | −.0164 | .0983 | .0779 | .1259 | .0405 | .0860 | .5195 | .6680 |
To further demonstrate the genetic relatedness among these pairs, for each trait, a Z-score was calculated based on regressing the trait value on genetic genetic marker , for . A larger absolute value of the Z-score statistic implies a stronger effect of the marker on the trait. For any pair of traits, the scatter plot of the Z-statistics provides a way of revealing the shared genetic relationship between them. The scatterplots of the Z-scores for all 28 pairs of traits are included in Section D of the Supplemetal Materials. Figure 1 (a) shows the plots of several pairs of the traits, including the pairs with a large positive , YPD v.s. YNB and Paraquat v.s. YNB, pairs with a large negative , Raffinose v.s. Hydrogen and Calcium v.s. Hydrogen, and pairs with near 0, including Paraquat v.s. Diamide and Paraquat v.s. Raffinose. The plot clearly indicates a strong positive genetic covariance between YPD and YNB. The genetic covariance between Paraquat and YNB/YPD is smaller. Raffinose/Hydrogen and Calcium/Hydrogen pair clearly show negative genetic correlation. There are several genetic variants with very large effects on Hydrogen, but they are not associated with the other traits such as Raffinose and Calcium. The shared genetic variants are relatively weak, leading to smaller genetic covariances. The plots on the bottom show the pairs of traits with weak weak genetic covariances. These plots indicate that the proposed genetic correlation measures can indeed capture the genetic sharing among different related traits.
Figure 1:
Scatter plots of marginal regression Z-score statistics for six pairs of traits ranked by the estimated genetic covariance (gcov) based on FDE (a) or LD regression (b), including the pairs with large positive genetic covariance (left panel), negative genetic covariance (middle panel), and small genetic covariance (right panel).
Figure 2 shows the six pairs of the phenotypes ranked by the estimated genetic correlations FDE, including two with the largest positive genetic correlations, two with the largest negative genetic correlations and two with the small genetic correlations. The pairs identified agree with the marginal Z-scores very well.
Figure 2:
Scatter plots of marginal regression Z-score statistics for six pairs of traits ranked by the estimated genetic correlation (gcor) based on FDE, including the pairs with large positive genetic correlation (left panel), negative genetic correlation (middle panel), and small genetic correlation (right panel).
As a comparison, we also obtained the estimated genetic covariance for each pair of the traits using the LD regression methods proposed by Bulik-Sullivan et al. (2015). The pairs of traits with large positive, negative or weak estimated covariance are presented in Figure 1 (b). The pairs with the largest positive and negative estimated covariance are different from those two pairs identified by FDE. Comparison of the scatter plots of the Z-score statistics in Figure 1 indicates the pairs identified by FDE seem to agree with the marginal Z-statistics better.
6. Discussion
Motivated by the problems of estimating the genetic relatedness between two traits using the GWAS data, we have considered the problem of estimating the different functionals of the regression coefficients of two linear models, including the inner product , the quadratic functionals and and the ratio . The proposed method is different from plugging in the de-biased estimators proposed in Javanmard and Montanari (2014); van de Geer et al. (2014); Zhang and Zhang (2014). The correction procedures are implemented on the inner product and quadratic functionals directly, which balance the bias and variance uniquely for these functionals and hence result in minimax rate optimal estimators. The proposed estimators were shown in simulations to result in smaller estimation errors than directly plugging in these de-biased estimators across different settings. Results from analysis of the yeast segregants data suggested that the yeast colony growth sizes were under similar genetic controls under certain growth medias such as YPD and YNB, but this was not true for all pairs of growth media considered.
The algorithm for obtaining the these estimates only involves applying the Lasso several times, which can be implemented efficiently using the coordinate descent algorithms. The Matlab codes to implement the proposed estimation methods are available at http://statgene.med.upenn.edu/software.html. An important future research is to quantify uncertainty of these proposed estimators and the upper bound analysis of (21)-(23) and (24) indicates the possibility of constructing confidence intervals, centering at the proposed estimators and of parametric length , under additional sparsity and other regularity conditions.
7. Proofs
In this Section, we prove Theorem 1 and (29) and (30) of Theorem 3. The proofs of Theorem 2, (28) and (31) of Theorem 3 and extra lemmas are presented in the supplementary materials.
7.1. Proof of Theorem 1
For simplicity of notation, we assume and use to represent the sample size throughout the proof. The proofs can be easily generalized to the case . Without loss of generality, we assume that the Sub-gaussian norm of random vectors and are also upper bounded by , that is, .
Proof of (21)
The upper bound is based on the following decomposition,
| (37) |
The following lemmas are introduced to control the terms in (37) and similar results were established in the analysis of Lasso, scaled Lasso and de-biased Lasso (Cai and Guo, 2017b; Ren et al., 2015; Sun and Zhang, 2012; Ye and Zhang, 2010). The proofs of the following lemmas can be found in the supplementary material, Section D.
Lemma 1.
Suppose the assumption (A1) holds and for some . Then with probability at least , we have
| (38) |
| (39) |
where and are positive constants.
Lemma 2.
Suppose the assumption (A1) holds and for some . Then with probability at least , we have
| (40) |
| (41) |
where and are positive constants.
By the decomposition (37) and the inequalities (39), (40) and (41), we obtain that
Proof of (22) and (23)
The proof of (23) is similar to that of (22) and only the proof of (22) is present in the following. We introduce the estimator and due to the fact that is non-negative, we have . We decompose the difference between and ,
Combined with the above argument, the upper bound (22) follows from (39) and the following lemma, whose proof can be found in the supplementary material Section D.
Lemma 3.
Suppose the assumption (A1) holds and for some . Then with probability at least ,
| (42) |
| (43) |
where and are positive constants.
7.2. Proof of (29) and (30) in Theorem 3
We first introduce the notations used in the proof of lower bound results. Let denote the prior distribution supported on the parameter space . Let denote the density function of the marginal distribution of the random variable with the prior on . More specifically, . We define the distance between two density functions and by
| (44) |
and the distance by . It is well known that
| (45) |
The proof of the lower bound is based on the following version of Le Cam’s Lemma (LeCam (1973); Yu (1997); Ren et al. (2015)).
Lemma 4.
Let denote a functional on . Suppose that , , and . Let denote a prior on the parameter space . Then we have
| (46) |
The proofs of (29) and (30) are applications of Lemma 4. The key is to construct the parameter spaces , and the prior on such that (i) , , (ii) the distance is controlled and (iii) the distance is maximized. In the following, we provide the detailed proof of (29). The proof of (30) is similar to that of (29) and is omitted here. In the discussion of lower bound results, we will assume that the design and follow joint normal distribution with zero means. The lower bound (29) can be decomposed into the following three lower bounds,
| (47) |
For , then
| (48) |
| (49) |
where is a positive constant. For , combining (48), (49) and (47), we have
| (50) |
For , by (47), we have
| (51) |
We can establish (29) by combining (50) and (51). In the following, we will establish the lower bounds (48), (49) and (47) separately.
Proof of (48)
Under the Gaussian random design model, follows a joint Gaussian distribution with mean . Let denote the covariance matrix of . For the indices of , we use 0 as the index of and as the indices for . Decompose into blocks , where , and denote the variance of , the variance of and the covariance of and , respectively. Let denote the precision matrix. There exists a bijective function and the inverse mapping , where and
| (52) |
Based on the bijection, it is sufficient to control the distance between two multivariate Gaussian distributions. We introduce the null parameter space
where . Note that . Define . Based on the mapping , we have the corresponding null parameter space for , where
We then introduce the alternative parameter space for , which will induce a parameter space for through the mapping . Define , where
| (53) |
with and
| (54) |
Then we construct the corresponding parameter space for , which is induced by the mapping and the parameter space ,
| (55) |
Similar to equation (7.15) in Cai and Guo (2017b), we can show that for , the corresponding is and the difference between and is
By taking , we have and hence for . Similar to the arguments between (7.15)-(7.18) in Cai and Guo (2017b), we can show that . Let denote the uniform prior over the parameter space induced by the uniform prior of over where . The control of is established in the following lemma, which follows from Lemma 2 of Cai and Guo (2017b) and is established in (7.21) of Cai and Guo (2017b).
Lemma 5.
Suppose that , where and is a sufficient small positive constant. For , we establish that and
| (56) |
To apply Lemma 4, we consider the functional and calculate the distance
| (57) |
Since and , we have and hence
where the last inequality follows from the fact that , and . Combined with (56), an application of Lemma 4 leads to (48).
Proof of (49)
We construct the following parameter spaces,
| (58) |
where and . Since , we have and , .
The proof of the following lemma can be found in the supplementary material Section D.
Lemma 6.
If , then we have
| (59) |
To apply Lemma 4, we take and calculate the distance
Proof of (47)
We introduce the following null and alternative parameter spaces,
| (60) |
where and
| (61) |
Let denote the prior over the parameter space induced by the uniform prior of over where . The control of is established in the following lemma, which follows from Lemma 7 of Cai and Guo (2017c) and is established in (1.6) of Cai and Guo (2017c).
Lemma 7.
Suppose that , where and is a sufficient small positive constant. For , we establish .
By specifying , we have and defined in (60) are proper subspaces of the parameter space . To apply Lemma 4, we calculate the distance . Applying Lemma 4, we establish (47).
Supplementary Material
Acknowledgement
We would like to thank Alexandre Tsybakov for helpful discussion on Section 3.2, and the reviewer and AE for helpful comments.
Footnotes
Supplementary Material
Supplement to “Optimal Estimation of Genetic Relatedness in High-dimensional Linear Regressions”. (.pdf file)
References
- Belloni A, Chernozhukov V, and Wang L. (2011), “Square-root lasso: pivotal recovery of sparse signals via conic programming,” Biometrika, 98(4), 791–806. [Google Scholar]
- Bloom JS, Ehrenreich IM, Loo WT, Lite T-LV, and Kruglyak L. (2013), “Finding the sources of missing heritability in a yeast cross,” Nature, 494(7436), 234–237. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bonnet A, Gassiat E, Lévy-Leduc C. et al. (2015), “Heritability estimation in high dimensional sparse linear mixed models,” Electronic Journal of Statistics, 9(2), 2099–2129. [Google Scholar]
- Bulik-Sullivan B, Finucane HK, Anttila V, Gusev A, Day FR, Loh P-R, Duncan L, Perry JR, Patterson N, Robinson EB et al. (2015), “An atlas of genetic correlations across human diseases and traits,” Nature genetics,. [DOI] [PMC free article] [PubMed]
- Cai TT, and Guo Z. (2017a), “Accuracy assessment for high-dimensional linear regression,” The Annals of Statistics, To appear.
- Cai TT, and Guo Z. (2017b), “Confidence intervals for high-dimensional linear regression: Minimax rates and adaptivity,” The Annals of Statistics, 45(2), 615–646. [Google Scholar]
- Cai TT, and Guo Z. (2017c), “Supplement to “Confidence intervals for high-dimensional linear regression: minimax rates and adaptivity,” The Annals of Statistics, 45(2). [Google Scholar]
- Cai TT, and Low MG (2005), “Nonquadratic estimators of a quadratic functional,” The Annals of Statistics, 33(6), 2930–2956. [Google Scholar]
- Cai TT, and Low MG (2006), “Optimal adaptive estimation of a quadratic functional,” The Annals of Statistics, 34(5), 2298–2325. [Google Scholar]
- Collier O, Comminges L, and Tsybakov AB (2015), “Minimax estimation of linear and quadratic functionals on sparsity classes,” The Annals of Statistics, To appear.
- Donoho DL, and Nussbaum M. (1990), “Minimax quadratic estimation of a quadratic functional,” Journal of Complexity, 6(3), 290–323. [Google Scholar]
- Efromovich S, and Low M. (1996), “On optimal adaptive estimation of a quadratic functional,” The Annals of Statistics, 24(3), 1106–1125. [Google Scholar]
- Fan J, Han X, and Gu W. (2012), “Estimating false discovery proportion under arbitrary covariance dependence,” Journal of the American Statistical Association, 107(499), 1019–1035. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Golan D, and Rosset S. (2011), “Accurate estimation of heritability in genome wide studies using random effects models,” Bioinformatics, 27, i317–i323. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Guo Z, Kang H, Cai TT, and Small DS (2016), “Confidence Intervals for Causal Effects with Invalid Instruments using Two-Stage Hard Thresholding with Voting,” arXiv preprint arXiv:1603.05224,.
- Janson L, Barber RF, and Candes E. (2016), “EigenPrism: inference for high dimensional signal-to-noise ratios,” Journal of the Royal Statistical Society: Series B (Statistical Methodology),. [DOI] [PMC free article] [PubMed]
- Javanmard A, and Montanari A. (2014), “Confidence intervals and hypothesis testing for high-dimensional regression,” The Journal of Machine Learning Research, 15(1), 2869–2909. [Google Scholar]
- Laurent B, and Massart P. (2000), “Adaptive estimation of a quadratic functional by model selection,” The Annals of Statistics, 28(5), 1302–1338. [Google Scholar]
- LeCam L. (1973), “Convergence of estimates under dimensionality restrictions,” The Annals of Statistics, 1(1), 38–53. [Google Scholar]
- Lee SH, Ripke S, Neale BM, Faraone SV, Purcell SM, Perlis RH, Mowry BJ, Thapar A, Goddard ME, and Witte JS (2013), “Genetic relationship between five psychiatric disorders estimated from genome-wide SNPs,” Nature Genetics, 45. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lee SH, Yang J, Goddard ME, Visscher PM, and Wray NR (2012), “Estimation of pleiotropy between complex diseases using single-nucleotide polymorphism-derived genomic relationships and restricted maximum likelihood,” Bioinformatics, 28(19), 2540–2542. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lee S, and van der Wer J. (2016), “MTG2: an efficient algorithm for multivariate linear mixed model analysis based on genomic information,” Bioinformatics, 32(9), 1420–2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Maier R, Moser G, Chen G-B, Ripke S, Coryell W, Potash JB, Scheftner WA, Shi J, Weissman MM, Hultman CM et al. (2015), “Joint analysis of psychiatric disorders increases accuracy of risk prediction for schizophrenia, bipolar disorder, and major depressive disorder,” The American Journal of Human Genetics, 96(2), 283–294. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Manolio T. (2010), “Genomewide association studies and assessment of the risk of disease,” New England Journal of Medicine, 363, 166–176. [DOI] [PubMed] [Google Scholar]
- McCarthy MI, Abecasis GR, Cardon LR, Goldstein DB, Little J, Ioannidis JP, and Hirschhorn JN (2008), “Genome-wide association studies for complex traits: consensus, uncertainty and challenges,” Nature Reviews Genetics, 9(5), 356–369. [DOI] [PubMed] [Google Scholar]
- Purcell SM, Wray NR, Stone JL, Visscher PM, O’Donovan MC, Sullivan PF, Sklar P, Ruderfer DM, McQuillin A, Morris DW et al. (2009), “Common polygenic variation contributes to risk of schizophrenia and bipolar disorder,” Nature, 460(7256), 748–752. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ren Z, Sun T, Zhang C-H, and Zhou HH (2015), “Asymptotic normality and optimalities in estimation of large Gaussian graphical models,” The Annals of Statistics, 43(3), 991–1026. [Google Scholar]
- Sun T, and Zhang C-H (2012), “Scaled sparse linear regression,” Biometrika, 101(2), 269–284. [Google Scholar]
- Tibshirani R. (1996), “Regression shrinkage and selection via the lasso,” Journal of the Royal Statistical Society. Series B (Methodological), 58(1), 267–288. [Google Scholar]
- van de Geer S, Bühlmann P, Ritov Y, and Dezeure R. (2014), “On asymptotically optimal confidence regions and tests for high-dimensional models,” The Annals of Statistics, 42(3), 1166–1202. [Google Scholar]
- Vershynin R. (2012), “Introduction to the non-asymptotic analysis of random matrices,” in Compressed Sensing: Theory and Applications, eds. Eldar Y, and Kutyniok G. Cambridge University Press, pp. 210–268. [Google Scholar]
- Verzelen N, and Gassiat E. (2016), “Adaptive estimation of High-Dimensional Signal-to-Noise Ratios,” arXiv preprint arXiv:1602.08006,.
- Wray NR, Goddard ME, and Visscher PM (2007), “Prediction of individual genetic risk to disease from genome-wide association studies,” Genome Research, 17(10), 1520–1528. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wu T, Chen Y, Hastie T, Sobel E, and Lange K. (2009), “Genome-wide association analysis by lasso penalized logistic regression,” Bioinformatics, 25, 714721. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yang L, Neale BM, Liu L, Lee SH, Wray NR, Ji N, Li H, Qian Q, Wang D, Li J. et al. (2013), “Polygenic transmission and complex neuro developmental network for attention deficit hyperactivity disorder: Genome-wide association study of both common and rare variants,” American Journal of Medical Genetics Part B: Neuropsychiatric Genetics, 162(5), 419–430. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ye F, and Zhang C-H (2010), “Rate Minimaxity of the Lasso and Dantzig Selector for the `q Loss in `r Balls,” The Journal of Machine Learning Research, 11, 3519–3540. [Google Scholar]
- Yu B. (1997), “Assouad, Fano, and Le Cam,” in Festschrift for Lucien Le Cam Springer, pp. 423–435. [Google Scholar]
- Zhang C-H, and Zhang SS (2014), “Confidence intervals for low dimensional parameters in high dimensional linear models,” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 76(1), 217–242. [Google Scholar]
- Zhernakova A, van Diemen C, and Wijmenga C. (2009), “Detecting shared pathogenesis from the shared genetics of immune-related diseases,” Nature Reviews Genetics, 10, 43–45. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.


