Testing Differential Networks with Applications to Detecting Gene-by-Gene Interactions

Yin Xia; Tianxi Cai; T Tony Cai

doi:10.1093/biomet/asu074

. Author manuscript; available in PMC: 2017 May 11.

Published in final edited form as: Biometrika. 2015 Mar 2;102(2):247–266. doi: 10.1093/biomet/asu074

Testing Differential Networks with Applications to Detecting Gene-by-Gene Interactions

Yin Xia ¹, Tianxi Cai ², T Tony Cai ³

PMCID: PMC5426514 NIHMSID: NIHMS848446 PMID: 28502988

Summary

Model organisms and human studies have led to increasing empirical evidence that interactions among genes contribute broadly to genetic variation of complex traits. In the presence of gene-by-gene interactions, the dimensionality of the feature space becomes extremely high relative to the sample size. This imposes a significant methodological challenge in identifying gene-by-gene interactions. In the present paper, through a Gaussian graphical model framework, we translate the problem of identifying gene-by-gene interactions associated with a binary trait D into an inference problem on the difference of two high-dimensional precision matrices, which summarize the conditional dependence network structures of the genes. We propose a procedure for testing the differential network globally that is particularly powerful against sparse alternatives. In addition, a multiple testing procedure with false discovery rate control is developed to infer the specific structure of the differential network. Theoretical justification is provided to ensure the validity of the proposed tests and optimality results are derived under sparsity assumptions. A simulation study demonstrates that the proposed tests maintain the desired error rates under the null and have good power under the alternative. The methods are applied to a breast cancer gene expression study.

Keywords: Differential network, false discovery rate, Gaussian graphical model, gene-by-gene interaction, highdimensional precision matrix, large scale multiple testing

1. INTRODUCTION

High throughput technologies, enabling comprehensive monitoring of a biological system, have fundamentally transformed biomedical research. Studies using such technologies have led to successful molecular classifications of diseases into clinically relevant subtypes and genetic signatures predictive of disease progression and treatment response (van’t Veer et al., 2002; Gregg et al., 2008; Hu et al., 2009, e.g.). Irrespective of the technology used, analysis of high-throughput data typically considers one marker at a time and yields a list of differentially expressed genes or proteins. On the other hand, epistasis, or interactions between genes, has long been recognized as crucial to understanding the genetic architecture of disease phenotypes (Phillips, 2008; Eichler et al., 2010). Increasing empirical evidence from model organisms and human studies suggests that gene-by-gene interactions may make an important contribution to total genetic variation of complex traits (Zerba et al., 2000; Marchini et al., 2005). In this paper, we are specifically interested in gene-by-gene interactions with respect to the interactive effects of two genes on a binary disease trait D.

In the presence of gene-by-gene interactions, the dimensionality of the feature space becomes extremely high relative to the sample size. This, together with the variability of the data, imposes a significant methodological challenge in identifying gene-by-gene interactions using currently available studies, which typically have limited sample sizes and power. Recent development in interaction modeling has led to several useful methods including multi-factor dimensionality reduction (Ritchie et al., 2001; Moore, 2004), polymorphism interaction analysis (Mechanic et al., 2008), random forests (Breiman, 2001), various variations of logistic regression with interactive effects (Chatterjee et al., 2006; Chapman & Clayton, 2007; Kooperberg & Ruczinski, 2005; Kooperberg & LeBlanc, 2008) and sure independence screening (Fan & Lv, 2008). However, to overcome the high dimensionality, a majority of these methods use multistage procedures and marginal assessments of the effects of a gene pair without simultaneously accounting for the effects of other genes. Multistage procedures may have limited power in detecting genes that affect the outcome through interactions with other genes without strong main effects. The interactive effects detected through models that only consider one pair of genes at a time without conditioning on other genes may also result in false identification of interactions due to the discrepancy between conditional and unconditional effects. Furthermore, none of the existing methods provide false discovery rate control in the presence of interactions. Due to the large number of tests, the power of multiple testing procedures using the standard Bonferroni or naive false discovery rate corrections can dissipate quickly.

In this paper, through a Gaussian graphical model framework, we translate the problem of identifying gene-by-gene interactions associated with a binary trait D into the comparison of two high-dimensional precision matrices. Let G denote a p × 1 vector of genomic markers and assume that, conditional on D = d, G ~ N(μ_d, Σ_d), for d = 1, 2. Then the posterior risk given G is

pr (D = 1 | G) = g {constant - \frac{1}{2} G^{T} (Ω_{1} - Ω_{2}) G + G^{T} (Ω_{1} μ_{1} - Ω_{2} μ_{2})}

where g(x) = e^x/(1 + e^x) and $Ω_{d} = (ω_{i, j, d}) = \sum_{d}^{- 1}$ is the precision matrix for G conditional on D = d. Hence, an interaction between the gene pair (i, j) affects the disease risk if and only if δ_i,j= ω_i,j,1 − ω_i,j,2 = 0. The difference between the two precision matrices, denoted by Δ = (δ_i,j) = Ω₁ – Ω₂, is called the differential network. This type of model for a differential network has been used in Li et al. (2007) and Danaher et al. (2014). We thus propose to test for gene-by-gene interactions both by testing the global hypotheses

H_{0} : Δ = 0 versus H_{1} : Δ \neq 0,

(1)

and by simultaneously testing the hypotheses

H_{0, i, j} : δ_{i, j} = 0 versus H_{1, i, j} : δ_{i, j} \neq 0, 1 \leq i < j \leq p,

while controlling for the overall false discovery rate at a pre-specified level.

Few authors have considered testing the equality of two precision matrices in the high-dimensional setting. The global null hypothesis Δ = 0, or equivalently Ω₁ = Ω₂, corresponds to the hypothesis that none of the gene pairs have interactive effects on D. The equality of two precision matrices is equivalent to the equality of two covariance matrices, and the latter has been studied under various alternatives. Under the dense alternative, where Σ₁ and Σ₂ differ in a large number of entries, various sum-of-square type testing procedures have been proposed (Schott, 2007; Srivastava & Yanagihara, 2010; Li & Chen, 2012). Under the sparse alternative with Σ₁ and Σ₂ differing only in a small number of entries, Cai et al. (2013) introduced a particularly powerful test. However, in the gene-by-gene interaction setting, the goal is to identify the structure of the differential network. In such cases, it is often reasonable to assume that Δ is sparse, while Σ₁ – Σ₂ is not. Hence, testing procedures that can leverage information on the sparsity of Δ may improve power. Furthermore, due to the fundamental difference between conditional and unconditional dependences, the various procedures for testing the covariance matrices may not be well adapted to testing specific entries of the precision matrices.

The first goal of this paper is to develop a global test for H₀ : Δ = 0 that is powerful against sparse alternatives. We then develop a multiple testing procedure for simultaneously testing the hypotheses {H_0,i,j : 1 ≤ i < j ≤ p} with false discovery rate control to infer the structure of the differential network. In the high-dimensional setting, there is no sample precision matrix that one can use to approximate Ω_d. We propose to infer Ω_d by relating its elements to the coefficients of a set of regression models for G conditional on D = d. We then construct test statistics based on the covariances between the residuals from the fitted regression models. The testing procedures are easy to implement. A Matlab implementation is available in the Supplementary Material.

2. Global Testing of Differential Networks

2.1. Notation and Definitions

In this section we consider testing the global hypothesis (1). We begin with notation and definitions that will be used in the rest of the paper. Let X_k ε ℝ^p and Y_k ε ℝ^p denote G given D = 1 and D = 2, respectively, X_k ~ N(μ₁,Σ₁) for k = 1,…, n₁, Y_k ~ N(μ₁,Σ₂) for k = 1,…, n₂, where Σ_d = (σ_i,j,d) for d = 1, 2, and {X_k : k = 1,…, n₁} and {Y_k : k = 1,…, n₂} are independent observations from the two populations. Let X = (X₁,…, X_n1)^T and Y = (Y₁,…, Y_n2)^T denote the data matrices. Let $Ω_{d} = (ω_{i, j, d}) = \sum_{d}^{- 1}$ , for d = 1,2.

For subscripts, we use the convention that i stands for the i^th entry of a vector and (i,j) for the entry in the i^th row and j^th column of a matrix, k represents the k^th sample and d indexes the binary trait. Let β_i,₁ = (β_1,i,1,…,β_p−1,i,1)^T denote the regression coefficients of X_k,i regressed on the rest of the entries of X_k and let β_i,2 = (β₁,_i_,2,…,β_p−1,i,2)^T denote the regression coefficients of Y_k,i regressed on the rest of the entries of Y_k.

For any vector μ_d with dimension p × 1, let μ_−i,d denote the (p – 1) × 1 vector by removing the i^th entry from μ_d. For a symmetric matrix A, let λ_max(A) and λ_min(A) denote the largest and smallest eigenvalues of A. For any p × q matrix A, A_i,₋_j denotes the i^th row of A with its j^th entry removed and A_−i,j denotes the j^th column of A with its i^th entry removed. The matrix A₋_i,₋_j denotes a (p – 1) × (q – 1) matrix obtained by removing the i^th row and j^th column of A. For an n × p data matrix U = (U₁,…, U_n)^T, let $U ._{, - i} = {(U_{1, - i}^{T}, \dots, U_{n, - i}^{T})}^{T}$ with dimension $n \times (p - 1), \bar{U} ._{, - i} = n^{- 1} \sum_{k = 1}^{n} U_{k, - i}$ with dimension 1 × (p − 1), U₍_i₎ = (U₁,_i,…, U_n,i)^T with dimension $n \times 1, {\bar{U}}_{(i)} = {({\bar{U}}_{i}, \dots, {\bar{U}}_{i})}^{T}$ with dimension n × 1, where ${\bar{U}}_{i} = n^{- 1} \sum_{k = 1}^{n} U_{k, i}$ , and ${\bar{U}}_{(\cdot, - i)} = {({\bar{U}}_{\cdot, - i}^{T}, \dots, {\bar{U}}_{\cdot, - i}^{T})}^{T}$ with dimension n × (p − 1). For tuning parameters λ, let $λ_{n_{d}, i, d}$ represent the i^th tuning parameter for binary trait d, which depends on the sample size n_d.

For a vector β = (β₁,…,β_p)^T ε ℝ^p, define the ℓ_q norm by $| β |_{q} = {(\sum_{i = 1}^{p} | β_{i} |^{q})}^{1 / q}$ for 1 ≤ q ≤ ∞. A vector β is called k-sparse if it has at most k nonzero entries. For a matrix Ω = (ω_i,j)_p×p, the matrix 1-norm is the maximum absolute column sum, $‖ Ω ‖_{L_{1}} = {max}_{1 \leq i, j \leq p} \sum_{i = 1}^{p} | ω_{i, j} |$ , the matrix elementwise infinity norm is defined to be ||Ω||_∞ = max_1≤_i,j_≤_p |ω_i,j| and the elementwise ℓ₁ norm is $‖ Ω ‖_{1} = {\sum_{i = 1}^{p} \sum_{j = 1}^{p} | ω}_{i, j} |$ . For a matrix Ω, we say Ω is k-sparse if each row/column has at most k nonzero entries. For a set ℋ, denote by |ℋ| the cardinality of ℋ. For two sequences of real numbers {a_n} and {b_n}, write a_n = O(b_n) if there exists a constant C such that |a_n| ≤ C|b_n| holds for all n, write a_n = o(b_n) if lim_n→∞ a_n/b_n = 0, and write a_n ≍ b_n if there are positive constants c and C such that c ≤ a_n/b_n ≤ C for all n.

2 2. Testing Procedure

It is well known (e.g., Anderson, 2003, Section 2.5), that in the Gaussian setting the precision matrix can be described in terms of regression models. Specifically, we may write

X_{k, i} = α_{i, 1} + X_{k, - i} β_{i, 1} + ε_{k, i, 1}, (i = 1, \dots, p; k = 1, \dots, n_{1}),

(2)

Y_{k, i} = α_{i, 2} + Y_{k, - i} β_{i, 2} + ε_{k, i, 2}, (i = 1, \dots, p; k = 1, \dots, n_{2}),

(3)

where $ε_{k, i, d} \sim N (0, σ_{i, i, d} - \sum_{i, - i, d} \sum_{- i, - i, d}^{- 1} \sum_{- i, i, d}) (d = 1, 2)$ , are independent of X_k,−i and Y_k,−i respectively, and $α_{i, d} = μ_{i, d} - \sum_{i, - i, d} \sum_{- i, - i, d}^{- 1} μ_{- i, d}$ . The regression coefficient vectors β_i,d and the error terms ε_k,i,d satisfy

β_{i, d} = - ω_{i, i, d}^{- 1} Ω_{- i, i, d}, r_{i, j, d} = cov (ε_{k, i, d}, ε_{k, j, d}) = \frac{ω_{i, j, d}}{ω_{i, i, d} ω_{j, j, d}},

where cov(·,·) denotes the population covariance. Since the null hypothesis H₀ : Δ = 0 is equivalent to the hypothesis

H_{0} : max_{1 \leq i \leq j \leq p} | ω_{i, j, 1} - ω_{i, j, 2} | = 0,

a natural approach to test H₀ is to first construct estimators of ω_i,j,d, and then base the test on the maximum standardized differences. We first construct estimators of r_i,j,d

Let ${\hat{β}}_{i, d} = {({\hat{β}}_{1, i, d}, \dots, {\hat{β}}_{p - 1, i, d})}^{T}$ be estimators of β_i,d satisfying

max_{1 \leq i \leq p} | {\hat{β}}_{i, d} - β_{i, d} | 1 = o_{p} {{(\log p)}^{- 1}},

(4)

max_{1 \leq i \leq p} | {\hat{β}}_{i, d} - β_{i, d} | 2 = o_{p} {{(n_{d} \log p)}^{- 1 / 4}} .

(5)

Estimators ${\hat{β}}_{i, d}$ that satisfy (4) and (5) can be obtained easily via methods such as the lasso and Dantzig selector. See Section 2.3 for details. Define the residuals by

{\hat{ε}}_{k, i, 1} = X_{k, i} - {\bar{X}}_{i} - (X_{k, - i} - \bar{X} ._{, - i}) {\hat{β}}_{i, 1}, {\hat{ε}}_{k, i, 2} = Y_{k, i} - {\bar{Y}}_{i} - (Y_{k, - i} - \bar{Y} ._{, - i}) {\hat{β}}_{i, 2} .

A natural estimator of r_i,j,d is the sample covariance between the residuals,

{\tilde{r}}_{i, j, d} = \frac{1}{n_{d}} \sum_{k = 1}^{n_{d}} {\hat{ε}}_{k, i, d} {\hat{ε}}_{k, j, d} .

(6)

However, when $i \neq j, {\tilde{r}}_{i, j, d}$ tends to be biased due to the correlation induced by the estimated parameters and it is desirable to construct a bias-corrected estimator. Lemma 2 shows that

{\tilde{r}}_{i, j, d} = {\tilde{R}}_{i, j, d} - {\tilde{r}}_{i, i, d} ({\hat{β}}_{i, j, d} - β_{i, j, d}) - {\tilde{r}}_{j, j, d} ({\hat{β}}_{j - 1, i, d} - β_{j - 1, i, d}) + o_{p} {{(n_{d} \log p)}^{- \frac{1}{2}}}

where ${\tilde{R}}_{i, j, d}$ is the empirical covariance between {ε_k,i,d: k = 1,… ,n_d} and {ε_k,j,d : k = 1,…,n_d}. For 1 ≤ i ≤ j ≤ p, β_i,j,d = − ω_i,j,d/ω_j,j,d and β_j−₁_,i,d = −ω_i,j,d/ω_i,i,d Thus, we propose a bias-corrected estimator of r_i,j,d as

{\hat{r}}_{i, j, d} = - ({\tilde{r}}_{i, j, d} + {\tilde{r}}_{i, i, d} {\hat{β}}_{i, j, d} + {\tilde{r}}_{j, j, d} {\hat{β}}_{j - 1, i, d}), 1 \leq i < j \leq p .

(7)

The bias of ${\hat{r}}_{i, j, d}$ is of order max{r_i,j,d(log p/n_d)^1/2,(n_dlog p)^−1/2}.

For i = j, note that r_i,i,d = 1/u_i,_i,_d. We show in Lemma 2 that

max_{1 \leq i \leq p} | {\tilde{r}}_{i, i, d} - r_{i, i, d} | = O_{p} {{(\log p / n_{d})}^{1 / 2}},

which implies that ${\hat{r}}_{i, i, d} = {\tilde{r}}_{i, i, d}$ is a nearly unbiased estimator of r_i,i,d. A natural estimator of ω_i,j,d can then be defined by

T_{i, j, d} = \frac{{\hat{r}}_{i, j, d}}{{\hat{r}}_{i, i, d} {\hat{r}}_{j, j, d}}, 1 \leq i \leq j \leq p

(8)

We test H₀ : Δ = 0 based on the estimators $T = {T_{i, j, 1} - T_{i, j, 2} : 1 \leq i \leq j \leq p}$

The estimators T_i,j,1 − T_i,j,2 in $T$ are heteroscedastic and possibly have a wide range of variability. We first standardize T_i,j,1 − T_i,j,2 before combining information from all entries in $T$ . Let $U_{i, j, d} = (1 / n_{d}) \sum_{k = 1}^{n_{d}} {ε_{k, i, d} ε_{k, j, d} - E (ε_{k, i, d} ε_{k, j, d})}$ and ${\tilde{U}}_{i, j, d} = (r_{i, j, d} - U_{i, j, d}) / (r_{i, i, d} r_{j, j, d})$ . It will be shown in Lemma 2 that, uniformly in 1 ≤ i ≤ j ≤ p,

| T_{i, j, d} - {\tilde{U}}_{i, j, d} | = O_{p} {{(\log p / n_{d})}^{\frac{1}{2}}} r_{i, j, d} + o_{p} {{(n_{d} \log p)}^{- \frac{1}{2}}} .

Let $θ_{i, j, d} = var ({\tilde{U}}_{i, j, d})$ . Note that

θ_{i, j, d} = var {ε_{k, i, d} ε_{k, j, d} / (r_{i, i, d} r_{j, j, d})} / n_{d} = (1 + ρ_{i, j, d}^{2}) / (n_{d} r_{i, i, d} r_{j, j, d}),

where $ρ_{i, j, d}^{2} = β_{i, j, d}^{2} r_{i, i, d} / r_{j, j, d}$ . We then estimate θ_i,j,d by

{\hat{θ}}_{i, j, d} = (1 + {\hat{β}}_{i, j, d}^{2} {\hat{r}}_{i, i, d} / {\hat{r}}_{j, j, d}) / (n_{d} {\hat{r}}_{i, i, d} {\hat{r}}_{j, j, d}) .

Define the standardized statistics

W_{i, j} = \frac{T_{i, j, 1} - T_{i, j, 2}}{{({\hat{θ}}_{i, j, 1} + {\hat{θ}}_{i, j, 2})}^{1 / 2}}, 1 \leq i \leq j \leq p .

(9)

Finally, we propose the following test statistic for testing the global null hypothesis H₀,

M_{n} = max_{1 \leq i \leq j \leq p} W_{i, j}^{2} = max_{1 \leq i \leq j \leq p} \frac{{(T_{i, j, 1} - T_{i, j, 2})}^{2}}{{\hat{θ}}_{i, j, 1} + {\hat{θ}}_{i, j, 2}} .

(10)

The asymptotic properties of M_n will be studied in detail in Section 3. Intuitively, {W_i,j} are approximately standard normal variables under the null H₀ and they are only weakly dependent under suitable conditions. Thus M_n is the maximum of the squares of p(p + 1)/2 such random variables, so its value should be close to 2 log{p(p + 1)/2} ≈ 4 log p under H₀. We show in Section 3 that, under certain regularity conditions, M_n − 4 log p − log log p converges to a type I extreme value distribution under H₀ : Δ = 0.

Based on the limiting null distribution of M_n, which will be developed in Section 3.1, we define the test ψ_α by

Ψ_{α} = I (M_{n} \geq q_{α} + 4 \log p - \log \log p)

(11)

where q_α is the 1 − α quantile of the type I extreme value distribution with the cumulative distribution function exp{(8π)^−1/2e^−t/2}, i.e.,

q_{α} = - \log (8 π) - 2 \log \log {(1 - α)}^{- 1} .

(12)

The hypothesis H₀ is rejected whenever ψ_α = 1.

2.3. Data-driven estimation of regression coefficients

The testing procedure requires the estimation of regression coefficients β_i,d, for i = 1,…,p and d = 1, 2. Various estimators have been studied in the literature, including the lasso and Dantizg selector. Here, we use the lasso by solving the optimization problem,

{\hat{β}}_{i, 1} = D_{i, 1}^{- 1 / 2} \underset{u \in ℝ^{p - 1}}{\arg min} {{(2 n_{1})}^{- 1} | (X ._{, - i} - {\bar{X}}_{(\cdot, - i)}) D_{i, 1}^{- 1 / 2} u - (X_{(i)} - {\bar{X}}_{(i)}) |_{2}^{2} + λ_{n_{1}, i, 1} | u | 1},

(13)

{\hat{β}}_{i, 2} = D_{i, 2}^{- 1 / 2} \underset{v \in ℝ^{p - 1}}{\arg min} {{(2 n_{2})}^{- 1} | (Y ._{, - i} - {\bar{Y}}_{(\cdot, - i)}) D_{i, 2}^{- 1 / 2} v - (Y_{(i)} - {\bar{Y}}_{(i)}) |_{2}^{2} + λ_{n_{2}, i, 2} | v | 1},

(14)

where $D_{i, d} = diag ({\sum^{^}}_{- i, - i, d})$ and $λ_{n_{d}, i, d} = κ_{d} {({\hat{σ}}_{i, i, d} \log p / n_{d})}^{1 / 2}$ , d = 1,2. Then by Proposition 4.2 of Liu (2013), under Condition (C1) given in Section 3 and a mild condition on the sparsity of β_i,d (i = 1,…, p, d = 1,2), the convergence rates in (4) and (5) can be guaranteed by using any κ_d > 2. The result is formally stated in Corollary 1. In practice, κ_d = 2 works well for global testing of H₀ : Δ = 0, and for the multiple testing procedure with false discovery rate control, a data-driven algorithm is proposed in Section 5 to select κ_d adaptively.

2.4. Discussion

The global test ψ_α given in (11) is based on estimators of ω_i,j,₁ − ω_i,j,₂ Here we estimate ω_i,j,d by first constructing estimators for r_i,j,d = ω_i,j,d/(ω_i,i,dω_j,j,d), and then estimating r_i,j,d through bias correction of the residuals ${\hat{r}}_{i, j, d}$ defined in (7).

Liu (2013) considered multiple testing of entries of a single precision matrix Ω = (ω_i,j). In the one-sample case, ω_i,j = 0 is equivalent to r_i,j= ω_i,j/(ω_i,iω_j,j) = 0 under the null and r_i,j is easier to estimate. The procedure in Liu (2013) is based on the estimation of r_i,j instead of ω_i,j. However, in Section 4 we will also consider multiple testing between two groups, and ω_i,j,₁= ω_i,j,2 is not equivalent to r_i,j,₁= r_i,j,2. Thus, it is necessary to construct testing procedures based directly on estimators of ω_i,j,₁ − ω_i,j,₂.

Testing the global hypothesis H₀ : Ω₁ = Ω₂ is equivalent to testing H₀ : Σ₁ = Σ₂, which has been well studied (Schott, 2007; Srivastava & Yanagihara, 2010; Li & Chen, 2012; Cai et al., 2013). In particular, Cai et al. (2013) constructed a global test for H₀ : Σ₁ = Σ₂ that is powerful against the alternative where Σ₁ − Σ₂ is sparse. However, in many applications, the goal is to learn the structure of the differential network, and we are interested in both testing the global hypothesis H₀ : Ω₁ = Ω₂ and multiple testing of the entrywise hypotheses H₀_,i,j : ω_i,j,₁ = ω_i,j,2. In such cases, it is often reasonable to assume that Δ = Ω₁ − Ω₂ is sparse, but Σ₁ − Σ₂ is not. Hence, testing procedures for H₀ : Σ₁ = Σ₂ cannot leverage information on the sparsity of Δ and more importantly do not naturally lead to a multiple testing procedure for simultaneously testing the entrywise hypotheses H₀_,i,j : ω_i,j,₁ = ω_i,j_,2.

3. Theoretical Results for the Global Test

3 1. Asymptotic Null Distribution of M_n

In this section, we analyze the properties of the new test for testing the global null hypothesis H₀ : Δ = 0, including the null distribution of the test statistic M_n, the asymptotic size and power. We are particularly interested in the power of the new test under the alternative with Δ sparse. We further show that the power is minimax rate optimal.

Under assumptions (C1) and (C2), Theorem 1 indicates that under H₀, M_n − 4 log p + log log p converges weakly to a Gumbel random variable with distribution function exp{−(8π)^−1/2e⁻^t^/2}.

(C1)
Assume that log p = o(n^1/5), n₁ ≍ n₂, and for some constant $C_{0} > 0, C_{0}^{- 1} \leq λ_{min} (Ω_{d}) \leq λ_{max} (Ω_{d}) \leq C_{0}$ , for d = 1,2. There exists some τ > 0 such that | A_τ| = o(p^1/16) where A_τ = {(i,j) : |ω_i,j,d| ≥ (log p)^−2−τ, 1 ≤ i<j ≤ p, for d = 1 or 2}.
(C2)
Let D_d be the diagonal of Ω_d and let $(η_{i, j, d}) = R_{d} = D_{d}^{- 1 / 2} Ω_{d} D_{d}^{- 1 / 2}$ , for d = 1,2. Assume that max_1≤_i_≤_j_≤_p |η_i,j,d| ≤ η_d ≤ 1 for some constant 0 < η_d < 1.

Condition (C1) on the eigenvalues is a common assumption in the high-dimensional setting and implies that most of the variables are not highly correlated with each other. Condition (C2) is also mild. For example, if max_1≤_i_≤_j_≤_p |η_i,j,d| = 1, then Ω_d is singular. The following theorem states the asymptotic null distribution for M_n.

Theorem 1

Suppose that (C1), (C2), (4) and (5) hold. Then under H₀, for any t ε ℝ,

pr (M_{n} - 4 \log p + \log \log p \leq t) \to \exp {- {(8 π)}^{- 1 / 2} \exp (- t / 2)}, a s n_{1}, n_{2}, p \to \infty,

(15)

where M_n is defined in equation (10). Under H₀, the convergence in (15) is uniform for all {X_k : k = 1,…, n₁} and {Y_k : k = 1,…, n₂} satisfying (C1), (C2), (4) and (5).

Equations (4) and (5) are mild conditions on the estimator of β_i,d in order to obtain the limiting distribution in Theorem 1. As discussed in Section 2 3, these conditions can be guaranteed by the lasso estimator for example.

Corollary 1

Suppose that (C1) and (C2) hold and max_1≤i≤p |β_i,_d|₀ = o{n^1/2 / (log p)^3/2}. Then under H₀, for any κ_d > 2 in (13) and (14), and for any t ε ℝ,

pr (M_{n} - 4 \log p + \log \log p \leq t) \to \exp {- {(8 π)}^{- 1 / 2} \exp (- t / 2)}, n_{1}, n_{2}, p \to \infty,

(16)

where M_n is defined in (10).

3 2. Power Analysis

We now turn to an analysis of the power of the test ψ_α given in (11). We shall define the following class of precision matrices:

U (c) = {(Ω_{1}, Ω_{2}) : max_{1 \leq i \leq j \leq p} \frac{| ω_{i, j, 1} - ω_{i, j, 2} |}{{(θ_{i, j, 1} + θ_{i, j, 2})}^{1 / 2}} \geq c {(\log p)}^{1 / 2}} .

(17)

The next theorem shows that the null parameter set in which Ω₁ = Ω₂ is asymptotically distinguishable from $U$ (4) by the test ψ_α. That is, H₀ is rejected by the test ψ_α with overwhelming probability if $(Ω_{1}, Ω_{2}) \in U (4)$ .

Theorem 2

Let the test ψ_α be given as in (11). Suppose that (C1), (4) and (5) hold. Then

\inf_{(Ω_{1}, Ω_{2}) \in U (4)} pr (Ψ_{α} = 1) \to 1, n, p \to \infty .

The following result shows that this lower bound is rate-optimal. Let $T_{α}$ be the set of all α-level tests, i.e., pr(T_α = 1) ≤ α under H₀ for all $T_{α} \in T_{α}$ .

Theorem 3

Suppose that log p = o(n). Let α, β > 0 and α + β < 1. Then there exists a constant c₀ > 0 such that for all sufficiently large n and p,

\inf_{(Ω_{1}, Ω_{2}) \in U (c_{0})} \sup_{T_{α} \in T_{α}} pr (T_{α} = 1) \leq 1 - β .

Theorem 3 shows that, if c₀ is sufficiently small, then any α level test is unable to reject the null hypothesis correctly uniformly over $(Ω_{1}, Ω_{2}) \in U (c_{0})$ with probability tending to one. So the order (logp)^1/2 in the lower bound of max_1≤_i_≤_j_≤_p{|ω_i,j,₁ − ω_i,j,2/(θ_i,j,₁ + θ_i,j,₂)^1/2} in (17) cannot be improved.

4. Multiple Testing with False Discovery Rate Control

If the global null hypothesis is rejected, it is often of interest to investigate the structure of the differential network Δ. A natural approach is to carry out simultaneous testing on the elements of Δ. In this section, we introduce a multiple testing procedure with false discovery rate control for testing (p² − p) /2 hypotheses

H_{0, i, j} : δ_{i, j} = 0 versus H_{1, i, j} : δ_{i, j} \neq 0, 1 \leq i < j \leq p .

(18)

The standardized differences of T_i,j,₁ and T_i,j,₂ are defined by the test statistics $W_{i, j} = (T_{i, j, 1} - T_{i, j, 2}) / {({\hat{θ}}_{i, j, 1} + {\hat{θ}}_{i, j, 2})}^{1 / 2}$ as in (9). Let t be the threshold level such that H₀_,i,j is rejected if |W_i,j |≥ t. Let ℋ₀ = {(i, j) : δ_i,j = 0,1 ≤ i < j ≤ p} be the set of true nulls. Denote by $R_{0} (t) = \sum_{(i, j) \in ℋ_{0}} I (| W_{i, j} | \geq t)$ the total number of false positives, and by R(t) = Σ₁_≤i<j≤p I(|W_i,j|≥ t) the total number of rejections. The false discovery proportion and false discovery rate are defined as

FDP (t) = \frac{R_{0} (t)}{R (t) \lor 1}, FDR (t) = E {FDP (t)} .

An ideal choice of t would reject as many true positives as possible while controlling the false discovery rate and false discovery proportion at the pre-specified level α. That is, we select

t_{0} = \inf {0 \leq t \leq 2 {(\log p)}^{1 / 2} : FDP (t) \leq α} .

Since ℋ₀ is unknown, we can estimate $\sum_{(i, j) \in ℋ_{0}} I {| W_{i, j} | \geq t}$ by $2 {1 - Φ (t)} | ℋ_{0} |$ as in Liu (2013), where ϕ(t) is the standard normal cumulative distribution function. Note that $| ℋ_{0} |$ can be estimated by (p² − p)/2 due to the sparsity of Δ. This leads to the following multiple testing procedure.

Calculate the test statistics W_i,j.
For given 0 ≥ α ≥ 1, calculate
$\hat{t} = \inf {0 \leq t \leq 2 {(\log p)}^{1 / 2} : \frac{2 {1 - Φ (t)} (p^{2} - p) / 2}{R (t) \lor 1} \leq α} .$
If $\hat{t}$ does not exists, set $\hat{t} = 2 {(\log p)}^{1 / 2}$ .
For 1 ≤ i < j ≤ p, reject H₀,_i,i,. if and only if $| W_{i, j} | \geq \hat{t}$ .

The following theorem shows that, under regularity conditions, the above procedure controls the false discovery proportion and false discovery rate at the pre-specified level α asymptotically.

Theorem 4

Let

S_{ρ} = {(i, j) : 1 \leq i < j \leq p, \frac{| ω_{i, j, 1} - ω_{i, j, 2} |}{{(θ_{i, j, 1} + θ_{i, j, 2})}^{1 / 2}} \geq {(\log p)}^{1 / 2 + ρ}} .

Suppose for some ρ > 0 and some δ > 0, $| S_{ρ} | \geq [1 / {{(8 π)}^{1 / 2} α} + δ] {(\log \log p)}^{1 / 2}$ . Suppose that $| A_{T} \cap ℋ_{0} | = o (p^{ν})$ for any ν > 0, where $A_{τ}$ is given in Condition (C1). Assume that $q_{0} = | ℋ_{0} | \geq c p^{2}$ for some c > 0, and (4) and (5) hold. Let q = (p² − p)/2. Then under (C1) with p ≤ cn^r for some c > 0 and r > 0, we have

\lim_{(n, p) \to \infty} \frac{FDR (\hat{t})}{α q_{0} / q} = 1, \frac{FDP (\hat{t})}{α q_{0} / q} \to 1

in probability, as (n, p) → ∞.

The condition $| S_{ρ} | \geq [1 / {{(8 π)}^{1 / 2} α} + δ] {(\log \log p)}^{1 / 2}$ in Theorem 4 is mild, since there are (p² − p)/2 hypotheses in total and this condition only requires a few entries with the standardized difference having magnitude exceeding {(log p)^1/2+^ρ/n}^1/2 for some constant ρ > 0. The technical condition $| A_{T} \cap ℋ_{0} | = o (p^{ν})$ for any ν > 0 is to ensure that most of the regression residuals are not highly correlated with each other under the null hypotheses H₀_,i,j : δ_i,j = 0.

The basic idea for the proof of Theorem 4 is similar to that in Liu (2013). However, the setting here is more complicated as ω_i,j,₁ and ω_i,j_,2 are not necessarily zero under H₀_,i,j : δ_i,j = 0. So the coordinates of the regression residuals in (2) and (3) can be correlated with each other. Thus slightly stronger conditions are needed and the proof is more involved.

5. Simulation Study

The proposed testing procedures are easy to implement, and the Matlab code is available in the Supplementary Material. We carry out a simulation study to investigate the numerical performance, including the size and power, of the global test Ψ_α and the false discovery rate controlled multiple testing procedure.

We first introduce the matrix models used in the simulations. Let D = (D_i,j) be a diagonal matrix with D_i,i = Unif(0.5, 2.5) for i = 1,…,p. The following four models under the null, $Ω_{1} = Ω_{2} = Ω^{(m)} = (ω_{i, j}^{(m)}) (m = 1, \dots, 4)$ , are used to study the size of the tests.

Model 1: $Ω^{* (1)} = (ω_{i, j}^{* (1)})$ where $ω_{i, i}^{* (1)} = 1$ , $ω_{i, i + 1}^{* (1)} = ω_{i + 1, i}^{* (1)} = 0.6$ , $ω_{i, i + 2}^{* (1)} = ω_{i + 2, i}^{* (1)} = 0.3$ and $ω_{i, j}^{* (1)} = 0$ otherwise. Ω⁽¹⁾ = D^1/2Ω^*(1)D^1/2.
Model 2: $Ω^{* (2)} = (ω_{i, j}^{* (2)})$ where $ω_{i, j}^{* (2)} = ω_{j, i}^{* (2)} = 0.5$ for i = 10(k − 1) + 1 and 10(k − 1) + 2 ≤ j ≤ 10(k − 1) + 10, 1 ≤ k ≤ p/10. $ω_{i, j}^{* (2)} = 0$ otherwise. Ω⁽²⁾ = D^1/2(Ω ^*(2) + δI)/(1 + δ)D^1/2 with δ = |λ_min(Ω^*(2))| + 0.05.
Model 3: $Ω^{* (3)} = (ω_{i, j}^{* (3)})$ where $ω_{i, i}^{* (3)} = 1$ , $ω_{i, j}^{* (3)} = 0.8 \times Bernoulli (1, 0.05)$ for i < j and $ω_{j, i}^{* (3)} = ω_{i, j}^{* (3)}$ . Ω⁽³⁾ = D^1/2(Ω^*(3)+ δI)/(1 + δ)D^1/2 with δ = |λ_min(Ω^*(3))| + 0.05.
Model 4: $\sum^{* (4)} = (σ_{i, j}^{* (4)})$ where $σ_{i, i}^{* (4)} = 1$ , $σ_{i, j}^{* (4)} = 0.5$ for 2(k − 1) + 1 ≤ i ≠ j ≤ 2k, where k = 1,…, [p/2] and $σ_{i, j}^{* (4)} = 0$ otherwise. Ω⁽⁴⁾ = d^1/2{(Σ^*(4) + δI)/(1 + δ)}⁻¹ D^1/2 with δ = |λ_min(Σ^*(4))| + 0.05.

For global testing of H₀ : Δ = 0, the sample sizes are taken to be n₁ = n₂ = 100, while the dimension p varies over the values 50, 100, 200 and 400. For each model, data are generated from multivariate normal distributions with mean zero and covariance matrices $\sum_{1} = Ω_{1}^{- 1}$ and $\sum_{2} = Ω_{2}^{- 1}$ The nominal significance level for all the tests is set at α₁ − 0.05.

To evaluate the power of the proposed tests, let U = (u_i,j) be a matrix with eight random nonzero entries. The locations of four nonzero entries are selected randomly from the upper triangle of U, each with a magnitude generated randomly and uniformly from the set [−2ω(log p/n)^1/2, −ω(log p/n)^1/2] ∪ [ω(log p/n)^1/2,2 ω(log p/n)^1/2], where $ω = {max}_{1 \leq i \leq p} ω_{i, i}^{(m)}$ . The other four nonzero entries in the lower triangle are determined by symmetry. We use the following four pairs of precision matrices $(Ω_{1}^{(m)}, Ω_{2}^{(m)}) (m = 1, \dots, 4)$ , to show the power of the test, where $Ω_{1}^{(m)} = Ω^{(m)} + δ I$ and $Ω_{2}^{(m)} = Ω^{(m)} + U + δ I$ , with δ = |min{λ_min(Ω⁽^m⁾ + U), λ_min(Ω⁽^m⁾)}| + 0.05. The actual sizes and powers in percentage for the four models, reported in Table 1, are estimated from 1000 replications.

Table 1.

Empirical sizes and powers (%) for global testing with α₁ = 0.05, n₁ = n₂ = 100, and 1000 replications.

p	Model 1	Model 2	Model 3	Model 4
	Size

50	3.8	3.9	5.4	4.4
100	3.6	4.4	4.1	3.8
200	3.4	3.6	3.7	3.9
400	3.5	3.7	3.6	3.5

	Power

50	100	98.7	95.6	81.6
100	99.7	96.6	95.1	77.8
200	93.1	88.2	93.6	72.1
400	86.3	73.1	77.7	70.7

Open in a new tab

Table 1 shows that the sizes of the global test $Ψ_{α_{1}}$ are close to the nominal level in all cases. This reflects the fact that the null distribution of the test statistic M_n is well approximated by its asymptotic distribution. The empirical sizes are slightly below the nominal level in some models, due to the correlation among the variables. Similar phenomena have also been observed in Cai et al. (2013) and are theoretically justified by their Proposition 1. Table 1 shows that the proposed test is powerful in all settings, although the two precision matrices differ only in eight entries with the magnitude of the difference of the order (log p/n)^1/2.

In addition, we consider nearer alternatives by generating the nonzero entries randomly and uniformly from the set [−ω(2 log p/n)^1/2, ω(2 log p/n)^1/2]. The power results are summarized in Table 2. Under the nearer alternatives, the magnitude of the standardized difference of Ω₁ − Ω₂ is smaller and as a result the power is lower.

Table 2.

Empirical power (%) for global testing under nearer alternatives.

p	Model 1	Model 2	Model 3	Model 4
	Power under nearer alternative

50	90.3	71.6	58.9	20.6
100	89.4	70.3	60.8	22.8
200	81.9	55.2	54.2	21.7
400	73.5	54.7	57.7	17.5

Open in a new tab

More extensive simulation results are presented in the Supplementary Material. The proposed test significantly outperforms both that of Cai et al. (2013), which is powerful when Σ₁ − Σ₂ is sparse under the alternative, and that of Li & Chen (2012), which is powerful when Σ₁ − Σ₂ is dense under the alternative.

For simultaneous testing of the individual entries of the differential network Δ with false discovery rate control, we select $λ_{n_{d}, i, d}$ in (13) and (14) adaptively with the principle of making $\sum_{(i, j) \in ℋ_{0}} I (| W_{i, j} | \geq t)$ and ${2 - 2 Φ (t)} | ℋ_{0} |$ as close as possible. The algorithm is as follows.

For any given i ∈{1,…,p}, let $λ_{n_{1}, i, 1} = (s / 20) {({\sum^{^}}_{i, i, 1} \log p / n_{1})}^{1 / 2}$ and $λ_{n_{2}, i, 2} = (s / 20) {({\sum^{^}}_{i, i, 2} \log p / n_{2})}^{1 / 2}$ for s = 1,…, 40. For each s, calculate ${\hat{β}}_{i, d}^{(s)} (i = 1, \dots, p)$ and d = 1,2. Based on the estimated regression coefficients, construct the corresponding standardized difference $W_{i, j}^{(s)}$ for each s.
Choose
$\hat{s} = \arg min \sum_{l = 1}^{10} {(\frac{\sum_{1 \leq i \leq j \leq p} I {| W_{i, j}^{(s)} | \geq Φ^{- 1} (1 - l [1 - Φ {{(\log p)}^{1 / 2}}] / 10)}}{l p (p - 1) [1 - Φ {{(\log p)}^{1 / 2}}] / 10} - 1)}^{2} .$

The tuning parameters are chosen to be $λ_{n_{1}, i, 1} = \hat{s} / 20 {({\sum^{^}}_{i, i, 1} \log p / n_{1})}^{1 / 2}$ and $λ_{n_{2}, i, 2} = \hat{s} / 20 {({\sum^{^}}_{i, i, 2} \log p / n_{2})}^{1 / 2}$ .

Pairwise comparisons among these four models are considered. The sample sizes are n₁ = n₂ = 100, while the dimension p = 50, 100, and 200. The false discovery rate level is set at α₂ = 0.1, and the empirical false discovery rate and the power of false discovery rate control in percentage, summarized in Table 3, are estimated from 100 replications. We examine the power based on the average powers for 100 replications as follows

\frac{1}{100} \sum_{l = 1}^{100} \frac{\sum_{(i, j) \in ℋ_{1}} I (| W_{i, j, l} | \geq \hat{t})}{| ℋ_{1} |},

where W_i,j,l denotes standardized difference for the l^th replication and $ℋ_{1}$ denotes the nonzero locations. For all six cases, the false discovery rates are close to α across all dimensions. For empirical power, the procedure is powerful when the dimension p is low, and retains high power for the comparisons between Model 1 and Models 2 and 4. However, for the comparison between Model 2 and Model 3, the power is low when dimension is high and this is because all of | ω_i,j_,1 − ω_i,j_,2|/(θ_i,j_,1n₁ + θ_i,j,2n₂)^1/2 is smaller than 0.25 when p = 200 and D = I. Similarly, most nonzero entries of the standardized difference for Model 2 and 4 are smaller than 0.24. Thus it is difficult to detect nonzero locations. Furthermore, under the same scenario, ω_i,j/⁽θ_i,jⁿ⁾^1/2 is always smaller than 0.16 for Model 3, and thus the detection becomes harder when we compare Model 3 with other models. Thus, the power results are not good when Model 3 is included in the comparison.

Table 3.

Empirical false discovery rate and power (%) with α₂ = 0.1, n₁ = n₂ = 100, and 100 replications.

p	Models 1, 2	Models 1, 3	Models 2, 3	Models 1, 4	Models 2, 4	Models 3, 4
	Empirical False Discovery Rate

50	10.5	11.0	12.6	12.2	11.5	10.2
100	9.5	10.0	12.1	11.8	11.4	9.5
200	9.7	10.4	11.2	11.7	11.6	10.3

	Power

50	67.9	65.6	35.7	55.0	30.2	26.1
100	64.2	38.3	19.3	51.4	25.1	18.2
200	61.1	20.6	17.1	46.1	21.7	11.3

Open in a new tab

6. Real Data Analysis

The high throughput technology and massively parallel measurement of mRNA expression catalyzed a new area of genomic biomarkers. A number of prominent genomic markers have been identified to assist in predicting breast cancer patient survival in clinical practice, and increasingly, pharmacogenomic endpoints are being incorporated into the design of clinical trials (Olopade et al., 2008). Molecular pathways of pathogenesis for breast cancer have also been increasingly discovered and curated (Nathanson et al., 2001). However, the role of gene-by-gene interactions, within and across pathways, in breast cancer survival remain unclear. Here, we apply our procedures to identify gene-by-gene interactions important for breast cancer survival.

For illustration, we consider 32 pathways from the molecular signature database that are related to breast cancer survival. Examples include the MAPK/ERK, WNT, TGF-β, P13k-AKT-mTOR and ATRBRCA pathways. Existing literature has indicated that a defect in the MAPK pathway may lead to uncontrolled growth, which is a step necessary for the development of all cancers (Santen et al., 2002; Downward, 2003). Mutations or deregulated expression of genes in the Wnt pathway can induce cancer (Klaus & Birchmeier, 2008). The TGF-β signaling pathway is critical to a plethora of cellular processes including cell proliferation, apoptosis and differentiation (Shi & Massagué, 2003). An increase in the TGF-β2 expression is associated with response to tamoxifen for breast cancer patients (Buck & Knabbe, 2006). The ATRBRCA pathway describes the role of BRCA1, BRCA2 and ATR in cancer susceptibility (Venkitaraman, 2002). BRCA1 and BRCA2 are the best-known genes linked to breast cancer risk. Hence, these pathways may play critical roles in breast cancer progression. To examine the interactions between genes in these pathways, we applied our procedure to a recent breast cancer gene expression study of 295 patients with primary breast carcinomas from the Netherlands Cancer Institute (van de Vijver et al., 2002). Out of the 32 pathways, there are a total of p = 754 genes with available data in this study. The two populations we consider are the short term survivors, defined as those 78 patients who died within 5 years; and the long term survivors, defined as those 69 patients who survived more than 10 years. We are particularly interested in identifying gene pairs with interactive effects on the binary cancer survival trait using the proposed procedures. In this setting, the sparsity assumption about β_i,k’s is reasonable as it is generally believed that transcriptional regulation of a single gene is generally defined by a small set of regulatory elements (Segal et al., 2003; Dobra et al., 2004).

Based on our proposed procedures, we identified nine pairs of gene-by-gene interactions as significant at a false discovery rate level of 0.1. An interaction here does not simply indicate a co-expression between a pair of genes, but instead represents a difference between the co-expression patterns among the long terms survivors and among the short term survivors. As shown in Figure 1, the majority of the genes involved in these interactions belong to five major pathways, the MAPK, WNT, TGF-β, Apoptosis, and ATRBRCA pathways, although many of these genes belong to multiple pathways. One pair of the identified interactions represent gene-by-gene interactions within pathways and the remaining eight pairs represent cross-talk between these pathways, some of which are previously documented. A total of five interactions are between the MAPK signaling pathway and the WNT and TGF-β, Apoptosis, ATRBRCA and MTA3 pathways. These cross-talks are not surprising since MAPK modulates a wide range of processes including gene expression, mitosis, proliferation, metabolism and apoptosis (Wada & Penninger, 2004). Several recent studies suggest extensive crosstalk between WNT and MAPK signaling pathways in cancer. For example, hyper-activation of MAPK signaling results in down-regulation of the WNT signal transduction pathway in melanoma, suggesting a negative crosstalk between the two pathways; while in colorectal cancer, stimulating the WNT pathway leads to activation of the MAPK pathway through Ras stabilization, representing a positive crosstalk (Guardavaccaro & Clevers, 2012). The observed interactive effect between the WNT and MAPK pathways suggests that the cross-talk between these two pathways may play an important role in breast cancer survival. The interaction between the tumor suppressor gene BRCA2 and the MAPK pathway has been documented in experiments with prostate cancer cells with upregulation of BRCA2 linked to an increase in MAPK activity (Moro et al., 2007). In the WNT pathway, the WNT1 gene promotes cell survival in various cell types and it has been experimentally shown that blocking WNT1 signaling can induce apoptotic cell death (You et al., 2004). Thus the interaction between WNT1 gene and the PRKACB gene in the Apoptosis pathway may also be crucial for breast cancer.

Fig. 1 — Identified gene-by-gene interactions for the breast cancer example. The dashed lines between gene-paris represent detected interactions. Genes inside each circle belong to the same pathway whose name is also shown.

A. Appendix: Proofs

A·1. Technical Lemmas

We prove the main results in this section. We begin by collecting technical lemmas proved in the supplementary material. The first lemma is the classical Bonferroni inequality.

Lemma A1 (Bonferroni inequality)

Let $B = \cup_{t = 1}^{p} B_{t}$ . For any k < [p/2], we have

\sum_{t = 1}^{2 k} {(- 1)}^{t - 1} F_{t} \leq pr (B) \leq \sum_{t = 1}^{2 k - 1} {(- 1)}^{t - 1} F_{t},

where $F_{t} = \sum_{1 \leq i_{1} < \dots < i_{t} \leq p} pr (B_{i_{1}} \cap \dots \cap B_{i_{t}})$ .

For d = 1, 2, let $U_{i, j, d} = n_{d}^{- 1} \sum_{k = 1}^{n_{d}} (ε_{k, i, d} ε_{k, j, d} - E ε_{k, i, d} ε_{k, j, d})$ , and define ${\tilde{U}}_{i, j, d} = (r_{i, j, d} - U_{i, j, d}) / (r_{i, i, d} r_{j, j, d})$ for 1 ≤ i < j ≤ p and ${\tilde{U}}_{i, i, d} = (r_{i, i, d} + U_{i, i, d}) / (r_{i, i, d} r_{i, i, d})$ .

Lemma A2

Suppose that Conditions (C1), (4) and (5) hold. Then

max_{1 \leq i \leq p} | {\tilde{r}}_{i, i, d} - r_{i, i, d} | = O_{p} {{(\log p / n_{d})}^{1 / 2}},

and

{\tilde{r}}_{i, j, d} = {\tilde{R}}_{i, j, d} - {\tilde{r}}_{i, i, d} ({\hat{β}}_{i, j, d} - β_{i, j, d}) - {\tilde{r}}_{j, j, d} ({\hat{β}}_{j - 1, i, d} - β_{j - 1, i, d}) + o_{p} {{(n_{d} \log p)}^{- 1 / 2}},

for 1 ≤ i < j ≤ p, where ${\tilde{R}}_{i, j, d}$ is the empirical covariance between {ε_k,i,d : k = 1, …, n_d} and {ε_k,j,d : k = 1, …, n_d}. Consequently, uniformly in 1 ≤ i < j ≤ p,

\begin{array}{l} {\hat{r}}_{i, j, d} - (ω_{i, i, d} {\hat{σ}}_{i, i, d, ε} + ω_{j, j, d} {\hat{σ}}_{j, j, d, ε} - 1) r_{i, j, d} = - U_{i, j, d} + o_{p} {{(n_{d} \log p)}^{- 1 / 2}}, \\ | T_{i, j, d} - {\tilde{U}}_{i, j, d} | = O_{p} {{(\log p / n_{d})}^{\frac{1}{2}}} r_{i, j, d} + o_{p} {n_{d} {(\log p)}^{- 1 / 2}}, \end{array}

and uniformly in 1 ≤ i ≤ p,

| T_{i, i, d} - {\tilde{U}}_{i, i, d} | = o_{p} {{(n_{d} \log p)}^{- 1 / 2}},

where ${\hat{r}}_{i, j, d}$ is defined in (7), $({\hat{σ}}_{i, j, d, ε}) = (1 / n_{d}) \sum_{k = 1}^{n_{d}} (ε_{k, d} - {\bar{ε}}_{d}) {(ε_{k, d} - {\bar{ε}}_{d})}^{T}, ε_{k, d} = (ε_{k, 1, d}, \dots, ε_{k, p, d})$ and ${\bar{ε}}_{d} = n_{d}^{- 1} \sum_{k = 1}^{n_{d}} ε_{k, d}$ .

Lemma A3

Let X_k ~ N(μ₁, Σ₁) for k = 1, …, n₁ and Y_k ~ N(μ₂, Σ₂) for k = 1, …, n₂. Define

{\sum^{\sim}}_{1} = {({\tilde{σ}}_{i, j, 1})}_{p \times p} = \frac{1}{n_{1}} \sum_{k = 1}^{n_{1}} (X - μ_{1}) {(X - μ_{1})}^{T}, {\sum^{\sim}}_{2} = {({\tilde{σ}}_{i, j, 2})}_{p \times p} = \frac{1}{n_{2}} \sum_{k = 1}^{n_{2}} (Y - μ_{2}) {(Y - μ_{2})}^{T} .

Then, for some constant C > 0, ${\tilde{σ}}_{i, j, 1} - {\tilde{σ}}_{i, j, 2}$ satisfies the large deviation bound

\begin{array}{l} pr [max_{(i, j) \in S} \frac{{({\tilde{σ}}_{i, j, 1} - {\tilde{σ}}_{i, j, 2} - σ_{i, j, 1} + σ_{i, j, 2})}^{2}}{var {(X_{k, i} - μ_{1, i}) (X_{k, j} - μ_{1, j})} / n_{1} + var {(Y_{k, i} - μ_{2, i}) (Y_{k, j} - μ_{2, j})} / n_{2}} \geq x^{2}] \\ \leq C | S | {1 - Φ (x)} + O (p^{- 1}) \end{array}

uniformly for 0 ≤ x ≤ (8 log p)^1/2 and any subset $S \subseteq {(i, j) : 1 \leq i \leq j \leq p}$ .

The following lemma is needed for false discovery rate control in Theorem 4.

Lemma A4

Let V_i,j = (U_i,_j_,2−U_i,j_,1){var(ε_k,i_,1ε_k,j_,1)/n₁ + var(ε_k,i_,2ε_k,j_,2)/n₂}^−1/2. Under the same conditions as in Theorem 4, we have for any ε > 0 that,

\begin{array}{l} \sum_{0 \leq t \leq t_{p}} pr [| \frac{\sum_{(i, j) \in ℋ_{0} \ A_{τ}} {I (| V_{i, j} | \geq t) - pr (| V_{i, j} | \geq t)}}{2 q_{0} {1 - Φ (t)}} | \geq ε] = o (1), \\ \int_{0}^{t_{p}} pr [| \frac{\sum_{(i, j) \in ℋ_{0} \ A_{τ}} {I (| V_{i, j} | \geq t) - pr (| V_{i, j} | \geq t)}}{2 q_{0} {1 - Φ (t)}} | \geq ε] d t = o (v_{p}), \end{array}

where t_p = (4 log p − log₂ p – log₃ p)^1/2 and v_p = 1/{log p(log₄ p)²}^1/2.

A·2. Proof of Theorem 1

Without loss of generality, throughout this section, we assume that ω_i,i,d = 1 for d = 1, 2 and i = 1,…, p. Let A = {(i, j) : 1 ≤ i ≤ j ≤ p}. (C1) implies |A_τ|=o(p^1/16). To prove Theorem 1, we first show that the terms in A_τ are negligible. Then we use Lemma 1, together with the Gaussian approximation technique, to show that $pr ({max}_{(i, j) \in A \ A_{τ}} W_{i, j}^{2} - 4 \log p + \log \log p \leq t) \to \exp {- {(8 π)}^{- 1 / 2} \exp (- t / 2)}$ , where W_i,j is defined in equation (9).

For d = 1, 2, let V_i,j = (U_i,j,₂ − U_i,j,₁)/{var(ε_k,i,₁ε_k,j,₁)/n₁ + var(ε_k,i,d ε_k,j,d)/n₂}^1/2, where $var (ε_{k, i, d} ε_{k, j, d}) = r_{i, i, d} r_{j, j, d} (1 + ρ_{i, j, d}^{2})$ with $ρ_{i, j, d}^{2} = β_{i, j, d}^{2} r_{i, i, d} / r_{j, j, d}$ . The proof of Lemma 2 yields

max_{1 \leq i \leq p} | {\hat{r}}_{i, i, d} - r_{i, i, d} | = O_{p} {{(\log p / n)}^{\frac{1}{2}}},

(A1)

and ${max}_{1 \leq i \leq p} | {\hat{r}}_{i, i, d} - {\tilde{R}}_{i, i, d} | = o_{p} {{(n_{d} \log p)}^{- 1 / 2}}$ , where n = max{n₁, n₂}. Note that

max_{1 \leq i \leq j \leq p} | {\hat{β}}_{i, j, d}^{2} {\hat{r}}_{i, i, d} / {\hat{r}}_{j, j, d} - ρ_{i, j, d}^{2}) = o_{p} (1 / \log p),

(A2)

and ${max}_{1 \leq i \leq j \leq p} | ω_{i, i, d} {\hat{σ}}_{i, i, d, ε} + ω_{j, j, d} {\hat{σ}}_{j, j, d, ε} - 2 | = O_{p} {{(\log p / n)}^{1 / 2}} .$ Also note that for (i, j) ∈ A\A_τ, we have |ω_i,j,d| = o{(log p)⁻¹}. Then by Lemma 2, it is easy to see that, under conditions (C1), (4) and (5), we have, for (i, j) ∈ A\A_τ, ${max}_{(i, j) \in A \ A_{τ}} ‖ W_{i, j} | - | V_{i, j} ‖ = o_{p} {{(\log p)}^{- 1 / 2}}$ . For (i,j) ∈ A_τ as a result of Lemma 2, we have W_i,j = V_i,j + b_i,j + o_p 1og p^–1/2), where $b_{i, j} = 2 {ω_{i, j} ({\hat{σ}}_{i, i, 1, ε} - {\hat{σ}}_{i, i, 2, ε}) + ω_{i, j} ({\hat{σ}}_{j, j, 1, ε} - σ_{j, j, 2, ε})} / {({\hat{θ}}_{i, j, 1} + {\hat{θ}}_{i, j, 2})}^{1 / 2}$ , $({\hat{σ}}_{i, j, d, ε}) = n_{d}^{- 1} \sum_{k = 1}^{n_{d}} (ε_{k, d} - {\bar{ε}}_{d}) {(ε_{k, d} - {\bar{ε}}_{d})}^{T}$ , $ε_{k, d} = (ε_{k, 1, d, \dots} ε_{k, p, d})$ and ${\bar{ε}}_{d} = (1 / n_{d}) \sum_{k = 1}^{n_{d}} ε_{k, d}$ . Note that

| b_{i, j} | \leq 2 {(\frac{2 ρ_{i, j}^{2}}{1 + ρ_{i, j}^{2}})}^{\frac{1}{2}} [\frac{| {\tilde{σ}}_{i, i, 1, ε} - {\tilde{σ}}_{i, i, 2, ε} |}{{var (ε_{k, i, 1}^{2}) / n_{1} + var (ε_{k, i, 2}^{2}) / n_{2}}^{\frac{1}{2}}} + \frac{| {\tilde{σ}}_{j, j, 1, ε} - {\tilde{σ}}_{j, j, 2, ε} |}{{var (ε_{k, j, 1}^{2}) / n_{1} + var (ε_{k, j, 2}^{2}) / n_{2}}^{\frac{1}{2}}}] + o {{(\log p)}^{- 1 / 2}},

where ${\tilde{σ}}_{i, i, d, ε} = n_{d}^{- 1} \sum_{k = 1}^{n_{d}} ε_{k, i, d}^{2}$ . Thus, we have

pr (max_{(i, j) \in A_{τ}} W_{i, j}^{2} \geq 4 \log p - \log \log p + t) \leq Card (A_{τ}) {pr (V_{i, j}^{2} \geq \log p / 8) + pr (b_{i, j}^{2} \geq 2 \log p)} = o (1),

where the last equality is a direct result of Lemma 3. Thus it suffices to prove that

pr (max_{(i, j) \in A \ A_{τ}} V_{i, j}^{2} - 4 \log p + \log \log p \leq t) \to \exp {- {(8 π)}^{- 1 / 2} \exp (- t / 2)} .

We arrange the indices {(i, j) : (i, j) ∈ A\A_τ} in any ordering and set them as {(i_m, j_m) : m = 1, …, q} with q =Card(A\A_τ). Let n₁/n₂ ≤ K with K ≥ 1, $θ_{m, d} = var (ε_{i_{m}, d} ε_{j_{m}, d})$ , for d = 1, 2 and define $Z_{k, m} = (n_{1} / n_{2}) {ε_{k, i_{m}, 2} ε_{k, j_{m}, 2} - E (ε_{k, i_{m}, 2} ε_{k, j_{m}, 2})}$ for 1 ≤ k ≤ n₂, $Z_{k, m} = - {ε_{k, i_{m}, 1} ε_{k, j_{m}, 1} - E (ε_{k, i_{m}, 1} ε_{k, j_{m}, 1})}$ for n₂ + 1 ≤ k ≤ n₁ + n₂, $V_{m} = {(n_{1}^{2} θ_{m, 2} / n_{2} + n_{1} θ_{m, 1})}^{- 1 / 2} \sum_{k = 1}^{n_{1} + n_{2}} Z_{k, m}$ and ${\hat{V}}_{m} = {(n_{1}^{2} θ_{m, 2} / n_{2} + n_{1} θ_{m, 1})}^{- 1 / 2} \sum_{k = 1}^{n_{1} + n_{2}} {\hat{Z}}_{k, m}$ , where ${\hat{Z}}_{k, m} = Z_{k, m} I (| Z_{k, m} | \leq τ_{n}) - E {Z_{k, m} I (| Z_{k, m} | \leq τ_{n})}$ , and τ_n = 32K₁ log(p + n). Note that ${max}_{(i, j) \in A \ A_{τ}} V_{i, j}^{2} = {max}_{1 \leq m \leq q} V_{m}^{2}$ , and that

\begin{array}{l} max_{1 \leq m \leq q} n^{- 1 / 2} \sum_{k = 1}^{n_{1} + n_{2}} E [| Z_{k, m} | I {| Z_{k, m} | \geq 32 K_{1} \log (p + n)}] \\ \leq C n^{1 / 2} max_{1 \leq k \leq n_{1} + n_{2}} max_{1 \leq m \leq q} E [| Z_{k, m} | I {| Z_{k, m} | \geq 32 K_{1} \log (p + n)}] \\ \leq C n^{1 / 2} {(p + n)}^{- 4} max_{1 \leq k \leq n_{1} + n_{2}} max_{1 \leq m \leq q} E [| Z_{k, m} | \exp {| Z_{k, m} | / (8 K_{1})}] \\ \leq C n^{1 / 2} {(p + n)}^{- 4} . \end{array}

Hence, $pr {{max}_{1 \leq m \leq q} | V_{m} - {\hat{V}}_{m} | \geq {(\log p)}^{- 1}} \leq pr ({max}_{1 \leq m \leq q} {max}_{1 \leq k \leq n_{1} + n_{2}} | Z_{k, m} | \geq τ_{n}) = O (p^{- 1})$ . By the fact that $| {max}_{1 \leq m \leq q} V_{m}^{2} - {max}_{1 \leq m \leq q} {\hat{V}}_{m}^{2} | \leq 2 {max}_{1 \leq m \leq q} | {\hat{V}}_{m} | {max}_{1 \leq m \leq q} | V_{m} - {\hat{V}}_{m} | + {max}_{1 \leq m \leq q} {| V_{m} - {\hat{V}}_{m} |}^{2}$ , it suffices to prove that for any t ∈ ℝ, as n, p → ∞,

pr (max_{1 \leq m \leq q} {\hat{V}}_{m}^{2} - 4 \log p + \log \log p \leq t) \to \exp {- {(8 π)}^{1 / 2} \exp (- t / 2)} .

(A3)

By Lemma 1, for any integer l with 0 < l < q/2,

\begin{array}{l} \sum_{d = 1}^{2 l} {(- 1)}^{d - 1} \sum_{1 \leq m_{1} < \dots < m_{d} \leq q} pr (\cap_{j = 1}^{d} F_{m_{j}}) \leq pr (max_{1 \leq m \leq q} {\hat{V}}_{m}^{2} \geq y_{p}) \\ \leq \sum_{d = 1}^{2 l - 1} {(- 1)}^{d - 1} \sum_{1 \leq m_{1} < \dots < m_{d} \leq q} pr (\cap_{j = 1}^{d} F_{m_{j}}), \end{array}

(A4)

where y_p = 4 log p − log log p + t and $F_{m_{j}} = ({\hat{V}}_{m_{j}}^{2} \geq y_{p})$ . Let ${\tilde{Z}}_{k, m} = {\hat{Z}}_{k, m} / {(n_{1} θ_{m, 2} / n_{2} + θ_{m, 1})}^{1 / 2}$ for m = 1, …, q and $W_{k} = ({\tilde{Z}}_{k, m_{1}}, \dots {\tilde{Z}}_{k, m_{d}})$ , for 1 ≤ k ≤ n₁ + n₂. Define ${| a |}_{min} = {min}_{1 \leq i \leq d} | a_{i} |$ for any vector a ∈ R^d. Then we have

pr (\cap_{j = 1}^{d} F_{m_{j}}) = pr ({| {n_{2}}^{- \frac{1}{2}} \sum_{k = 1}^{n_{1} + n_{2}} W_{k} |}_{min} \geq y_{p}^{\frac{1}{2}}) .

Then it follows from Theorem 1 in Zaïtsev (1987) that

\begin{array}{l} pr ({| {n_{2}}^{- 1 / 2} \sum_{k = 1}^{n_{1} + n_{2}} W_{k} |}_{min} \geq y_{p}^{1 / 2}) \leq pr {{| N_{d} |}_{min} \geq y_{p}^{1 / 2} - ε_{n} {(\log p)}^{- 1 / 2}} \\ + c_{1} d^{\frac{5}{2}} \exp {- \frac{n^{1 / 2} ε_{n}}{c_{2} d^{3} τ_{n} {(\log p)}^{1 / 2}}}, \end{array}

(A5)

where c₁ > 0 and c₂ > 0 are constants, ε_n → 0 which will be specified later and $N_{d} = (N_{m_{1}}, \dots N_{m_{d}})$ is a normal random vector with E(N_d) = 0 and $cov (N_{d}) = n_{2} / n_{1} cov (W_{1}) + cov (W_{n_{2} + 1})$ . Recall that d is a fixed integer which does not depend on n, p. Because $\log p = o (n^{1 / 5})$ , we can let ε_n → 0 sufficiently slowly that, for any large M > 0

c_{1} d^{5 / 2} \exp {- \frac{n^{1 / 2} ε_{n}}{c_{2} d^{3} τ_{n} {(\log p)}^{1 / 2}}} = O (p^{- M}) .

(A6)

Combining (A4), (A5) and (A6) we have

pr (max_{1 \leq m \leq q} {\hat{V}}_{m}^{2} \geq y_{p}) \leq \sum_{d = 1}^{2 l - 1} {(- 1)}^{d - 1} \sum_{1 \leq m_{1} < \dots < m_{d} \leq q} pr {{| N_{d} |}_{min} \geq y_{p}^{1 / 2} - ε_{n} {(\log p)}^{- 1 / 2}} + o^{(1)} .

(A7)

Similarly, using Theorem 1 in Zaïtsev (1987) again, we can get

pr (max_{1 \leq m \leq q} {\hat{V}}_{m}^{2} \geq y_{p}) \geq \sum_{d = 1}^{2 l} {(- 1)}^{d - 1} \sum_{1 \leq m_{1} < \dots < m_{d} \leq q} pr {{| N_{d} |}_{min} \geq y_{p}^{1 / 2} + ε_{n} {(\log p)}^{- 1 / 2}} - o (1) .

(A8)

We recall the following lemma, which is shown in the supplementary material of Cai et al. (2013).

Lemma A5

For any fixed integer d ≥ 1 and real number t ∈ ℝ,

\sum_{1 \leq m_{1} < \dots < m_{d} \leq q} pr {{| N_{d} |}_{min} \geq y_{p}^{1 / 2} \pm ε_{n} {(\log p)}^{- 1 / 2}} = \frac{1}{d!} {{(8 π)}^{- 1 / 2} \exp (- t / 2)}^{d} {1 + o (1)} .

(A9)

It then follows from Lemma 5, (A7) and (A8) that

\begin{array}{l} \underset{n, p \to \infty}{\lim \sup} pr (max_{1 \leq m \leq q} {\hat{V}}_{m}^{2} \geq y_{p}) \leq \sum_{d = 1}^{2 l} {(- 1)}^{d - 1} \frac{1}{d!} {{(8 π)}^{- 1 / 2} \exp (- t / 2)}^{d} \\ \underset{n, p \to \infty}{\lim \inf} pr (max_{1 \leq m \leq q} {\hat{V}}_{m}^{2} \geq y_{p}) \geq \sum_{d = 1}^{2 l} {(- 1)}^{d - 1} \frac{1}{d!} {{(8 π)}^{- 1 / 2} \exp (- t / 2)}^{d} \end{array}

for any positive integer l. By letting l → ∞, we obtain (A3) and Theorem 1 is proved.

A·3. Proof of Theorem 2

Let $M_{n}^{1} = {max}_{1 \leq i \leq j \leq p} {T_{i, j, 1} - T_{i, j, 2} - (ω_{i, j, 1} - ω_{i, j, 2})}^{2} / ({\hat{θ}}_{i, j, 1} + {\hat{θ}}_{i, j, 2})$ . It follows from the proof of Theorem 1 that $pr (M_{n}^{1} \leq 4 \log p - 2^{- 1} \log \log p) \to 1$ , as n, p → ∞. By (A1), (A2) and the inequalities ${max}_{1 \leq i \leq j \leq p} {(ω_{i, j, 1} - ω_{i, j, 2})}^{2} / ({\hat{θ}}_{i, j, 1} + {\hat{θ}}_{i, j, 2}) \leq 2 M_{n}^{1} + 2 M_{n}$ , and ${max}_{1 \leq i \leq j \leq p} (ω_{i, j, 1} - ω_{i, j, 2} | {({\hat{θ}}_{i, j, 1} + {\hat{θ}}_{i, j, 2})}^{1 / 2} \geq 4 {(\log p)}^{1 / 2}$ , we have pr(M_n ≥ q_α + 4 log p − log log p) → 1 as n, p → ∞.

A·4. Proof of Theorem 3

To prove the lower bound result, we first construct the worst case scenario to test between Ω₁ and Ω₂, and then apply the arguments as shown in Baraud (2002).

Let ℳ denote the set of all subsets of {1,…, p} with cardinality p^r, for r < 1/2. Let $\hat{m}$ be a random subset of {1,…, p}, which is uniformly distributed on ℳ. We construct a class of Ω₁, $N = {Ω_{\hat{m}}, \hat{m} \in ℳ}$ , such that ω_i,j = 0 for i ≠ j and $1 / ω_{i, i} - 1 = ρ 1_{i \in \hat{m}}$ , for i, j = 1,…, p and ρ = c(log p/n)^1/2, where c > 0 will be specified later. Let Ω₂ = I and Ω₁ be uniformly distributed on $N$ . Let μ_ρ be the distribution of Ω₁ − I. Note that μ_ρ is a probability measure on ${Δ \in S (p^{r}) : ‖ Δ ‖_{F}^{2} = p^{r} ρ^{2}}$ , where $S (p^{r})$ is the class of matrices with p^r nonzero entries. Let dpr₁({X_n, Y_n}) and dpr₂({X_n, Y_n}) be the functions with precision matrices Ω₁ and Ω₂ respectively, likelihood then we have

L_{μ_{ρ}} = L_{μ_{ρ}} ({X_{n}, Y_{n}}) = E_{μ_{ρ}} {\frac{d {pr}_{1} ({X_{n}, Y_{n}})}{d {pr}_{2} ({X_{n}, Y_{n}})}},

where $E_{μ_{ρ}} (\cdot)$ is the expectation on Ω₁. By the arguments in Baraud (2002), it suffices to show that $E (L_{μ_{ρ}}^{2}) \leq 1 + o (1)$ . It is easy to check that

L_{μ_{ρ}} = E_{\hat{m}} [\prod_{i = 1}^{n} \frac{1}{| \sum_{\hat{m}} |^{1 / 2}} \exp {- \frac{1}{2} Z_{i}^{T} (Ω_{\hat{m}} - I) Z_{i}}],

where $\sum_{\hat{m}} = Ω_{\hat{m}}^{- 1}$ and $Z_{1}, \dots, Z_{n} \overset{i, i, d .}{~} N (0, I)$ . Thus, we have

\begin{array}{l} E (L_{μ_{ρ}}^{2}) = E {({(\begin{array}{l} p \\ k_{p} \end{array})}^{- 1} \sum_{m \in ℳ} [\prod_{i = 1}^{n} \frac{1}{| \sum_{m} |^{1 / 2}} \exp {- Z_{i}^{T} (Ω_{m} - I) Z_{i} / 2}])}^{2} \\ = {(\begin{array}{l} p \\ k_{p} \end{array})}^{- 2} \sum_{m, m^{'} \in ℳ} E [\prod_{i = 1}^{n} \frac{1}{| \sum_{m} |^{1 / 2}} \frac{1}{| \sum_{m^{'}} |^{1 / 2}} \exp {- Z_{i}^{T} (Ω_{m} + Ω_{m^{'}} - 2 I) Z_{i} / 2}] \end{array}

Set Ω_m + Ω_m_′ − 2I = (a_i,j). It is easy to show that a_i,j = 0 for i ≠ j, a_j,j = 0 if j ε (m ∪ m′)^c, a_j,j = 2(1/(1 + ρ)−1) if j ε m ∩ m′ and a_j,j = 1/(1 + ρ) −1 if j ε m \ m′ \ m. Let t = | m ∩ m′|. Then

\begin{array}{l} E (L_{μ_{ρ}}^{2}) = {(\begin{array}{l} p \\ k_{p} \end{array})}^{- 1} \sum_{t = 0}^{k_{p}} (\begin{array}{l} k_{p} \\ t \end{array}) (\begin{array}{l} p - k_{p} \\ k_{p} - t \end{array}) \frac{1}{{(1 + ρ)}^{k_{p} n}} {(1 + ρ)}^{(k_{p} - t) n} {(\frac{1 + ρ}{1 - ρ})}^{t n / 2} \\ \leq p^{k_{p}} (p - k_{p})! / p! \sum_{t = 0}^{k_{p}} (\begin{array}{l} k_{p} \\ t \end{array}) {(\frac{k_{p}}{p})}^{t} {(\frac{1}{{1 - ρ}^{2}})}^{t n / 2} = {1 + \frac{k_{p}}{p {(1 - ρ)}^{n / 2}}}^{k_{p}} (1 + o (1)), \end{array}

for r < 1/2. Thus, by letting c be sufficiently small, we have

E (L_{μ_{ρ}}^{2}) \leq \exp {k_{p} \log (1 + k_{p} p^{c^{2} - 1})} (1 + o (1)) \leq \exp (k_{p}^{2} p^{c^{2} - 1}) (1 + o (1)) = 1 + o (1) .

A·5. Proof of Theorem 4

We first show that $\hat{t}$ , as defined in Section 4, is obtained in the range (0, 2(log p)^1/2). Then we illustrate that R₀(t), defined in Section 4, is close to 2 {1 − Φ(t)}|ℋ₀| by first showing the terms in A_τ are negligible. We then focus on the set ℋ₀ \ A_τ and prove the result based on Lemma 4.

Under the condition of Theorem 4, we have Σ₁_≤i_<_j≤p I{|W_i,j| ≥ 2(log p)^1/2} ≥ [1/{(8π)^1/2 α} + δ](log₂ p)^1/2, with probability going to one. Hence we have with probability going to one,

\frac{(p^{2} - p) / 2}{max {\sum_{1 \leq i < j \leq p} I {| W_{i, j} | \geq 2 {(\log p)}^{1 / 2}}, 1}} \leq \frac{p^{2} - p}{2} {\frac{1}{{(8 π)}^{1 / 2} α} + δ}^{- 1} {(\log_{2} p)}^{- 1 / 2} .

Let t_p = (4 log p − log₂ p − log₃ p)^1/2. Because $1 - Φ (t_{p}) \sim 1 / {{(2 π)}^{1 / 2} t_{p}} \exp (- t_{p}^{2} / 2)$ , we have $pr (1 \leq \hat{t} \leq t_{p}) \to 1$ according to the definition of $\hat{t}$ in the false discovery rate control algorithm in Section 4. Note that, for $0 \leq \hat{t} \leq t_{p}$ , we have

\frac{2 {1 - Φ (\hat{t})} (p^{2} - p) / 2}{max {\sum_{1 \leq i < j \leq p} I {| W_{i, j} | \geq 2 {(\log p)}^{1 / 2}}, 1}} = α .

Thus to prove Theorem 4, it suffices to prove that $| \sum_{(i, j) \in ℋ_{0}} {I (| W_{i, j} | \geq t) - G (t)} | / {q_{0} G (t)} \to 0$ in probability, for 0 ≤ t ≤ {4 log p + o(log p)}^1/2, where G(t) = 2{1 − Φ(t)}. Now we consider two cases.

If t = {4 log p + o(log p)}^1/2, the proof of Theorem 1 yields that $pr({max}_{(i, j) \in A_{τ}} W_{i, j}^{2} \geq t^{2}) = o (1)$ . Thus, it suffices to prove that $| \sum_{(i, j) \in ℋ_{0}} {I (| W_{i, j} | \geq t) - G (t)} | / {q_{0} G (t)} \to 0$ probability. For (i, j) ∊ ℋ₀ \ A_τ, we have from the proof of Theorem 1 that max₁_≤i<j≤p | W_i,j − V_i,j | = o_p {(log p)^−1/2}. Thus, it suffices to show that
$| \frac{\sum_{(i, j) \in ℋ_{0} \ A_{τ}} ε_{i, j} (t)}{q_{0} G (t)} | \to 0$ (A10)
in probability, where ε_i,j(t) = I(|V_i,j |≥ t) − G(t).

If t ≤ (C log p)^1/2 with C < 4, we have

| \frac{\sum_{(i, j) \in A_{τ} \cap ℋ_{0}} {I (| W_{i, j} | \geq t) - I (| V_{i, j} | \geq t)}}{q_{0} G (t)} | \leq \frac{2 | A_{τ} \cap ℋ_{0} |}{O (p^{2 - C / 2})} \to 0

in probability. Thus, it is again enough to show that

| \frac{\sum_{(i, j) \in ℋ_{0} \ A_{τ}} ε_{i, j} (t)}{q_{0} G (t)} | \to 0

(A11)

in probability. Define

{\tilde{ℋ}}_{0} = ℋ_{0} \ A_{τ}

. Let 0 ≤ t₀ < ⋯ < t_m = t_p such that t_l − t_l−₁ = v_p for l = 1,…, m − 1 and t_m − t_m−₁ ≤ v_p. Thus we have m·~ t_p/v_p. For any t such that t_l−₁ ≤ t ≤ t_l, we have

\frac{\sum_{(i, j) \in {\tilde{ℋ}}_{0}} I (| V_{i, j} | \geq t_{l})}{q_{0} G (t_{l})} \frac{G (t_{l})}{G (t_{l - 1})} \leq \frac{\sum_{(i, j) \in {\tilde{ℋ}}_{0}} I (| V_{i, j} | \geq t_{l})}{q_{0} G (t)} \leq \frac{\sum_{(i, j) \in {\tilde{ℋ}}_{0}} I (| V_{i, j} | \geq t_{l - 1})}{q_{0} G (t_{l - 1})} \frac{G (t_{l - 1})}{G (t_{l})} .

Thus it suffices to prove

{max}_{0 \leq l \leq m} | \sum_{(i, j) \in {\tilde{ℋ}}_{0}} ε_{i, j} (t_{l}) | / {q_{0} G (t_{l})} \to 0

in probability. Note that

\begin{array}{l} pr {max_{0 \leq l \leq m} | \frac{\sum_{(i, j) \in {\tilde{ℋ}}_{0}} ε_{i, j} (t_{l})}{q_{0} G (t_{l})} | \geq ε} \leq \sum_{l = 1}^{m} pr {| \frac{\sum_{(i, j) \in {\tilde{ℋ}}_{0}} ε_{i, j} (t_{l})}{q_{0} G (t_{l})} | \geq ε} \\ \leq \frac{1}{v_{p}} \int_{0}^{t_{p}} pr {| \frac{\sum_{(i, j) \in {\tilde{ℋ}}_{0}} ε_{i, j} (t_{l})}{q_{0} G (t)} | \geq ε} d t + \sum_{l = m - 1}^{m} pr {| \frac{\sum_{(i, j) \in {\tilde{ℋ}}_{0}} ε_{i, j} (t_{l})}{q_{0} G (t_{l})} | \geq ε} . \end{array}

Thus by (A5) with d = 1 and Lemma 4, Theorem 4 is proved.

Footnotes

Supplementary Material

Supplementary material available at Biometrika online includes more extensive simulation esults comparing the numerical performance of the proposed global test with that of other tests, the proofs of Lemmas 2, 3 and 4, and the Matlab code for numerical implementation.

Contributor Information

Yin Xia, Department of Statistics & Operations Research, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27514, USA.

Tianxi Cai, Department of Biostatistics, Harvard School of Public Health, Harvard University, Boston, Massachusetts 02115, USA.

T. Tony Cai, Department of Statistics, The Wharton School, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA.

References

Anderson TW. An Introduction To Multivariate Statistical Analysis. 3rd New York: Wiley-Intersceince; 2003. [Google Scholar]
Baraud Y. Non-asymptotic minimax rates of testing in signal detection. Bernoulli. 2002;8:577–606. [Google Scholar]
Breiman L. Random forests. Mach Learn. 2001;45:5–32. [Google Scholar]
Buck MB, Knabbe C. TGF-Beta signaling in breast cancer. Ann N Y Acad Sci. 2006;1089:119–126. doi: 10.1196/annals.1386.024. [DOI] [PubMed] [Google Scholar]
Cai T, Liu W, Xia Y. Two-sample covariance matrix testing and support recovery in high-dimensional and sparse settings. J Am Statist Assoc. 2013;108:265–277. [Google Scholar]
Chapman J, Clayton D. Detecting association using epistatic information. Genet Epidemiol. 2007;31:894–909. doi: 10.1002/gepi.20250. [DOI] [PubMed] [Google Scholar]
Chatterjee N, Kalaylioglu Z, Moslehi R, Peters U, Wacholder S. Powerful multilocus tests of genetic association in the presence of gene-gene and gene-environment interactions. Am J Hum Genet. 2006;79:1002–1016. doi: 10.1086/509704. [DOI] [PMC free article] [PubMed] [Google Scholar]
Danaher P, Wang P, Witten DM. The joint graphical lasso for inverse covariance estimation across multiple classes. J R Statist Soc B. 2014;76:373–397. doi: 10.1111/rssb.12033. [DOI] [PMC free article] [PubMed] [Google Scholar]
Dobra A, Hans C, Jones B, Nevins JR, Yao G, West M. Sparse graphical models for exploring gene expression data. J Multivariate Anal. 2004;90:196–212. [Google Scholar]
Downward J. Targeting RAS signalling pathways in cancer therapy. Nat Rev Cancer. 2003;3:11–22. doi: 10.1038/nrc969. [DOI] [PubMed] [Google Scholar]
Eichler EE, Flint J, Gibson G, Kong A, Leal SM, Moore JH, Nadeau JH. Missing heritability and strategies for finding the underlying causes of complex disease. Nat Rev Genet. 2010;11:446–450. doi: 10.1038/nrg2809. [DOI] [PMC free article] [PubMed] [Google Scholar]
Fan J, Lv J. Sure independence screening for ultra-high dimensional feature space (with discussion) J R Statist Soc B. 2008;70:849–911. doi: 10.1111/j.1467-9868.2008.00674.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
Gregg JP, Lit L, Baron CA, Hertz-Picciotto I, Walker W, Davis RA, Croen LA, Ozonoff S, Hansen R, Pessah IN, et al. Gene expression changes in children with autism. Genomics. 2008;91:22–29. doi: 10.1016/j.ygeno.2007.09.003. [DOI] [PubMed] [Google Scholar]
Guardavaccaro D, Clevers H. Wnt/β-Catenin and MAPK Signaling: Allies and enemies in different battlefields. Sci Signal. 2012;5 doi: 10.1126/scisignal.2002921. pe15. [DOI] [PubMed] [Google Scholar]
Hu VW, Sarachana T, Kim KS, Nguyen A, Kulkarni S, Steinberg ME, Luu T, Lai Y, Lee NH. Gene expression profiling differentiates autism case–controls and phenotypic variants of autism spectrum disorders: evidence for circadian rhythm dysfunction in severe autism. Autism Res. 2009;2:78–97. doi: 10.1002/aur.73. [DOI] [PMC free article] [PubMed] [Google Scholar]
Klaus A, Birchmeier W. Wnt signalling and its impact on development and cancer. Nat Rev Cancer. 2008;8:387–398. doi: 10.1038/nrc2389. [DOI] [PubMed] [Google Scholar]
Kooperberg C, Leblanc M. Increasing the power of identifying gene × gene interactions in genomewide association studies. Genet Epidemiol. 2008;32:255–263. doi: 10.1002/gepi.20300. [DOI] [PMC free article] [PubMed] [Google Scholar]
Kooperberg C, Ruczinski I. Identifying interacting SNPs using Monte Carlo logic regression. Genet Epidemiol. 2005;28:157–170. doi: 10.1002/gepi.20042. [DOI] [PubMed] [Google Scholar]
Li J, Chen SX. Two sample tests for high-dimensional covariance matrices. Ann Statist. 2012;40:908–940. [Google Scholar]
Li KC, Palotie A, Yuan S, Bronnikov D, Chen D, Wei X, Choi OW, Saarela J, Peltonen L. Finding disease candidate genes by liquid association. Genome Biol. 2007;8:R205. doi: 10.1186/gb-2007-8-10-r205. [DOI] [PMC free article] [PubMed] [Google Scholar]
Liu W. Gaussian graphical model estimation with false discovery rate control. Ann Statist. 2013;41:2948–2978. [Google Scholar]
Marchini J, Donnelly P, Cardon L. Genome-wide strategies for detecting multiple loci that influence complex diseases. Nat Genet. 2005;37:413–417. doi: 10.1038/ng1537. [DOI] [PubMed] [Google Scholar]
Mechanic L, Luke B, Goodman J, Chanock S, Harris C. Polymorphism Interaction Analysis (PIA): a method for investigating complex gene-gene interactions. BMC Bioinformatics. 2008;9:146. doi: 10.1186/1471-2105-9-146. [DOI] [PMC free article] [PubMed] [Google Scholar]
Moore J. Computational analysis of gene-gene interactions using multifactor dimensionality reduction. Expert Rev Mol Diagn. 2004;4:795–803. doi: 10.1586/14737159.4.6.795. [DOI] [PubMed] [Google Scholar]
Moro L, Arbini AA, Marra E, Greco M. Constitutive activation of MAPK/ERK inhibits prostate cancer cell proliferation through upregulation of BRCA2. Int J Oncol. 2007;30:217–224. doi: 10.3892/ijo.30.1.217. [DOI] [PubMed] [Google Scholar]
Nathanson K, Wooster R, Weber B. Breast cancer genetics: what we know and what we need. Nat Med. 2001;7:552–556. doi: 10.1038/87876. [DOI] [PubMed] [Google Scholar]
Olopade O, Grushko T, Nanda R, Huo D. Advances in Breast Cancer: Pathways to Personalized Medicine. Clin Cancer Res. 2008;14:7988. doi: 10.1158/1078-0432.CCR-08-1211. [DOI] [PMC free article] [PubMed] [Google Scholar]
Phillips PC. Epistasisthe essential role of gene interactions in the structure and evolution of genetic systems. Nat Rev Genet. 2008;9:855–867. doi: 10.1038/nrg2452. [DOI] [PMC free article] [PubMed] [Google Scholar]
Ritchie M, Hahn L, Roodi N, Bailey L, Dupont W, Parl F, Moore J. Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer. Am J Hum Genet. 2001;69:138–147. doi: 10.1086/321276. [DOI] [PMC free article] [PubMed] [Google Scholar]
Santen RJ, Song RX, Mcpherson R, Kumar R, Adam L, Jeng MH, Yue W. The role of mitogen-activated protein (MAP) kinase in breast cancer. J Steroid Biochem Mol Biol. 2002;80:239–256. doi: 10.1016/s0960-0760(01)00189-3. [DOI] [PubMed] [Google Scholar]
Schott JR. A test for the equality of covariance matrices when the dimension is large relative to the sample sizes. Comput Stat Data An. 2007;51:6535–6542. [Google Scholar]
Segal E, Shapira M, Regev A, Pe’er D, Botstein D, Koller D, Friedman N. Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data. Nat Genet. 2003;34:166–176. doi: 10.1038/ng1165. [DOI] [PubMed] [Google Scholar]
Shi Y, Massagué J. Mechanisms of TGF-β signaling from cell membrane to the nucleus. Cell. 2003;113:685–700. doi: 10.1016/s0092-8674(03)00432-x. [DOI] [PubMed] [Google Scholar]
Srivastava MS, Yanagihara H. Testing the equality of several covariance matrices with fewer observations than the dimension. J Multivariate Anal. 2010;101:1319–1329. [Google Scholar]
van de Vijver M, He Y, Van’t Veer L, et al. A gene-expression signature as a predictor of survival in breast cancer. N Engl J Med. 2002;347:1999–2009. doi: 10.1056/NEJMoa021967. [DOI] [PubMed] [Google Scholar]
van’t Veer LJ, Dai H, Van De Vijver MJ, He YD, Hart AA, Mao M, Peterse HL, Van Der Kooy K, Marton MJ, Witteveen AT, et al. Gene expression profiling predicts clinical outcome of breast cancer. Nature. 2002;415:530–536. doi: 10.1038/415530a. [DOI] [PubMed] [Google Scholar]
Venkitaraman AR. Cancer susceptibility and the functions of BRCA1 and BRCA2. Cell. 2002;108:171–182. doi: 10.1016/s0092-8674(02)00615-3. [DOI] [PubMed] [Google Scholar]
Wada T, Penninger JM. Mitogen-activated protein kinases in apoptosis regulation. Oncogene. 2004;23:2838–2849. doi: 10.1038/sj.onc.1207556. [DOI] [PubMed] [Google Scholar]
You L, He B, Uematsu K, Xu Z, Mazieres J, Lee A, Mccormick F, Jablons DM. Inhibition of wnt-1 signaling induces apoptosis in β-catenin-deficient mesothelioma cells. Cancer Res. 2004;64:3474–3478. doi: 10.1158/0008-5472.CAN-04-0115. [DOI] [PubMed] [Google Scholar]
Zaïtsev AY. On the gaussian approximation of convolutions under multidimensional analogues of sn bernstein’s inequality conditions. Probab Theory Rel. 1987;74:535–566. [Google Scholar]
Zerba K, Ferrell R, Sing C. Complex adaptive systems and human health: the influence of common genotypes of the apolipoprotein E (ApoE) gene polymorphism and age on the relational order within a field of lipid metabolism traits. Hum Genet. 2000;107:466–475. doi: 10.1007/s004390000394. [DOI] [PubMed] [Google Scholar]

[R1] Anderson TW. An Introduction To Multivariate Statistical Analysis. 3rd New York: Wiley-Intersceince; 2003. [Google Scholar]

[R2] Baraud Y. Non-asymptotic minimax rates of testing in signal detection. Bernoulli. 2002;8:577–606. [Google Scholar]

[R3] Breiman L. Random forests. Mach Learn. 2001;45:5–32. [Google Scholar]

[R4] Buck MB, Knabbe C. TGF-Beta signaling in breast cancer. Ann N Y Acad Sci. 2006;1089:119–126. doi: 10.1196/annals.1386.024. [DOI] [PubMed] [Google Scholar]

[R5] Cai T, Liu W, Xia Y. Two-sample covariance matrix testing and support recovery in high-dimensional and sparse settings. J Am Statist Assoc. 2013;108:265–277. [Google Scholar]

[R6] Chapman J, Clayton D. Detecting association using epistatic information. Genet Epidemiol. 2007;31:894–909. doi: 10.1002/gepi.20250. [DOI] [PubMed] [Google Scholar]

[R7] Chatterjee N, Kalaylioglu Z, Moslehi R, Peters U, Wacholder S. Powerful multilocus tests of genetic association in the presence of gene-gene and gene-environment interactions. Am J Hum Genet. 2006;79:1002–1016. doi: 10.1086/509704. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R8] Danaher P, Wang P, Witten DM. The joint graphical lasso for inverse covariance estimation across multiple classes. J R Statist Soc B. 2014;76:373–397. doi: 10.1111/rssb.12033. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R9] Dobra A, Hans C, Jones B, Nevins JR, Yao G, West M. Sparse graphical models for exploring gene expression data. J Multivariate Anal. 2004;90:196–212. [Google Scholar]

[R10] Downward J. Targeting RAS signalling pathways in cancer therapy. Nat Rev Cancer. 2003;3:11–22. doi: 10.1038/nrc969. [DOI] [PubMed] [Google Scholar]

[R11] Eichler EE, Flint J, Gibson G, Kong A, Leal SM, Moore JH, Nadeau JH. Missing heritability and strategies for finding the underlying causes of complex disease. Nat Rev Genet. 2010;11:446–450. doi: 10.1038/nrg2809. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R12] Fan J, Lv J. Sure independence screening for ultra-high dimensional feature space (with discussion) J R Statist Soc B. 2008;70:849–911. doi: 10.1111/j.1467-9868.2008.00674.x. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R13] Gregg JP, Lit L, Baron CA, Hertz-Picciotto I, Walker W, Davis RA, Croen LA, Ozonoff S, Hansen R, Pessah IN, et al. Gene expression changes in children with autism. Genomics. 2008;91:22–29. doi: 10.1016/j.ygeno.2007.09.003. [DOI] [PubMed] [Google Scholar]

[R14] Guardavaccaro D, Clevers H. Wnt/β-Catenin and MAPK Signaling: Allies and enemies in different battlefields. Sci Signal. 2012;5 doi: 10.1126/scisignal.2002921. pe15. [DOI] [PubMed] [Google Scholar]

[R15] Hu VW, Sarachana T, Kim KS, Nguyen A, Kulkarni S, Steinberg ME, Luu T, Lai Y, Lee NH. Gene expression profiling differentiates autism case–controls and phenotypic variants of autism spectrum disorders: evidence for circadian rhythm dysfunction in severe autism. Autism Res. 2009;2:78–97. doi: 10.1002/aur.73. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R16] Klaus A, Birchmeier W. Wnt signalling and its impact on development and cancer. Nat Rev Cancer. 2008;8:387–398. doi: 10.1038/nrc2389. [DOI] [PubMed] [Google Scholar]

[R17] Kooperberg C, Leblanc M. Increasing the power of identifying gene × gene interactions in genomewide association studies. Genet Epidemiol. 2008;32:255–263. doi: 10.1002/gepi.20300. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R18] Kooperberg C, Ruczinski I. Identifying interacting SNPs using Monte Carlo logic regression. Genet Epidemiol. 2005;28:157–170. doi: 10.1002/gepi.20042. [DOI] [PubMed] [Google Scholar]

[R19] Li J, Chen SX. Two sample tests for high-dimensional covariance matrices. Ann Statist. 2012;40:908–940. [Google Scholar]

[R20] Li KC, Palotie A, Yuan S, Bronnikov D, Chen D, Wei X, Choi OW, Saarela J, Peltonen L. Finding disease candidate genes by liquid association. Genome Biol. 2007;8:R205. doi: 10.1186/gb-2007-8-10-r205. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R21] Liu W. Gaussian graphical model estimation with false discovery rate control. Ann Statist. 2013;41:2948–2978. [Google Scholar]

[R22] Marchini J, Donnelly P, Cardon L. Genome-wide strategies for detecting multiple loci that influence complex diseases. Nat Genet. 2005;37:413–417. doi: 10.1038/ng1537. [DOI] [PubMed] [Google Scholar]

[R23] Mechanic L, Luke B, Goodman J, Chanock S, Harris C. Polymorphism Interaction Analysis (PIA): a method for investigating complex gene-gene interactions. BMC Bioinformatics. 2008;9:146. doi: 10.1186/1471-2105-9-146. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R24] Moore J. Computational analysis of gene-gene interactions using multifactor dimensionality reduction. Expert Rev Mol Diagn. 2004;4:795–803. doi: 10.1586/14737159.4.6.795. [DOI] [PubMed] [Google Scholar]

[R25] Moro L, Arbini AA, Marra E, Greco M. Constitutive activation of MAPK/ERK inhibits prostate cancer cell proliferation through upregulation of BRCA2. Int J Oncol. 2007;30:217–224. doi: 10.3892/ijo.30.1.217. [DOI] [PubMed] [Google Scholar]

[R26] Nathanson K, Wooster R, Weber B. Breast cancer genetics: what we know and what we need. Nat Med. 2001;7:552–556. doi: 10.1038/87876. [DOI] [PubMed] [Google Scholar]

[R27] Olopade O, Grushko T, Nanda R, Huo D. Advances in Breast Cancer: Pathways to Personalized Medicine. Clin Cancer Res. 2008;14:7988. doi: 10.1158/1078-0432.CCR-08-1211. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R28] Phillips PC. Epistasisthe essential role of gene interactions in the structure and evolution of genetic systems. Nat Rev Genet. 2008;9:855–867. doi: 10.1038/nrg2452. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R29] Ritchie M, Hahn L, Roodi N, Bailey L, Dupont W, Parl F, Moore J. Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer. Am J Hum Genet. 2001;69:138–147. doi: 10.1086/321276. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R30] Santen RJ, Song RX, Mcpherson R, Kumar R, Adam L, Jeng MH, Yue W. The role of mitogen-activated protein (MAP) kinase in breast cancer. J Steroid Biochem Mol Biol. 2002;80:239–256. doi: 10.1016/s0960-0760(01)00189-3. [DOI] [PubMed] [Google Scholar]

[R31] Schott JR. A test for the equality of covariance matrices when the dimension is large relative to the sample sizes. Comput Stat Data An. 2007;51:6535–6542. [Google Scholar]

[R32] Segal E, Shapira M, Regev A, Pe’er D, Botstein D, Koller D, Friedman N. Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data. Nat Genet. 2003;34:166–176. doi: 10.1038/ng1165. [DOI] [PubMed] [Google Scholar]

[R33] Shi Y, Massagué J. Mechanisms of TGF-β signaling from cell membrane to the nucleus. Cell. 2003;113:685–700. doi: 10.1016/s0092-8674(03)00432-x. [DOI] [PubMed] [Google Scholar]

[R34] Srivastava MS, Yanagihara H. Testing the equality of several covariance matrices with fewer observations than the dimension. J Multivariate Anal. 2010;101:1319–1329. [Google Scholar]

[R35] van de Vijver M, He Y, Van’t Veer L, et al. A gene-expression signature as a predictor of survival in breast cancer. N Engl J Med. 2002;347:1999–2009. doi: 10.1056/NEJMoa021967. [DOI] [PubMed] [Google Scholar]

[R36] van’t Veer LJ, Dai H, Van De Vijver MJ, He YD, Hart AA, Mao M, Peterse HL, Van Der Kooy K, Marton MJ, Witteveen AT, et al. Gene expression profiling predicts clinical outcome of breast cancer. Nature. 2002;415:530–536. doi: 10.1038/415530a. [DOI] [PubMed] [Google Scholar]

[R37] Venkitaraman AR. Cancer susceptibility and the functions of BRCA1 and BRCA2. Cell. 2002;108:171–182. doi: 10.1016/s0092-8674(02)00615-3. [DOI] [PubMed] [Google Scholar]

[R38] Wada T, Penninger JM. Mitogen-activated protein kinases in apoptosis regulation. Oncogene. 2004;23:2838–2849. doi: 10.1038/sj.onc.1207556. [DOI] [PubMed] [Google Scholar]

[R39] You L, He B, Uematsu K, Xu Z, Mazieres J, Lee A, Mccormick F, Jablons DM. Inhibition of wnt-1 signaling induces apoptosis in β-catenin-deficient mesothelioma cells. Cancer Res. 2004;64:3474–3478. doi: 10.1158/0008-5472.CAN-04-0115. [DOI] [PubMed] [Google Scholar]

[R40] Zaïtsev AY. On the gaussian approximation of convolutions under multidimensional analogues of sn bernstein’s inequality conditions. Probab Theory Rel. 1987;74:535–566. [Google Scholar]

[R41] Zerba K, Ferrell R, Sing C. Complex adaptive systems and human health: the influence of common genotypes of the apolipoprotein E (ApoE) gene polymorphism and age on the relational order within a field of lipid metabolism traits. Hum Genet. 2000;107:466–475. doi: 10.1007/s004390000394. [DOI] [PubMed] [Google Scholar]

PERMALINK

Testing Differential Networks with Applications to Detecting Gene-by-Gene Interactions

Yin Xia

Tianxi Cai

T Tony Cai

Summary

1. INTRODUCTION

2. Global Testing of Differential Networks

2.1. Notation and Definitions

2 2. Testing Procedure

2.3. Data-driven estimation of regression coefficients

2.4. Discussion

3. Theoretical Results for the Global Test

3 1. Asymptotic Null Distribution of Mn

Theorem 1

Corollary 1

3 2. Power Analysis

Theorem 2

Theorem 3

4. Multiple Testing with False Discovery Rate Control

Theorem 4

5. Simulation Study

Table 1.

Table 2.

Table 3.

6. Real Data Analysis

Fig. 1.

A. Appendix: Proofs

A·1. Technical Lemmas

Lemma A1 (Bonferroni inequality)

Lemma A2

Lemma A3

Lemma A4

A·2. Proof of Theorem 1

Lemma A5

A·3. Proof of Theorem 2

A·4. Proof of Theorem 3

A·5. Proof of Theorem 4

Footnotes

Contributor Information

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

3 1. Asymptotic Null Distribution of M_n