Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2020 Nov 1.
Published in final edited form as: J Multivar Anal. 2019 Jul 2;174:104524. doi: 10.1016/j.jmva.2019.05.009

Roy’s largest root under rank-one perturbations: the complex valued case and applications

Prathapasinghe Dharmawansa a, Boaz Nadler b,*, Ofer Shwartz b
PMCID: PMC6716615  NIHMSID: NIHMS1533229  PMID: 31474779

Abstract

The largest eigenvalue of a single or a double Wishart matrix, both known as Roy’s largest root, plays an important role in a variety of applications. Recently, via a small noise perturbation approach with fixed dimension and degrees of freedom, Johnstone and Nadler derived simple yet accurate approximations to its distribution in the real valued case, under a rank-one alternative. In this paper, we extend their results to the complex valued case for five common single matrix and double matrix settings. In addition, we study the finite sample distribution of the leading eigenvector. We present the utility of our results in several signal detection and communication applications, and illustrate their accuracy via simulations.

Keywords: Complex Wishart distribution, Rank-one perturbation, Roy’s largest root, Signal detection in noise 2010 MSC: Primary 60B20, Seconday 62H10, 33C15

1. Introduction

Wishart matrices, both real and complex valued, play a central role in statistics, with numerous engineering applications, specifically signal processing and communications. Of particular interest are the roots of a single Wishart matrix H, and of a double Wishart matrix E−1 H, with H and E independent [1]. The latter can be viewed as the multivariate analogue of the univariate F distribution and is also closely related to the multivariate beta distribution [32, Section 3.3]. Here we consider the largest eigenvalue 1 of either the matrix H or the matrix E−1 H, a test statistic proposed by Roy [38, 39], known as Roy’s largest root [32, Section 10.6]. Specifically, we focus on the complex-valued case where H, E are independent complex-valued Wishart matrices. Throughout this paper, we consider m × m matrices, where E follows a complex valued central Wishart distribution with nE degrees of freedom and identity covariance matrix ∑E = I, denoted E~CWm(nE,I). The distribution of the matrix H will either be central H~CWm(nH,ΣH), or non-central H~CWm(nH,ΣH,Ω). For the definition of central and non-central complex valued Wishart matrices, see for example [15] and [19, Section 8].

Obtaining simple expressions, exact or approximate, for the distribution of this top eigenvalue, denoted by 1, in the single or double matrix case has been a subject of intense research for more than 50 years. Khatri [27] derived an exact expression for the distribution of 1 in the single central matrix case with an identity covariance matrix (∑H = I). His result was generalized to several other settings, such as an arbitrary covariance matrix or a non-centrality matrix [24, 28, 36, 37, 41]. The resulting expressions are, in general, challenging to evaluate numerically. More recently, Zanella et al. [44] derived simpler exact, yet recursive expressions, both for the central case with arbitrary ∑H and for the non-central case but with ∑H = I. Alternative recursive formulas in the real-valued case and in the complex-valued case were derived by Chiani [46].

A different approach to derive approximate distributions for the largest eigenvalue when ∑E = ∑H = I, is based on random matrix theory. Considering the limit as nH and m (and in the double matrix case also nE) tend to infinity, with their ratios converging to constants, 1 in the single matrix case and ln(1) in the double matrix case, asymptotically follow a Tracy-Widom distribution [2022]. Furthermore, with suitable centering and scaling, the convergence to these limiting distributions is quite fast [10, 31].

In this paper, motivated by statistical signal detection and communication applications, we consider complex valued Wishart matrices H whose population covariance is a rank-one perturbation of a base covariance matrix. Specifically, in the central case we assume ∑H = I + λvv, where λ is a measure of signal strength, the unit norm vector vCm is its direction, and v denotes the conjugate transpose of v. Similarly, in the non-central case, we assume that H~CWm(nH,I,Ω), with a rank-one non-centrality matrix Ω = λvv. Our goal is to study the distribution of 1 and its dependence on λ, which as discussed below is a central quantity of interest in various applications. A classical result in the single-matrix case is that with dimension m fixed, as nH → ∞, the largest eigenvalue of H converges to a Gaussian distribution [1]. In the random matrix setting, as both nH and m tend to infinity with their ratio tending to a constant, Baik et al. [3] and Paul [34] proved that if λ>mnH then 1 still converges to a Gaussian distribution, but with a different variance. In the two-matrix case, the location of the phase transition and the limiting value of the largest eigenvalue of E−1 H were recently studied by Nadakuditi and Silverstein [33]. Dharmawansa et al. [8] proved that above the phase transition, 1 converges to a Gaussian distribution and provided an explicit expression for its asymptotic variance.

Whereas the above results assume that dimension and degrees of freedom tend to infinity, in various common applications these quantities are relatively small. In such settings, the above mentioned asymptotic results may provide a poor approximation to the distribution of the largest eigenvalue 1, which can be quite far from Gaussian, see Fig. 1 (left) for an illustrative example. Accurate expressions for the distribution of 1, for small dimension and degrees of freedom, were recently derived for single and double real-valued Wishart matrices by Johnstone and Nadler [23], via a small noise perturbation approach. In this paper, we build upon their work and extend their results to the complex valued case and to the study of the distribution of the leading sample eigenvector, not considered in their work. As discussed below, both are important quantities in various applications.

Fig. 1.

Fig. 1.

Density of the largest eigenvalue in Case 1 (left) and Case 2 (right). The parameters for Case 1 are nH = m = 5, λ = 5 and σ2 = 0.01. For Case 2 they are nH = m = 5, ω = 1 and σ2 = 0.01. The red solid line corresponds to Propositions 1 and 2.

Propositions 1-5 in Section 2 provide approximate expressions for the distribution of 1 under the five single-matrix and double-matrix cases outlined in Table 1. In Section 3 we study the finite sample fluctuations of the leading eigenvector and its overlap with the population eigenvector. Next, in Section 4 we illustrate the utility of these approximations in signal detection and communication applications. Specifically, Section 4.1 considers the power of Roy’s largest root test under two common signal models, whereas Section 4.2 considers the outage probability in a specific multiple-input and multiple-output (MIMO) communication system [24]. For a rank-one Rician fading channel, we show analytically that to minimize the outage probability it is preferable to have an equal number of transmitting and receiving antennas. This important design property was previously observed via simulations [24].

Table 1.

Five common single-matrix and double-matrix cases. The middle column describes the distribution of the covariance matrices of the observed data. In the first two cases only one sample covariance matrix is computed. The right column describes several relevant applications.

Case General Form of Distribution Application
1 H~CWm(nH,Σ+λvv)
∑ is known
Signal detection in noise, known noise covariance matrix.
2 H~CWm(nH,Σ,ωvv)
Σ is known
Constant modulus signal detection in noise, known noise covariance matrix.
3 H~CWm(nH,Σ+λvv)
E~CWm(nE,Σ)
Signal detection in noise, estimated noise covariance matrix.
4 H~CWm(nH,Σ,ωvv)
E~CWm(nE,Σ)
Constant modulus signal detection in noise, estimated noise covariance matrix.
5 H~CWp(q,Φ,Ω)
E~CWp(nq,Φ)
Ω is a rank-one matrix
canonical correlation analysis between two groups of sizes pq

2. On the Distribution of Roy’s Largest Root

Table 1 outlines five common single matrix and double matrix complex Wishart cases, along with some representative applications. Propositions 1-5 below, are the complex analogues of those in [23], and provide simple approximations to the distribution of Roy’s largest root in these cases. As outlined in the appendix, their proof follows those of [23], with some notable differences. In particular, we present complex valued analogues of some well known results for real valued Wishart matrices. In what follows we denote by E the expectation operator. We also denote by χk2 the chi-squared distribution with k degrees of freedom and by χk2(η) the non-central chi-squared distribution with non-centrality parameter η. Throughout the manuscript we follow the standard definition of complex valued multivariate Gaussian random variables, see [15]. Specifically, if X~CN(0,σ2) then it can be written as (A+ιB)2 where A, B ∈ ℝ are independent N(0,σ2) random variables and ι=1.

We start with the simplest Case 1 in Table 1, involving a single central Wishart matrix, H~CWm(nH,Σ+λvv). In various engineering applications the matrix ∑ denotes the covariance of the noise measured at m sensors and is often assumed to be known, whereas λ is a measure of the signal strength and the unit norm vector v denotes its direction. Without loss of generality, we thus assume ∑ = σ2I, where σ2 then denotes the noise variance. In contrast to previous asymptotic approaches, whereby the number of samples nH → ∞ and possibly also the dimension m → ∞, in the following we keep nH and m fixed, and study the distribution of the largest eigenvalue in the limit of small noise, namely as σ → 0. To emphasize that we study the dependence of the largest eigenvalue of H on the parameter σ, we shall denote it by 1 (σ).

Proposition 1. Let H~CWm(nH,λvv+σ2I), with ||v|| = 1, λ > 0 and let ℓ1(σ) be its largest eigenvalue. Then, with (m, nH, λ) fixed, as σ → 0

1(σ)=λ+σ22A+σ22B+σ42(λ+σ2)BCA+OP(σ4) (1)

where A, B, C are independent random variables, distributed as A~χ2nH2, B~χ2m22, and C~χ2nH22.

Remark 1. Given that ℓ1 is the largest eigenvalue of a Wishart matrix, it has finite mean and variance. Approximate formulas for these quantities follow directly from (1). Since E{χk2}=kVar{χk2}=2k, and E{1χk2}=1(k2) for k > 2 then for nH > 1

E{1(σ)}=λnH+(nH+m1)σ2+σ4λ+σ2(m1)+o(σ4),

and similarly,

Var{1(σ)}=λ2nH+2λnHσ2+(nH+m1)σ4+o(σ4).

Remark 2. The exact distribution of the largest eigenvalue ℓ1 in the setting of Proposition 1, with number of samples larger than the dimension, has been recently derived by Chiani [6, Theorem 4, part 3]. The result is given in terms of the determinant of an m × m matrix whose entries depend on the generalized incomplete gamma function, with parameters that depend on λ and on σ. In contrast, while (1) is approximate, the dependence on the values of λ and σ is more explicit.

The next proposition considers a non-central single Wishart, Case 2 in Table 1.

Proposition 2. Let H~CWm(nH,σ2I,(ωσ2)vv) with ||v|| = 1, ω > 0 and let ℓ1(σ) be its largest eigenvalue. Then, with (m, nH, ω) fixed, as σ → 0

1(σ)=σ22(A+B+BCA)+OP(σ4) (2)

where A, B, C are all independent and distributed as A~χ2nH2(2ωσ2), B~χ2m22 and C~χ2nH22.

Remark 3. By definition, E{χk2(η)}=k+η. Furthermore, it is easy to show that as η → ∞, E{(χk2(η))1}=(k2+η)1{1+O(η1)} and Var{(χk2(η))1}=2{(k+η2)2(k+η4)}{1+O(η1)}. Note that as σ → 0 the non-centrality parameter 2ω/σ2 which appears in the random variable A in (2) tends to infinity. Hence, for small σ, we can approximate the mean and variance of ℓ1(σ) in (2) by

E{1(σ)}ω+(nH+m1)σ2+(nH1)(m1)σ2(nH1)+ωσ4

and

Var{1(σ)}8ω+4σ2{nH+m1+(nH1)(m1)2(nH+σ2ω1)2(nH+σ2ω2)}.

The next two propositions provide approximations to the distribution of Roy’s largest root in the central and non-central double matrix settings, which correspond to Cases 3 and 4 in Table 1. For Case 3, for example, in principle we need to study 1(E−1 H) where E~CWm(nE,Σ) and H~CWm(nH,Σ+λ~ww). However, a simplification can be made based on the following observations: (i) The matrix E−1 H has the same eigenvalues as ∑1/2E−1 H∑−1/2, which is equal to (Σ−1/2EΣ−1/2)−1−1/2HΣ−1/2), (ii) the matrix Σ12EΣ12~CWm(nE,I) and (iii) the matrix Σ12HΣ12~CWm(nH,I+λvv), where v = Σ−1/2w/||Σ−1/2w|| has unit norm, and λ=Σ12w2λ~. Hence, in the following propositions we assume without loss of generality that the covariance matrix of E is Σ = I.

Proposition 3. Let H~CWm(nH,I+λvv) and E~CWm(nE,I) be independent, with nE > m + 1 and ||v|| = 1. Let ℓ1 be the largest eigenvalue of E−1H. Then, with (m, nH, nE) fixed, as λ becomes large

1(λ)(1+λ)a1Fb1,c1+a2Fb2,c2+a3 (3)

where the two F distributed random variates are independent and

a1=nHnEm+1,a2=m1nEm+2,a3=m1(nEm)(nEm1),b1=2nH,b2=2m2,c1=2nE2m+2,c2=2nE2m+4. (4)

Proposition 4. Suppose that H~CWm(nH,I,ωvv) and E~CWm(nE,I) are independent, with nE > m + 1, ω > 0, and ||v|| = 1. Let ℓ1 be the largest eigenvalue of E−1H. Then, with (m, nH, nE) fixed, as ω becomes large

1(ω)a1Fb1,c1(2ω)+a2Fb2,c2+a3 (5)

where the two F distributed random variates are independent and the parameters ai, bi, ci are given in (4).

Remark 4. In the limit as nE → ∞, the two F-distributed random variables in (3) and (5) converge to χ2 distributed random variables, thus recovering the leading order terms in (1) and (2), respectively.

Let us illustrate the accuracy of our approximations via several simulations. Fig. 1 compares the empirical density of the largest eigenvalue, computed from 105 independent Monte Carlo realizations, in Cases 1 and 2 defined in Table 1, to the two corresponding propositions. For reference, we also plot the standard Gaussian density. The accuracy of our proposition for computing tail probabilities of the form Pr(1 > t) is illustrated in Fig. 2 for Case 1. Similar results (not shown) hold for other cases. Results for Cases 3 and 4 of Table 1 are shown in Fig. 3. As can be seen, in all cases, due to the small sample size and dimension, the distribution of the largest root deviates significantly from the asymptotic Gaussian one, with our propositions being significantly more accurate.

Fig. 2.

Fig. 2.

Tail probabilities for largest eigenvalue in Case 1, same parameters as in Fig. 1.

Fig. 3.

Fig. 3.

Density of the largest eigenvalue in Case 3 (left) and Case 4 (right). In both plots nE = nH = 10, m = 5. In Case 3, λ = 50 and in Case 4 ω = 150. The blue solid line corresponds to Propositions 3 and 4.

2.1. On the leading canonical correlation coefficient

We now consider the fifth Case of Table 1 and study the largest sample canonical correlation coefficient between a first group of p variables and a second group of q variables, in the presence of a single large canonical correlation coefficient in the population. Canonical correlation analysis is widely used in a variety of applications, for example in medical image processing [7, 26, 30], signal processing [2, 35, 40], and array processing [11].

Since the canonical correlation is invariant under unitary transformations within each of the two groups of variables, in the presence of a single large correlation coefficient, without loss of generality we can choose the following form for the matrix Σ,

Σ=(IpP˜P˜Iq).

Here P~=(P0p×(qp)) with P = diag (ρ, 0,…, 0) ∈ ℝp×p and ρ is the value of the correlation coefficient.

To study the sample canonical correlation, consider n + 1 complex-valued m-dimensional multivariate Gaussian observations xi~CN(0,Σ), i = {1, …, n + 1} on m = p + q variables, where without loss of generality pq. The corresponding sample covariance matrix S decomposes as

nS=(YYYXXYXX),

where YCn×p and XCn×q represent the first p variables and the remaining q variables, respectively.

Our interest is in the largest sample canonical correlation coefficient, denoted by r1. Similar to the real valued case [32, Chapter 10], its square r12 is the largest root of the following characteristic equation

det(r2YYYQY)=0, (6)

where Q=X(XX)1X. Introducing the notation H = YQY and E = Y(IpQ)Y, (6) can be rewritten as

det(r2(H+E)H)=0.

Hence, we may equivalently study the largest root of E−1H, since it is related to r12 by 1=r12(1r12).

Similar to [23], it can be shown that with Φ = IpP2, conditional on X, the two matrices H and E are independent and distributed as

H|XCWp(q,Φ,Ω)andE|XCWp(nq,Φ) (7)

with the non-centrality matrix given by

Ω=Φ1P˜XXP˜=ρ21ρ2(XX)11e1e1=ωe1e1whereω=ρ21ρ2(XX)11. (8)

Since XX~CWq(n,Iq), then all diagonal entries of XX follow a chi-square distribution. In particular, (XX)11~χ2n22. The next proposition provides an approximation to the distribution of the largest sample canonical correlation in the presence of a single population canonical correlation. To this end, we introduce the following notation. We denote by Fa,bχ(c,n) a random variable, which is defined as a function of three other random variables as follows: First, generate a random variable Z~cχn2. Next, generate two independent random variables, one distributed as χa2(Z) and the other as χb2. Finally, compute their ratio

χa2(Z)/aχb2/bFa,bχ(c,n). (9)

Proposition 5. Let 1=r12(1r12), where r1 is the largest sample canonical correlation between two groups of size pq computed from n + 1 i.i.d. observations with v = npq > 1. Then in the presence of a single large population correlation coefficient ρ between the two groups, asymptotically as ρ → 1,

1a1Fb1,c1χ(ρ21ρ2,2n)+a2Fb2,c2+a3

where

a1=q/(v+1),a2=(p1)/(v+2),a3=(p1)/v(v1),b1=2q,b2=2p2,c1=2(v+1),c2=2(v+2).

Remark 5. It can be shown that the probability density of Fa,bχ(c,n) is

fX(x)=1B(c12,b12)(1+c)n/2(ba)b2xa21(x+ba)12(a+b)·F12(n2,12(a+b);a2;xc(c+1)(x+ba))

where F12(a,b;c;z) is the Gauss hypergeometric function and B(p,q) is the beta function. This formula is useful for numerical evaluation for small parameter values.

Fig. 4 illustrates the accuracy of Proposition 5. A good match between the theoretical approximation formula and simulation results is clearly visible, particularly at the right tail of the distribution.

Fig. 4.

Fig. 4.

Density function of 1(E−1H) in canonical correlation analysis.

3. Distribution of the Leading Sample Eigenvector

Another key quantity of both theoretical and practical importance is the squared dot product between the leading sample eigenvector, denoted v^, and its corresponding population eigenvector v. Assuming v=v^=1,

R=|v^v|2. (10)

A practical application where it is important to understand the behavior of R under a rank one spike, involves the design of dominant mode rejection (DMR) adaptive beamformers in array processing [42]. The main purpose of this beamformer is to eliminate interferences from undesired directions other than the steering direction. As shown in [43], an important parameter which determines the performance of the DMR scheme is the correlation between the random sample eigenvectors and the unknown population eigenvectors. Specifically, in the presence of a single dominant interferer, the population covariance matrix takes the form of a rank one spiked model [43, Eq. 17], and the effectiveness of the DMR depends on the quantity R. Another application where the quantity R plays a key role is passive radar detection with digital illuminators having several periodic identical pulses [14]. In a sequence of papers [1214], the authors developed a new framework for passive radar detection based on the leading eigenvector of the sample covariance matrix. This detection scheme outperforms traditional detectors [14]. Motivated by these and other applications, we now develop stochastic approximations to R. For Case 1 of Table 1, we have:

Proposition 6. Let H~CWm(nH,λvv+σ2I), with ||v|| = 1 and λ > 0. Let v^ be the eigenvector corresponding to the largest eigenvalue of H. Then, with (m, nH, λ) fixed, for small σ

R11+σ2λ+σ2BA+2σ4(λ+σ2)2BCA2,

where A~χ2nH2, B~χ2m22 and C~χ2nH22 are all independent.

The distribution of R in Case 2 of Table 1 is given by the following proposition.

Proposition 7. Let H~CWm(nH,σ2I,(ωσ2)vv), with ||v|| = 1 and ω > 0. Let v^ be the eigenvector corresponding to the largest eigenvalue of H. Then, with (m, nH, ω) fixed, for small σ

R11+BAσ+2BCAσ2,

where Aσ~χ2nH2(2ωσ2), B~χ2m22 and C~χ2nH22 are all independent.

Propositions 6 and 7 can be useful to analyze theoretically various DMR and radar detections schemes, and shed light on their dependence on the relevant system parameters.

For the double-matrix Case 3 in Table 1, we have

Proposition 8. Let H~CWm(nH,λvv+I) and E~CWm(nE,I) be independent, with nE > m + 1 and ||v|| = 1. Let v^ be the eigenvector corresponding to the largest eigenvalue of E−1H. Then, with (m, nH, nE) fixed, for large λ

R11+BD,

where B~χ2m22 and D~χ2nE+42m2 are independent.

In the context of array processing, the double matrix Case 3 of Table 1 corresponds to a setting where the noise characteristics of the m sensors are not perfectly known, but rather their covariance matrix is estimated from nE samples that do not contain any signal. Comparing Proposition 8 with Proposition 6 sheds light on the effect of estimating the covariance matrix of the noise. Whereas in Case 1, as signal strength λ → ∞ the quantity R converges to one, in Case 3, the random variable R does not converge to one, but rather to a Beta distribution.

Figs. 5 and 6 illustrate the accuracy of our approximate distributions of the squared inner product between the leading sample and population eigenvectors.

Fig. 5.

Fig. 5.

Empirical versus theoretical density of R in Case 1 (left) and Case 2 (right).

Fig. 6.

Fig. 6.

Comparison of empirical density of R in Case 3 of Table 1 with Proposition 8, for nH = 10, nE = 16, m = 5 and λ = 100.

4. Applications

We now demonstrate the utility of our approximations to Roy’s largest root distribution under a rank-one perturbation in three different engineering applications. The first two are concerned with common problems in signal detection, whereas the third with the outage probability of a rank-one Rician fading MIMO channel.

4.1. Signal Detection in Noise

Detecting the presence of a signal in a noisy environment is a fundamental problem in detection theory. Specific examples include spectrum sensing in cognitive radio [17] and target detection in sonar and radar [42]. Assuming additive Gaussian noise, the observed vector y(t)Cm at time t is of the form

y(t)=λs(t)u+n(t), (11)

where s(t)C is the time dependent signal, uCm is normalized such that ||u|| = 1 is its direction, λ ≥ 0 is a measure of the signal strength and the vector nCm is a zero mean complex valued random noise, assumed to be independent of the signal and distributed as n~CN(0,Σ). The positive definite Hermitian matrix Σ is thus the population covariance of the additive random noise. In some cases it is assumed to be explicitly known, whereas in others it needs to be estimated. The signal s(t) is often modeled as a random quantity with E{s(t)2}=1. For example, in multiple antenna spectrum sensing for cognitive radio a common model is that s(t)~CN(0,1), namely s(t) = s1(t) + ιs2(t) where s1(t) and s2(t) are real valued and independent random variables distributed N(0,1) [45,46]. Similarly, in detection of constant modulus signals (e.g., FM signals [18]), s(t) = exp(ιϕ(t)), where ϕ(t) is random.

When the covariance matrix Σ of the noise vector n is assumed known, the observed data used to detect if a signal is present are often nH i.i.d. observations y1, …, ynH, from (11). A popular approach is to compute the sample covariance matrix H=Σj=1nHyjyj, and declare that a signal is present if some function of its eigenvalues is larger than a suitable threshold. Several such detection tests have been proposed [18, 45, 46], including Roy’s largest root [29]. As discussed below, depending on the model of the signal, this leads precisely to Cases 1 and 2 in Table 1.

In other situations, Σ is unknown, but it is possible to observe both the nH samples yi of (11) as well as an additional set of nE independent realizations n1, …, nnE of the noise vector n. The latter are measured, for example, in time slots at which it is a-priori known that no signals are emitted. Here, a typical approach is to form both the matrix H as above and the matrix E=Σj=1nEnjnj and detect the presence of a signal via some function of the eigenvalues of E−1H. Signal detection based on the largest eigenvalue of E−1H leads to Cases 3 and 4 in Table 1.

As discussed in Section 2, one may assume without loss of generality that Σ = σ2I. Thus, when s~CN(0,1),

HCWm(nH,λuu+σ2I).

In contrast, if s = exp(ιϕ), conditional on ϕ1,…, ϕnH,

HCWm(nH,σ2I,λnHσ2uu).

Propositions 1-4 can thus be used to approximate the detection power of Roy’s largest root test as a function of signal strength λ in both the single matrix cases and the double matrix cases,

PD=Pr{1>μ|signal present with strengthλ}, (12)

where μ is a given threshold parameter. The accuracy of (12) is illustrated in Fig. 7.

Fig. 7.

Fig. 7.

Detection power profile for several signal to noise ratios as a function of threshold μ for a known covariance matrix or an unknown covariable. In both cases λ = 1, nH = 5, m = 5. In the right panel nE = 10. From top to bottom, σ = 1/10,2/10 and 3/10.

4.2. Rank-One Rician-Fading MIMO Channel

As a last application, consider the outage probability of a MIMO communication channel with nT transmitters and nR receivers. Here, the transmitted signals xCnT and received signals yCnR are related as

y=Hx+n

where H is the nR × nT channel matrix and n is additive random complex valued noise, assumed to be distributed as n~CN(n,σn2I), where σn2 is its (real-valued) variance. Due to fluctuations in the environment, the channel matrix H is modeled as a random quantity. In particular, under a common Rician fading model [16], H has the form

H=KK+1H1+1K+1H2 (13)

where H1 represents the specular (Rician) component from a direct line-of-sight between transmitter and receiver antennas and H2 represents the scattered Rayleigh-fading component. With fixed sender and receiver locations, the matrix H1 is constant whereas H2 is random with entries modeled as i.i.d. complex Gaussians, CN(0,σH2). Under the normalization tr(H1H1)=nRnT, the factor K represents the ratio of deterministic-to-scattered power of the environment.

Under the maximal ratio transmission strategy, where the transmitter sends information along the leading eigenvector of HH, the channel signal to noise ratio is given by

μ=ΩDσn21(HH) (14)

where ΩD=E[x2] is the power of the transmitted signal vectors [24]. An important quantity is the channel’s outage probability, defined as the probability of failing to achieve a specified minimal SNR μmin required for satisfactory reception. Based on (14), the outage probability Pout can be written as

Pout=Pr(ΩDσn21μmin). (15)

One particularly interesting case is when the Rician component H1 is assumed to be of rank one, H1 = uv, where uCnR, vCnT. An important design question is which configuration of antennas minimizes (15), under the constraint that the total number of transmitting and receiving antennas is fixed. Via simulations, [24] showed it is best to have an equal number of transmitting and receiving antennas. Here we analytically prove this result asymptotically in the limit of small scattering variance (i.e., σH « 1).

Proposition 9. Consider a rank-one Rician fading channel with a fixed number of antennas, nT + nR = N. Then, for σH « 1, the outage probability is minimized at nT = nR = N/2 for N even (or say nT = ⌊N/2⌋, nR = ⌈N/2⌉| for N odd).

Proof. Under the model in (13) and the assumption that H1 = uv is rank one, the j-th column of H, of dimension nR, is distributed as CN(K(K+1)uvj,σH2(K+1)InR). Therefore,

HHCWnR(nT,α2InR,β2/α2ww)

is non-central Wishart, with

w=v/v,α2=1K+1σH2andβ2=KK+1u2v2=KK+1nRnT.

Thus, Proposition 2 implies that for fixed (nT, nR, K),

μ=ΩDσn21=c1(A+B+BCA)+OP(σH4) (16)

where A, B, C are independent random variables distributed as

Aχ2nT2(c2),Bχ2nR22,Cχ2nT22

and

c1=ΩDσH22(K+1)σn2,c2=2β2α2=2KσH2nRnT. (17)

Since E(A)=2nT+c21, and c2 → as σH → 0, we may neglect the third term in (16). Furthermore, since A and B are independent,

μc1(A+B)=c1(χ2nT2(c2)+χ2nR22)=c1χ2N22(c2).

Clearly Pout of (15) is minimal when the largest eigenvalue 1 is stochastically as large as possible, or in turn, when its non-centrality parameter c2 is maximal. Since by (17), c2nTnR, the proposition follows. □

Fig. 8.

Fig. 8.

Outage probability as a function of nT, with nT + nR fixed. Circles represent a Monte-Carlo simulation whereas the solid line is our approximation (which can be computed for any non-integer nT ∈ ℝ+). These graphs support Proposition 9 and demonstrate the accuracy of our approximations. In both graphs, K = 2, σH = 0.3, σn = 1 and ΩD = 5.

Acknowledgments

We thank the editor in chief, the associate editor and the referees for their constructive comments and suggestions. This work was supported in part by grants NIH BIB R01EB1988 (PD) and BSF 2012-159 (PD, BN, OS). B.N. is incumbent of the William Petschek professorial chair of mathematics.

Appendix A. Proofs of main propositions

We prove our main results using the analytical framework developed in [23]. For a complex-valued number zC, its real and imaginary parts are denoted R(z) and J(z), respectively, whereas z is its complex conjugate. We begin with the following auxiliary lemma, which describes the analytic structure of the leading eigenvalue and eigenvector of a covariance matrix constructed from vectors all in the same direction, which without loss of generality we choose as the standard vector e1 = (1, 0, …, 0), but corrupted by small perturbations. Its proof is in Appendix B.

Lemma 1. Let {xj}j=1n be n vectors in Cm of the form

Xj=uje1+ϵξj (A.1)

where uj are complex valued scalars, ξj=(0ξj) with ξjCm1 are the perturbations in orthogonal directions to e1 and ϵ ∈ ℝ is a small parameter. Define z ∈ ℝ, bCm1 and ZC(m−1)×(m−1)

z=Σj=1nuju¯j,b=z12Σj=1nu¯jξj,Z=Σj=1nξjξj. (A.2)

Let 1(ϵ) be the largest eigenvalue of H(ϵ)=Σj=1nxjxj with corresponding leading eigenvector v1(ϵ) normalized such that e1v1(ϵ)=1. Then ℓ1(ϵ) is an even analytic function of ϵ, whereas v1(ϵ)−e1 is an odd function of ϵ. In particular, the Taylor expansions of ℓ1(ϵ) and ϵ = 0 are given by

1(ϵ)=z+b2ϵ2+z1b(Zbb)bϵ4+ (A.3)
v1(ϵ)=e1+z1/2(0b)ϵ+z3/2(0Zbb2b)ϵ3+

Proof of Propositions 1 and 2. Since the eigenvalues of H do not depend on the direction of the vector v, without loss of generality we thus assume that v = e1. Then, H may be realized from nH i.i.d. observations of the form (A.1) with ϵ replaced by σ,

ξjCN(0,Im1),uj{CN(0,σ2+λ),Proposition1,CN(μj,σ2),Proposition2, (A.4)

and μj are arbitrary complex numbers satisfying Σj|μj|2 = ω.

For each realization of u = (uk) and Ξ=[ξ1,,ξnH]C(m1)×nH, Lemma 1 yields the approximation (A.3) for 1(σ). To derive the distributions of the various terms in (A.3) we proceed as follows. Define o1=uuCnH, choose columns o2, …, on so that O = [o1, …, onH] is an nH×nH unitary matrix, and consider the following (m−1)×nH matrix V = ΞO. Its first column is v1=Ξuu=b, and thus the O(ϵ2) term in (A.3) is bb = ||v1||2. For the forth order term, observe that Z = ΞΞ = VV and so the quantity D = b(Zbb)b may be written as

D=v1(VVv1v1)v1=(v1V)(v1V)(v1v)(v1v)=Σj=2nH|v1vj|2.

Hence, (A.3) becomes

1(ϵ)=V0+V2ϵ2+V4ϵ4+

where V0 = ||u||2, V2 = ||v1||2 and V4=V01D. To study the distributions of V0, V2, V4, note that by assumption in (A.4), uj=(aj+ιbj)2 with

aj{N(0,λ+σ2)Proposition1N(2R(μj),σ2)Proposition2bj{N(0,λ+σ2)Proposition1N(2(μj),σ2)Proposition2.

Therefore, u2=12Σj=1nH(aj2+bj2) is a sum of 2nH independent squares of either mean centered or non-centered Gaussian random variables. This in turn gives

V0=u2{σ2+λ2χ2nH2,Proposition1,σ22χ2nH2(2ωσ2),Proposition2.

Since given u, O is unitary and fixed, then vju~CN(0,Im1). Since this distribution is independent of u, vj~CN(0,Im1). By similar arguments

V2=v1212χ2m22

which is independent of ||u||2. Finally, conditioned on(u, v1), we have v1vj~CN(0,v12) and v1vj2~v12χ222. Thus,

D|(u,v1)=Σj=2nH|v1vj|2|(u,v1)v122χ2nH22,

where the χ2nH22 variate is independent of (u, v1). We conclude that

V4{12σ2+2λ(χ2nH2)1χ2m22χ2nH22,proposition1,12σ2(χ2nH2(2ωσ2))1χ2m22χ2nH22,proposition2.

Since the random variables V0, V2, V4 are independent, then so are A, B, C in either (1) or (2). This completes the proof of Propositions 1 and 2. □

To prove Propositions 3 and 4, we first introduce some additional notation and two auxiliary lemmas, whose proofs are deferred to Appendix B. For a matrix S, denote by S jk and S jk the (j, k)-th entries of S and S−1, respectively.

Lemma 2. Let E~CWm(nE,I) and M=[e1,b]Cm×2, with the vector b fixed and orthogonal to e1. Define a 2 × 2 diagonal matrix D=diag(1,1b2). Then

S=(ME1M)1CW2(nEm+2,D),

and the two random variables S11 and S 22 are independent with

S112χ2nE2m+22,S22χ2nE2m+422b2.

Lemma 3. Let E~CWm(nE,I) and let A2=(000Z), where Z is an (m − 1) × (m − 1) random matrix independent of E, with E(Z)=Im1. Then

E(e1E1A2E1e1E11)=m1(nEm)(nEm+1).

Proof of Propositions 3 and 4. Without loss of generality we may assume that the signal direction is v = e1. Hence

H{CWm(nH,I+λe1e1),Proposition3,CWm(nH,I,ωe1e1),Proposition4.

Next, we apply a perturbation approach similar to the one used in the previous proof. To introduce a small parameter, set

ϵ2={1/(1+λ),Proposition3,1/ω,Proposition4.

The matrix Hϵ = ϵ2H has a representation of the form XX with X [x1, …, xnH] where each xj follows (A.1) but now with

ξj~CN(0,Im1), uj~{CN(0,1),Proposition 3,CN(μj/ω,1/ω)Proposition 4,

where Σ |μj|2 = ω. In particular,

z=Σj=1nH|uj|2~{12χ2nH2,Proposition 3,12ωχ2nH2(2ω),Proposition 4.

With b as in (A.2), using the same arguments as in the previous proof, we have that b~CN(0,Im1), independently of u.

The matrix Hϵ may be written as Hϵ = A0 + ϵA1 + ϵ2A2, where

A0=(z000m1),A1=z(0bb0m1),A2=(000z) (A.5)

with Z as in (A.2). For future use we define the following quantities

E11=e1E1e1,b^=(0b),Eb1=b^E1e1,Ebb=b^E1b^.

Note that the condition nEm ensures that E is invertible with probability 1. This follows for example from Theorem 3.2 in [9].

The matrix E−1Hϵ is similar to the Hermitian matrix E−1/2HϵE−1/2. Therefore, all its eigenvalues are real-valued for any value of ϵ. Furthermore, since E−1/2 HϵE−1/2 is a holomorphic symmetric function of ϵ, it follows from Kato ([25], Theorem 6.1 page 120) that the largest eigenvalue 1 and its eigenprojection P~(ϵ) are analytic functions of ϵ in some neighborhood of zero, where the largest eigenvalue has multiplicity one. The projection to the corresponding eigenspace of E−1Hϵ is P(ϵ)=E12P~(ϵ). As the matrix E does not depend on ϵ, this projection is also an analytic function in some neighborhood of ϵ = 0.

At ϵ = 0, E−1e1 is an eigenvector with eigenvalue E11z, that is,

E1H0E1e1=zE1e1e1E1e1=zE11E1e1,

from which we obtain

e1P˜(0)E1e1=e1E1e1=E11. (A.6)

Since P~(ϵ) is an analytic function of ϵ and the inner product is a smooth function, then there exists a neighborhood of ϵ = 0 where e1P~(ϵ)E1e1, e1 is both analytic in ϵ and strictly positive. In this neighborhood, we may define

v1(ϵ)=E11e1P˜(ϵ)E1e1P˜(ϵ)E1e1. (A.7)

Clearly v1(ϵ) is the eigenvector corresponding to the eigenvalue 1(ϵ) and it is also analytic. We thus expand

1(ϵ)=Σj=0λjϵj,v1(ϵ)=Σj=0wjϵj. (A.8)

Inserting these expansions into the eigenvalue-eigenvector equations E−1Hϵv1 = 1v1 gives the following equations: at the O(1) level,

E1A0w0=λ0w0

whose solution is

λ0=zE11,w0=constE1e1. (A.9)

By (A.6)-(A.7), w0 = v1(0) = E−1e1, so the above constant is one.

By (A.7), e1v1(ϵ)=E11=e1w0. Hence e1wj=0 for all j ≥ 1. Furthermore, since A0=ze1e1, then A0wj = 0 for all j ≥ 1. The O(ϵ) equation is thus

E1A1w0+E1A0w1=λ1w0+λ0w1. (A.10)

However, A0w1 = 0. Multiplying this equation by e1 gives that

λ1=e1E1w0E11=zE11{e1E1(0bb0)E1e1}=zE11{e1E1(0b00)E1e1+e1E1(00b0)E1e1}=2zR(Eb1). (A.11)

Inserting the expression for λ1 into (A.10) gives that

w1=1zE11{E1(0bb0)E1e12R(Eb1)E1e1}=1zE11(Eb1E1e1+E11E1b^2R(Eb1)E1e1)=1z(E1b^Eb1¯E11E1e1).

The next O(ϵ2) equation is

E1A2w0+E1A1w1+E1A0w2=λ2w0+λ1w1+λ0w2.

Multiply this equation by e1 and recall that A0w2 = 0 and e1w0=E11 gives

λ2=e1E1A2E1e1E11+e1E1A11z(E1b^Eb1¯E11E1e1)E11=e1E1A2E1e1E11+E11Ebb+(Eb1¯)22Eb1¯R(Eb1)E11=e1E1A2E1e1E11+E11EbbEb1Eb1¯E11. (A.12)

Combining (A.9)-(A.12), we obtain the following approximate stochastic representation for the largest eigenvalue 1 of E−1Hϵ

1(ϵ)=zE11+2ϵzR(Eb1)+ϵ2e1E1A2E1e1E11+ϵ2E11EbbEb1Eb1¯E11+OP(ϵ3). (A.13)

Next, to derive the approximate distribution of 1 corresponding to the above equation, we study a 2 × 2 Hermitian matrix S, whose inverse is defined by

S1=(E11Eb1¯Eb1Ebb)=ME1M,

where M=[e1,b^] is a 2 × 2 matrix. Inverting this matrix gives

S=1E11EbbEb1Eb1¯(EbbEb1Eb1¯E11).

Hence in terms of the matrices S and S−1, (A.13) can be written as

1(ϵ)=zS11+2ϵzR(Eb1)+ϵ2S22+ϵ2e1E1A2E1e1E11+OP(ϵ3). (A.14)

To establish Propositions 3 and 4, we start from (A.14). We neglect the second term T1=2ϵzR(Eb1) which is symmetric with mean zero, and whose variance is much smaller than that of the first term. We also approximate the last term, denoted by T2, by its mean value, using Lemma 3. We now have

1(ϵ)zS11+ϵ2{1S22+c(m,nE)},

where c(m, n) is the expectation from Lemma 3. Since (ϵ) is the largest eigenvalue of E−1Hϵ = ϵ2E−1H, (A.14) should be divided by ϵ2 to obtain the largest eigenvalue of E−1H. By doing so, and inserting the distributions of S11 and combining this with the S22 from Lemma 2 gives

12zϵ2χ2nE2m+22+2b2χ2nE2m+42+m1(nEm)(nEm1).

Next, by inserting the distributions of ||b||2, z and the relevant value of ϵ, we get that for Proposition 3

1(λ)(1+λ)χ2nH2χ2nE2m+22+χ2m22χ2nE2m+42+m1(nEm)(nEm1)

and for Proposition 4

1(ω)χ2nH2(2ω)χ2nE2m+22+χ2m22χ2nE2m+42+m1(nEm)(nEm1).

From Lemma 2 and the independency of u and z, all of the above χ2 random variables are independent. Finally, since ratios of independent χ2 random variables follow an F distribution, the two propositions follow. □

Proof of Proposition 5. By (8), the non-centrality parameter ω depends on the data only through XX. Conditioning on XX, following (7), we invoke Proposition 4 with the parameters m = p, nH = q, and nE = nq to obtain

1(E1H)|Xa1Fb1,c1(2ρ21ρ2(XX)11)+a2Fb2,c2+a3.

Now the final result follows by integrating over the distribution of (XX)11~12χ2n2, and using the definition of Fa,bχ(c,n) given in (9). □

Proof of Proposition 6 and 7. Let us assume without loss of generality that v = e1. If v^ is not normalized, then we can write (10) as R=v^e12v^2. From Lemma 1, we have

v^=w0+σw1+σ3w3+

where

w0=e1,w1=1u(0v1),w3=1u3(0Σj=1nHvjvjv1),

with i.i.d. variables vj~CN(0,Im1) all independent of uϵCn,

u{CN(0,(λ+σ2)In)forHCWm(nH,λe1e1+σ2Im)CN(μ,σ2In)forHCWm(nH,σ2Im,(ω/σ2)e1e1)

and ||μ||2 = ω. Therefore,

R=11+σ2v1||2u1||2+2σ4Σj=1nH|v1vj|2u1||4+OP(σ6).

The result follows from the distribution of these quantities. □

Proof of Proposition 8. Let us rewrite (A.8) as follows

v^(ϵ)=w0+w1ϵ+OP(ϵ2),

where w0 = E−1e1, w1=1z(E1b^Eb1¯E11E1e1). For convenience, decompose the matrices E and E−1 as

E=(E11E12E12E22),E1=(E11E12E12E22) (A.15)

where E11 ∈ ℝ, E12ϵC(m1)×1, and E22ϵC(m1)×(m1). Consequently, E11 = 1/(E11E12E22−1E12) ∈ ℝ and E12=E11E221E12ϵC(m1)×1. The exact form of E22 is unimportant as it does not affect our calculations.

Let us now focus on the numerator of R. Since ew1 = 0, we have

v^e1=E11+OP(ϵ2),

from which we obtain

|v^e1|2=(E11)+OP(ϵ2).

The denominator of R can be written as

v^||2=e1(E1)e1+2zR{e1(E1)2b^}ϵ2zE11R{Eb1}e1(E1)2e1ϵ+OP(ϵ2)

Using the decomposition of E−1 given in (A.15), we get

v^||2=(E11)2+E12||2+2zR{E11E12b+E21E22b}ϵ2zE11R{Eb1}{(E11)2+E12||2}ϵ+OP(ϵ2).

Now we can conveniently express R as

R=(E11)2(E11)2+||E12||2+(E11)2{(E11)2+||E12||2}2(PEQE)ϵ+OP(ϵ2),

where

PE=2zE11{(E11)2+E122}R{Eb1},QE=2zR{E12(E11Im1+E22)b}.

Since PE and QE are zero mean random variables, we neglect them to obtain

R11+E221E22||2,

where we have used the relation E12=E11E221E12ϵC(m1)×1. Noting that E221E12E22~CN(0,E221) with E22~CWm1(nE,Im1), we can show that 1(1+E221E122) is beta distributed with parameters nEm + 2 and m − 1. Now the final result follows from the observation that, for X~χp2 and Y~χq2, X/(X + Y) is beta distributed with parameters p/2 and q/2. □

Appendix B. Proof of Auxiliary Lemmas

Proof of Lemma 1. Write the m × n matrix X(ϵ) = [x1, …, xn] and observe that X(−ϵ) = UX(ϵ), where U = diag(1, −1, …, −1), is an orthogonal matrix. Thus, H(−ϵ) = UTH(ϵ)U has the same eigenvalues as H(ϵ). In particular, the largest eigenvalue 1 and its corresponding eigenvector v1 satisfy

1(ϵ)=1(ϵ),v1(ϵ)=Uv1(ϵ). (B.1)

Hence 1 and the first component of v1 are even functions of ϵ whereas the remaining components of v1 are odd.

We decompose the matrix H(ϵ)=Σj=1nxjxj as

H(ϵ)=j=1n(uje1+ϵξj)(uje1+ϵξj)=j=1nuj2e1e1+ϵj=1n[ξjuj¯e1ξj]+ϵ2j=1nξjξj=(z000m1)+ϵz(0bb0m1)+ϵ2(000Z)=A0+ϵA1+ϵ2A2,

with the matrices A0, A1 and A2 are given in (A.5). Following similar arguments which lead to (A.7) and (A.8) with E = I, we can establish that 1(ϵ) and v1(ϵ) are analytic in some neighborhood of zero. Therefore, we have the following Taylor series expansions:

1(ϵ)=λ0+ϵ2λ2+ϵ4λ4+andv1(ϵ)=w0+ϵw1+ϵ2w2+ϵ3w3+ϵ4w4+ (B.2)

Also, the eigenprojection P(ϵ) of 1 satisfies

v1(ϵ)=1e1P(ϵ)e1P(ϵ)e1. (B.3)

Inserting the expansions (B.2) into the eigenvalue equation Hv1 = 1v1 gives the following set of equations for r ≥ 0

A0wr+A1wr1+A2wr2=λ0wr+λ2wr2+λ4wr4+ (B.4)

with the convention that vectors with negative subscripts are zero. From the r = 0 equation, A0w0 = λ0w0, we readily find that

λ0=z,w0=const·e1.

Eq. (B.3) implies that e1v1=1 and w0 = v1(0) = e1. This implies that wj, for j ≥ 1, is orthogonal to e1, that is orthogonal to w0.

From the eigenvector remarks following (B.1) it follows that w2j = 0 for j ≥ 1. These remarks allow considerable simplification of (B.4); we use those for r = 1 and r = 3

A1w0=λ0w1,A2w1=λ0w3+λ2w1 (B.5)

from which we obtain

w1=z1/2b^,w3=λ01(A2λ2I)w1. (B.6)

Multiply (B.4) on the left by w0 and use the first equation of (B.5) to obtain, for r even,

λr=(A1w0)wr1=λ0w1wr1

and hence

λ2=λ0w1w1=bbandλ4=w1(A2λ2I)w1=z1b(zbb)b.

Therefore, we can further simplify (B.6) to yield

w1=z1/2b^,w2=z3/2(A2b2Im1)b^=z3/2(0Zbb2b).

To prove Lemmas 2 and 3, we shall use the following two claims, which are the complex analogues of Theorems 3.2.10 and 3.2.11 in Muirhead [32]. While their proofs are similar to those in the real valued case, for completeness we present them below.

Claim 1. Suppose A~CWm(n,Σ) with n > m − 1 where A and Σ are partitioned as follows

A=(A11A12A21A22)Σ=(Σ11Σ12Σ21Σ22)

and let A112=A11A12A221A21, and Σ112=Σ11Σ12Σ221Σ21. Then, A11.2 is distributed as CWk(nm+k,Σ112) and is independent of A12, A21 and A22.

Claim 2. Let A~CWm(n,Σ) and let M be a k×m matrix of rank k, where M is independent of A. Then (MA1M)1~CWk(nm+k,(MΣ1M)1).

Proof of Claim 1. Let C = Σ. We partition it as follows,

C=(C11C12C21C22), (B.7)

where C11ϵCk×k, C22ϵC(mk)×(mk), and C12ϵCk×(mk) with C12=C21. Consequently, Σ1121=C11.

Following [15, 19], the density of A is given by

f(A)=detnm(A)Γm(n)detn(Σ)etr(Σ1A) (B.8)

where tr(·) denotes the trace operator and

Γm(n)=πm2(m1)j=1mΓ(nj+1)

with Γ(·) denoting the classical gamma function.

To prove the claim we shall study the form of det(A) and of tr(Σ−1A). First of all, we have that

det(A)=det(A22)det(A11·2).

Next, we introduce a change of variables from the entries of the matrix A, to A112=A11A12A221A21, B12 = A12, B22 = A22. The Jacobian of this transformation is an upper triangular matrix, with all diagonal entries equal to one. Hence, the volume element in (B.8) is dA = dA11dA12dA22 = dA11.2dB12dB22. Furthermore, using the expansion

tr(Σ1A)=tr((C11C12C21C22)(A11·2+B12B221B21B12B21B22))=tr(C11A11·2)+tr(C11B12B221B21)+tr(C12B21)+tr(C21B12)+tr(C22B22)

along with the fact that B21=B12 yields that

f(A11·2,B12,B22)=detnm(B22)detnm(A11·2)detn(Σ22)detn(Σ11·2)Γm(n)×etr(Σ11·21A11·2)tr(Σ11.21B12B221B12)×etr(C21B12)tr(C21B12)tr(C22B22). (B.9)

Now we may use the decomposition

Γm(n)=πk2(k1)j=1kΓ(nm+kj+1)×π(mk)2(m+k1)j=1mkΓ(nj+1)=Γk(nm+k)×π(mk)2(m+k1)j=1mkΓ(nj+1)

to rewrite (B.9) as

f(A11·2,B12,B22)=f1(A11·2)×f2(B12,B22), (B.10)

where

f1(A11·2)=detnm+kk(A11·2)detnm+k(Σ11·2)Γk(nm+k)etr(Σ11·21A11·2), (B.11)

and

f2(B12,B22)=detnm(B22)π(mk)2(m+k1)j=1mkΓ(nj+1)detmk(Σ11.2)detn(Σ22)×etr(Σ11·21B12B221B12)tr(C21B12)tr(C21B12)tr(C22B22).

The factorization in (B.10) establishes that A11.2 is independent of A12 and A22. Finally, (B.11) implies that A112~CWk(nm+k,Σ112) which concludes the proof. □

Proof of Claim 2. Set B = Σ−1/2AΣ−1/2. Now B~CWm(n,I). For R = MΣ−1/2, (MA−1M)−1 = (RB−1R)−1 and (MΣ−1M)−1 = (RR)−1. Thus, it is sufficient to prove that (RB1R)1~CWk(nm+k,(RR)1). Let R = L[Ik : 0]H be the SVD decomposition of R, where L is k × k and nonsingular and H is m × m unitary. Now,

(RB1R)1=(L[Ik:0]HB1H[Ik:0]L)1=(L1)([Ik:0](HBH)1[Ik:0])1L1=(L1)([Ik:0]C1[Ik:0])1L1

where C=HBH~CWm(n,I). Let

F=C1=(F11F12F21F22),C=(C11C12C21C22)

where F11 and C11 are k × k. Then (RB1R)1=(L1)F111L1, and since F111=C11C12C221C21, it follows from Claim 1 that F111~CWk(nm+k,Ik). Hence (L1)F111L1~CWk(nm+k,(LL)1), and since (LL)−1 = (RR)−1, the proof is complete. □

Proof of Lemma 2. Note that S11=E11=e1TE1e1. Then, by Claim 2, (S11)1~CW1(nEm+1,I1)=χ2nE2m+222, meaning S11~2χ2nE2m+22. Next, by definition S = (ME−1M)−1, with fixed M. Thus, by the same claim, S~CW2(nEm+2,D) from which we obtain S22~χ2nE2m+42(2b2). Finally, since (S11)1=S11S12S221S21, by Claim 1, (S11)−1 is independent of S22. □

Proof of Lemma 3. First we decompose the expectation as follows:

E(e1TE1A2E1e1E11)=EE{EAE(e1TE1A2E1e1E11)}.

Next, since A2 is independent of E,

E(A2E)=E(A2)=(000Im1).

Combining the above two equations gives that

E(e1topE1A2E1e1E11)=E(j=2mE1j)2E11)=(m1)E(E122E11).

To compute this expectation, consider the matrix S1=[e1e2]E1[e1e2]=(E11E21E¯21E22). Since S22 = E22 and S22 = E11/(E11E22 − ‖E122), we have

1S22=E22E122E11. (B.12)

Noting that S22~12χ2nE2m+42 and E22~2χ2nE2m+22, we take the expectation of both sides of (B.12) to obtain

E{(E12)2E11}=E(E22)E(2χ2nE2m+42)=1(nEm)(nEm+1)

which completes the proof. □

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

References

  • [1].Anderson TW, An introduction to multivariate statistical analysis, Wiley, New York, third edition, 2003. [Google Scholar]
  • [2].Asendorf N, Nadakuditi RR, Improved detection of correlated signals in low-rank-plus-noise type data sets using informative canonical correlation analysis (icca), IEEE Trans. Inform. Theory 63 (2017) 3451–3467. [Google Scholar]
  • [3].Baik J, Ben Arous G, Peche S, Phase transition of the largest eigenvalue for nonnull complex sample covariance matrices, Ann. Probab. (2005) 1643–1697. [Google Scholar]
  • [4].Chiani M, Distribution of the largest eigenvalue for real Wishart and Gaussian random matrices and a simple approximation for the Tracy-Widom distribution, J. Mult. Anal 129 (2014) 68–81. [Google Scholar]
  • [5].Chiani M, Distribution of the largest root of a matrix for Roy’s test in multivariate analysis of variance, J. Mult. Anal 143 (2016) 467–471. [Google Scholar]
  • [6].Chiani M, On the probability that all eigenvalues of Gaussian, Wishart, and double Wishart random matrices lie within an interval, IEEE Trans. Inform. Theory 63 (2017) 4521–4531. [Google Scholar]
  • [7].Correa NM, Adali T, Li YO, Calhoun VD, Canonical correlation analysis for data fusion and group inferences, IEEE Sig. Proc. Magazine 27 (2010) 39–50. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [8].Dharmawansa P, Johnstone IM, Onatski A, Local asymptotic normality of the spectrum of high-dimensional spiked F-ratios, arXiv preprint arXiv:1411.3875 (2014). [Google Scholar]
  • [9].Edelman A, Eigenvalues and condition numbers of random matrices, SIAM J. Matrix Anal. Appl 9 (1988) 543–560. [Google Scholar]
  • [10].El Karoui N, A rate of convergence result for the largest eigenvalue of complex white Wishart matrices, Ann. Probab 34 (2006) 2077–2117. [Google Scholar]
  • [11].Ge H, Kirsteins IP, Wang X, Does canonical correlation analysis provide reliable information on data correlation in array processing?, in: IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 2113–2116. [Google Scholar]
  • [12].Gogineni S, Setlur P, Rangaswamy M, Nadakuditi RR, Random matrix theory inspired passive bistatic radar detection of low-rank signals, in: IEEE Radar Conference, pp. 1656–1659. [Google Scholar]
  • [13].Gogineni S, Setlur P, Rangaswamy M, Nadakuditi RR, Comparison of passive radar detectors with noisy reference signal, in: IEEE Statistical Signal Processing Workshop (SSP), pp. 1–5. [Google Scholar]
  • [14].Gogineni S, Setlur P, Rangaswamy M, Nadakuditi RR, Passive radar detection with noisy reference signal using measured data, in: IEEE Radar Conference, pp. 858–861. [Google Scholar]
  • [15].Goodman NR, Statistical analysis based on a certain multivariate complex Gaussian distribution (an introduction), Ann. Math. Statist 34 (1963) 152–177. [Google Scholar]
  • [16].Hansen J, Bolcskei H, A geometrical investigation of the rank-1 Ricean MIMO channel at high SNR, in: Intl. Symp. on Inform. Theory, IEEE, p. 64. [Google Scholar]
  • [17].Haykin S, Cognitive radio: brain-empowered wireless communications, IEEE J. Sel. areas Commun 23 (2005) 201–220. [Google Scholar]
  • [18].Haykin S, Moher M, Communication systems, Wiley, New York, 5th edition, 2009. [Google Scholar]
  • [19].James AT, Distributions of matrix variates and latent roots derived from normal samples, Ann. Math. Statist 35 (1964) 475–501. [Google Scholar]
  • [20].Johansson K, Shape fluctuations and random matrices, Comm. Math. Phys 209 (2000) 437–476. [Google Scholar]
  • [21].Johnstone IM, On the distribution of the largest eigenvalue in principal components analysis, Ann. Statist 29 (2001) 295–327. [Google Scholar]
  • [22].Johnstone IM, Approximate null distribution of the largest root in multivariate analysis, Ann. Appl. Statist 3 (2009) 1616–1633. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [23].Johnstone IM, Nadler B, Roy’s largest root test under rank-one alternatives, Biometrika 104 (2017) 181–193. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [24].Kang M, Alouini M-S, Largest eigenvalue of complex Wishart matrices and performance analysis of MIMO MRC systems, IEEE J. Sel. Areas Commun 21 (2003) 418–426. [Google Scholar]
  • [25].Kato T, Perturbation theory of linear operators, Springer, Berlin, second edition, 1995. [Google Scholar]
  • [26].Khalid MU, Seghouane AK, Improving functional connectivity detection in FMRI by combining sparse dictionary learning and canonical correlation analysis, in: IEEE 10th Intl. Symp. on Biomedical Imaging, pp. 286–289. [Google Scholar]
  • [27].Khatri C, Distribution of the largest or the smallest characteristic root under null hypothesis concerning complex multivariate normal populations, Ann. Math. Statist 35 (1964) 1807–1810. [Google Scholar]
  • [28].Khatri C, Non-central distributions of the i-th largest characteristic roots of three matrices concerning complex multivariate normal populations, Ann. I. Stat. Math 21 (1969) 23–32. [Google Scholar]
  • [29].Kritchman S, Nadler B, Non-parametric detection of the number of signals: Hypothesis testing and random matrix theory, IEEE Trans. Signal Process 57 (2009) 3930–3941. [Google Scholar]
  • [30].Lin D, Zhang J, Li J, Calhoun V, Wang YP, Identifying genetic connections with brain functions in schizophrenia using group sparse canonical correlation analysis, in: IEEE 10th Intl. Symp. Biomedical Imaging, pp. 278–281. [Google Scholar]
  • [31].Ma Z, Accuracy of the Tracy-Widom limits for the extreme eigenvalues in white Wishart matrices, Bernoulli 18 (2012) 322–359. [Google Scholar]
  • [32].Muirhead RJ, Aspects of multivariate statistical theory, Wiley, New York, 1982. [Google Scholar]
  • [33].Nadakuditi RR, Silverstein JW, Fundamental limit of sample generalized eigenvalue based detection of signals in noise using relatively few signal-bearing and noise-only samples, IEEE J. Sel. Topics Sig. Proc 4 (2010) 468–480. [Google Scholar]
  • [34].Paul D, Asymptotics of sample eigenstructure for a large dimensional spiked covariance model, Statist. Sinica 17 (2007) 1617–1642. [Google Scholar]
  • [35].Pezeshki A, Scharf LL, Azimi-Sadjadi MR, Lundberg M, Empirical canonical correlation analysis in subspaces, in: Thirty-Eighth Asilomar Conference on Signals, Systems and Computers, volume 1, pp. 994–997. [Google Scholar]
  • [36].Ratnarajah T, Vaillancourt R, Alvo M, Eigenvalues and condition numbers of complex random matrices, SIAM J. Matrix Anal. Appl 26 (2004) 441–456. [Google Scholar]
  • [37].Ratnarajah T, Vaillancourt R, Alvo M, Complex random matrices and Rician channel capacity, Problems of Information transmission 41 (2005) 1–22. [Google Scholar]
  • [38].Roy SN, On a heuristic method of test construction and its use in multivariate analysis, Ann. Math. Statist 24 (1953) 220–238. [Google Scholar]
  • [39].Roy SN, Some aspects of multivariate analysis, Wiley, New York, 1957. [Google Scholar]
  • [40].Scharf L, Thomas JK, Wiener filters in canonical coordinates for transform coding, filtering, and quantizing, IEEE Trans. Signal Process 46 (1998) 647–654. [Google Scholar]
  • [41].Sugiyama T, Distributions of the largest latent root of the multivariate complex Gaussian distribution, Ann. I. Stat. Math 24 (1972) 87–94. [Google Scholar]
  • [42].Van Trees HL, Optimum array processing: Part IV of detection, estimation, and modulation theory, John Wiley & Sons, New York, 2002. [Google Scholar]
  • [43].Wage KE, Buck JR, Snapshot performance of the dominant mode rejection beamformer, IEEE J. Oceanic Eng 39 (2014) 212–225. [Google Scholar]
  • [44].Zanella A, Chiani M, Win MZ, On the marginal distribution of the eigenvalues of Wishart matrices, IEEE Trans. Commun 57 (2009) 1050–1060. [Google Scholar]
  • [45].Zeng Y, Liang Y-C, Eigenvalue-based spectrum sensing algorithms for cognitive radio, IEEE Trans. Commun 57 (2009) 1784–1793. [Google Scholar]
  • [46].Zeng Y, Liang Y-C, Hoang AT, Zhang R, A review on spectrum sensing for cognitive radio: challenges and solutions, EURASIP J. Adv. Sig. Pr 2010 (2010) 381465. [Google Scholar]

RESOURCES