A characterization of multivariate normality through univariate projections

Yongzhao Shao; Ming Zhou

doi:10.1016/j.jmva.2010.04.015

. Author manuscript; available in PMC: 2013 Nov 22.

Published in final edited form as: J Multivar Anal. 2010 Nov;101(10):10.1016/j.jmva.2010.04.015. doi: 10.1016/j.jmva.2010.04.015

A characterization of multivariate normality through univariate projections

Yongzhao Shao ^a,^b,^✉, Ming Zhou ^b

PMCID: PMC3837532 NIHMSID: NIHMS213961 PMID: 24273352

Abstract

This paper introduces a new characterization of multivariate normality of a random vector based on univariate normality of linear combinations of its components.

Keywords: Goodness of fit, Linear combination of components, Marginal distribution, Multivariate normal distribution, Non-normality

1. Introduction

As is well known, the multivariate normal distribution is central to multivariate analysis. Therefore, characterizations and assessments of multivariate normality have attracted sustained interests from researchers as demonstrated in the monographs and papers by [1], [2], [3], [4] and others.

Commonly used assessments of multivariate normality or non-normality of a random vector include a variety of approaches based on linear combinations of variates. In particular, many types of univariate-based plots are both easy to make and simple to use for detecting skewness, outliers, and other departures from multivariate normality [3]. In addition, there exist many formal tests for multivariate normality of a random vector based on examination of selected linear combinations of its components [3, 5]. Indeed, as pointed out by Anderson [1, pp. 23], “One of the reasons that the study of normal multivariate distributions is so useful is that marginal distributions and conditional distributions derived from multivariate normal distributions are also normal distributions. Moreover, linear combinations of multivariate normal variates are again normally distributed.”

One the other hand, it is well known that a non-normal random vector may have some normally distributed linear combinations of its components [2, 5]. This does raise a serious question concerning the effectiveness of the common statistical practice for assessing multivariate normality by examining a few linear combinations of components. After all, only a few of the infinitely many linear combinations can be plotted or tested in practice. Therefore, it is of theoretical interest to characterize or measure the size of the set of normally-distributed linear combinations. Probabilistically, one might ask how large is the chance that a randomly selected linear combination of components from a non-normal random vector is normally distributed? Indeed, this problem has attracted attention of many researchers for a long time [6], [7], [8]. Remarkably, Hamedani and Tata [7] proved that a bivariate random variable is normally distributed if it has a infinite collection of distinct linear combinations of its components that are normally distributed. In particular, this result implies that, a non-normal bivariate random vector can only have finitely many normally distributed linear combinations of its components. However, this characterization of bivariate normality cannot be extended to the multivariate case in a straightforward way [8]. The main objective of this paper is to introduce a new characterization of multivariate normality through univariate projections that holds in all dimensions. We show that, for any multivariate random variable, the set of normally-distributed linear combinations of its components is negligible among all possible linear combinations. In particular, in any dimensions, the probability is zero that a randomly selected linear combination of components of a non-normal random vector is normally distributed. This finding includes the existing bivariate result of Hamedani and Tata [7] as a corollary (see Remark 2 in Section 2 for more details). Given the eminent role of normal distributions in multivariate statistical analysis [1], [2], [3], the finding of this paper bears certain significance to the assessment of multivariate normality, thus might be of interest to many researchers.

In the next section, we establish a new characterization of multivariate normality for a random vector by assessing normality of linear combinations of its components The linear combinations will also be called projections on the vector of coefficients. Section 3 contains concluding remarks.

2. Main Results

The first subsection introduces the basic notation, the multivariate normal distribution, the normal directions, and a few lemmas. The proofs of these lemmas are rather elementary, but are included for completeness. The new characterization of multivariate normality can be found in the second subsection.

2.1. Notation and Lemmas

Let ℝ^p (p ≥ 1) be the p-dimensional Euclidean space. The inner-product of two vectors x = (x₁, …, x_p)^T and y = (y₁, …, y_p)^T ∈ ℝ^p is denoted as $x^{T} y = \sum_{i = 1}^{p} x_{i} y_{i}$ . We use S = {u ∈ ℝ^p ∣ u^Tu = 1} to denote the unit sphere, m the Lebesgue measure in ℝ^p, i.e., m(A) denotes the Lebesgue measure of a measurable set A. Also, let Π denote the uniform measure on the unit sphere S, and ℕ the set of natural numbers.

We will say that a random vector X = (X₁, …, X_p) has a multivariate normal distribution if the support of X is the entire space ℝ^p and there exist a p-vector μ and a symmetric, positive-definite p × p matrix Σ, such that the probability density function of X can be expressed as

f_{x} (x) = \frac{1}{{(2 π)}^{p / 2} | \sum |^{1 / 2}} \exp (- \frac{1}{2} {(x - μ)}^{T} \sum^{- 1} (x - μ)),

where |Σ| is the determinant of Σ. The vector μ is the expected value and the matrix Σ is the covariance matrix of X. If a random vector has a p-variate normal distribution, by the above definition, it must have a density function and a non-singular covariance matrix. As is well known, given independent and identically distributed random observations, the mean vector μ and the covariance matrix Σ can be consistently estimated by their sample counterparts, i.e., the sample mean X̄ and sample covariance matrix $S_{n}^{2}$ , respectively. Moreover, given the existence of a p-variate Lebesgue density of X, the sample covariance matrix $S_{n}^{2}$ is non-singular almost surely [9] [10]. Therefore, it is not essential to know the mean vector μ and the covariance matrix Σ. Indeed, without loss of generality, both the mean vector μ and the covariance matrix Σ are commonly assumed unknown in statistics and many other applications. Throughout this paper, we consider a given random vector X = (X₁, …, X_p)^T ∈ ℝ^p possessing a density function f(x) relative to the Lebesgue measure m. In particular, we call a vector u = (u₁, …, u_p)^T ∈ S a normal direction of X (or f_X) if its one dimensional projection on u, u^TX, has a univariate normal distribution.

Note that when u^TX is normally distributed, its moment generating function exists. Then we can denote its mean and variance by μ_u and $σ_{u}^{2}$ , respectively. Therefore u^TX is normally distributed if and only if $E {\exp (t u^{T} X)} = \exp (μ_{u} t + t^{2} σ_{u}^{2} / 2)$ , $σ_{u}^{2} > 0$ , for all t ∈ ℝ. Or equivalently, in terms of the density f of X,

\int_{ℝ^{p}} \exp (t u^{T} x) f (x) d x = \exp (μ_{u} t + t^{2} σ_{u}^{2} / 2), σ_{u}^{2} > 0, for all t \in ℝ .

(1)

Let G be the set of lines in ℝ^p that lie on normal directions of X and pass through the origin, that is,

G = {u \in ℝ^{p} ∣ u^{T} X is normally distributed} .

(2)

We assume 0 ∈ G. Also, denote U as the set of normal directions of X, then U = G ∩ S. Since univariate normal distribution is completely determined by its moments [11, pp. 389], U can be written in terms of moment equations. Let ϕ be the density of the standard normal distribution, then

U = {u \in S | \int_{ℝ^{p}} {(u^{T} x)}^{n} f (x) d x - \int_{ℝ} t^{n} \frac{1}{σ_{u}} ϕ (\frac{t - μ_{u}}{σ_{u}}) dt = 0, for all n \in ℕ} .

(3)

With the above notation, it is well-known that X is normally distributed if and only if G = ℝ^p or U = S. Next we are going to show that X is normally distributed as long as G has positive Lebesgue measure. In the first lemma, we will show that G is a closed set thus Lebesgue measurable.

Lemma 1

The set G = {u ∈ ℝ^p ∣ u^TX is normally distributed} is closed if X has a density in ℝ^p.

Proof

It suffices to show that the set G contains all its limiting points. If a non-zero sequence {u_n}_n≥1 ⊂ G converges to u₀ ≠ 0, then $u_{n}^{T} X$ converges to $u_{0}^{T} X$ in distribution, where $u_{0}^{T} X$ is non-degenerate because X has a Lebesgue density by assumption. Let $α_{n} = E (u_{n}^{T} X)$ , $β_{n}^{2} = Var (u_{n}^{T} X)$ , then $β_{n}^{- 1} (u_{n}^{T} X - α_{n})$ has a standard normal distribution. By the convergence of types theorem [11, pp. 193], there exist real numbers β > 0 and α such that lim_n→∞ α_n = α, lim_n→∞ β_n = β and $u_{0}^{T} X$ has a normal distribution. Thus u₀ ∈ G and G is a closed set in ℝ^p.

Before proving that X is normally distributed, it is necessary to show that all moments of X exist, which is true if G has positive Lebesgue measure, i.e. m(G) > 0, as asserted by the next lemma.

Lemma 2

For a random vector X with a Lebesgue density in ℝ^p, all moments of X exist if the set G = {u ∈ ℝ^p ∣ u^TX is normally distributed} has positive Lebesgue measure.

Proof

Let m be the Lebesgue measure in ℝ^p. Since m(G) > 0, there exists a basis of ℝ^p, {u₁,…, u_p} ⊂ G. Otherwise there exists {u_i₁,…, u_{i_r}} ⊂ G with r < p, such that any element in G is a linear combination of u_i₁,…, u_{i_r}. Then G would be a subset of the linear vector space spanned by u_i₁,…, u_{i_r}, who has Lebesgue measure 0 in ℝ^p. Consequently m(G) = 0, which is a contradiction. Now we can assume that {u₁,…, u_p} can be chosen as a basis in ℝ^p, let Y = (Y₁,…, Y_p)^T = (u₁,…, u_p)^TX, i = 1,…, n, then E|Y_i|^m < ∞ for all m ∈ ℕ because $Y_{i} = u_{i}^{T} X$ is normal. Moreover, X = {(u₁,…, u_p)^T}⁻¹Y, that is, each X_i is a linear combination of normal random variables. Thus for each i, E|X_i|^m < ∞ for all m ∈ ℕ, or equivalently, E{|X₁|^r₁ ⋯ |X_p|^r_p} < ∞ for all r₁,…, r_p ∈ ℕ.

Remark 1

It is clear that m(G) = 0 if and only if Π(U) = 0, where m and Π are the Lebesgue measures in ℝ^p and on the unit sphere S, respectively.

When all moments of X exist, let W = (W₁,…, W_p)^T be a normal random vector having the same mean and covariance matrix as X and define the following moment equations

g_{n} (u) = E {{(u^{T} X)}^{n}} - E {{(u^{T} W)}^{n}}, u = {(u_{1}, \dots, u_{p})}^{T} \in ℝ^{p}, n \in ℕ .

(4)

Let H_n be the set of solutions to the above moment equations g_n = 0, that is,

H_{n} = {u \in ℝ^{p} ∣ g_{n} (u) = 0}, n \in ℕ .

(5)

Lemma 3

Using the notation in (2), (4), (5), if all moments of X exist, then G = ∩_n≥1H_n. Moreover, for each n, either m(H_n) = 0 or H_n = ℝ^p.

Proof

G = ∩_n≥1H_n follows from the fact that univariate normal distribution is determined by its moments. When all moments of X exist, g_n(u) is a homogenous multivariate polynomial about u₁,…, u_p with degrees at most n. If g_n is the zero function, then H_n = ℝ^p. If g_n is not the zero function, then for any fixed (u₁,…, u_p−1)^T, there are at most n values of u_p such that (u₁,…, u_p) ∈ H_n by the fundamental theorem of algebra (i.e. a polynomial of degree n has at most n solutions). Thus m(H_n) = 0. Because, if we denote H_n(u₁,…, u_p−1) = {u_p ∈ ℝ ∣ (u₁,…, u_p)^T ∈ H_n}, which is a finite set in this case. Let m₁ be the Lebesgue measure in ℝ, then the Lebesgue measure m in ℝ^p is the product measure $m_{1}^{p} = m_{1} \times \dots \times m_{1}$ . By Tonelli’s theorem [12, pp. 152], $m (H_{n}) = \int_{ℝ^{p - 1}} m_{1} {H_{n} (u_{1}, \dots, u_{p - 1})} {d m}_{1}^{p - 1} = 0$ .

2.2. A New Characterization of Multivariate Normality

If the set G = {u ∈ ℝ^p ∣ u^TX is normally distributed} has positive Lebesgue measure, then, by Lemma 2, all moments of X exist, and then G = ℝ^p by Lemma 3. On the other hand, if G has zero measure, then clearly X can not be normally distributed. This yields the following theorem.

Theorem 1

A random vector X ∈ ℝ^p with a Lebesgue density f is not normally distributed if and only if the set of normal directions, U = {u ∈ S ∣ u^TX is normally distributed}, has measure 0, i.e., Π(U) = 0.

One might think that a set with Lebesgue measure zero is not necessarily small. For example, the set of rational numbers has Lebesgue measure zero but is dense in R^p. However, G here is a nowhere dense set. In particular, in the bivariate case, if X is not normally distributed, U not only has measure zero, but also is a finite set, as claimed by the next corollary.

Corollary 1

If a bivariate random vector X (or its density) is not normal, then X has at most finitely many normal directions, i.e., U = {u ∈ S ∣ u^TX is normally distributed} is a finite set.

Proof

Suppose U has two or more points, same arguments as in Lemma 2 yield that X has finite moments of all orders and U satisfies all the moment equations g_n = 0 by Lemma 3. However, if g_n is not the zero function, g_n(u) is essentially a univariate polynomial (due to homogeneity of g_n), which has finitely many solutions on the unit circle. Thus U is a finite set if X is not normal.

Remark 2

A result equivalent to the above Corollary for the bivariate case was established previously by Hamedani and Tata [7] and also claimed as part of the results in Ferguson [6]. While Ferguson [6] did not give a proof, Hamedani and Tata [7] proved the fact using characteristic functions. In particular, Theorem 3 of Hamedani and Tata [7] asserts that, given {(a_k, b_k), k = 1, 2, ⋯}, a countable distinct sequence in ℝ², such that for each k, a_kX₁ + b_kX₂ is a normal random variable, then X = (X₁, X₂)^T is a bivariate normal random variable. To see that this fact directly follows from the above Corollary, it suffices to take u_k = (u_1k, u_2k)^T where $u_{1 k} = a_{k} / \sqrt{a_{k}^{2} + b_{k}^{2}}$ and $u_{2 k} = b_{k} / \sqrt{a_{k}^{2} + b_{k}^{2}}$ . Then a_kX₁ + b_kX₂ is a normal random variable if and only if u_k = (u_1k, u_2k)^T is a normal direction of X = (X₁, X₂)^T. However, the above result as stated in Hamedani and Tata [7] for the bivariate case does not hold in three or higher dimensions as pointed out in Hamedani [8]. Thus, Theorem 1 of this paper, which holds for any dimension p ≥ 2, provides a non-straightforward generalization to the existing result for the bivariate case.

Suppose Y is another random vector with Lebesgue density. If X is not normally distributed, then m(G) = m({u ∈ R^p ∣ u^TX is normally distributed}) = 0 by Theorem 1. Thus P(Y ∈ G) = 0 or $P {{(Y^{T} Y)}^{- \frac{1}{2}} Y \in U} = 0$ , since the probability measure of Y is dominated by m. Therefore we obtain the following corollary.

Corollary 2

If a random vector X is not normally distributed, then for any other random vector Y ∈ ℝ^p with a Lebesgue density, the probability of ${(Y^{T} Y)}^{- \frac{1}{2}} Y$ taking values of normal directions of X is zero.

Remark 3

Formal tests for multivariate normality of a random vector might be constructed based on randomly selected linear combinations of its components. Suppose X, X₁, …, X_n is an independent random sample from an unknown density f. Then we can consider univariate data X^TX_i, i = 1, …, n, which can be viewed as projections of X₁, …, X_n on X. If f is not normal, then each X^TX_i, conditioned on X, are not normally distributed almost surely, thus can be tested using a consistent univariate test for normality such as the [3, 5] test. By Corollary 2, such a univariate-based test would have power against any non-normal alternative density. Thus one may construct univariate tests for multivariate normality based on a randomly selected direction. Tests based such univariate projections might be found in [3] and others.

3. Concluding Remarks

This paper establishes that a multivariate density is not normal if and only if its set of normal directions has Lebesgue measure zero. Consequently, the normal directions of a non-normal density are indeed quite rare. Note that this characterization of a non-normal multivariate density holds in any fixed dimension. Moreover, this new characterization is not an asymptotic result thus its validity does not depend on typical assumptions such as a large sample size. The main finding of this paper may have some significance for the assessment of multivariate normality which is of great relevance in multivariate analysis.

Acknowledgments

The authors thank the editor and the reviewers for their valuable comments and suggestions. This research is partially supported by a research grant from the Stony Wold-Herbert Foundation (YS) and by a translational research grant NIH/NCI P30 CA 16087-24 (YS).

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Contributor Information

Yongzhao Shao, Email: shaoy01@nyu.edu.

Ming Zhou, Email: mingzhou@iastate.edu.

References

1.Anderson TW. An Introduction to Multivariate Statistical Analysis. 3. New York: Wiley; 2003. [Google Scholar]
2.Tong YL. The Multivariate Normal Distribution. New York: Springer-Verlag; 1990. [Google Scholar]
3.Thode HC., Jr . Testing for Normality. New York: Marcel Dekker, Inc.; 2002. [Google Scholar]
4.Sinz F, Gerwinn S, Bethge M. Characterization of the p-generalized normal distribution. Journal of Multivariate Analysis. 2009;100:817–820. [Google Scholar]
5.Looney SW. How to Use Tests for Univariate Normality to Assess Multivariate Normality. American Statistician. 1995;49:64–70. [Google Scholar]
6.Ferguson T. On the Determination of the Joint Distributions from the Marginal Distributions of Linear Combinations (Abstract) Annals of Mathematical Statistics. 1959;30:255. [Google Scholar]
7.Hamedani GG, Tata MN. On the Determination of the Bivariate Normal Distribution from Distributions of Linear Combinations of the Variables. The American Mathematical Monthly. 1975;82:913–915. [Google Scholar]
8.Hamedani GG. Nonnormality of Linear Combinations of Normal Random Variables. American Statistician. 1984;38:295–296. [Google Scholar]
9.Eaton ML, Perlman MD. The non-singularity of generalized sample covariance matrices. The Annals of Statistics. 1973;1:710–717. [Google Scholar]
10.Dykstra RL. Establishing the positive definiteness of the sample covariance matrix. The Annals of Mathematical Statistics. 1970;41:2153–2154. [Google Scholar]
11.Billingsley P. Probability and Measure. 3. New York: Wiley; 1995. [Google Scholar]
12.Arthreya K, Lahiri S. Measure Theory and Probability Theory. New York: Springer; 2006. [Google Scholar]

[R1] 1.Anderson TW. An Introduction to Multivariate Statistical Analysis. 3. New York: Wiley; 2003. [Google Scholar]

[R2] 2.Tong YL. The Multivariate Normal Distribution. New York: Springer-Verlag; 1990. [Google Scholar]

[R3] 3.Thode HC., Jr . Testing for Normality. New York: Marcel Dekker, Inc.; 2002. [Google Scholar]

[R4] 4.Sinz F, Gerwinn S, Bethge M. Characterization of the p-generalized normal distribution. Journal of Multivariate Analysis. 2009;100:817–820. [Google Scholar]

[R5] 5.Looney SW. How to Use Tests for Univariate Normality to Assess Multivariate Normality. American Statistician. 1995;49:64–70. [Google Scholar]

[R6] 6.Ferguson T. On the Determination of the Joint Distributions from the Marginal Distributions of Linear Combinations (Abstract) Annals of Mathematical Statistics. 1959;30:255. [Google Scholar]

[R7] 7.Hamedani GG, Tata MN. On the Determination of the Bivariate Normal Distribution from Distributions of Linear Combinations of the Variables. The American Mathematical Monthly. 1975;82:913–915. [Google Scholar]

[R8] 8.Hamedani GG. Nonnormality of Linear Combinations of Normal Random Variables. American Statistician. 1984;38:295–296. [Google Scholar]

[R9] 9.Eaton ML, Perlman MD. The non-singularity of generalized sample covariance matrices. The Annals of Statistics. 1973;1:710–717. [Google Scholar]

[R10] 10.Dykstra RL. Establishing the positive definiteness of the sample covariance matrix. The Annals of Mathematical Statistics. 1970;41:2153–2154. [Google Scholar]

[R11] 11.Billingsley P. Probability and Measure. 3. New York: Wiley; 1995. [Google Scholar]

[R12] 12.Arthreya K, Lahiri S. Measure Theory and Probability Theory. New York: Springer; 2006. [Google Scholar]

PERMALINK

A characterization of multivariate normality through univariate projections

Yongzhao Shao

Ming Zhou

Abstract

1. Introduction

2. Main Results

2.1. Notation and Lemmas

Lemma 1

Proof

Lemma 2

Proof

Remark 1

Lemma 3

Proof

2.2. A New Characterization of Multivariate Normality

Theorem 1

Corollary 1

Proof

Remark 2

Corollary 2

Remark 3

3. Concluding Remarks

Acknowledgments

Footnotes

Contributor Information

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

A characterization of multivariate normality through univariate projections

Yongzhao Shao

Ming Zhou

Abstract

1. Introduction

2. Main Results

2.1. Notation and Lemmas

Lemma 1

Proof

Lemma 2

Proof

Remark 1

Lemma 3

Proof

2.2. A New Characterization of Multivariate Normality

Theorem 1

Corollary 1

Proof

Remark 2

Corollary 2

Remark 3

3. Concluding Remarks

Acknowledgments

Footnotes

Contributor Information

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases