Double asymptotics for the chi-square statistic

Grzegorz A Rempała; Jacek Wesołowski

doi:10.1016/j.spl.2016.09.004

. Author manuscript; available in PMC: 2017 Dec 1.

Published in final edited form as: Stat Probab Lett. 2016 Sep 10;119:317–325. doi: 10.1016/j.spl.2016.09.004

Double asymptotics for the chi-square statistic

Grzegorz A Rempała ^a, Jacek Wesołowski ^b

PMCID: PMC5383219 NIHMSID: NIHMS818460 PMID: 28392612

Abstract

Consider distributional limit of the Pearson chi-square statistic when the number of classes m_n increases with the sample size n and $n / \sqrt{m_{n}} \to λ$ . Under mild moment conditions, the limit is Gaussian for λ = ∞, Poisson for finite λ > 0, and degenerate for λ = 0.

Keywords: Pearson chi-square statistic, central limit theorem, Poisson limit theorem, weak convergence

1. Preliminaries

The Pearson chi-square statistic is probably one of the best-known and most important objects of statistical science and has played a major role in statistical applications ever since its first appearance in Karl Pearson’s work on “randomness testing” (Pearson, 1900). The standard test for goodness-of-fit with the Pearson chi-square statistic tacitly assumes that the support of the discrete distribution of interest is fixed (whether finite or not) and unaffected by the sampling process. However, this assumption may be unrealistic for modern ’big-data’ problems which involve complex, adaptive data acquisition processes (see, e.g., Grotzinger et al. 2014 for an example in astrobiology). In many such cases the associated statistical testing problems may be more accurately described in terms of triangular arrays of discrete distributions whose finite supports are dependent upon the collected samples and increase with the samples’ size (Pietrzak et al., 2016). Motivated by ’big-data’ applications, in this note we establish some asymptotic results for the Pearson chi-square statistic for triangular arrays of discrete random variables for which their number of classes m_n grows with the sample size n. Specifically, let X_n_,_k, k = 1, . . . , n, be iid random variables having the same distribution as X_n, where

ℙ (X_{n} = i) = p_{n} (i) > 0, i = 1, 2, \dots, m_{n} < \infty, n = 1, 2, \dots

Recall that the standard Pearson chi-square statistic is defined as

χ_{n}^{2} = n \sum_{i = 1}^{m_{n}} \frac{{({\hat{p}}_{n} (i) - p_{n} (i))}^{2}}{p_{n} (i)},

(1)

where the empirical frequencies p̂_n(i) are

{\hat{p}}_{n} (i) = n^{- 1} \sum_{k = 1}^{n} I (X_{n, k} = i), i = 1, \dots, m_{n} .

As stated above, in what follows we will be interested in the double asymptotic analysis of the weak limit of $χ_{n}^{2}$ , that is, the case when m_n → ∞ as n → ∞.

Observe that $χ_{n}^{2}$ given in (1) can be decomposed into a sum of two uncorrelated components as follows

χ_{n}^{2} = n^{- 1} (U_{n} + S_{n}) - n,

(2)

where

U_{n} = \sum_{1 \leq k \neq l \leq n} \frac{I (X_{n, k} = X_{n, l})}{p_{n} (X_{n, k})}

(3)

and

S_{n} = \sum_{k = 1}^{n} \frac{1}{p_{n} (X_{n, k})} = \sum_{k = 1}^{n} p_{n}^{- 1} (X_{n, k}) .

(4)

The second equality above introduces notational convention we use throughout. Note that for fixed n the statistic S _n is simply a sum of iid random variables and U_n is an unnormalized U-statistic (see, e.g., Korolyuk and Borovskich, 2013). It is routine to check that

E U_{n} = n (n - 1) and E S_{n} = {n m}_{n}

and consequently

E χ_{n}^{2} = m_{n} - 1.

Moreover, since we also have ℂov(U_n, S _n) = 0, it follows that

V ar χ_{n}^{2} = n^{- 2} (V ar S_{n} + V ar U_{n}) = n^{- 1} [V ar p_{n}^{- 1} (X_{n}) + 2 (n - 1) (m_{n} - 1)] .

When m_n = m is a constant then the classical result (see, e.g., Shao, 2003, chapter 6) implies that the statistic $χ_{n}^{2}$ asymptotically follows the χ²-distribution with (m−1) degrees of freedom. Consequently, when m is large the standardized statistic $(χ_{n}^{2} - (m - 1)) / \sqrt{2 (m - 1)}$ may be approximated by the standard normal distribution. However, in the case when m_n → ∞ as n → ∞ the matters appear to be more subtle and the above normal approximation may or may not be valid depending upon the asymptotic relation of m_n and n, as described below. Since S _n is a sum of iid random variables, the case when S _n contributes to the limit of normalized $χ_{n}^{2}$ may be largely handled with the standard theory for arrays of iid variables. Consequently, we focus here on a seemingly more interesting case when the asymptotic influence of U_n dominates over that of S _n. Specifically, throughout the paper we assume that as n, m_n → ∞

{(m_{n} n)}^{- 1} V ar p_{n}^{- 1} (X_{n}) \to 0.

(C)

Note that (C) implies $n^{- 1} (S_{n} - {n m}_{n}) / \sqrt{2 m_{n}} \to 0$ in probability and, in particular, is trivially satisfied when X_n is a uniform random variable on the integer lattice 1, . . . , m_n, that is, when $p_{n} (i) = m_{n}^{- 1}$ for i = 1 . . . , m_n. Under condition (C) we get a rather complete picture of the limiting behavior of $χ_{n}^{2}$ . Our main results are presented in Section 2 where we discuss the Poissonian and Gaussian asymptotics. Some examples, relations to asymptotics known in the literature and further discussions are provided in Section 3. The basic tools used in our derivations are listed in the appendix. In what follows limits are taken as n → ∞ with m_n → ∞ and $\overset{d}{\to}$ stands for convergence in distribution.

2. Poissonian and Gaussian asymptotics

We start with the case when a naive normal approximation for the standardized $χ_{n}^{2}$ statistic fails. Indeed, as it turns out, when m_n is asymptotically of order n², we have the following Poisson limit theorem for $χ_{n}^{2}$ .

Theorem 2.1

Assume that the condition (C) holds, as well as

\frac{n}{\sqrt{m_{n}}} \to λ \in (0, \infty) .

(5)

Then

\frac{χ_{n}^{2} - m_{n}}{\sqrt{2 m_{n}}} \overset{d}{\to} \frac{\sqrt{2}}{λ} Z - \frac{λ}{\sqrt{2}}, Z ~ Pois (\frac{λ^{2}}{2})

(6)

Proof

Due to (C) it suffices to consider the asymptotics of U_n alone. We write

\frac{U_{n} - n (n - 1)}{n \sqrt{2 m_{n}}} = \frac{\sqrt{2 m_{n}}}{n} \sum_{k = 1}^{n} A_{n, k} - \frac{n - 1}{\sqrt{2 m_{n}}},

(7)

where A_n_,1 = 0 and for k = 2, . . . , n

A_{n, k} = m_{n}^{- 1} \sum_{j = 1}^{k - 1} \frac{I (X_{n, j} = X_{n, k})}{p_{n} (X_{n, j})} = m_{n}^{- 1} p_{n}^{- 1} (X_{n, k}) \sum_{j = 1}^{k - 1} I (X_{n, j} = X_{n, k}) .

(8)

The above representation implies that to prove (6) we need only to show that $\sum_{k = 1}^{n} A_{n, k} \overset{d}{\to} Pois (\frac{λ^{2}}{2})$ . To this end we will verify the conditions of Theorem A.1 in the appendix, due to Beśka, Kłopotowski and Słomiński (Beśka et al., 1982). Denote ℱ_n_,0 = {∅, Ω} and ℱ_n_,_k = σ(X_n_,1, . . . , X_n_,_k), k = 1, . . . , n. Then using the first form of A_n_,_k from (8) we see that

\begin{array}{l} max_{1 \leq k \leq n} E (A_{n, k} ∣ F_{n, k - 1}) = m_{n}^{- 1} max_{1 \leq k \leq n} \sum_{j = 1}^{k - 1} E (\frac{I (X_{n, j} = X_{n, k})}{p_{n} (X_{n, j})} | F_{n, k - 1}) \\ = max_{1 \leq k \leq n} \frac{k - 1}{m_{n}} = \frac{n - 1}{m_{n}} \to 0 \end{array}

due to (5) and thus (A.1) holds. Similarly,

\sum_{k = 1}^{n} E (A_{n, k} ∣ F_{n, k - 1}) = \sum_{k = 1}^{n} \frac{k - 1}{m_{n}} = \frac{n (n - 1)}{2 m_{n}} \to \frac{λ^{2}}{2}

(9)

and thus (A.2) also follows with $η = \frac{λ^{2}}{2}$ . Since A_n_,_k ≥ 0 the required convergence in (A.3) (for any ε > 0) will follow from convergence of the unconditional moments

\sum_{k = 1}^{n} E A_{n, k} I (∣ A_{n, k} - 1 ∣ > ε) \leq ε^{- 2} \sum_{k = 1}^{n} (E A_{n, k}^{3} - 2 E A_{n, k}^{2} + E A_{n, k}) .

(10)

Using the second form of A_n_,_k from (8) we see that the conditional distribution of m_n p_n(X_n_,_k) A_n_,_k given X_n_,_k follows a binomial distribution Binom(k − 1, p_n(X_n_,_k)). Since for M ~ Binom(r, p) we have 𝔼 M = rp, 𝔼 M² = rp + r(r − 1) p² and 𝔼 M³ = rp +3r(r − 1)p² +r(r − 1)(r − 2)p³, we thus obtain

\begin{matrix} \sum_{k = 1}^{n} E A_{n, k} = \frac{1}{m_{n}} \sum_{k = 1}^{n} (k - 1) ≃ \frac{n^{2}}{2 m_{n}} \to \frac{λ^{2}}{2}, \\ \sum_{k = 1}^{n} E A_{n, k}^{2} = \frac{1}{m_{n}^{2}} \sum_{k = 1}^{n} ((k - 1) m_{n} + (k - 1) (k - 2)) ≃ \frac{n^{2}}{2 m_{n}} + \frac{n^{3}}{3 m_{n}^{2}} \to \frac{λ^{2}}{2} . \end{matrix}

Similarly,

\begin{array}{l} \sum_{k = 1}^{n} E A_{n, k}^{3} = \frac{1}{m_{n}^{3}} \sum_{k = 1}^{n} ((k - 1) E p_{n}^{- 2} (X_{n}) + 3 (k - 1) (k - 2) m_{n} + (k - 1) (k - 2) (k - 3)) \\ ≃ \frac{n^{2}}{2 m_{n}^{3}} E p_{n}^{- 2} (X_{n}) + \frac{n^{3}}{m_{n}^{3}} + \frac{n^{4}}{4 m_{n}^{3}} . \end{array}

Note that (C) and (5) imply $m_{n}^{- 2} E p_{n}^{- 2} (X_{n}) \to 1$ and therefore

\sum_{k = 1}^{n} E A_{n, k}^{3} ≃ \frac{n^{2}}{2 m_{n}^{3}} E \frac{1}{p_{n}^{2} (X_{n})} \to \frac{λ^{2}}{2} .

Combining the limits of the last three expressions we conclude that the right-hand side of (10) tends to zero and hence (A.3) of Theorem A.1 is also satisfied. The result follows.

Let us now consider the case $\frac{n}{\sqrt{m_{n}}} \to \infty$ . As it turns out, under this condition the statistic $χ_{n}^{2}$ is asymptotically Gaussian.

Theorem 2.2

Assume that condition (C) is satisfied and that there exists δ > 0 such that

sup_{n} m_{n}^{- (1 + δ)} E p_{n}^{- (1 + δ)} (X_{n}) < \infty

(11)

as well as

\frac{n}{\sqrt{m_{n}}} \to \infty .

(12)

Then

\frac{χ_{n}^{2} - m_{n}}{\sqrt{2 m_{n}}} \overset{d}{\to} N, N ~ Norm (0, 1) .

(13)

Remark 2.3

Note that under (C) the conditions (11) (with δ = 1) and (12) are implied by the condition n/m_n → λ ∈ (0, ∞).

Proof

As in Theorem 2.1, under our assumption (C) it suffices to show convergence in distribution to N ~ Norm(0, 1) of the normalized U_n variable

\frac{U_{n} - n (n - 1)}{\sqrt{n (n - 1) 2 (m_{n} - 1)}} = \sum_{k = 1}^{n} Y_{n, k},

where

Y_{n, k} = \frac{\sqrt{2}}{\sqrt{n (n - 1) (m_{n} - 1)}} \sum_{j = 1}^{k - 1} (\frac{I (X_{n, j} = X_{n, k})}{p_{n} (X_{n, j})} - 1) = \frac{\sqrt{2} B_{n, k}}{\sqrt{n (n - 1) (m_{n} - 1)}}

(14)

and the last equality defines B_n_,_k. Since 𝔼(I(X_n_,_k = X_n_, _j)|ℱ_n_,_k₋₁) = p_n(X_n_, _j) for any j = 1, . . . , k − 1, it follows that 𝔼(Y_n_,_k|ℱ_n_,_k₋₁) = 0. Consequently, (Y_n_,_k, ℱ_n_,_k)_k_=1,...,_n are martingale differences. Therefore, to prove (13) we may use the Lyapounov version of the CLT for martingale differences (see Theorem A.2 in the appendix).

Due to (14) we have

E (B_{n, k}^{2} ∣ F_{n, k - 1}) = \sum_{j = 1}^{k - 1} \frac{V ar (I (X_{n} = X_{n, j}) ∣ F_{n, k - 1})}{p_{n}^{2} (X_{n, j})} + \sum_{1 \leq i \neq j \leq k - 1} \frac{ℂ ov (I (X_{n} = X_{n, i}), I (X_{n} = X_{n, j}) ∣ F_{n, k - 1})}{p_{n} (X_{n, i}) p_{n} (X_{n, j})} .

Since 𝕍ar(I(X_n = X_n_, _j)|ℱ_n_,_k₋₁) = p_n(X_n_, _j)(1 − p_n(X_n_, _j)) and

ℂ ov (I (X_{n} = X_{n, i}), I (X_{n} = X_{n, j}) ∣ F_{n, k - 1}) = I (X_{n, i} = X_{n, j}) p_{n} (X_{n, i}) - p_{n} (X_{n, i}) p_{n} (X_{n, j})

we obtain

E (B_{n, k}^{2} ∣ F_{n, k - 1}) = \sum_{j = 1}^{k - 1} (p_{n}^{- 1} (X_{n, j}) - 1) + \sum_{1 \leq i \neq j \leq k - 1} (\frac{I (X_{n, i} = X_{n, j})}{p_{n} (X_{n, i})} - 1) .

Consequently, (A.4) is equivalent to

\frac{\sum_{k = 1}^{n} \sum_{j = 1}^{k - 1} (p_{n}^{- 1} (X_{n, j}) - m_{n})}{\frac{n (n - 1)}{2} (m_{n} - 1)} + \frac{\sum_{k = 1}^{n} \sum_{1 \leq i \neq j \leq k - 1} (\frac{I (X_{n, i} = X_{n, j})}{p_{n} (X_{n, i})} - 1)}{\frac{n (n - 1)}{2} (m_{n} - 1)} \overset{ℙ}{\to} 0.

(15)

To show the above, we separately consider moments of the summands on the left-hand side of (15). For the first one, note that

\begin{array}{l} \sum_{k = 1}^{n} \sum_{j = 1}^{k - 1} (p_{n}^{- 1} (X_{n, j}) - m_{n}) = \sum_{j = 1}^{n - 1} (n - j) (p_{n}^{- 1} (X_{n, j}) - m_{n}) \\ \overset{d}{=} \sum_{j = 1}^{n - 1} j (p_{n}^{- 1} (X_{n, j}) - m_{n}) \end{array}

where the last equality denotes the distributional equality of random variables. Therefore, using inequality (B.2) given in the appendix, we get (possibly with different universal constants C from line to line)

\begin{array}{l} E {| \frac{\sum_{k = 1}^{n} \sum_{j = 1}^{k - 1} (p_{n}^{- 1} (X_{n, j}) - m_{n})}{\frac{n (n - 1)}{2} (m_{n} - 1)} |}^{1 + δ} \leq C \frac{E {| \sum_{j = 1}^{n - 1} j (p_{n}^{- 1} (X_{n, j}) - m_{n}) |}^{1 + δ}}{n^{2 + 2 δ} m_{n}^{1 + δ}} \\ \leq C \frac{E {| p_{n}^{- 1} (X_{n, j}) - m_{n} |}^{1 + δ} n^{\frac{δ - 1}{2} \lor 0} \sum_{j = 1}^{n - 1} j^{1 + δ}}{n^{2 + 2 δ} m_{n}^{1 + δ}} \\ \leq C \frac{E {| p_{n}^{- 1} (X_{n, j}) - m_{n} |}^{1 + δ} n^{\frac{3 (1 + δ)}{2} \lor (2 + δ)}}{n^{2 + 2 δ} m_{n}^{1 + δ}} \leq C \frac{E {| p_{n}^{- 1} (X_{n, j}) - m_{n} |}^{1 + δ}}{n^{\frac{1 + δ}{2} \land δ} m_{n}^{1 + δ}} . \end{array}

In view of this and the elementary inequality |a + b|^p ≤ C(|a|^p + |b|^p) valid for any p > 0 and any real a, b we have for some constants C₁, C₂

E {| \frac{\sum_{k = 1}^{n} \sum_{j = 1}^{k - 1} (p_{n}^{- 1} (X_{n, j}) - m_{n})}{\frac{n (n - 1)}{2} (m_{n} - 1)} |}^{1 + δ} \leq \frac{C_{1}}{n^{\frac{1 + δ}{2} \land δ}} \frac{E p_{n}^{- (1 + δ)} (X_{n})}{m_{n}^{1 + δ}} + \frac{C_{2}}{n^{\frac{1 + δ}{2} \land δ}} \to 0.

For the numerator of the second part on the left hand side of (15) we may write

\sum_{k = 1}^{n} \sum_{1 \leq i \neq j \leq k - 1} (\frac{I (X_{n, i} = X_{n, j})}{p_{n} (X_{n, i})} - 1) = 2 \sum_{1 \leq i < j \leq n - 1} (n - j) (\frac{I (X_{n, i} = X_{n, j})}{p_{n} (X_{n, i})} - 1) .

Moreover,

E {(\sum_{1 \leq i < j \leq n - 1} (n - j) (\frac{I (X_{n, i} = X_{n, j})}{p_{n} (X_{n, i})} - 1))}^{2} = \sum_{1 \leq i < j \leq n - 1} {(n - j)}^{2} E {(\frac{I (X_{n, i} = X_{n, j})}{p_{n} (X_{n, i})} - 1)}^{2},

since the expectations of the other terms resulting from squaring the large-bracketed first expression above are equal to zero. Consequently

\begin{array}{l} E {(\sum_{1 \leq i < j \leq n - 1} (n - j) (\frac{I (X_{n, i} = X_{n, j})}{p_{n} (X_{n, i})} - 1))}^{2} = (m_{n} - 1) \sum_{1 \leq i < j \leq n - 1} {(n - j)}^{2} \\ \leq C m_{n} n^{4} \end{array}

and thus for the squared expectation of the second term in (15) we get

E {(\frac{\sum_{k = 1}^{n} \sum_{1 \leq i \neq j \leq k - 1} (\frac{I (X_{n, i} = X_{n, j})}{p_{n} (X_{n, i})} - 1)}{\frac{n (n - 1)}{2} (m_{n} - 1)})}^{2} \leq C m_{n}^{- 1} \to 0.

Note that here we used the fact that m_n → ∞. To finish the proof we only need to show (A.5). Again we will rely on the representation of Y_n_,_k given in (14). Note that

E {| Y_{n, k} |}^{2 + δ} \leq C n^{- (2 + δ)} m_{n}^{- (1 + \frac{δ}{2})} E (p_{n}^{- (2 + δ)} (X_{n, k}) {| \sum_{j = 1}^{k - 1} (I (X_{n, j} = X_{n, k}) - p_{n} (X_{n, k})) |}^{2 + δ}) .

Since I(X_n_, _j = X_n_,_k) − p_n(X_n_,_k), j = 1, . . . , k − 1, are conditionally iid given X_n_,_k and

E ((I (X_{n, j} = X_{n, k}) - p_{n} (X_{n, k})) ∣ X_{n, k}) = 0

then by conditioning with respect to X_n_,_k and applying Rosenthal’s inequality (see (B.1) in the appendix) to the conditional moment of the sum we obtain

\begin{array}{l} \sum_{k = 1}^{n} E {| Y_{n, k} |}^{2 + δ} \leq \frac{C}{n^{2 + δ} m_{n}^{1 + \frac{δ}{2}}} \sum_{k = 1}^{n} E (p_{n}^{- (2 + δ)} (X_{n}) ((k - 1) p_{n} (X_{n}) + {[(k - 1) p_{n} (X_{n})]}^{1 + \frac{δ}{2}})) \\ \leq C (n^{- δ} m_{n}^{- (1 + \frac{δ}{2})} E p_{n}^{- (1 + δ)} (X_{n}) + n^{- \frac{δ}{2}} m_{n}^{- (1 + \frac{δ}{2})} E p_{n}^{- (1 + \frac{δ}{2})} (X_{n})) . \end{array}

(16)

By virtue of the Schwartz inequality we obtain that

\begin{array}{l} n^{- \frac{δ}{2}} m_{n}^{- (1 + \frac{δ}{2})} E p_{n}^{- (1 + \frac{δ}{2})} (X_{n}) = n^{- \frac{δ}{2}} m_{n}^{- (1 + \frac{δ}{2})} E p_{n}^{- \frac{1}{2}} (X_{n}) p_{n}^{- \frac{1 + δ}{2}} (X_{n}) \\ \leq n^{- \frac{δ}{2}} \sqrt{m_{n}^{- (1 + δ)} E p_{n}^{- (1 + δ)} (X_{n})} \to 0 \end{array}

in view of (11). Therefore, it only suffices to show that the first term in the last expression in (16) converges to zero. But this follows due to (11) and (12), since

\frac{E p_{n}^{- (1 + δ)} (X_{n})}{n^{δ} m_{n}^{1 + \frac{δ}{2}}} = {(\frac{\sqrt{m_{n}}}{n})}^{δ} \frac{E p_{n}^{- (1 + δ)} (X_{n})}{m_{n}^{1 + δ}} \to 0.

3. Discussion

We will now illustrate the results of the previous section with some examples as well as put them in a broader context of earlier work by others. For the sake of completeness, we first note

Remark 3.1. The case λ = 0

Consider $\frac{n}{\sqrt{m_{n}}} \to 0$ . Then the last part of the right hand side of (7) converges to zero and we are left with the sum of non-negative random variables which satisfies

\frac{2 \sqrt{m_{n}}}{n} \sum_{k = 1}^{n} A_{n, k} \overset{ℙ}{\to} 0 .

To see the above, it suffices to consider the convergence of the first moments. To this end note that

\frac{2 \sqrt{m_{n}}}{n} \sum_{k = 1}^{n} E A_{n, k} = \frac{2 \sqrt{m_{n}}}{n} \sum_{k = 1}^{n} \frac{k - 1}{m_{n}} = \frac{n - 1}{\sqrt{m_{n}}} \to 0.

The simple illustration of Theorem 2.2 is as follows.

Example 3.1

Let α ∈ [0, 1) and set p_n(i) = (C_αi^α)⁻¹ for i = 1, . . . , m_n. Here $C_{α} = \sum_{i = 1}^{m_{n}} i^{- α} ≃ m_{n}^{1 - α} / (1 - α)$ in view of the general formula

\sum_{i = 1}^{m_{n}} i^{β} ≃ m_{n}^{β + 1} / (β + 1) for β > - 1.

(17)

Note that for 0 < α < 1 the condition (C) is equivalent to

n / m_{n} \to \infty

(18)

and implies (12). Applying (17) again we see that for any δ > 0

\frac{E p_{n}^{- (1 + δ)} (X_{n})}{m_{n}^{1 + δ}} = \frac{C_{α}^{δ} \sum_{i = 1}^{m_{n}} i^{α δ}}{m_{n}^{1 + δ}} ≃ \frac{m_{n}^{(1 - α) δ} m_{n}^{1 + α δ}}{{(1 - α)}^{δ} (1 + α δ) m_{n}^{1 + δ}} = {(1 - α)}^{- δ} {(1 + α δ)}^{- 1} < \infty

and therefore (11) is also satisfied. Hence, the conclusion of Theorem 2.2 holds true under (18) for 0 < α < 1.

Note that in the above example the assumption (5) of Theorem 2.1 cannot be satisfied for 0 < α < 1 (see (18)) but can hold for α = 0, that is, when the distribution is uniform. We remark that in our present setting such distribution is of interest, for instance, when testing for signal-noise threshold in data with large number of support points (Pietrzak et al., 2016). Combining the results of Theorems 2.1 and 2.2 and Remark 3.1 one obtains the following.

Corollary 3.2 (Asymptotics of $χ_{n}^{2}$ for uniform distribution)

Assume that $p_{n} (i) = m_{n}^{- 1}$ for i = 1, 2, . . . , m_n and n = 1, 2, . . . as well as

n / \sqrt{m_{n}} \to λ .

Then

\frac{χ_{n}^{2} - m_{n}}{\sqrt{2 m_{n}}} \overset{d}{\to} {\begin{cases} 0 & when λ = 0, \\ \frac{\sqrt{2}}{λ} Z - \frac{λ}{\sqrt{2}}, Z ~ Pois (\frac{λ^{2}}{2}) & when λ \in (0, \infty), \\ N ~ Norm (0, 1) & when λ = \infty . \end{cases}

We note that the asymptotic distribution of $χ_{n}^{2}$ when both n and m_n tend to infinity has been considered by several authors, typically in the context of asymptotics of families of goodness-of-fit statistics related to different divergence distances. Some of these results considered also the asymptotic behavior of such statistics not only under the null hypothesis (as we did here) but also under simple alternatives and hence are, in that sense, more general. However, when applied to the chi-square statistic under the null hypothesis they appear to be special cases of our theorems in Section 2. We briefly review below some of the most relevant results.

Tumanyan (1954, 1956) proved asymptotic normality of $χ_{n}^{2}$ under the assumption min_{1≤i≤m_n} np_n(i) → ∞ which in the case of the uniform distribution is equivalent to n/m_n → ∞, a condition obviously stronger than $n / \sqrt{m_{n}} \to \infty$ we use (see Corollary 3.2).

Steck (1957) generalized these results on normal asymptotics assuming among other conditions that inf_n n/m_n > 0 which again is stronger than $n / \sqrt{m_{n}} \to \infty$ . He also obtained the Poissonian and degenerate limit in the case of uniform distribution, in agreement with the first two cases in our Corollary 3.2. The main result of Holst (1972) for the chi-square statistic gives normal asymptotics under the regime n/m_n → λ ∈ (0, ∞) and max_1≤_j_≤_n p_n( j) < β/n which also is stronger than our assumptions. In the uniform case under this regime the result was proved earlier in Harris and Park (1971). The main result of Morris (1975) for the chi-square statistics gives asymptotic normality under n min_1≤_j_≤_n p_n( j) > ε > 0 for all n ≥ 1, max_1≤_j_≤_n p_n( j) → 0 and the ”uniform asymptotically negligible” condition of the form ${max}_{1 \leq i \leq m_{n}} σ_{n}^{2} (i) / s_{n}^{2} \to 0$ , where $σ_{n}^{2} (i) = 2 + \frac{{(1 - m_{n} p_{n} (i))}^{2}}{n p_{n} (i)}$ , i = 1, . . . , m_n, and $s_{n}^{2} = \sum_{i = 1}^{m_{n}} σ_{n}^{2} (i)$ . In the case of the uniform distribution it gives asymptotic normality of $χ_{n}^{2}$ under the condition n/m_n > ε > 0, the result apparently weaker than the third part of Corollary 3.2.

Following the paper of Cressie and Read (1984) introducing the family of power divergence statistics (of which the chi-square statistic is a member), much effort was directed at proving asymptotic normality for wider families of divergence distances as well as for more than one multinomial independent sample, see e.g. Menéndez et al. (1998); Pérez and Pardo (2002) (in both papers the authors considered the regime n/m_n → λ ∈ (0, ∞)) and Inglot et al. (1991), Morales et al. (2003) (in both papers the authors considered the regime $m_{n}^{1 + β} {log}^{2} (n) / n \to 0$ and $m_{n}^{β} {min}_{1 \leq j \leq n} p_{n} (j) > c > 0$ for some β ≥ 1) or Pietrzak et al. (2016) (with the regime n/m_n → ∞). Note that for the asymptotic normality results all these regimes are again more stringent than what we consider here.

Finally, for completeness, we briefly address one of the scenarios when condition (C) does not hold.

Remark 3.3

Note that if $\frac{m_{n} n}{V ar p_{n}^{- 1} (X_{n})} \to 0$ then the asymptotic behavior of standardized $χ_{n}^{2}$ is the same as that of $Z_{n} = \sum_{k = 1}^{n} Y_{n, k}$ , where

Y_{n, k} = \frac{p_{n}^{- 1} (X_{n, k}) - m_{n}}{\sqrt{n V ar p_{n}^{- 1} (X_{n})}}, k = 1, \dots, n .

Since for any fixed n ≥ 1 random variables Y_n_,_k, k = 1, . . . , n, are iid (zero mean) and 𝕍ar Y_n_,_k = n⁻¹ it follows that {Y_n_,_k, k = 1, . . . , n}_n_≥1 is an infinitesimal array. Therefore classical CLT for row-wise iid triangular arrays (cf., e.g., Shao, 2003, chapter 1) applies. Note also that the remaining case when $\frac{m_{n} n}{V ar p_{n}^{- 1} (X_{n})} \to λ \in (0, \infty)$ appears more complicated and requires a different approach.

Acknowledgments

The research was conducted when the second author was visiting The Mathematical Biosciences Institute at OSU. Both authors thank the Institute for its logistical support and funding through US NSF grant DMS-1440386. The research was also partially funded by US NIH grant R01CA-152158 and US NSF grant DMS-1318886. The authors wish to gratefully acknowledge helpful comments made by the referee and the associate editor on the early version of the manuscript.

Appendix A. Limit Theorems

Below, for convenience of the readers, we recall some results which are used in the proofs. The first one is found in Beśka et al. (1982) and the second one is a version of the martingale CLT (see, e.g., Hall and Heyde, 1980).

Theorem A.1 (Poissonian conditional limit theorem)

Let {Z_n_,_k, k = 1, . . . , n; n ≥ 1} be a double sequence of non-negative random variables adapted to a row-wise increasing double sequence of σ-fields {𝒢_n_,_k₋₁, k = 1, . . . , n; n ≥ 1}. If for n → ∞

max_{1 \leq k \leq n} E (Z_{n, k} ∣ G_{n, k - 1}) \overset{ℙ}{\to} 0,

(A.1)

\sum_{k = 1}^{n} E (Z_{n, k} ∣ G_{n, k - 1}) \overset{ℙ}{\to} η > 0,

(A.2)

and for any ε > 0

\sum_{k = 1}^{n} E (Z_{n, k} I (∣ Z_{n, k} - 1 ∣ > ε) ∣ G_{n, k - 1}) \overset{ℙ}{\to} 0,

(A.3)

then $\sum_{k = 1}^{n} Z_{n, k} \overset{d}{\to} Z$ , where Z ~ Pois(η) is a Poisson random variable.

Theorem A.2 (Lyapunov-type martingale CLT)

Let {(Z_n_,_k, ℱ_n_,_k) k = 1, . . . , n; n ≥ 1} be a double sequence of martingale differences. If

\sum_{k = 1}^{n} E (Y_{n, k}^{2} ∣ F_{n, k - 1}) \overset{ℙ}{\to} 1

(A.4)

and

\sum_{k = 1}^{n} E Y_{n, k}^{2 + δ} \to 0.

(A.5)

then $\sum_{k = 1}^{n} Z_{n, k} \overset{d}{\to} N$ , where N ~ Norm(0, 1) is a standard normal random variable.

Appendix B. Moment Inequalities

The following moment inequalities are used in Section 2.

Rosenthal inequality

Rosenthal (1970). If X₁, . . . , X_n are independent and centered random variables such that 𝔼|X_i|^r < ∞, i = 1, . . . , n and r > 2 then

\begin{array}{l} E {| \sum_{i = 1}^{n} X_{i} |}^{r} \leq C_{r} max {\sum_{i = 1}^{n} E {∣ X_{i} ∣}^{r}, {(\sum_{i = 1}^{n} E X_{i}^{2})}^{\frac{r}{2}}} \\ \leq C_{r} (\sum_{i = 1}^{n} E {∣ X_{i} ∣}^{r} + {(\sum_{i = 1}^{n} E X_{i}^{2})}^{\frac{r}{2}}) . \end{array}

(B.1)

MZ-BE inequality

Marcinkiewicz and Zygmund (1937) for r ≥ 2, von Bahr and Esseen (1965) for 1 ≤ r ≤ 2. If X₁, . . . , X_n are independent and centered random variables such that 𝔼|X_i|^r < ∞, i = 1, . . . , n then for r > 1

E {| \sum_{i = 1}^{n} X_{i} |}^{r} \leq C_{r} n^{r_{*}} \sum_{i = 1}^{n} E {∣ X_{i} ∣}^{r},

(B.2)

where $r_{*} = 0 \lor (\frac{r}{2} - 1)$ .

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

References

Beśka M, Kłopotowski A, Słomiński L. Limit theorems for random sums of dependent d-dimensional random vectors. Probability Theory and Related Fields. 1982;61(1):43–57. [Google Scholar]
Cressie N, Read TR. Multinomial goodness-of-fit tests. Journal of the Royal Statistical Society. Series B (Methodological) 1984:440–464. [Google Scholar]
Grotzinger JP, Sumner D, Kah L, Stack K, Gupta S, Edgar L, Rubin D, Lewis K, Schieber J, Mangold N, et al. A habitable fluvio-lacustrine environment at Yellowknife Bay, Gale Crater, Mars. Science. 2014;343(6169):1242777. doi: 10.1126/science.1242777. [DOI] [PubMed] [Google Scholar]
Hall P, Heyde CC. Martingale limit theory and its application. New York: Academic Press; 1980. includes indexes. [Google Scholar]
Harris B, Park C. Indagationes Mathematicae (Proceedings) Vol. 74. Elsevier; 1971. The distribution of linear combinations of the sample occupancy numbers; pp. 121–134. [Google Scholar]
Holst L. Asymptotic normality and efficiency for certain goodness-of-fit tests. Biometrika. 1972;59(1):137–145. [Google Scholar]
Inglot T, Jurlewicz T, Ledwina T. Asymptotics for multinomial goodness of fit tests for a simple hypothesis. Theory of Probability & Its Applications. 1991;35(4):771–777. [Google Scholar]
Korolyuk VS, Borovskich YV. Theory of U-statistics. Vol. 273. Springer Science & Business Media; 2013. [Google Scholar]
Marcinkiewicz J, Zygmund A. Quelques théoremes sur les fonctions indépendantes. Fund Math. 1937;29:60–90. [Google Scholar]
Menéndez M, Morales D, Pardo L, Vajda I. Asymptotic distributions of φ-divergences of hypothetical and observed frequencies on refined partitions. Statistica Neerlandica. 1998;52(1):71–89. [Google Scholar]
Morales D, Pardo L, Vajda I. Asymptotic laws for disparity statistics in product multinomial models. Journal of Multivariate Analysis. 2003;85(2):335–360. [Google Scholar]
Morris C. Central limit theorems for multinomial sums. The Annals of Statistics. 1975:165–188. [Google Scholar]
Pearson K. On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science. 1900;50(302):157–175. [Google Scholar]
Pérez T, Pardo J. Asymptotic normality for the Kϕ-divergence goodness-of-fit tests. J Comput Appl Math. 2002;145:301–317. [Google Scholar]
Pietrzak M, Rempała GA, Seweryn M, Wesołowski J. Limit theorems for empirical Rényi entropy and divergence with applications to molecular diversity analysis. TEST. 2016:1–20. [Google Scholar]
Rosenthal HP. On the subspaces ofl p (p¿ 2) spanned by sequences of independent random variables. Israel Journal of Mathematics. 1970;8(3):273–303. [Google Scholar]
Shao J. Springer Texts in Statistics. Springer; 2003. Mathematical Statistics. [Google Scholar]
Steck GP. Limit theorems for conditional distributions. Univ California Publ Statist. 1957;2(12):237–284. [Google Scholar]
Tumanyan SK. On the asymptotic distribution of the chi-square criterion. Dokl Akad Nauk SSSR. 1954;94:1011–1012. [Google Scholar]
Tumanyan SK. Asymptotic distribution of the chi-square criterion when the number of observations and number of groups increase simultaneously. Teor Veroyat Yeyo Primen. 1956;1(1):131–145. [Google Scholar]
von Bahr B, Esseen CG. Inequalities for the r-th absolute moment of a sum of random variables, 1 ≦ r ≦ 2. The Annals of Mathematical Statistics. 1965;36(1):299–303. [Google Scholar]

[R1] Beśka M, Kłopotowski A, Słomiński L. Limit theorems for random sums of dependent d-dimensional random vectors. Probability Theory and Related Fields. 1982;61(1):43–57. [Google Scholar]

[R2] Cressie N, Read TR. Multinomial goodness-of-fit tests. Journal of the Royal Statistical Society. Series B (Methodological) 1984:440–464. [Google Scholar]

[R3] Grotzinger JP, Sumner D, Kah L, Stack K, Gupta S, Edgar L, Rubin D, Lewis K, Schieber J, Mangold N, et al. A habitable fluvio-lacustrine environment at Yellowknife Bay, Gale Crater, Mars. Science. 2014;343(6169):1242777. doi: 10.1126/science.1242777. [DOI] [PubMed] [Google Scholar]

[R4] Hall P, Heyde CC. Martingale limit theory and its application. New York: Academic Press; 1980. includes indexes. [Google Scholar]

[R5] Harris B, Park C. Indagationes Mathematicae (Proceedings) Vol. 74. Elsevier; 1971. The distribution of linear combinations of the sample occupancy numbers; pp. 121–134. [Google Scholar]

[R6] Holst L. Asymptotic normality and efficiency for certain goodness-of-fit tests. Biometrika. 1972;59(1):137–145. [Google Scholar]

[R7] Inglot T, Jurlewicz T, Ledwina T. Asymptotics for multinomial goodness of fit tests for a simple hypothesis. Theory of Probability & Its Applications. 1991;35(4):771–777. [Google Scholar]

[R8] Korolyuk VS, Borovskich YV. Theory of U-statistics. Vol. 273. Springer Science & Business Media; 2013. [Google Scholar]

[R9] Marcinkiewicz J, Zygmund A. Quelques théoremes sur les fonctions indépendantes. Fund Math. 1937;29:60–90. [Google Scholar]

[R10] Menéndez M, Morales D, Pardo L, Vajda I. Asymptotic distributions of φ-divergences of hypothetical and observed frequencies on refined partitions. Statistica Neerlandica. 1998;52(1):71–89. [Google Scholar]

[R11] Morales D, Pardo L, Vajda I. Asymptotic laws for disparity statistics in product multinomial models. Journal of Multivariate Analysis. 2003;85(2):335–360. [Google Scholar]

[R12] Morris C. Central limit theorems for multinomial sums. The Annals of Statistics. 1975:165–188. [Google Scholar]

[R13] Pearson K. On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science. 1900;50(302):157–175. [Google Scholar]

[R14] Pérez T, Pardo J. Asymptotic normality for the Kϕ-divergence goodness-of-fit tests. J Comput Appl Math. 2002;145:301–317. [Google Scholar]

[R15] Pietrzak M, Rempała GA, Seweryn M, Wesołowski J. Limit theorems for empirical Rényi entropy and divergence with applications to molecular diversity analysis. TEST. 2016:1–20. [Google Scholar]

[R16] Rosenthal HP. On the subspaces ofl p (p¿ 2) spanned by sequences of independent random variables. Israel Journal of Mathematics. 1970;8(3):273–303. [Google Scholar]

[R17] Shao J. Springer Texts in Statistics. Springer; 2003. Mathematical Statistics. [Google Scholar]

[R18] Steck GP. Limit theorems for conditional distributions. Univ California Publ Statist. 1957;2(12):237–284. [Google Scholar]

[R19] Tumanyan SK. On the asymptotic distribution of the chi-square criterion. Dokl Akad Nauk SSSR. 1954;94:1011–1012. [Google Scholar]

[R20] Tumanyan SK. Asymptotic distribution of the chi-square criterion when the number of observations and number of groups increase simultaneously. Teor Veroyat Yeyo Primen. 1956;1(1):131–145. [Google Scholar]

[R21] von Bahr B, Esseen CG. Inequalities for the r-th absolute moment of a sum of random variables, 1 ≦ r ≦ 2. The Annals of Mathematical Statistics. 1965;36(1):299–303. [Google Scholar]

PERMALINK

Double asymptotics for the chi-square statistic

Grzegorz A Rempała

Jacek Wesołowski

Abstract

1. Preliminaries

2. Poissonian and Gaussian asymptotics

Theorem 2.1

Proof

Theorem 2.2

Remark 2.3

Proof

3. Discussion

Remark 3.1. The case λ = 0

Example 3.1

Corollary 3.2 (Asymptotics of $χ_{n}^{2}$ for uniform distribution)

Remark 3.3

Acknowledgments

Appendix A. Limit Theorems

Theorem A.1 (Poissonian conditional limit theorem)

Theorem A.2 (Lyapunov-type martingale CLT)

Appendix B. Moment Inequalities

Rosenthal inequality

MZ-BE inequality

Footnotes

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Double asymptotics for the chi-square statistic

Grzegorz A Rempała

Jacek Wesołowski

Abstract

1. Preliminaries

2. Poissonian and Gaussian asymptotics

Theorem 2.1

Proof

Theorem 2.2

Remark 2.3

Proof

3. Discussion

Remark 3.1. The case λ = 0

Example 3.1

Corollary 3.2 (Asymptotics of χn2 for uniform distribution)

Remark 3.3

Acknowledgments

Appendix A. Limit Theorems

Theorem A.1 (Poissonian conditional limit theorem)

Theorem A.2 (Lyapunov-type martingale CLT)

Appendix B. Moment Inequalities

Rosenthal inequality

MZ-BE inequality

Footnotes

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

Corollary 3.2 (Asymptotics of $χ_{n}^{2}$ for uniform distribution)