Analytic, Computational, and Approximate Forms for Ratios of Noncentral and Central Gaussian Quadratic Forms

Hae-Young Kim; Matthew J Gribbin; Keith E Muller; Douglas J Taylor

doi:10.1198/106186006X112954

. Author manuscript; available in PMC: 2013 Jul 8.

Published in final edited form as: J Comput Graph Stat. 2012 Jan 1;15(2):443–459. doi: 10.1198/106186006X112954

Analytic, Computational, and Approximate Forms for Ratios of Noncentral and Central Gaussian Quadratic Forms

Hae-Young Kim ¹, Matthew J Gribbin ², Keith E Muller ³, Douglas J Taylor ⁴

PMCID: PMC3704188 NIHMSID: NIHMS446113 PMID: 23843686

Abstract

Many useful statistics equal the ratio of a possibly noncentral chi-square to a quadratic form in Gaussian variables with all positive weights. Expressing the density and distribution function as positively weighted sums of corresponding F functions has many advantages. The mixture forms have analytic value when embedded within a more complex problem. The mixture forms also have computational value. The expansions work well with quadratic forms having few components and small degrees of freedom. A more general algorithm from earlier literature can take longer or fail to converge in the same setting. Many approximations have been suggested for the problem. a positively weighted noncentral quadratic form can always have two moments matched to a noncentral chi-square. For a single quadratic form, the noncentral form performs neither uniformly more or less accurately than older approximations. The approach also gives a noncentral F approximation for any ratio of a positively weighted noncentral form to a positively weighted central quadratic form. The method provides better accuracy for noncentral ratios than approximations based on a single chi-square. The accuracy suffices for many practical applications, such as power analysis, even with few degrees of freedom. Naturally the approximation proves much faster and simpler to compute than any exact method. Embedding the approximation in analytic expressions provides simple forms which correctly guarantee only positive values have nonzero probabilities, and also automatically reduce to partially or fully exact results when either quadratic form has only one term.

Keywords: Cumulative distribution function, Mixture distribution, Noncentral F

1. Introduction

Many statistics reduce to the ratio of a possibly noncentral chi-square to a positively weighted quadratic form in Gaussian variables. We focus primarily on computing exactly or accurately approximating the distribution function of such a form. The results lead us to consider more general but closely related ratios with the single chi-square replaced by a positively weighted noncentral quadratic form.

Often both the density and the cumulative distribution function (CDF) of such ratios cannot be found, except in special cases. However, the distribution function may be expressed in terms of a probability for a quadratic form with both positive and negative weights. Davies's (1980) method applies to such forms, and hence allows computing probabilities for the original ratio of interest. The approach centers on numerical integration of equation (3.2) in Imhof (1961).

Davies's approach has important drawbacks. First, it is strictly a numerical procedure, based on numerical integration of the characteristic function. Therefore it does not allow convenient analytic manipulation when nested inside other statistical problems. Second, careful implementation requires a large amount of code. Third, as Davies noted, the method may converge slowly, or fail to converge, particularly when a single term with a few degrees of freedom dominates. Forms of interest in the present article seem likely to exhibit the problems.

Interest in ratios of quadratic forms has a long history, especially from the perspective of mixture representations. Robbins and Pitman (1949) discussed mixture distributions in general. They also described series expansions for (1) a positively weighted sum of central quadratic forms; (2) a ratio of such forms; and (3) the ratio of a single, noncentral chi-square to the weighted sum of two central chi-square variables. Ruben (1962) described an expansion for the distribution function of positively weighted sums of noncentral chi-square as a mixture of central chi-square variables, as well as a second expansion as a mixture of noncentral chi-square variables. A number of Robbins and Pitman's results are special cases of Ruben's results.

Many others have contributed. Johnson and Kotz (1970, chap. 29, pp. 169–173) and Johnson, Kotz, and Balakrishnan (1994) summarized both theory and computational practice. Davies (1980) and Shively, Ansley, and Kohn (1990) devised algorithms for computing the density and CDF of a ratio of quadratic forms. Mathai and Provost (1992) reviewed approximations to the CDF of quadratic forms. One approach focuses on determining the moments, or approximations to the moments, to define an approximating CDF (Provost and Rudiuk 1992; Smith 1996). In a related approach, Lieberman (1994) described saddlepoint approximations for the density of a ratio of quadratic forms in normal variables for the central and noncentral case, respectively.

A wide variety of CDF approximations for such ratios use the CDF of another random variable to simplify computation. Matching moments of the quadratic form in the ratio to a scaled chi-square, which corresponds to Satterthwaite's (1946) approach, gives an F approximation. Expressing the desired CDF in terms of the CDF of a quadratic form with both positive and negative weights leads to a different set of approximations. Imhof (1961) described an especially accurate approximation, based on matching three moments of the quadratic form to a central chi-square. Muller and Barton (1989) suggested approximating the special case of a positively weighted sum of noncentral chi-square variables with a single scaled noncentral chi-square which matches the first noncentral moment and two corresponding central moments. The approximation works well, but not as well as Imhof's overall for many cases.

Applying Imhof's approximation to a ratio involves a quadratic form with both positive and negative coefficients. Such forms can naturally lead to a symmetric distribution, which has skewness of zero, and disallows Imhof's approximation from existing. In practice, skewness values near zero cause the approximation to fail due to chi-square degrees of freedom too large for accurate computations with finite precision arithmetic. Muller and Barton (1989) used the obvious alternative of approximating the numerator and denominator separately by a scaled chi-square, which leads to a central or noncentral F approximation. The approach guarantees a numerically stable approximation always exists for a far wider range of conditions. The approach has a simple interpretation, and automatically reduces to partially exact or exact results when one or both components of the ratio have only one term.

A scaled noncentral chi-square provides the obvious approximation for a sum of non-central chi-square variables, with a natural choice matching three moments to choose the three parameters (scale, degrees of freedom, and noncentrality). However, no such approximation applies universally (Johnson and Kotz 1970, sec. 29.5.8). As we show in Section 4, a two-moment match does exist.

Two applications will illustrate the results. The first application centers on the independent groups t test with Gaussian but heterogeneous data, the Behrens-Fisher problem. The second application involves the distribution of Cronbach's alpha (Cronbach 1951), an estimate of reliability of a measurement process, such as a psychometric test, which contains multiple measures of performance (such as items in multiple choice tests). Kistner and Muller (2004) recently demonstrated that the exact distribution can be expressed in terms of the ratio of a chi-square to a central quadratic form in Gaussian variables.

2. Notation

Lower case bold always indicates a single column, a vector, such as 1_n = [1…1]′, or λ = [λ ₁…λ_K]′, with the prime denoting the transpose. Similarly, upper case bold indicates a matrix, such as Σ, a p × p covariance matrix.

The CDF of random variable X is written F_X (x), with corresponding density denoted f_X (x). The corresponding mean and variance are written E (X) and ν(X). Writing X ∼ Inline graphic (μ, σ²) indicates that X follows a scalar Gaussian distribution with mean μ and variance σ², while x ∼ _p (μ, Σ) indicates the p × 1 vector x follows a multivariate Gaussian distribution. In turn, X ∼ χ² (ν, ω) indicates X follows a noncentral chi-square distribution with ν degrees of freedom and noncentrality ω. Similarly, X ∼ F(ν₁, ν₂, ω) indicates X has a noncentral F distribution with ν₁ numerator degrees of freedom, ν₂ denominator degrees of freedom, and noncentrality ω. The corresponding CDF's are denoted by F_χ₂ (ν, ω) and F_F (ν₁, ν₂, ω). A weighted sum of K independent and possibly noncentral chi-square variables, with positive weights {λ_k}, degrees of freedom {ν_k}, and noncentrality parameters {ω_k} will be indicated by Q (λ, ν, ω). More precisely,

Q (λ, ν, ω) = \sum_{k = 1}^{K} λ_{k} X_{k} .

(2.1)

The following lemma gives a standard representation that will be used in Section 3.

Lemma 1

If X₁ ∼ χ² (ν₁,ω) independently of X₂ ∼ χ² (ν₂), and F = (X₁/ν₁)/(X₂/ν₂),

Pr {F \leq r} = Pr {(X_{1} / v_{1}) \leq r (X_{2} / v_{2})} = \int_{0}^{\infty} F_{χ^{2}} (tr v_{1} / v_{2}; v_{1}, ω) f_{χ^{2}} (t; v_{2}) dt .

(2.2)

3. A noncentral χ² over a positively weighted quadratic form

3.1 An Expansion in terms of F Random Variables

We use Ruben's formulas for quadratic forms with all positive weights to specify an expansion for the ratio of a possibly noncentral chi-square to a positively weighted quadratic form. The approach provides a single series expansion, which may be chosen to be a mixture, for the density and CDF in terms of F densities and CDF's. A variety of bounds are also provided. Some special cases of the results are equivalent to expansions derived by Robbins and Pitman.

We assume $λ_{R}^{'} = [{1^{'}}_{ν 1} λ_{1} {1^{'}}_{ν 2} λ_{2} \dots {1^{'}}_{ν K} λ_{K}] = {λ_{qR}}$ with λ_q_R > 0, and ${ω^{'}}_{R} = {ω_{q R}} = [{1^{'}}_{ν 1} ω_{1} / ν_{1} {1^{'}}_{ν 2} ω_{2} / ν_{2} \dots {1^{'}}_{ν K} ω_{K} / ν_{K}]$ with ω_q_R ≥ 0. Both are ν₊ × 1, with

ν_{+} = \sum_{k = 1}^{K} ν_{k} .

(3.1)

With X₀ ∼ χ²(ν₀,ω₀), X_k ∼ χ²(ν_k), all mutually independent, and U_q ∼ χ²(1, 0), we focus on

Pr {\frac{Q_{1}}{Q_{2}} \leq r} = Pr {\frac{λ_{0} X_{0}}{\sum_{k = 1}^{K} λ_{k} X_{k}} \leq r} = Pr {\frac{λ_{0} X_{0}}{\sum_{q = 1}^{ν +} λ_{R q} U_{q}} \leq r} = Pr {λ_{0} X_{0} - r \sum_{q = 1}^{ν +} λ_{R q} U_{q} \leq 0} .

(3.2)

Theorem 1

If X₀ ∼ χ²(ν₀, ω₀) and X_k ∼ χ²(ν_k) are all mutually independent with λ₀ > 0, λ_k > 0, then

Pr {\frac{λ_{0} X_{0}}{\sum_{k = 1}^{K} λ_{k} X_{k}} \leq r} = \sum_{j = 0}^{\infty} c_{j} F_{F} [\frac{r β (ν_{+} + 2 j)}{λ_{0} ν_{0}}; ν_{0}, ν_{+} + 2 j, ω_{0}] .

(3.3)

The coefficients may be computed as

d_{m} = \sum_{q = 1}^{ν_{+}} {(1 - β / λ_{R q})}^{m},

(3.4)

and

c_{j} = {\begin{matrix} \prod_{q = 1}^{ν +} {(β / λ_{R q})}^{1 / 2} j = 0 \\ \frac{1}{2 j} \sum_{m = 0}^{j - 1} d_{j - m} c_{m} j \geq 1 . \end{matrix} .

(3.5)

Choosing β ≤ λ₁ gives 0 ≤ c_j < 1 and $\sum_{j = 0}^{\infty} c_{j} = 1$ , which gives a mixture.

Proof: Ruben's (1962) expansion for the CDF of a central quadratic form gives

Pr {Q_{1} / Q_{2} \leq r} = Pr {Q_{1} \leq r Q_{2}} = \int_{0}^{\infty} F_{χ^{2}} (rq / λ_{0}; ν_{0}, ω_{0}) β^{- 1} \sum_{j = 0}^{\infty} c_{j} f_{χ^{2}} (q β^{- 1}; ν_{+} + 2 j) dq = \sum_{j = 0}^{\infty} c_{j} \int_{0}^{\infty} F_{χ^{2}} (rq / λ_{0}; ν_{0}, ω_{0}) f_{χ^{2}} (q β^{- 1}; ν_{+} + 2 j) β^{- 1} dq = \sum_{j = 0}^{\infty} c_{j} \int_{0}^{\infty} F_{χ^{2}} (tr β / λ_{0}; ν_{0}, ω_{0}) f_{χ^{2}} (t; ν_{+} + 2 j) dt .

(3.6)

The interchange of integration and summation is allowed by Fubini's theorem. An individual term in the series may be reduced to the desired form by using Lemma 1 to write

\int_{0}^{\infty} F_{χ^{2}} (tr β / λ_{0}; ν_{0}, ω_{0}) f_{χ^{2}} (t; ν_{+} + 2 j) dt = \int_{0}^{\infty} F_{χ^{2}} {t [\frac{r β (ν_{+} + 2 j)}{λ_{0} ν_{0}}] \frac{ν_{0}}{(ν_{+} + 2 j)}; ν_{0}, ω_{0}} f_{χ^{2}} (t; ν_{+} + 2 j) dt = F_{F} [r β (ν_{+} + 2 j) / (λ_{0} ν_{0}); ν_{0}, ν_{+} + 2 j, ω_{0}] .

(3.7)

Specifying the coefficients completes proof. Ruben (1962) and Johnson and Kotz (1970, equations 88–89) provided explicit and convenient forms.

Johnson and Kotz summarized Ruben's extensive discussion of choices for β in terms of a bound on truncation error for the CDF (equation 43 in Johnson and Kotz 1970, p. 158). For β ≤ λ₁, which guarantees the expansion is a mixture, Ruben's bound on truncation error is least when β = λ₁. For a (possibly nonmixture) expansion with β ≥ λ_K, the bound on truncation error is least when β = λ_K. However, Ruben recommended β = 2λ₁λ_K/ (λ₁ + λ_K) to improve convergence because it yields a much smaller bound on truncation error than does β = λ_K or β = λ₁. Unfortunately, in the setting discussed here, the (generalized) bounds rarely provide any useful guidance. Hence we do not provide any details.

Corollary 1

For $y_{ik} \sim N (μ_{k}, σ_{k}^{2})$ , i ∈ {1, …,N_k} k ∈ {1, 2}, all independent, ${\bar{y}}_{k} = \sum_{i = 1}^{N_{k}} y_{ik} / N_{k} = {\hat{μ}}_{k}$ , $S_{k} = \sum_{i = 1}^{N_{k}} {(y_{ik} - {\bar{y}}_{k})}^{2}$ and s² = (S₁ + S₂) (1/N₁ + 1/N₂) / (N₁ + N₂ − 2),the Behrens-Fisher statistic is t = (ȳ₁ − ȳ₂)/s. With $ω_{t} = {(μ_{1} - μ_{2})}^{2} / (σ_{1}^{2} / N_{1} + σ_{2}^{2} / N_{2})$ , β as in the theorem, c_j in terms of ${λ_{R q}} \in {σ_{k}^{2} (N_{1} + N_{2}) / [N_{1} N_{2} (N_{1} + N_{2} - 2)]}$ , the CDF of t² is

Pr {t^{2} \leq r} = \sum_{j = 0}^{\infty} c_{j} F_{F} [\frac{r β (N_{1} + N_{2} - 2 + 2 j)}{σ_{1}^{2} / N_{1} + σ_{2}^{2} / N_{2}}; 1, N_{1} + N_{2} - 2 + 2 j, ω_{t}] .

(3.8)

Proof: The null hypothesis is H₀ : μ₁ = μ₂, or H₀ : μ₁ − μ₂ = δ = 0. Also, $X_{k} \sim χ_{1}^{2} (N_{k} - 1)$ . The test statistic is

t^{2} = \frac{{({\hat{μ}}_{1} - {\hat{μ}}_{2})}^{2}}{(S_{1} + S_{2}) (1 / N_{1} + 1 / N_{2}) / (N_{1} + N_{2} - 2)} = \frac{{\hat{δ}}^{2}}{\frac{(N_{1} + N_{2}) σ_{1}^{2}}{N_{1} N_{2} (N_{1} + N_{2} - 2)} X_{1} + \frac{(N_{1} + N_{2}) σ_{2}^{2}}{N_{1} N_{2} (N_{1} + N_{2} - 2)} X_{2}} .

(3.9)

It is straightforward to prove that {S₁, S₂, μ̂₁, μ ^₂} are mutually independent. Also,

X_{0} = {\hat{δ}}^{2} / (σ_{1}^{2} / N_{1} + σ_{2}^{2} / N_{2}) \sim χ^{2} (1, ω_{t}),

(3.10)

The denominator is a weighted sum of independent central chi-square variables, with weights $λ_{kt} = σ_{k}^{2} (N_{1} + N_{2}) / [N_{1} N_{2} (N_{1} + N_{2} - 2)]$ . If $λ_{0 t} = (σ_{1}^{2} / N_{1} + σ_{2}^{2} / N_{2})$ , then

t^{2} = \frac{λ_{0 t} X_{0} / 1}{λ_{1 t} X_{1} + λ_{2 t} X_{2}} .

(3.11)

In the notation of Theorem 1, ν₀ = 1, ν₁ = N₁ − 1, ν₂ = N₂ − 1, and ω₀ = ω_t. In turn, ν₊ = N₁ + N₂ − 2, and the result follows immediately.

Corollary 2

For N independent samples of y ∼ Inline graphic _p(μ, Σ), the matrix $A_{2} = F_{'}^{\sum} A_{1} F_{\sum},$ with $A_{1} = [p / (p - 1) - r_{α}] 1_{p} 1_{p}^{'} - I_{p} [p / (p - 1)]$ , has one positive and p − 1 negative eigenvalues, {λ_{A_2,k}}. With β as in the theorem, c_j in terms of {λ_Rq} ∈ {λ_A₂,2, …, λ_A₂,p}, the CDF of the sample estimate of Cronbach's α is

Pr {{\hat{ρ}}_{α} \leq r_{α}} = \sum_{j = 0}^{\infty} c_{j} F_{F} [\frac{β [(p - 1) (N - 1) + 2 j]}{λ_{A_{2}, 1} (N - 1)}; N - 1, (p - 1) (N - 1) + 2 j] .

(3.12)

With intraclass correlation, ρ_I, such that $ρ_{α} = p / [1 / ρ_{I} + (p - 1)]$ ,

Pr {{\hat{ρ}}_{1} \leq r_{I}} = Pr {{\hat{ρ}}_{α} \leq p / [1 / r_{I} + (p - 1)]} .

(3.13)

Proof: A sample covariance matrix is C = {c_ij}, while ρ̂_I indicates an estimate, and r_I a particular value of intraclass correlation. Here ρ ^_I is

{\hat{ρ}}_{I} = {\sum_{i = 1}^{p - 1} \sum_{j = i + 1}^{p} c_{ij} / [p (p - 1)]} / (\sum_{i = 1}^{p} c_{ii} / p),

(3.14)

and ρ̂_α = p/ [(1/ρ ^_I) + (p − 1)] is the estimate of Cronbach's α. With (N − 1) > p and $\sum = F_{\sum} F_{\sum}^{'},$ the distribution function of ρ̂_α exactly equals the distribution function of a weighted sum of independent central chi-square variables (Kistner and Muller 2004). The weights are {λA₂,k}, the single positive andp − 1 negative eigenvalues of $A_{2} = F_{\sum}^{'} A_{1} F_{\sum}$ , with $A_{1} = [p / (p - 1) - r_{α}] 1_{p} 1_{p}^{'} - I_{p} [p / (p - 1)]$ . In turn, with independent X_k ∼ χ² (N − 1),

Pr {{\hat{ρ}}_{α} \leq r_{α}} = Pr {(λ_{A_{2}, 1} X_{0}) / (\sum_{k = 2}^{p} | λ_{A_{2}, k} | X_{k}) \leq 1} .

(3.15)

Applying the theorem to the last equation directly gives Equation (3.12), the desired result for ρ_α. In turn, ρ_I is a one-to-one function of ρ_α. Hence Pr {ρ̂_I\≤ r_I} = Pr {ρ̂_α ≤ p/[1/r_I+ (p − 1)]}.

3.2 Error Bounds for the Expansion

Corollary 3

For a mixture representation, a uniform upper bound on truncation error is

\sum_{j = J + 1}^{\infty} c_{j} F_{F} [r β (ν_{+} + 2 j) {(λ_{0} ν_{0})}^{- 1}; ν_{0}, ν_{+} + 2 j, ω_{0}] \leq 1 - \sum_{j = 0}^{J} c_{j} .

(3.16)

Proof: Robbins and Pitman (1949) provided the basis of the proof in Equation (9) and surrounding text in their article. A mixture has all positive coefficients and terms, as is F_F [·]. Hence

| p - \sum_{j = 0}^{J} c_{j} F_{F} [\frac{r β (ν_{+} + 2 j)}{λ_{0} ν_{0}}; ν_{0}, ν_{+} + 2 j, ω_{0}] | = | \sum_{j = J + 1}^{\infty} c_{j} F_{F} [\frac{r β (ν_{+} + 2 j)}{λ_{0} ν_{0}}; ν_{0}, ν_{+} + 2 j, ω_{0}] | = \sum_{j = J + 1}^{\infty} c_{j} F_{F} [\frac{r β (ν_{+} + 2 j)}{λ_{0} ν_{0}}; ν_{0}, ν_{+} + 2 j, ω_{0}] \leq \sum_{j = J + 1}^{\infty} c_{j} \cdot 1 = 1 - \sum_{j = 0}^{J} c_{j} .

(3.17)

Although Robbins and Pitman described series expansions for a ratio of quadratic forms, their approach to computing coefficients is very complicated. Ruben presented a convenient form for the computations. However, he considered a single quadratic form, not a ratio. We applied Ruben's method to compute the coefficients for the ratio of interest, and considered a bound on the error term of the mixture representation presented by Robbins and Pitman.

Corollary 4

For a mixture representation, a uniform lower bound on truncation error is

F_{F} [J] \cdot (1 - \sum_{j = 0}^{J} c_{j}) \leq \sum_{j = J + 1}^{\infty} c_{j} F_{F} [\frac{r β (ν_{+} + 2 j)}{λ_{0} ν_{0}}; ν_{0}, ν_{+} + 2 j, ω_{0}],

(3.18)

with F_F [j] = F_F [rβ (ν₊ + 2j + 2) / (λ₀ ν₀); ν₀,ν₊ + 2j + 2, ω₀].

Proof: If ∈ > 0, then F_F (f; υ₁, υ₂, ω₀) ≤ F_F (f; υ₁, υ₂ + ∈, ω₀) and F_F (f; υ₁, υ₂, ω₀) ≤ F_F (f + ∈; υ₁, υ₂, ω₀). Hence

\sum_{j = J + 1}^{\infty} c_{j} F_{F} [\frac{r β (ν_{+} + 2 j)}{λ_{0} ν_{0}}; ν_{0}, ν_{+} + 2 j, ω_{0}] \geq \sum_{j = J + 1}^{\infty} c_{j} F_{F} [\frac{r β (ν_{+} + 2 J + 2)}{λ_{0} ν_{0}}; ν_{0}, ν_{+} + 2 J + 2, ω_{0}] .

(3.19)

In turn,

\sum_{j = J + 1}^{\infty} c_{j} F_{F} [r β (ν_{+} + 2 J + 2) / (λ_{0} ν_{0}); ν_{0}, ν_{+} + 2 J + 2, ω_{0}] = \sum_{j = J + 1}^{\infty} c_{j} F_{F} [J] = F_{F} [J] \sum_{j = J + 1}^{\infty} c_{j} .

(3.20)

As J increases, the lower and upper bounds grow closer together. Except for very small r, even moderate values of J give F_F [J] near 1 and a narrow bounding interval. Any stopping rule should insure that the lower bound is less than the tolerable error.

Corollary 5

The $CDF F_{R} (r) = Pr {λ_{0} X_{0} / \sum_{k = 1}^{K} λ_{k} X_{k} \leq r}$ is an increasing function of λ_k for k ∈ {1,…, K}, for each {r, ν, ω₀}. Assuming λ_k, for k ∈ {1,…, K}, are sorted from smallest to largest, stochastic bounds on the probability are

F_{F} (\frac{λ_{1} ν_{+}}{λ_{0} ν_{0}} r; ν_{0}, ν_{+}, ω_{0}) \leq F_{R} (r) \leq F_{F} (\frac{λ_{K} ν_{+}}{λ_{0} ν_{0}} r; ν_{0}, ν_{+}, ω_{0}) .

(3.21)

Proof: Here

\begin{matrix} Pr {\frac{λ_{0} X_{0}}{\sum_{k = 1}^{K} (λ_{k} / λ_{1}) X_{k}} \leq λ_{1} r} = F_{R} (r) = Pr {\frac{λ_{0} X_{0}}{\sum_{k = 1}^{K} (λ_{k} / λ_{1}) X_{k}} \leq λ_{1} r} \\ Pr {\frac{λ_{0} X_{0}}{(λ_{1} / λ_{1}) \sum_{k = 1}^{K} X_{k}} \leq λ_{1} r} \leq F_{R} (r) \leq Pr {\frac{λ_{0} X_{0}}{(λ_{K} / λ_{1}) \sum_{k = 1}^{K} X_{k}} \leq λ_{1} r} \\ Pr {\frac{X_{0} / ν_{0}}{\sum_{k = 1}^{K} X_{k} / ν_{+}} \leq \frac{λ_{1} ν_{+}}{λ_{0} ν_{0}} r} \leq F_{R} (r) \leq Pr {\frac{X_{0} / ν_{0}}{\sum_{k = 1}^{K} X_{k} / ν_{+}} \leq \frac{λ_{K} ν_{+}}{λ_{0} ν_{0}} r} \\ F_{F} {r \frac{λ_{1} ν_{+}}{λ_{0} ν_{0}}; ν_{0}, ν_{+}, ω_{0}} \leq F_{R} (r) \leq F_{F} {r \frac{λ_{K} ν_{+}}{λ_{0} ν_{0}}; ν_{0}, ν_{+}, ω_{0}} . \end{matrix}

(3.22)

Corollary 6

If ω₀ = 0, a global bound can be specified using equation (3.18) in Ramirez and Jensen (1991). For J + 1 > (K − 2) /|log ε|, K_* the smallest even integer with K* ≥K, $A = \prod_{i = 1}^{K} {(β / λ_{i})}^{1 / 2}$ , ε = max_1≤i≤K (1 − β/λ_i), and V₀ ∼ χ² (K_*), the error in approximating Pr {R ≤ r} by truncating at J terms is bounded by

e_{J} \leq \frac{β}{r} \cdot \frac{Γ (K_{*} / 2)}{Γ (K / 2)} \cdot \frac{A ɛ}{{(ɛ | log ɛ |)}^{K_{*} / 2}} Pr {V_{0} > (2 J + K_{*} - 2) | log ɛ |} .

(3.23)

Proof: For the Cronbach α and intraclass correlation application, we are interested in a random variable of the form

R^{- 1} = Y = \sum_{k = 1}^{K} λ_{k} X_{k} / (λ_{0} X_{0}),

(3.24)

with corresponding realizations r and y. We seek to compute Pr {R ≤ r}, and bound the truncation error. Integrating equation (3.5) from Ramirez and Jensen (1991) gives

F_{Y} (y) = \int_{0}^{y} \sum_{j = 0}^{\infty} \frac{c_{j}}{β} \cdot f_{F} (t / β; K + 2 j, ν_{0} - K + 1) dt = \sum_{j = 0}^{\infty} c_{j} \int_{0}^{y / β} f_{F} (u; K + 2 j, ν_{0} - K + 1) du = \sum_{j = 0}^{\infty} c_{j} F_{F} (y / β; K + 2 j, ν_{0} - K + 1) = \sum_{j = 0}^{\infty} c_{j} [1 - F_{F} (β / y; ν_{0} - K + 1, K + 2 j)] = 1 - \sum_{j = 0}^{\infty} c_{j} F_{F} (β / y; ν_{0} - K + 1, K + 2 j) .

(3.25)

In turn,

Pr {R \leq r} = Pr {1 / r \leq 1 / R} = 1 - Pr {1 / R \leq 1 / r}

(3.26)

and

F_{R} (r) = Pr {1 / R \leq r} = 1 - Pr {R \leq 1 / r} = 1 - F_{R} (1 / r) = 1 - \sum_{j = 0}^{\infty} c_{j} F_{F} (β r^{- 1}; K + 2 j, ν_{0} - K + 1) .

(3.27)

Differentiating with respect to r gives

f_{R} (r) = β r^{- 2} \cdot \sum_{j = 0}^{\infty} c_{j} f_{F} (β r^{- 1}; K + 2 j, ν_{0} - K + 1) .

(3.28)

Therefore

e_{J} \leq \frac{β}{r} \cdot \frac{Γ (K_{*} / 2)}{Γ (K / 2)} \cdot \frac{A ɛ}{{(ɛ | log ɛ |)}^{K_{*} / 2}} Pr {V_{0} > (2 J + K_{*} - 2) | log ɛ |} .

(3.29)

Corollary 7

If ω₀ = 0, a local bound can be specified using equation (3.21) in Ramirez and Jensen (1991). For J + 1 > (ν₀ − 1) /|log εt|, (ν₀* + 1) equals the smallest even integer with (ν₀* +1) ≥ (ν₀ +1), $A = \prod_{i = 1}^{K} {(β / λ_{i})}^{1 / 2}$ , ε = max₁≤i≤ (1 − β/λ_i), c = [A · (rβ)^−(K−2)/2]/[β · Γ (K/2) · Γ[(ν₀ − K + 1)/2] · (1 + 1/rβ)^{(ν0 + 1)/2}], t = (1 + 1/rβ) / (rβ) and V₀ ∼ χ² (ν₀* + 1). The error in approximating Pr {R ≤ r} by truncating at J terms is bounded by

e_{J} (r) \leq \frac{β}{r} \cdot \frac{c Γ [(ν_{0 *} + 1) / 2] ɛ t}{{(ɛ t | log ɛ t |)}^{(ν_{0 *} + 1) / 2}} Pr {V > [2 J + (ν_{0 *} + 1) - 2] | log ɛ t |} .

(3.30)

Proof: The proof parallels the proof of Corollary 6.

4. Approximations

4.1 A Single Noncentral Quadratic Form with Positive Weights

Existing approximations of positively weighted noncentral quadratic forms have a variety of shortcomings. They may lack accuracy in some cases (Muller and Barton 1989), or may not reduce to the exact noncentral form with equal weights (Imhof 1961). Also, Imhof's may be improper in the sense of assigning nonzero probability to negative values, which is not ideal when used in analytic manipulations. The approximation in Lemma 2 simultaneously achieves excellent accuracy, automatic reduction to special cases, and proper support in a single form. As noted earlier, no single approximation can always match three moments of a noncentral form.

Lemma 2

If $Q = \sum_{k = 1}^{K} λ_{k} X_{k}$ , with λ_k > 0 and all mutually independent {X_k} for X_k ∼ χ² (ν_k, ω_k), then strictly positive {λ_*, ν_*} and ω_* ≥ 0 always exist such that Q_* = λ_*X_* with X_* ∼ χ²(ν_*, ω_*) insures E (Q) = E (Q_*) and E [Q − E (Q)]² = ν (Q) = ν(Q_*). Hence

Pr {Q \leq q} \approx Pr {Q_{*} \leq q} .

(4.1)

The required parameters are

λ_{*} = (S_{3} + 2 S_{4}) / (S_{1} + 2 S_{2})

(4.2)

ν_{*} = S_{1} (S_{1} + 2 S_{2}) / (S_{3} + 2 S_{4})

(4.3)

ω_{*} = S_{2} (S_{1} + 2 S_{2}) / (S_{3} + 2 S_{4}),

(4.4)

with

\sum_{k = 1}^{K} λ_{k} ν_{k} = S_{1} = λ_{*} ν_{*}

(4.5)

\sum_{k = 1}^{K} λ_{k} ν_{k} + \sum_{k = 1}^{K} λ_{k} ω_{k} = S_{1} + S_{2} = λ_{*} (ν_{*} + ω_{*})

(4.6)

2 \sum_{k = 1}^{K} λ_{k}^{2} ν_{k} + 4 \sum_{k = 1}^{K} λ_{k}^{2} ω_{k} = 2 S_{3} + 4 S_{4} = 2 λ_{*}^{2} (ν_{*} + 2 ω_{*}) .

(4.7)

Proof: If Q_* = λ_*y_* and y_* ∼ χ² (ν_*,ω_*), then E [Q|max_k(ω_k) = 0] = S1, E [Q| max_k(ω_k) > 0] =S₁+ S₂, and ν [Q| max_k(ω_k) > 0] = 2S₃ + 4S₄. Solving for the parameters of Q_* in terms of {S_m} completes the proof.

4.2 The Ratio of a Noncentral to a Central Quadratic form with All Positive Weights

Theorem 2

If $Q_{1} = \sum_{k = 1}^{K} λ_{k} X_{k},$ , $Q_{2} = \sum_{k = K_{1} + 1}^{K_{1} + K_{2}} λ_{k} X_{k}$ , with all mutually independent {X_k}, λ_k > 0, X_k ∼ χ² (ν_k, ω_k), ω_k = 0 for k > K₁, then strictly positive {λ_*1, λ_*2, ν_*1, ν_*2} and ω_*1 ≥ 0 always exist such that Q_*1/λ_*1 = X_*1 ∼ χ² (ν_*1, ω_*1), Q_*2/λ_*2 = X_*2 ∼ χ²(ν_*2,0), E (Q1) = E (Q_*1), ν (Q1) = ν (Q_*1), E (Q2) = E (Q_*2) and ν (Q2) = V (Q_{_*2}). In turn, with r_* = (rλ_*2/λ_*1) (ν_*2/ν_*1),

Pr {\frac{Q_{1}}{Q_{2}} \leq r} \approx Pr {\frac{λ_{* 1} X_{* 1}}{λ_{* 2} X_{* 2}} \leq r} = Pr {\frac{X_{* 1} / ν_{* 1}}{X_{* 2} / ν_{* 2}} \leq r \cdot \frac{λ_{* 2} / ν_{* 1}}{λ_{* 1} / ν_{* 2}}} = F_{F} (r_{*}, ν_{* 1}, ν_{* 2}, ω_{* 1}) .

(4.8)

Proof: Applying Lemma 2 separately to Q1 andQ2 gives the required {λ_*1, λ_*2, ν_*1, ν_*2,ω_*1}. The last equation completes the proof.

The theorem guarantees that any ratio of a positively weighted noncentral quadratic form to a positively weighted central quadratic form may be approximated by a noncentral F. If all distributions are central the approximation reduces to a Satterthwaite type. If either the numerator or denominator have all weights equal, then the approximation automatically reduces to the correct result. The approximation applies to the Behrens-Fisher statistic, as well as to Cronbach's α and intraclass correlation for general covariance. For compound symmetric covariance, the approximation automatically reduces to the appropriate exact F.

5. Numerical Results

5.1 Computing Strategy

Two problems motivated the research. We illustrate the value of the new expansions and approximations for both the Behrens-Fisher problem, as well as for intraclass correlation and Cronbach's α with general covariance.

We used Davies's algorithm and the series expansion to compute the distribution function. The series was terminated according to the error bounds, with β = λ₁ (the smallest weight). A maximum error of 0.0001 was allowed for both algorithms. We also computed Imhof and F approximations, as well as stochastic bounds.

5.2 Behrens-Fisher Statistic

Moser, Stevens, and Watts (1989) expressed the CDF of the t statistic with $σ_{2}^{2} / σ_{1}^{2} \neq 1$ as a numerical integration. They tabled the CDF for a range of values, including ω ∈ {0, 1, 5,10}, $σ_{2}^{2} / σ_{1}^{2} \in {2, 5, 10}$ , and N_k ∈ {6, 51}. Table 1 contains the exact test size of the uncorrected t, as computed by the new series expansion and Davies' algorithm. All values were also checked against results of Moser et al. (1989) Table 2 contains results for non-null cases with $σ_{2}^{2} / σ_{1}^{2} = 10 .$ In all cases, the new series expansion provides the desired accuracy.

Table 1.

Exact and Approximate Uncorrected t (Behrens-Fisher) Test Size (ω_t = 0). Sample sizes are N₁ and N₂ for the Two Groups.

N₁

N₂

σ_{2}^{2} / σ_{1}^{2}

Davies

Series

Imhof

∼̇ F

Lower

Upper

0.0593

0.0595

0.1018

0.0616

0.0165

0.2273

0.0653

0.0654

0.1133

0.0675

0.0132

0.3645

0.0007

0.0008

<0.0001

0.0007

0.0004

0.0984

0.0001

0.0002

<0.0001

0.0001

<0.0001

0.1566

0.2819

0.2820

0.2849

0.2822

0.0409

0.3531

0.3801

0.3802

0.3774

0.3809

0.0398

0.5082

0.0512

0.0513

0.0564

0.0512

0.0119

0.2548

0.0518

0.0519

0.0578

0.0518

0.0087

0.3996

Open in a new tab

Table 2.

Exact and Approximate Uncorrected t (Behrens-Fisher) Power With $σ_{2}^{2} / σ_{1}^{2} = 10$ . Sample Sizes are N₁ And N₂ for the Two Groups.

N₁	N₂	ω_t	Davies	Series	Imhof	∼̇ F	Lower	Upper
6	6	5	0.5367	0.5368	0.5368	0.5365	0.2829	0.9009
6	6	10	0.8082	0.8083	0.7981	0.8077	0.5792	0.9857
6	51	5	0.0270	0.0271	0.0263	0.0270	0.0178	0.7880
6	51	10	0.1416	0.1417	0.1443	0.1416	0.1061	0.9570
51	6	5	0.9101	0.9102	0.9098	0.9102	0.5544	0.9437
51	6	10	0.9879	0.9879	0.9830	0.9879	0.8519	0.9938
51	51	5	0.6012	0.6012	0.6081	0.6012	0.3355	0.9187
51	51	10	0.8785	0.8785	0.8806	0.8785	0.6862	0.9897

Open in a new tab

Tables 1 and 2 also include the F and Imhof approximations, as well as stochastic lower and upper bounds. For the examples, the new F approximation performs nearly identically to the exact series expansion in most cases.

5.3 Cronbach's α and Intraclass Correlation

Kistner and Muller (2004) computed the exact CDF of Cronbach's α with N = 10, p = 4, and known covariance. They tabled some CDF values, including cases with constant correlation (CS, compound symmetry) for ρ = 0.5, and AR(1), with ρ ∈ {0.2, 0.5, 0.8}. The variance pattern, ${[σ_{1}^{2} σ_{2}^{2} σ_{3}^{2} σ_{4}^{2}]}^{'}$ , was either [1 1 1 1]′, [1 2 3 4]′, or [4 3 2 1]′. An AR(1) correlation structure can be expressed as

[\begin{matrix} 1 & ρ & ρ^{2} & ρ^{3} \\ ρ & 1 & ρ & ρ \\ ρ^{2} & ρ & 1 & ρ \\ ρ^{3} & ρ^{2} & ρ & 1 \end{matrix}] .

(5.1)

Table 3 displays values of the exact CDF computed with the new series and Davies' algorithm, as well as approximations and stochastic bounds. Figure 1 illustrates the corresponding exact and approximate densities, with the exact and F approximation essentially indistinguishable. The density based on the Imhof approximation has a gap caused by missing values resulting from skewness values near zero for the quadratic form. The density in the figure obviously has positive skewness. However, the weights needed are eigenvalues of $A_{2} = F_{\sum}^{'} A_{1} F_{\sum}$ , and vary as a function of r_α. Hence the moments of the approximating quadratic form also vary as a function of r_α. Most awkwardly, the skewness changes from positive to zero to negative for r_α ∈ (0.78, 0.80) in the example. Corresponding approximate degrees of freedom of ν_* ≈ 10⁶ or greater pose a difficulty in computing the approximating chi-square probability.

Table 3.

Exact and Approximate CDF's of Estimated Cronbach's α for Pr{ρ̂_α ≤ 0.70 } with Σ Known, N= 10.

Correlation

Variances

Davies

Series

Imhof

∼̇ F

Lower

Upper

0.5

σ_{k}^{2} \in {1, 1, 1, 1}

0.2689

0.2755

0.2689

AR(1)

0.5

σ_{k}^{2} \in {1, 1, 1, 1}

0.5628

0.5627

0.5605

0.5631

0.2226

0.8427

AR(1)

0.2

σ_{k}^{2} \in {1, 1, 1, 1}

0.9442

0.9389

0.9440

0.8778

0.9777

AR(1)

0.8

σ_{k}^{2} \in {1, 1, 1, 1}

0.0430

0.0429

0.0428

0.0429

0.0020

0.2171

0.5

σ_{k}^{2} \in {1, 2, 3, 4}

0.4697

0.4696

0.4722

0.4705

0.0097

0.8746

Open in a new tab

Density for ρ̂_α. Solid Line for Exact, Dashed Line for F Approximation, and Dotted Line for Imhof's Approximation for p = 4, CS, ρ = 0.5, and $σ_{k}^{2} \in {1, 2, 3, 4}$ .

Table 4 shows the results of the exact CDF for a range of quantiles with N = 10, p = 3, AR(1),ρ = .5, $[σ_{1}^{2} σ_{2}^{2} σ_{3}^{2}] = [\begin{matrix} 1 & 2 & 3 \end{matrix}]$ , as well as the F and Imhof approximations. Figure 2 displays corresponding exact and approximate densities, with the exact and F approximation essentially indistinguishable. The density based on the Imhof approximation is subject to the same skewness difficulty described above for r_α ∈ (0.66, 0.67).

Density for ρ̂_α. Solid Line for Exact, Dashed Line for F Approximation, and Dotted Line for Imhof's Approximation for p = 3, AR(1) with ρ = 0.5, and $σ_{k}^{2} \in {1, 2, 3}$ .

For all cases in Table 3 and 4, the new series expansion achieves the desired accuracy. In addition, the corresponding density calculations remain straightforward. The F approximation performs better than the Imhof approximation in all cases. The approximate density remains well-defined and simple to compute, with degrees of freedom parameters roughly 9 and 18.4 for r_α ∈ (0.78,0.80) in Figure 1, and 9 and 13.2 for r_α ∈ (0.66,0.67) in Figure 2.

Table 4.

Exact and Approximate CDF's of Estimated Cronbach's α, Pr{ρ̂_α ≤ r_α }, for a Range of Quantiles with Σ Known, AR(1), ρ = 0.5, $σ_{k}^{2} \in {1, 2, 3}$ , N= 10.

r_α	Davies	Series	Imhof	∼̇ F	Lower	Upper
0.10	0.0614	0.0613	0.0629	0.0614	0.0024	0.1932
0.20	0.0899	0.0898	0.0940	0.0900	0.0041	0.2605
0.30	0.1349	0.1349	0.1428	0.1353	0.0075	0.3529
0.40	0.2072	0.2072	0.2186	0.2079	0.0146	0.4766
0.50	0.3231	0.3231	0.3339	0.3242	0.0315	0.6317
0.60	0.5010	0.5010	0.5012	0.5020	0.0758	0.8008
0.70	0.7367	0.7368	0.7322	0.7361	0.2035	0.9372
0.80	0.9418	0.9418	0.9111	0.9391	0.5497	0.9942
0.90	0.9992	0.9992	0.9691	0.9989	0.9708	1

Open in a new tab

5.4 Additional Evaluations

The new scaled, noncentral chi-square and the Imhof approximations, as well as the Muller and Barton approximation, were compared to exact values for an additional wide range of quadratic forms. Cases included the sums of 2, 4, and 8 positively weighted chi-square variables for exact quantiles with probabilities in the range 0.00001–0.99999, and increments of 0.00001. Given the agreement with the displayed results, we provide only rough summaries, for the sake of brevity. For positively weighted quadratic forms, the Imhof approximation performed particularly better in the lower tail than the new approximation for all cases and only slightly better in the upper tail than the new approximation in cases with only two weighted chi-square variables. Elsewhere, the new approximation performed almost identically to Imhof's. Both greatly surpassed the Muller and Barton approximation in performance throughout the distribution for all cases.

5.5 Timing Evaluations

In order to provide stable timing estimates, we repeated calculations for each table 20,000 times. Averaging results for Tables 1 and 2, the Imhof approximation was fastest, while the F approximation took approximately 1.5 times longer. In turn, the Davies's algorithm took approximately 158.6 times longer, while the series took approximately 418.1 times longer. Averaging results for Tables 3 and 4, the Imhof approximation was fastest, while the F approximation took approximately 1.3 times longer. In turn, the Davies's algorithm took approximately 52.9 times longer, while the series took approximately 70.2 times longer.

6. Conclusions

The simple form of the series lends itself to analytic manipulation in more complex forms. Such a problem occurs in integral expressions for test size and power of internal pilot designs (Coffey and Muller 1999). The expansion has the additional advantage of being expressible as a mixture.

The series is simple to program, and appears to be numerically stable and accurate. In contrast, Davies's (more general) algorithm requires substantially more code to implement. However, overall, Davies's algorithm is much faster, especially for noncentral cases. Naturally, when approximations provide sufficient accuracy, they are much faster than the series approach or Davies's algorithm.

As Davies noted in his original article, his algorithm has difficulties when dominated by one term. Although we did not present such cases, the series appears to work better in such situations (with either only one positive term or one negative term). The role of alternative β values may also be worth considering in the study of accuracy and speed.

Approximations for the noncentral case were examined in some detail. The new approximation seems by far the best for ratios with a noncentral component. The new approximation uses four parameters (two each for the numerator and denominator), with proper support for numerator and denominator separately, and hence for the ratio. Imhof's method uses three parameters, with an improper support. Hence it is not surprising that the F approximation performs better. In addition, Imhof's approximation can be computationally ill-behaved.

In the “univariate” approach to repeated measures ANOVA, the test statistic can be expressed as the ratio of a noncentral quadratic form to a central form, all with positive weights. Such a ratio includes the ratios studied here as special cases. We expect the approaches that worked well in the present setting can be extended to our ongoing research on power approximations for the “univariate” approach to repeated measures.

We emphasize the conclusion that no single method dominates in computing probabilities involving quadratic forms. Free SAS/IML® software implementing the series method, Davies's algorithm and the F approximation is available at http://www.bios.unc.edu/∼muller.

Acknowledgments

Muller's work supported in part by NCI RO-1 CA095749-01A1, NCI PO1 CA47 982-04 and NIAID 9P30 AI 50410 and NIBIB EB000219. The authors gratefully acknowledge helpful comments from anonymous reviewers and an associate editor which stimulated us to both strengthen our results and simplify the presentation.

Contributor Information

Hae-Young Kim, Email: hkim@bios.unc.edu, Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599-7420.

Matthew J. Gribbin, Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599-7420.

Keith E. Muller, Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599-7420.

Douglas J. Taylor, Family Health International, Research Triangle Park, North Carolina 27709.

References

Coffey CS, Muller KE. Exact Test Size and Power of a Gaussian Error Linear Model for an Internal Pilot Study. Statistics in Medicine. 1999:18, 1199–1214. doi: 10.1002/(sici)1097-0258(19990530)18:10<1199::aid-sim124>3.0.co;2-0. [DOI] [PubMed] [Google Scholar]
Cronbach LJ. Coefficient Alpha and the Internal Structure of Tests. Psychometrika. 1951:16, 297–334. [Google Scholar]
Davies RB. Algorithm AS 155: The Distribution of a Linear Combination of χ2Random Variables. Applied Statistics. 1980:29, 323–333. [Google Scholar]
Imhof JP. Computing the Distribution of Quadratic Forms in Normal Variables. Biometrika. 1961;48:419–426. [Google Scholar]
Johnson NL, Kotz S. Continuous Univariate Distributions—2. Boston: Houghton Mifflin; 1970. [Google Scholar]
Johnson NL, Kotz S, Balakrishnan N. Continuous Univariate Distributions—1. 2nd. New York: Wiley; 1994. [Google Scholar]
Kirk RE. Experimental Design: Procedures for the Behavioral Sciences. 3rd. Belmont, CA: Brooks Cole; 1995. [Google Scholar]
Kistner EO, Muller KE. Exact Distributions of Cronbach's Alpha and Intraclass Correlation With Gaussian Data and General Covariance. Psychometrika. 2004:69, 459–474. doi: 10.1007/BF02295646. [DOI] [PMC free article] [PubMed] [Google Scholar]
Lieberman O. Saddlepoint Approximation for the Distribution of a Ratio of Quadratic Forms in Normal Variables. Journal of the American Statistical Association. 1994;89:924–928. [Google Scholar]
Mathai AM, Provost SB. Quadratic Forms in Random Variables: Theory and Applications. New York: Marcel Dekker; 1992. [Google Scholar]
Moser BK, Stevens GR, Watts CL. The Two-Sample tTest Versus Satterthwaite's Approximate F Test. Communications in Statistics: Theory and Methods. 1989;18:3963–3975. [Google Scholar]
Muller KE, Barton CN. Approximate Power for Repeated Measures ANOVA Lacking Sphericity. Journal of the American Statistical Association. 1989;84:549–555. Correction, (1991), Journal of the American Statistical Association, 86, 255–256. [Google Scholar]
Provost SB, Rudiuk EM. The Exact Distribution Function of the Ratio of Two Quadratic Forms in Noncentral Normal Variables. Metron. 1992;50:33–58. [Google Scholar]
Ramirez DE, Jensen DR. Misspecified T2 Tests. II. Series Expansions. Communications in Statistics, Part B—Simulation and Computation. 1991;20:97–108. [Google Scholar]
Robbins H, Pitman EJG. Applications of the Method of Mixtures to Quadratic Forms in Normal Variates. Annals of Mathematical Statistics. 1949;20:552–560. [Google Scholar]
Ruben H. Probability Content of Regions Under Spherical Normal Distributions, IV: The Distribution of Homogeneous and Non-Homogeneous Quadratic Functions of Normal Variables. Annals of Mathematical Statistics. 1962;33:542–570. [Google Scholar]
Satterthwaite FE. An Approximate Distribution of Estimates of Variance Components. Biometrics Bulletin. 1946;2:110–114. [PubMed] [Google Scholar]
Shively TS, Ansley CF, Kohn R. Fast Evaluation of the Distribution of the Durbin-Waston and Other Invariant Test Statistics in Time Series Regression. Journal of the American Statistical Association. 1990;85:676–685. [Google Scholar]
Smith MD. Comparing Approximations to the Expectation of a Ratio of Quadratic Forms in Normal Variables. Econometric Review. 1996;15:81–95. [Google Scholar]

[R1] Coffey CS, Muller KE. Exact Test Size and Power of a Gaussian Error Linear Model for an Internal Pilot Study. Statistics in Medicine. 1999:18, 1199–1214. doi: 10.1002/(sici)1097-0258(19990530)18:10<1199::aid-sim124>3.0.co;2-0. [DOI] [PubMed] [Google Scholar]

[R2] Cronbach LJ. Coefficient Alpha and the Internal Structure of Tests. Psychometrika. 1951:16, 297–334. [Google Scholar]

[R3] Davies RB. Algorithm AS 155: The Distribution of a Linear Combination of χ2Random Variables. Applied Statistics. 1980:29, 323–333. [Google Scholar]

[R4] Imhof JP. Computing the Distribution of Quadratic Forms in Normal Variables. Biometrika. 1961;48:419–426. [Google Scholar]

[R5] Johnson NL, Kotz S. Continuous Univariate Distributions—2. Boston: Houghton Mifflin; 1970. [Google Scholar]

[R6] Johnson NL, Kotz S, Balakrishnan N. Continuous Univariate Distributions—1. 2nd. New York: Wiley; 1994. [Google Scholar]

[R7] Kirk RE. Experimental Design: Procedures for the Behavioral Sciences. 3rd. Belmont, CA: Brooks Cole; 1995. [Google Scholar]

[R8] Kistner EO, Muller KE. Exact Distributions of Cronbach's Alpha and Intraclass Correlation With Gaussian Data and General Covariance. Psychometrika. 2004:69, 459–474. doi: 10.1007/BF02295646. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R9] Lieberman O. Saddlepoint Approximation for the Distribution of a Ratio of Quadratic Forms in Normal Variables. Journal of the American Statistical Association. 1994;89:924–928. [Google Scholar]

[R10] Mathai AM, Provost SB. Quadratic Forms in Random Variables: Theory and Applications. New York: Marcel Dekker; 1992. [Google Scholar]

[R11] Moser BK, Stevens GR, Watts CL. The Two-Sample tTest Versus Satterthwaite's Approximate F Test. Communications in Statistics: Theory and Methods. 1989;18:3963–3975. [Google Scholar]

[R12] Muller KE, Barton CN. Approximate Power for Repeated Measures ANOVA Lacking Sphericity. Journal of the American Statistical Association. 1989;84:549–555. Correction, (1991), Journal of the American Statistical Association, 86, 255–256. [Google Scholar]

[R13] Provost SB, Rudiuk EM. The Exact Distribution Function of the Ratio of Two Quadratic Forms in Noncentral Normal Variables. Metron. 1992;50:33–58. [Google Scholar]

[R14] Ramirez DE, Jensen DR. Misspecified T2 Tests. II. Series Expansions. Communications in Statistics, Part B—Simulation and Computation. 1991;20:97–108. [Google Scholar]

[R15] Robbins H, Pitman EJG. Applications of the Method of Mixtures to Quadratic Forms in Normal Variates. Annals of Mathematical Statistics. 1949;20:552–560. [Google Scholar]

[R16] Ruben H. Probability Content of Regions Under Spherical Normal Distributions, IV: The Distribution of Homogeneous and Non-Homogeneous Quadratic Functions of Normal Variables. Annals of Mathematical Statistics. 1962;33:542–570. [Google Scholar]

[R17] Satterthwaite FE. An Approximate Distribution of Estimates of Variance Components. Biometrics Bulletin. 1946;2:110–114. [PubMed] [Google Scholar]

[R18] Shively TS, Ansley CF, Kohn R. Fast Evaluation of the Distribution of the Durbin-Waston and Other Invariant Test Statistics in Time Series Regression. Journal of the American Statistical Association. 1990;85:676–685. [Google Scholar]

[R19] Smith MD. Comparing Approximations to the Expectation of a Ratio of Quadratic Forms in Normal Variables. Econometric Review. 1996;15:81–95. [Google Scholar]

PERMALINK

Analytic, Computational, and Approximate Forms for Ratios of Noncentral and Central Gaussian Quadratic Forms

Hae-Young Kim

Matthew J Gribbin

Keith E Muller

Douglas J Taylor

Roles

Abstract

1. Introduction

2. Notation

Lemma 1

3. A noncentral χ2 over a positively weighted quadratic form

3.1 An Expansion in terms of F Random Variables

Theorem 1

Corollary 1

Corollary 2

3.2 Error Bounds for the Expansion

Corollary 3

Corollary 4

Corollary 5

Corollary 6

Corollary 7

4. Approximations

4.1 A Single Noncentral Quadratic Form with Positive Weights

Lemma 2

4.2 The Ratio of a Noncentral to a Central Quadratic form with All Positive Weights

Theorem 2

5. Numerical Results

5.1 Computing Strategy

5.2 Behrens-Fisher Statistic

Table 1.

Table 2.

5.3 Cronbach's α and Intraclass Correlation

Table 3.

Figure 1.

Figure 2.

Table 4.

5.4 Additional Evaluations

5.5 Timing Evaluations

6. Conclusions

Acknowledgments

Contributor Information

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

3. A noncentral χ² over a positively weighted quadratic form