On the Expected Values of Sequences of Functions

Deborah H Glueck; Keith E Muller

doi:10.1081/STA-100002037

. Author manuscript; available in PMC: 2013 Dec 16.

Published in final edited form as: Commun Stat Theory Methods. 2007 Feb 15;30(2):10.1081/STA-100002037. doi: 10.1081/STA-100002037

On the Expected Values of Sequences of Functions

Deborah H Glueck ¹, Keith E Muller ²

PMCID: PMC3864817 NIHMSID: NIHMS446107 PMID: 24353369

Abstract

We prove new extensions to lemmas about combinations of convergent sequences of distribution functions and absolutely continuous bounded functions. New lemma one, a generalized Helly theorem, allows computing the limit of the expected value of a sequence of functions with respect to a sequence of measures. Previously published results allow either the function or the measure to be a sequence, but not both. Lemma two allows computing the expected value of an absolutely continuous monotone function by integrating the probabilities of the inverse function values. Previous results were restricted to the identity function. Lemma three gives a computationally and analytically convenient form for the limit of the expected value of a sequence of functions of a sequence of random variables. This is a new result that follows directly from the first two lemmas. Although the lemmas resemble standard results and seem obviously true, we have found only similar looking and related but quite distinct results in the literature. We provide examples which highlight the value of the new results.

Keywords: Absolutely continuous, Inversion, Integrals

1. Introduction

Computing expectations, both analytically and numerically, remains one of the central problems in statistics. We present three new lemmas which aid both analytic and numerical calculation for some applications in small and large samples.

In particular, the first lemma allows calculation of the limits of expectations, when both the function and the random variable are converging to limits. The second lemma suggests natural transformations for the computation of expectations, a common statistical task. We suggest transformations based on probability distribution functions and their inverses. The required numerical functions are widely available in common statistical programming languages. The choice automatically simplifies numerical computation of expectations by leading to evaluating bounded functions on bounded regions. Although a transformation is a standard numerical technique, the use of probability functions eliminates the guess work involved in choosing the transformation, and provides an ideal match between statistical thinking and ease of computation. The third lemma combines the results and allows one to calculate an expectation when both the function and the random variable are converging to a limit.

The need for the results presented here arose from a desire to derive computable expressions for the power of certain hypothesis tests in multivariate regression with both fixed and random predictors. We provide examples of how we used the lemmas to numerically calculate small sample expectations, and to correctly find limiting results.

Calculating the expected value of variables and functions can be difficult, especially in the limit. Often either the function of integration, or the limits, or both, are unbounded. In attempting to derive large sample properties, a sequence of cumulative distribution functions (CDF's) may converge to a point mass, while their support converges to a set of measure zero. The Riemann integral fails under these conditions. Thus the lemmas that follow must be stated in terms of Lebesgue integrals computed with respect to probability measures.

Although the lemmas resemble standard results and seem obviously true, we have found only similar looking and related but quite distinct results in the literature. All depend on various restrictive regularity conditions that often arise in practice. For example, for a sequence of cumulative distribution functions, {F_n}, Serfling (1, p16) proved that if F_n ⇒ F then for any bounded continuous function g,

lim_{n \to \infty} \int {g dF}_{n} = \int g dF .

(1)

Pratt (2, p74) and Loeve (3, pl26) gave results similar to Equation 1. We give a more general result in which both the integrand and the measure are converging. Gibbons and Chakraborti (4, p37-38) mentioned a quantile transform result, without proof. They define the quantile function using the infimum. Our Corollary 3 is a special case of their result. We illustrate the value of the new results with three examples concerning power of certain multivariate tests.

2. Three Lemmas

Lemma 1

Consider a random variable, X, and a sequence of random variables, {X₁, X₂, …}, with corresponding CDFs F, and {F₁, F₂, …}. Suppose F_n converges to F, and thus X_n converges in distribution to X. Let {g₁(x), g₂(x), …} be a set of continuous bounded functions such that g_n(x) converge uniformly to g(x), a continuous bounded function. Assume that ∀n ∫ g_n dF_n < ∞ and ∫ g dF < ∞. Taking the integrals with respect to the Lebesgue-Stieltjes probability measures induced by F and {F₁, F₂, …} (5, p69),

lim_{n \to \infty} \int g_{n} {dF}_{n} = \int g dF .

(2)

The special case of ∫ gd F_n corresponds to a Helly theorem, (5, p192–194). Also, the results of exercise 9-2 in Burrill (5, p195) indicate that less stringent regularity conditions would be hard to find for the special case of ∫ g_nd F.

Proof. It suffices to show that ∀∊ > 0, ∃ M(∊) > 0 such that for n > M(∊),

| \int g_{n} {d F}_{n} - \int g dF | < ∊ .

(3)

By assumption, g_n(x) converges uniformly to g(x). Thus, ∀∊ > 0, there exists a corresponding number M₁(∊) > 0 such that for all n > M₁ and for all x in the domain of g, |g_n(x) − g(x)| < ∊/2 (6, p530). Thus, for n > M₁(∊),

| \int g_{n} {d F}_{n} - \int {g dF}_{n} | = | \int (g_{n} - g) {d F}_{n} | \leq | \int | g_{n} - g | {d F}_{n} | < | \int | ∊ / 2 | {d F}_{n} | \leq | ∊ / 2 | | \int {d F}_{n} | \leq ∊ / 2 .

(4)

The last step follows because F_n is a cumulative probability distribution function, and hence ∀_n, ∫ d F_n = 1. By part 3 of a theorem in Serfling (1, p16), ∀ ∊ > 0, we may conclude that ∃ M₂(∊) > 0 such that for n > M₂(∊),

| \int {g dF}_{n} - \int g dF | < ∊ / 2 .

(5)

Now, ∀∊ > 0 and ∀x in the domain of g, choose n > max[M₁(∊), M₂(∊)]. Then

| \int g_{n} {dF}_{n} - \int g dF | < ∊ / 2 + ∊ / 2 = ∊,

(6)

with the inequality following from the triangle inequality.

Lemma 2

(Corollary to Lemma 2.1, 7, p243) Let X be a continuous random variable with density f_x(x) and distribution function F_X(x). Let g(x) be a real valued absolutely continuous function that is strictly monotone decreasing in x, so that g(x) > y iff x < g⁻¹(y). Let

\begin{matrix} A = {y : y \geq 0} \cap {y : P {g (X) > y} > 0} and \\ ℬ = {y : (y > 0)} \cap {y : P {g (X) < - y} > 0} . Then \\ ɛ [g (X)] = \int_{A} F_{X} [g^{- 1} (y)] dy - \int_{ℬ} {1 - F_{X} [g^{- 1} (- y)]} dy . \end{matrix}

(7)

Proof. Note that

ɛ [g (X)] = \int_{A} P {g (X) > y} dy - \int_{ℬ} P {g (X) < - y} dy = \int_{A} P {X < g^{- 1} (y)} dy - \int_{ℬ} P {X > g^{- 1} (- y)} dy = \int_{A} P {X < g^{- 1} (y)} dy - \int_{ℬ} (1 - P {X < g^{- 1} (- y)}) dy .

(8)

The result follows.

Corollary 2.1

With the same conditions as in Lemma 1, and for b > a > 0, suppose ∀x ∈ ℛ, g(x) ∈ [a, b]. Then

ɛ [g (X)] = \int_{a}^{b} F_{X} [g^{- 1} (y)] dy .

(9)

Corollary 2.2

With the same conditions as in Lemma 1, consider instead h(x), a real valued absolutely continuous function that is strictly monotone increasing in x, so that h(x) > y iff x > h⁻¹(y). Then

ɛ [h (X)] = \int_{A} {1 - F_{X} [h^{- 1} (y)]} dy - \int_{ℬ} F_{X} [h^{- 1} (- y)] dy .

(10)

Lemma 3

Consider the continuous random variable, X, and the sequence of continuous random variables, {X₁, X₂, …}, with the same assumptions as Lemma 1.Let g(x)and the set {g₁(x), g₂(x), …} be real valued absolutely continuous bounded functions that are strictly monotone decreasing in x, so that g_n(x) > y iff $x < g_{n}^{- 1} (y)$ . Suppose the sequence {g₁(x), g₂(x), …} converges uniformly to g(x). Assume that ∫ g dF < ∞ and ∀n, ∫ g_n dF_n < ∞. For b > a > 0, suppose ∀x ∈ ℛ, g(x) ∈ [a, b]. Then

lim_{n \to \infty} ɛ [g_{n} (X_{n})] = \int_{a}^{b} F_{X} [g^{- 1} (y)] dy .

(11)

Proof. Follows directly from Lemma 1 and Lemma 2.

Corollary 3

(Quantile transformation: see 4, §2.5, p37–38) Consider a real valued random variable X with strictly monotone distribution function F_x(x) and density function f_x(x), defined on the interval (a, b), with −∞ < a < b < ∞. Let y = F_x(x). Then

ɛ (X) = \int_{0}^{1} F_{X}^{- 1} (y) dy .

(12)

When resorting to numerical techniques for calculating expectations, either the density or the region of integration, or both, may be infinite. This transformation reduces the problem to an integral of a bounded function over a bounded interval.

3. Examples

Example for Lemma 1

Glueck (8) considered taking the limit, under a sequence of Pitman local alternatives, of an approximation for the power of the Hotelling-Lawley trace statistic, with Gaussian predictors. The asymptotic power can be written as the expected value of a non-central F, with respect to the distribution of a random noncentrality value. In this setting, the Riemann integral is undefined because the support for the random noncentrality parameter converges to set of measure zero as the parameter converges to a point. Let F(ν₁, ν₂_N, ω_N) indicate a noncentral F random variable, with denominator degrees of freedom and noncentrality depending on N. Suppose

g_{N} = Pr {F (ν_{1}, ν_{2 N}, ω_{N}) \leq f_{N}} .

(13)

Consider integrating g_N with respect to F_n, the distribution function of a sum of independent scaled χ² random variables for which the scaling constants depend on ω_N, and the degrees of freedom depend on N. With α the type 1 error rate, c_crit chosen so that F_χ₂(c_crit; ab) = 1 − α, and lim_N_→∞, ω = ω_L,

lim_{N \to \infty} \int g_{N} d F_{N} = 1 - F_{χ^{2}} (c_{crit}; ν_{1}, ω_{L}) .

(14)

Example for Corollary 2.1

Glueck (8) sought a computational form for the small sample power of the Hotelling-Lawley trace statistic, with Gaussian predictors. Suppose F_ω(w) is the distribution function of ω, a sum of independent scaled χ² random variables. Define

g (ω) = Pr {F (ν_{1}, ν_{2}, ω) < f},

(15)

with ω ∈ [0, ∞] and f chosen so that g(ω) ∈ [0, 1 − α]. Then

ɛ [g (ω)] = \int_{0}^{\infty} g (ω) {dF}_{ω} = \int_{0}^{1 - α} F_{ω} [F^{- 1} (f; ν_{1}, ν_{2}, y)] dy .

(16)

Example for Corollary 3

Muller and Pasour (9) defined

g (t) = F_{χ^{2}} (\frac{ν_{1 s}}{ν_{2 s}} f_{R} t; ν_{1 s}, ω_{s}) - F_{χ^{2}} (\frac{ν_{1 s}}{ν_{2 s}} f_{L} t; ν_{1 s}, ω_{s}),

(17)

and considered integrating

F_{V} (z) = π_{s}^{- 1} \int_{0}^{z_{*}} g (t) f_{χ^{2}} (t; ν_{2 s}) dt .

(18)

They used a particular quantile transformation to produce a much better behaved numerical integral. If p = F_χ₂(t; ν₂_s), then $t = F_{χ^{2}}^{- 1}$ (p; ν₂_s) and dp = f_χ₂(t; ν₂_s)dt. If p₀ = F_χ₂(zν₂_s/σ²; ν₂_s) then

F_{V} (z) = π_{s}^{- 1} \int_{0}^{p_{0}} g_{R} [F_{χ^{2}}^{- 1} (p; ν_{2 s})] dp - π_{s}^{- 1} \int_{0}^{p_{0}} g_{L} [F_{χ^{2}}^{- 1} (p; ν_{2 s})] dp .

(19)

Acknowledgments

The authors gratefully acknowledge the help of Drs. G. G. Koch, R. M. Hamer, L. M. LaVange, D. F. Ransohoff and P. W. Stewart. In addition, an anonymous Associate Editor stimulated us to clarify the motivation of the paper, provide more compelling examples, and improve the readability of the paper. Also, an anonymous Associate Editor brought the work of Gibbons and Chakraborti to our attention. Glueck was supported in part by grant number T32HS000589-04 from the Agency for Health Care Policy and Research to the University of Medicine and Dentistry of New Jersey, and by National Cancer Institute Grant IP30CA46934-01 to the University of Colorado. Muller's work supported in part by NCI program project grant P01 CA47 982-04.

References

1.Serfling RJ. Approximation Theorems of Mathematical Statistics. John Wiley and Sons; New York: 1980. [Google Scholar]
2.Pratt JW. On interchanging limits and integrals. Annals of Mathematical Statistics. 1960;31:74–77. [Google Scholar]
3.Loève M. Probability Theory II. 4th. Springer-Verlag; New York: 1977. [Google Scholar]
4.Gibbons JD, Chakraborti S. Nonparametric Statistical Inference. 3rd. Marcel Dekker; New York: 1992. [Google Scholar]
5.Burrill CW. Measure, Integration and Probability. McGraw-Hill; New York: 1972. [Google Scholar]
6.Courant R, John F. Introduction to Calculus and Analysis, Volume One. Interscience Publishers; New York: 1965. [Google Scholar]
7.Ross S. A First Course in Probability. 2nd. Macmillan; New York: 1984. [Google Scholar]
8.Glueck DH. Power for a Generalization of the General Linear Multivariate Model with Fixed and Random Predictors, Mimeo Series No 2158T. Institute of Statistics; Chapel Hill, North Carolina: 1996. [Google Scholar]
9.Muller KE, Pasour VB. Bias in linear model power and sample size due to estimating variance. Communications in Statistics: Theory and Methods. 1997;26:839–851. doi: 10.1080/03610929708831953. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R1] 1.Serfling RJ. Approximation Theorems of Mathematical Statistics. John Wiley and Sons; New York: 1980. [Google Scholar]

[R2] 2.Pratt JW. On interchanging limits and integrals. Annals of Mathematical Statistics. 1960;31:74–77. [Google Scholar]

[R3] 3.Loève M. Probability Theory II. 4th. Springer-Verlag; New York: 1977. [Google Scholar]

[R4] 4.Gibbons JD, Chakraborti S. Nonparametric Statistical Inference. 3rd. Marcel Dekker; New York: 1992. [Google Scholar]

[R5] 5.Burrill CW. Measure, Integration and Probability. McGraw-Hill; New York: 1972. [Google Scholar]

[R6] 6.Courant R, John F. Introduction to Calculus and Analysis, Volume One. Interscience Publishers; New York: 1965. [Google Scholar]

[R7] 7.Ross S. A First Course in Probability. 2nd. Macmillan; New York: 1984. [Google Scholar]

[R8] 8.Glueck DH. Power for a Generalization of the General Linear Multivariate Model with Fixed and Random Predictors, Mimeo Series No 2158T. Institute of Statistics; Chapel Hill, North Carolina: 1996. [Google Scholar]

[R9] 9.Muller KE, Pasour VB. Bias in linear model power and sample size due to estimating variance. Communications in Statistics: Theory and Methods. 1997;26:839–851. doi: 10.1080/03610929708831953. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

On the Expected Values of Sequences of Functions

Deborah H Glueck

Keith E Muller

Abstract

1. Introduction

2. Three Lemmas

Lemma 1

Lemma 2

Corollary 2.1

Corollary 2.2

Lemma 3

Corollary 3

3. Examples

Example for Lemma 1

Example for Corollary 2.1

Example for Corollary 3

Acknowledgments

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

On the Expected Values of Sequences of Functions

Deborah H Glueck

Keith E Muller

Abstract

1. Introduction

2. Three Lemmas

Lemma 1

Lemma 2

Corollary 2.1

Corollary 2.2

Lemma 3

Corollary 3

3. Examples

Example for Lemma 1

Example for Corollary 2.1

Example for Corollary 3

Acknowledgments

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases