How many Laplace transforms of probability measures are there?

Fuchang Gao; Wenbo V Li; Jon A Wellner

doi:10.1090/S0002-9939-2010-10448-3

. Author manuscript; available in PMC: 2011 May 13.

Published in final edited form as: Proc Am Math Soc. 2010;138(12):4331–4344. doi: 10.1090/S0002-9939-2010-10448-3

How many Laplace transforms of probability measures are there?

Fuchang Gao ^*, Wenbo V Li ^†, Jon A Wellner ^‡

PMCID: PMC3093923 NIHMSID: NIHMS270996 PMID: 21572594

Abstract

A bracketing metric entropy bound for the class of Laplace transforms of probability measures on [0, ∞) is obtained through its connection with the small deviation probability of a smooth Gaussian process. Our results for the particular smooth Gaussian process seem to be of independent interest.

Keywords: Laplace Transform, bracketing metric entropy, completely monotone functions, smooth Gaussian process, small deviation probability

1 Introduction

Let μ be a finite measure on [0, ∞). The Laplace transform of μ is a function on (0, ∞) defined by

f (t) = \int_{0}^{\infty} e^{- ty} μ (dy) .

(1)

It is easy to check that such a function has the property that (−1)ⁿ f⁽ⁿ⁾ (t) ≥ 0 for all non-negative integers n and all t > 0. A function on (0, ∞) with this property is called a completely monotone function on (0, ∞). A characterization due to Bernstein (c.f. Williamson (1956)) says that f is completely monotone on (0, ∞) if and only if there is a non-negative measure μ (not necessary finite) on [0, ∞) such that (1) holds. Therefore, due to monotonicity, the class of Laplace transforms of finite measures on [0, ∞) is the same as the class of bounded completely monotone functions on (0, ∞). These functions can be extended to continuous functions on [0, ∞), and we will call them completely monotone on [0, ∞).

Completely monotonic functions have remarkable applications in various fields, such as probability and statistics, physics and potential theory. The main properties of these functions are given in Widder (1941), Chapter IV. For example, the class of completely monotonic functions is closed under sums, products and pointwise convergence. We refer to Alzer and Berg (2002) for a detailed list of references on completely monotonic functions. Closely related to the class of completely monotonic functions are the so-called k-monotone functions, where the non-negativity of (−1)ⁿ f⁽ⁿ⁾ is required for all integers n ≤ k. In fact, completely monotonic functions can be viewed as the limiting case of the k-monotone functions as k → ∞. In this sense, the present work is a partial extension of Gao (2008) and Gao and Wellner (2009).

Let ℳ_∞ be the class of completely monotone functions on [0, ∞) that are bounded by 1. Then

ℳ_{\infty} = {f : [0, \infty) \to [0, \infty) | f (t) = \int_{0}^{\infty} e^{- tx} μ (dx), ∥ μ ∥ \leq 1} .

It is well known (see e.g. Feller (1971), Theorem 1, page 439) that the sub-class of ℳ_∞ with f(0) = 1 corresponds exactly to the Laplace transforms of the class of probability measures μ on [0, ∞). For a random variable with distribution function F(t) = P(X ≤ t), define the survival function S(t) = 1 − F(t) = P(X > t). Thus the class

S_{\infty} = {S : [0, \infty) \to [0, \infty) | S (t) = \int_{0}^{\infty} e^{- tx} μ (dx), ∥ μ ∥ = 1}

is exactly the class of survival functions of all scale mixtures of the standard exponential distribution (with survival function e^−t), with corresponding densities

p (t) = - S^{'} (t) = \int_{0}^{\infty} {xe}^{- xt} μ (dx), t \geq 0 .

It is easily seen that the class P_∞ of such densities with p(0) < ∞ is also a class of completely monotone functions corresponding to probability measures μ on [0, ∞) with finite first moment. These classes have many applications in statistics; see e.g. Jewell (1982) for a brief survey. Jewell (1982) considered nonparametric estimation of a completely monotone density and showed that the nonparametric maximum likelihood estimator (or MLE) for this class is almost surely consistent. The bracketing entropy bounds derived below can be considered as a first step toward global rates of convergence of the MLE.

In probability and statistical applications, one way to understand the complexity of a function class is by way of the metric entropy for the class under certain common distances. Recall that the metric entropy of a function class F under distance ρ is defined to be log N(ε, F, ρ) where N(ε, F, ρ) is the minimum number of open balls of radius ε needed to cover F. In statistical applications, sometimes bracketing metric entropy is needed. Recall that bracket entropy is defined as log N_{[ ]}(ε, F, ρ) where

N_{[]} (ε, F, ρ) : = \min {n : \exists {\underline{f}}_{1}, {\bar{f}}_{1}, \dots, {\underline{f}}_{n}, {\bar{f}}_{n} s . t . ρ ({\bar{f}}_{k}, {\underline{f}}_{k}) \leq ε, F \subset \cup_{k = 1}^{n} [{\underline{f}}_{k}, {\bar{f}}_{k}]},

and

[{\underline{f}}_{k}, {\bar{f}}_{k}] = {g \in F : {\underline{f}}_{k} \leq g \leq {\bar{f}}_{k}} .

Clearly N(ε, F, ρ) ≤ N_{[ ]}(ε, F, ρ) and they are closely related in our setting below.

In this paper, we study the metric entropy of ℳ_∞ under the L^p(ν) norm given by

{∥ f ∥}_{L^{p} (ν)}^{p} = \int_{0}^{\infty} {| f (x) |}^{p} ν (dx), 1 \leq p \leq \infty,

where ν is a probability measure on [0, ∞). Our main result is the following

Theorem 1.1

Let ν be a probability measure on [0, ∞). There exists a constant C depending only on p ≥ 1 such that for any 0 < ε < 1/4,
$\log N_{[]} (ε, ℳ_{\infty}, {∥ \cdot ∥}_{L^{p} (ν)}) \leq C \log (Γ / γ) \cdot {| \log ε |}^{2},$
for any 0 < γ < Γ < ∞ such that ν([γ, Γ]) ≥ 1 − 4^−pε^p. In particular, if there exists a constant K > 1, such that ν([ε^K, ε^−K]) ≥ 1 − 4^−pε^p, then
$\log N_{[]} (ε, ℳ_{\infty}, {∥ \cdot ∥}_{L^{p} (ν)}) \leq CK {| \log ε |}^{3} .$
If ν is Lebesgue measure on [0, 1], then
$\log N_{[]} (ε, ℳ_{\infty}, {∥ \cdot ∥}_{L^{2} (ν)}) ≍ \log N (ε, ℳ_{\infty}, {∥ \cdot ∥}_{L^{2} (ν)}) ≍ {| \log ε |}^{3},$
where A ≍ B means there exist universal constants C₁, C₂ > 0 such that C₁A ≤ B ≤ C₂B.

As an equivalent result for part (ii) of the above theorem, we have the following important small deviation probability estimates for an associated smooth Gaussian process. In particular, it may be of interest to find a probabilistic proof for the lower bound directly.

Theorem 1.2

Let Y (t), t > 0, be a Gaussian process with covariance EY(t)Y (s) = (1 − e^−t−s)/(t + s), then for 0 < ε < 1

\log ℙ (\sup_{t > 0} | Y (t) | < ε) ≍ - {| \log ε |}^{3} .

The rest of the paper is organized as follows. In Section 2, we provide the upper bound estimate in the main result by explicit construction. In Section 3, we summarize various connections between entropy numbers of a set (and its convex hull) and small ball probabilities for the associated Gaussian process. Some of our observations in a general setting are stated explicitly for the first time. Finally we identify the particular Gaussian process suitable for our entropy estimates. Then in Section 4, we obtain the required upper bound small ball probability estimate (which implies the lower bound entropy estimate as discussed in section 3) by a simple determinant estimate. This method of small ball estimates is made explicit here for the first time and can be used in many more problems. The technical determinant estimates are also of independent interests.

2 Upper Bound Estimate

In this section, we provide an upper bound for N_{[ ]}(ε, ℳ_∞, ∥ · ∥_L^p(ν)), where ν is a probability measure on [0, ∞) and 1 ≤ p ≤ ∞. Before we start, we note that ℳ_∞ is the convex hull of K := {K(t, ·) : t ∈ [0, ∞)} where for each t ∈ [0, ∞), K(t, ·) is a function on [0, ∞) defined by K(t, x) = e^−tx. There are some general results on metric entropy of convex hulls conv(T) using the metric entropy of T. (cf. Dudley (1987), Ball and Pajor (1990), van der Vaart and Wellner (1996), Carl (1997), Carl et al. (1999), Li and Linde (2000), Gao (2004), etc.) For example, Carl et al. (1999) proved that if N(ε, T, ∥ · ∥) = O(ε^−α), α > 0, then

\log N (ε, conv (T), ∥ \cdot ∥) = O (ε^{- 2 α / (2 + α)}),

where ∥ · ∥ is any Banach space norm. Although these results are best possible for the general case, when applied to specific problems they could be far from being sharp. This is especially the case when the metric entropy of T grows at a polynomial rate. For example, in our case, because the functions e^−kx, k = 1, 2, … , n, have mutual L²[0, 1]-distance at least $\frac{1}{2} n^{- 3 / 2}$ , we immediately have N(ε, K, ∥ · ∥_L²[0,1]) ≥ Cε^−2/3. Thus, in the case p = 2 and with ν taken to be Lebesgue measure on [0, 1], the best upper bound we can hope to obtain using the general convex hull result quoted above is

\log N (ε, ℳ_{\infty}, {∥ \cdot ∥}_{L^{2} (ν)}) \leq C_{2} ε^{- 1 / 2},

which is a much larger (at least in the dependence on ∊) than upper bound

\log N (ε, ℳ_{\infty}, {∥ \cdot ∥}_{L^{2} (ν)}) \leq C {| \log ε |}^{3}

which we will obtain later in the section.

We will obtain our upper bound estimate by an explicit construction of ε-brackets under L^p(ν) distance.

For each 0 < ε < 1/4, we choose γ > 0 and Γ = 2^mγ where m is a positive integer such that ν([γ, Γ]) ≥ 1 − 4^−p ε^p. We use the notation I(a ≤ t < b) to denote the indicator function of the interval [a, b). Now for each f ∈ ℳ_∞, we first write in block form

f (t) = I (0 \leq t < γ) f (t) + I (t \geq Γ) f (t) + \sum_{i = 1}^{m} I (2^{i - 1} γ \leq t < 2^{i} γ) f (t) .

Then for each block 2ⁱ⁻¹γ ≤ t < 2ⁱγ, we separate the integration limits at the level 2²⁻ⁱ | log ε|/γ and use the first N terms of Taylor’s series expansion of e^−u with error terms associated with ξ = ξ_u,N, 0 ≤ ξ ≤ 1, to rewrite

f (t) = I (0 \leq t < γ) f (t) + I (t \geq Γ) f (t) + \sum_{i = 1}^{m} (p_{i} (t) + q_{i} (t) + r_{i} (t))

where

\begin{matrix} p_{i} (t) & = : & I (2^{i - 1} γ \leq t < 2^{i} γ) \sum_{n = 0}^{N} \frac{{(- 1)}^{n} t^{n}}{n!} \int_{0}^{2^{2 - i | \log ε | / γ}} x^{n} μ (dx) \\ q_{i} (t) & = : & I (2^{i - 1} γ \leq t < 2^{i} γ) \int_{0}^{2^{2 - i | \log ε | / γ}} \frac{{(- ξ tx)}^{N + 1}}{(N + 1)!} μ (dx) \\ r_{i} (t) & = : & I (2^{i - 1} γ \leq t < 2^{i} γ) \int_{2^{2 - i | \log ε | / γ}}^{\infty} e^{- tx} μ (dx) . \end{matrix}

We choose the integer N so that

4 e^{2} | \log ε | - 1 \leq N < 4 e^{2} | \log ε | .

(2)

Then, by using the inequality k! ≥ (k/e)^k and the fact that 0 < ξ < 1, we have within the block 2ⁱ⁻¹γ ≤ t < 2ⁱγ,

\begin{matrix} | q_{i} (t) | & \leq & \int_{0}^{2^{2 - i | \log ε | / γ}} \frac{{(tx)}^{N + 1}}{(N + 1)!} μ (dx) \\ \leq & \frac{| 4 \log ε |^{N + 1}}{(N + 1)!} \leq {(\frac{4 e | \log ε |}{N + 1})}^{N + 1} \leq e^{- (N + 1)} \leq ε^{4 e^{2}}, \end{matrix}

where we used tx ≤ 2ⁱγ · 2²⁻ⁱ| log ε|/γ = 4| log ε| in the second inequality above. This implies, due to disjoint supports of q_i(t),

| \sum_{i = 1}^{m} q_{i} (t) | \leq ε^{4 e^{2}} .

(3)

Next, we notice that for t ≥ 2ⁱ⁻¹γ and x ≥ 2²⁻ⁱγ⁻¹| log ε|, e^−tx ≤ ε². Thus

| \sum_{i = 1}^{m} r_{i} (t) | \leq \sum_{i = 1}^{m} I (2^{i - 1} γ \leq t < 2^{i} γ) \int_{2^{2 - i} γ^{- 1} | \log ε |}^{\infty} ε^{2} μ (dx) \leq ε^{2} .

(4)

Finally, because |f| ≤ 1 and ν([0, γ)) + ν([Γ, ∞)) ≤ 4^−pε^p, we have

∥ I_{0 \leq t < γ} f (t) + I_{t \geq Γ} f (t) ∥_{L^{p} (ν)} \leq ε / 4 .

Together with (3) and (4), we see that the set

R = : {\sum_{i = 1}^{m} q_{i} (t) + \sum_{i = 1}^{m} r_{i} (t) + I (t < γ) f (t) + I (t \geq Γ) f (t) : f \in ℳ_{\infty}}

has diameter in L^p(ν)-distance at most ε² + ε^4e² + ε/4 < ε/2.

Therefore, if we denote P_i = {p_i(t) : f ∈ ℳ_∞}, then the expansion of f above implies that $ℳ_{\infty} \subset \sum_{i = 1}^{m} P_{i} + ℛ$ , and consequently, we have

N_{[]} (ε, ℳ_{\infty}, {∥ \cdot ∥}_{L^{p} (ν)}) \leq N_{[]} (ε / 2, \sum_{i = 1}^{m} P_{i}, {∥ \cdot ∥}_{L^{p} (ν)}) .

For any 1 ≤ i ≤ m and any p_i ∈ P_i, we can write

p_{i} (t) = I (2^{i - 1} γ \leq t < 2^{i} γ) \sum_{n = 0}^{N_{i}} {(- 1)}^{n} a_{ni} {(2^{- i} γ^{- 1} t)}^{n},

(5)

where 0 ≤ a_ni ≤ |4 log ε|ⁿ/n!. Now we can construct

\begin{matrix} {\bar{p}}_{i} & = & I (2^{i - 1} γ \leq t < 2^{i} γ) \sum_{n = 0}^{N} {(- 1)}^{n} b_{ni} {(2^{- i} γ^{- 1} t)}^{n}, \\ {\underline{p}}_{i} & = & I (2^{i - 1} γ \leq t < 2^{i} γ) \sum_{n = 0}^{N} {(- 1)}^{n} c_{ni} {(2^{- i} γ^{- 1} t)}^{n}, \end{matrix}

where

b_{ni} = {\begin{array}{l} \frac{ε}{2^{n + 2}} ⌈ \frac{2^{n + 2} a_{ni}}{ε} ⌉ & if n is even \\ \frac{ε}{2^{n + 2}} ⌊ \frac{2^{n + 2} a_{ni}}{ε} ⌋ & if n is odd \end{array}, c_{ni} = {\begin{array}{l} \frac{ε}{2^{n + 2}} ⌊ \frac{2^{n + 2} a_{ni}}{ε} ⌋ & if n is even \\ \frac{ε}{2^{n + 2}} ⌈ \frac{2^{n + 2} a_{ni}}{ε} ⌉ & if n is odd \end{array} .

Clearly, p̱_i(t) ≤ p_i(t) ≤ p̄_i(t), and

\begin{array}{l} | {\bar{p}}_{i} - {\underline{p}}_{i} | & \leq & I (2^{i - 1} γ \leq t < 2^{i} γ) \sum_{n = 0}^{N} | c_{ni} - b_{ni} | {(2^{- i} γ^{- 1} t)}^{n} \\ \leq & I (2^{i - 1} γ \leq t < 2^{i} γ) \sum_{n = 0}^{N} \frac{ε}{2^{n + 2}} {(2^{- i} γ^{- 1} t)}^{n} \\ \leq & \frac{ε}{2} I (2^{i - 1} γ \leq t < 2^{i} γ) . \end{array}

Hence

\sum_{i = 1}^{m} {\underline{p}}_{i} \leq \sum_{i = 1}^{m} p_{i} \leq \sum_{i = 1}^{m} {\bar{p}}_{i} \leq \sum_{i = 1}^{m} {\underline{p}}_{i} + ε / 2 .

That is, the sets

\underline{P} = : {\sum_{i = 1}^{m} {\underline{p}}_{i} : p_{i} \in P_{i}, 1 \leq i \leq m} and \bar{P} = : {\sum_{i = 1}^{m} {\bar{p}}_{i} : p_{i} \in P_{i}, 1 \leq i \leq m}

form ε/2 brackets of $\sum_{i = 1}^{m} P_{i}$ in L^∞-norm, and thus in L^p(ν)-norm for all 1 ≤ p < ∞.

Now we count the number of different realizations of P̄ and P̱. Note that, due to the uniform bound on a_ni in (5) there are no more than

\frac{2^{n + 1}}{ε} \cdot \frac{{| 4 \log ε |}^{n}}{n!} + 1

realizations for b_ni. So, the number of realizations of p̄_i is bounded by

\prod_{n = 0}^{N} (\frac{2^{n + 1}}{ε} \cdot \frac{{| 4 \log ε |}^{n}}{n!} + 1) .

Because n! > (n/e)ⁿ, for all 1 ≤ n ≤ N, we have

\frac{2^{n + 1}}{ε} \cdot \frac{{| 4 \log ε |}^{n}}{n!} + 1 \leq \frac{3}{ε} {(\frac{8 e | \log ε |}{n})}^{n} .

Thus, the number of realizations of p̄_i is bounded by

\begin{array}{l} {(\frac{3}{ε})}^{N + 1} \cdot \exp (\sum_{n = 1}^{N} (n \log | 8 e \log ε | - n \log n)) \\ \leq & {(\frac{3}{ε})}^{N + 1} \cdot \exp (\frac{N (N + 1)}{2} \log | 8 e \log ε | - \int_{1}^{N} x \log xdx) \\ \leq & {(\frac{3}{ε})}^{N + 1} \cdot \exp (\frac{N (N + 1)}{2} \log | 8 e \log ε | - \frac{N^{2}}{2} \log N + \frac{N^{2}}{4}) \\ \leq & \exp (C {| \log ε |}^{2}) \end{array}

for some absolute constant C, where in the last inequality we used the bounds on N given in (2).

Hence the total number of realizations of P̄ is bounded by exp (Cm| log ε|²). A similar estimate holds for the total number of realizations of P̱, and we finally obtain

\log N_{[]} (ε, ℳ_{\infty}, {∥ \cdot ∥}_{L^{p} (ν)}) \leq C^{'} m {| \log ε |}^{2}

for some different constant C′. This finishes the proof since m = log₂(Γ/γ).

3 Entropy of Convex Hulls

A lower bound estimate of metric entropy is typically difficult, because it often involves a construction of a well-separated set of maximal cardinality. Thus we introduce some soft analytic arguments to avoid this difficulty and change the problem into a familiar one in this section. The hard estimates are given in the next section.

First note that ℳ_∞ is just the convex hull of the functions k_s(·), 0 < s < ∞, where k_s(t) = e^−ts. We recall a general method to bound the entropy of convex hulls that was introduced in Gao (2004). Let T be a set in ℝⁿ or in a Hilbert space. The convex hull of T can be expressed as

conv (T) = {\sum_{n = 1}^{\infty} a_{n} t_{n} : t_{n} \in T, a_{n} \geq 0, n \in ℕ, \sum_{n = 1}^{\infty} a_{n} = 1};

while the absolute convex hull of T is defined by

abconv (T) = {\sum_{n = 1}^{\infty} a_{n} t_{n} : t_{n} \in T, n \in ℕ, \sum_{n = 1}^{\infty} | a_{n} | \leq 1} .

Clearly, by using probability measures and signed measures, we can express

\begin{array}{l} conv (T) = {\int_{T} t μ (dt) : μ is a probability measure on T}; \\ abconv (T) = {\int_{T} t μ (dt) : μ is a signed measure on T, {∥ μ ∥}_{TV} \leq 1} . \end{array}

The following is clear:

conv (T) \subset abconv (T) \subset conv (T) - conv (T) .

Therefore, for any norm ∥ · ∥,

N (ε, conv (T), ∥ \cdot ∥) \leq N (ε, abconv (T), ∥ \cdot ∥) \leq {[N (ε / 2, conv (T), ∥ \cdot ∥)]}^{2} .

In particular, at the logarithmic level, the two entropy numbers are comparable, modulo constant factors on ε. The benefit of using the absolute convex hull is that it is symmetric and can be viewed as the unit ball of a Banach space, which allows us to use the following duality lemma of metric entropy: there exist constants c₁, c₂, K₁, K₂ > 0 such that for all ε > 0,

\begin{array}{l} K_{1} \log N (c_{1} ε, abconv (T), {∥ \cdot ∥}_{2}) & \leq & \log N (ε, B, {∥ \cdot ∥}_{T}) \\ \leq & K_{2} \log N (c_{2} ε, abconv (T), {∥ \cdot ∥}_{2}), \end{array}

where B is the unit ball of the dual norm of ∥ · ∥, and ∥ · ∥_T is the norm induced by T, that is,

{∥ x ∥}_{T} : = \sup_{t \in T} | 〈 t, x 〉 | = \sup_{t \in abconv (T)} | 〈 t, x 〉 | .

Strictly speaking, the duality lemma remains a conjecture in the general case. However, when the norm ∥ · ∥ is a Hilbert space norm, this has been proved, see Tomczak-Jaegermann (1987), Bourgain et al. (1989), and Artstein et al. (2004).

A striking relation discovered by Kuelbs and Li (1993) says that the entropy number log N(ε, B, ∥ · ∥_T) is determined by the Gaussian measure of the set

D_{ε} = : {x \in H : {∥ x ∥}_{T} \leq ε}

under some very weak regularity assumptions. For details, see Kuelbs and Li (1993), Li and Linde (1999), and also Corollary 2.2 of Aurzada et al. (2009). Using this relation, we can now summarize the connection between the metric entropy of convex hulls and the Gaussian measure of D_ε as follows:

Proposition 3.1

Let T be a precompact set in a Hilbert space. For α > 0 and β ∈ ℝ, there exists a constant C₁ > 0 such that for all 0 < ε < 1,

\log ℙ (D_{ε}) \leq - C_{1} ε^{- α} {| \log ε |}^{β}

if and only if there exists a constant C₂ > 0 such that for all 0 < ε < 1,

\log N (ε, conv (T), ∥ \cdot ∥) \geq C_{2} ε^{- \frac{2 α}{2 + α}} {| \log ε |}^{\frac{2 β}{2 + α}};

and for β > 0 and γ ∈ ℝ, there exists a constant C₃ > 0 such that for all 0 < ε < 1,

\log ℙ (D_{ε}) \leq - C_{3} {| \log ε |}^{β} {(\log | \log ε |)}^{γ}

if and only if there exists a constant C₄ > 0 such that for all 0 < ε < 1,

\log N (ε, conv (T), {∥ \cdot ∥}_{2}) \geq C_{4} {| \log ε |}^{β} {(\log | \log ε |)}^{γ} .

Furthermore, the results also hold if the directions of the inequalities are switched.

The result of this proposition can be implicitly seen in Gao (2004), where an explanation of the relation between N(ε, B, ∥ · ∥_T) and the Gaussian measure of D_ε is also given.

Perhaps, the most useful case of Proposition 3.1 is when T is a set of functions: K(t, ·), t ∈ T, where for each fixed t ∈ T, K(t, ·) is a function in L²(Ω), and where Ω is a bounded set in ℝ^d, d ≥ 1. For this special case, we have

Corollary 3.2

Let X(t) = ∫_Ω K(t, x)dB(x), t ∈ T, where K(t, ·) are square-integrable functions on a bounded set Ω in ℝ^d, d ≥ 1, and B(x) is the d-dimensional Brownian sheet on Ω. If ℱ is the convex hull of the functions K(t, ·), t ∈ T, then

\log ℙ (\sup_{t \in T} | X (t) | < ε) ≍ - ε^{- α} {| \log ε |}^{β}

for α > 0 and β ∈ ℝ if and only if

\log N (ε, ℱ, ∥ \cdot ∥) ≍ ε^{- \frac{2 α}{2 + α}} {| \log ε |}^{\frac{2 β}{2 + α}};

and for β > 0 and γ ∈ ℝ,

\log ℙ (\sup_{t \in T} | X (t) | < ε) ≍ {- | \log ε |}^{β} {(\log | \log ε |)}^{γ}

if and only if

\log N (ε, ℱ, {∥ \cdot ∥}_{2}) ≍ {| \log ε |}^{β} {(\log | \log ε |)}^{γ} .

The authors found this corollary especially useful. For example, it was used in Blei et al. (2007) and Gao (2008) to change a problem of metric entropy into a problem of small deviation probability of a Gaussian process which is relatively easier. The proof is given in Gao (2008) for the case Ω = [0, 1], and in Blei et al. (2007) for the case [0, 1]^d. For the general case, it can be proved as easily. Indeed, the only thing we need to prove is that ℙ(D_ε) can be expressed as the probability of the set sup_t∈T |X(t)| < ε. We outline a proof below. Let ϕ_n be an orthonormal basis of L²(Ω), then

X (t) = \int_{Ω} K (t, s) dB (s) = \sum_{n = 1}^{\infty} ξ_{n} \int_{Ω} K (t, s) ϕ_{n} (s) ds

where ξ_n are i.i.d standard normal random variables. Thus,

\begin{array}{l} ℙ (D_{ε}) & = & ℙ {g \in L^{2} (Ω) : | \int_{Ω} f (s) g (s) ds | < ε, f \in ℱ} \\ = & ℙ {g \in L^{2} (Ω) : | \int_{T} \int_{Ω} K (t, s) g (s) ds μ (dt) | < ε, {∥ μ ∥}_{TV} \leq 1} \\ = & ℙ {\sum_{n = 1}^{\infty} a_{n} ϕ_{n} (s) : \sum_{n = 1}^{\infty} a_{n}^{2} < \infty, | \sum_{n = 1}^{\infty} a_{n} \int_{T} \int_{Ω} K (t, s) ϕ_{n} (s) ds μ (dt) | < ε, {∥ μ ∥}_{TV} \leq 1} \\ = & ℙ {\sum_{n = 1}^{\infty} a_{n} ϕ_{n} (s) : \sum_{n = 1}^{\infty} a_{n}^{2} < \infty, \sup_{t \in T} | \sum_{n = 1}^{\infty} a_{n} \int_{Ω} K (t, s) ϕ_{n} (s) ds | < ε} \\ = & ℙ (\sup_{t \in T} | X (t) | < ε) . \end{array}

Now back to our problem to estimate log N(ε, ℳ_∞, ∥ · ∥₂) in the statement (ii) of Theorem 1.1, where ∥ · ∥₂ is the L² norm with respect to Lebesgue measure on [0, 1], we notice that ℳ_∞ is the convex hull of the functions K(t, ·), t ∈ [0, ∞), on [0, 1], with K(t, s) = e^−ts. Clearly, for each fixed t, K(t, ·) is a square-integrable function on the bounded set [0, 1]. Now, for this K, the corresponding X(t) is a Gaussian process on [0, ∞) with covariance

E X (t) X (s) = \frac{1 - e^{- t - s}}{t + s}, s, t \geq 0 .

(6)

Thus, the problem becomes how to sharply bound the probability

ℙ (\sup_{t \in (0, 1]} | X (t) | < ε) .

This will be done in the next section.

4 Lower Bound Estimate

Let X(t), t ≥ 0 be the centered Gaussian process defined in (6). Our goal in this section is to prove that

\log ℙ (\sup_{t \geq 0} | X (t) | < ε) \leq - C {| \log ε |}^{3},

for some constant C > 0.

Note that for any sequence of positive numbers ${δ_{i}}_{i = 1}^{n}$ ,

\begin{array}{l} ℙ (\sup_{t \geq 0} | X (t) | < ε) & \leq & ℙ (\max_{1 \leq i \leq n} | X (δ_{i}) | < ε) \\ = & {(2 π)}^{- n / 2} {(\det \sum)}^{- 1 / 2} \int_{\max_{1 \leq i \leq n} | y_{i} | \leq ε} \exp (- 〈 y, \sum^{- 1} y 〉) {dy}_{1} \dots {dy}_{n} \\ \leq & {(2 π)}^{- n / 2} {(\det \sum)}^{- 1 / 2} {(2 ε)}^{n} \\ \leq & ε^{n} {(\det \sum)}^{- 1 / 2} . \end{array}

(7)

where we use the covariance matrix

\sum = {(E X (δ_{i}) X (δ_{j}))}_{1 \leq i, j \leq n} = {(\frac{1 - e^{- δ_{i} - δ_{j}}}{δ_{i} + δ_{j}})}_{1 \leq i, j \leq n} .

To find a lower bound for det(Σ), we need the following lemma:

Lemma 4.1

If 0 < b_ij < a_ij for all 1 ≤ i, j ≤ n then

\det (a_{ij} - b_{ij}) \geq \det (a_{ij}) - \sum_{k = 1}^{n} \max_{1 \leq l \leq n} \frac{b_{kl}}{a_{kl}} \cdot per (a_{ij}) .

where per(a_ij) is the permanent of the matrix (a_ij).

Proof

For notational simplicity, we denote c_ij = a_ij − b_ij, then

\begin{array}{l} \det (a_{ij} - b_{ij}) - \det (a_{ij}) \\ = & \sum_{σ} {(- 1)}^{σ} c_{1, σ (1)} c_{2, σ (2)} c_{n, σ (n)} - \sum_{σ} {(- 1)}^{σ} a_{1, σ (1)} a_{2, σ (2)} \dots a_{n, σ (n)} \\ = & \sum_{σ} {(- 1)}^{σ} \sum_{k = 1}^{n} [c_{1, σ (1)} \dots c_{k - 1, σ (k - 1)}] (c_{k, σ (k)} - a_{k, σ (k)}) [a_{k + 1, σ (k + 1)} \dots a_{n, σ (n)}] \\ \geq & - \sum_{σ} \sum_{k = 1}^{n} [a_{1, σ (1)} \dots a_{k - 1, σ (k - 1)}] (b_{k, σ (k)}) [a_{k + 1, σ (k + 1)} \dots a_{n, σ (n)}] \\ \geq & - \sum_{k = 1}^{n} \max_{1 \leq l \leq n} \frac{b_{kl}}{a_{kl}} \sum_{σ} [a_{1, σ (1)} \dots a_{k - 1, σ (k - 1)}] (a_{k, σ (k)}) [a_{k + 1, σ (k + 1)} \dots a_{n, σ (n)}] \\ = & - \sum_{k = 1}^{n} \max_{1 \leq l \leq n} \frac{b_{kl}}{a_{kl}} \cdot per (a_{ij}) . \end{array}

In order to use Lemma 4.1 to estimate det(Σ), we set

a_{ij} = \frac{1}{δ_{i} + δ_{j}}, and b_{ij} = e^{- δ_{i} - δ_{j}} a_{ij}

for a specific sequence ${δ_{i}}_{i = 1}^{n}$ defined by

δ_{mp + q} = 4^{p + m} (m + q), 0 \leq p < m, 1 \leq q \leq m

for n = m².

Clearly, we have

0 < b_{kl} / a_{kl} \leq e^{- 2 m 4^{m}}, 1 \leq k, l \leq n = m^{2} .

(8)

It remains to estimate det(a_ij) and per(a_ij), which is given in the following lemma.

Lemma 4.2

For the matrix (a_ij) defined above, we have per(a_ij) ≤ 1, and det(a_ij) ≥ (240e)^−2m³.

Proof

It is easy to see

per (a_{ij}) \leq n! {(\max_{i, j} a_{ij})}^{n} \leq \frac{(m^{2})!}{{(2 m 4^{m})}^{m^{2}}} \leq 1

since a_ij ≤ (2m4^m)⁻¹ for 1 ≤ i, j ≤ n = m².

To estimate det(a_ij), we use the Cauchy’s determinant identity, see Krattenthaler (1999),

\det (a_{ij}) = \det (\frac{1}{δ_{i} + δ_{j}}) = \frac{\prod_{1 \leq i < j \leq n} {(δ_{j} - δ_{i})}^{2}}{\prod_{1 \leq i, j \leq n} (δ_{j} + δ_{i})} = \frac{1}{2^{n} \prod_{i = 1}^{n} δ_{i}} \cdot \prod_{1 \leq i < j \leq n} {(\frac{δ_{j} - δ_{i}}{δ_{j} + δ_{i}})}^{2} .

To estimate the last product, we partition the set {(i; j) : 1 < i < j < n = m²} into the three sets, and estimate each part separately. For 1 ≤ i < j ≤ n = m², write i = mp + q and j = mr + s with 1 ≤ q, s ≤ m. Denote

\begin{array}{l} A = {(i, j) : i = mp + q, j = mp + s, 0 \leq p \leq m - 1, 1 \leq q < s \leq m}, \\ B = {(i, j) : i = mp + q, j = m (p + 1) + s, 0 \leq p \leq m - 2, 1 \leq q, s \leq m}, \\ C = {(i, j) : i = mp + q, j = mr + s, 0 \leq p \leq m - 3, p + 2 \leq r \leq m - 1, 1 \leq q, s \leq m} . \end{array}

Thus, A, B and C form a partition of {(i, j) : 1 ≤ i < j ≤ n = m²}.

First, for (i, j) ∈ A,

\frac{δ_{j} - δ_{i}}{δ_{j} + δ_{i}} = \frac{s - q}{2 m + s + q} > \frac{s - q}{4 m} .

Thus

\begin{matrix} \prod_{(i, j) \in A} {(\frac{δ_{j} - δ_{i}}{δ_{j} + δ_{i}})}^{2} & \geq & \prod_{p = 0}^{m - 1} \prod_{1 \leq q < s \leq m} {(\frac{s - q}{4 m})}^{2} = \prod_{k = 1}^{m - 1} \prod_{q = 1}^{m - k} {(\frac{k}{4 m})}^{2 m} \\ \geq & \prod_{k = 1}^{m - 1} {(\frac{k}{4 m})}^{2 m^{2}} = {(\frac{(m - 1)!}{{(4 m)}^{m - 1}})}^{2 m^{2}} \\ \geq & {(8 e)}^{- 2 m^{3}} . \end{matrix}

Second, for (i, j) ∈ B,

\frac{δ_{j} - δ_{i}}{δ_{j} + δ_{i}} = \frac{(4 m + 4 s) - (m + q)}{(4 m + 4 s) + (m + q)} \geq \frac{1}{5} .

Thus we have

\prod_{(i, j) \in B} {(\frac{δ_{j} - δ_{i}}{δ_{j} + δ_{i}})}^{2} \geq \prod_{p = 0}^{m - 2} \prod_{1 \leq q, s \leq m} 5^{- 2} \geq 5^{- 2 m^{3}} .

Third, for (i, j) ∈ C, we have r − p ≥ 2, and

(\frac{δ_{j} - δ_{i}}{δ_{j} + δ_{i}}) = \frac{4^{r} (m + s) - 4^{p} (m + q)}{4^{r} (m + s) + 4^{p} (m + q)} = 1 - \frac{2 \cdot 4^{p} (m + q)}{4^{r} (m + s) + 4^{p} (m + q)} > 1 - \frac{1}{4^{r - p - 1}} .

Thus, since Π(1 − x_k) ≥ 1 − Σx_k for 0 < x_k < 1,

\begin{matrix} \prod_{(i, j) \in C} {(\frac{δ_{j} - δ_{i}}{δ_{j} + δ_{i}})}^{2} & \geq & \prod_{p = 0}^{m - 3} \prod_{r = p + 2}^{m - 1} \prod_{1 \leq q, s \leq m} {(1 - \frac{1}{4^{r - p - 1}})}^{2} \\ \geq & \prod_{k = 1}^{m - 2} {(1 - 4^{- k})}^{2 m^{3}} \\ \geq & {(1 - \sum_{k = 1}^{m - 2} 4^{- k})}^{2 m^{3}} \\ \geq & {(2 / 3)}^{2 m^{3}} . \end{matrix}

Therefore, we have

\prod_{1 \leq i < j \leq n} {(\frac{δ_{j} - δ_{i}}{δ_{j} + δ_{i}})}^{2} = \prod_{(i, j) \in A} \cdot \prod_{(i, j) \in B} \cdot \prod_{(i, j) \in C} {(\frac{δ_{j} - δ_{i}}{δ_{j} + δ_{i}})}^{2} \geq {(60 e)}^{- 2 m^{3}} .

On the other hand, it is not difficult to see that

\begin{matrix} 2^{n} \prod_{i = 1}^{n} δ_{i} & = & 2^{m^{2}} \prod_{q = 1}^{m} \prod_{p = 0}^{m - 1} 4^{p + m} (m + q) < 2^{m^{2}} \cdot 4^{m^{2} (m - 1) / 2 + m^{3}} {(2 m)}^{m^{2}} \\ = & 4^{{3 m}^{3} / 2 + m^{2} / 2 + m^{2} \log_{4} m} < 4^{2 m^{3}} \end{matrix}

for m > 1. Hence,

\det (a_{ij}) = {(2^{n} \prod_{i = 1}^{n} δ_{i})}^{- 1} \cdot \prod_{i \leq i < j \leq n} {(\frac{δ_{j} - δ_{i}}{δ_{j} + δ_{i}})}^{2} \geq {(240 e)}^{- 2 m^{3}} .

Now combining the two lemmas above, and using the estimate in (8), we obtain

\det (\sum) \geq {(240 e)}^{- 2 m^{3}} - m^{2} \cdot e^{- 2 m 4^{m}} \geq e^{- 16 m^{3}}

provided that m is large enough. Plugging into (7), we have

ℙ (\sup_{t \geq 0} | X (t) | < ε) \leq e^{8 m^{3}} ε^{m^{2}} .

Minimizing the right-hand side by choosing m ≈ | log ε|/12, we obtain

ℙ (\sup_{t \geq 0} | X (t) | < ε) ≲ \exp (- {(432)}^{- 1} {| \log ε |}^{3}) .

Statement (ii) of Theorem 1.1 follows by applying Corollary 3.2. At the same time, we also finished the proof of Theorem 1.2.

Acknowledgments

We owe thanks to a referee for a number of helpful suggestions.

References

Alzer H, Berg C. Some classes of completely monotonic functions. Ann Acad Sci Fenn Math. 2002;27:445–460. [Google Scholar]
Artstein S, Milman V, Szarek S, Tomczak-Jaegermann N. On convexified packing and entropy duality. Geom Funct Anal. 2004;14:1134–1141. [Google Scholar]
Aurzada F, Ibragimov I, Lifshits M, van Zanten JH. Small deviations of smooth stationary Gaussian processes. Theory of Probability and Its Applications. 2009;53:697–707. [Google Scholar]
Ball K, Pajor A. Geometry of Banach spaces (Strobl, 1989), vol 158 of London Math Soc Lecture Note Ser. Cambridge Univ. Press; Cambridge: 1990. The entropy of convex bodies with “few” extreme points; pp. 25–32. [Google Scholar]
Blei R, Gao F, Li WV. Metric entropy of high dimensional distributions. Proc Amer Math Soc. 2007;135:4009–4018. [Google Scholar]
Bourgain J, Pajor A, Szarek SJ, Tomczak-Jaegermann N. Geometric aspects of functional analysis (1987–88), vol 1376 of Lecture Notes in Math. Springer; Berlin: 1989. On the duality problem for entropy numbers of operators; pp. 50–63. [Google Scholar]
Carl B. Metric entropy of convex hulls in Hilbert spaces. Bull London Math Soc. 1997;29:452–458. [Google Scholar]
Carl B, Kyrezi I, Pajor A. Metric entropy of convex hulls in Banach spaces. J London Math Soc. 1999;60(2):871–896. [Google Scholar]
Dudley RM. Universal Donsker classes and metric entropy. Ann Probab. 1987;15:1306–1326. [Google Scholar]
Feller W. An introduction to probability theory and its applications. Second edition. II. John Wiley & Sons Inc; New York: 1971. [Google Scholar]
Gao F. Entropy of absolute convex hulls in Hilbert spaces. Bull London Math Soc. 2004;36:460–468. [Google Scholar]
Gao F. Entropy estimate for k-monotone functions via small ball probability of integrated Brownian motion. Electron Commun Probab. 2008;13:121–130. [Google Scholar]
Gao F, Wellner JA. On the rate of convergence of the maximum likelihood estimator of a k-monotone density. Science in China, Series A Mathematics. 2009;52:1525–1538. doi: 10.1007/s11425-009-0102-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
Jewell NP. Mixtures of exponential distributions. Ann Statist. 1982;10:479–484. [Google Scholar]
Krattenthaler C. Sém Lothar Combin. Vol. 42. The Andrews Festschrift; Maratea, 1998: 1999. Advanced determinant calculus. Art. B42q, 67 pp. (electronic) [Google Scholar]
Kuelbs J, Li WV. Metric entropy and the small ball problem for Gaussian measures. J Funct Anal. 1993;116:133–157. [Google Scholar]
Li WV, Linde W. Approximation, metric entropy and small ball estimates for Gaussian measures. Ann Probab. 1999;27:1556–1578. [Google Scholar]
Li WV, Linde W. Metric entropy of convex hulls in Hilbert spaces. Studia Math. 2000;139:29–45. [Google Scholar]
Tomczak-Jaegermann N. Dualité des nombres d’entropie pour des opérateurs à valeurs dans un espace de Hilbert. C R Acad Sci Paris Sér I Math. 1987;305:299–301. [Google Scholar]
van der Vaart AW, Wellner JA. Weak convergence and empirical processes. Springer Series in Statistics. Springer-Verlag; New York: 1996. With applications to statistics. [Google Scholar]
Widder DV. The Laplace Transform. Princeton Mathematical Series, v. 6. Princeton University Press; Princeton, N. J: 1941. [Google Scholar]
Williamson RE. Multiply monotone functions and their Laplace transforms. Duke Math J. 1956;23:189–207. [Google Scholar]

[R1] Alzer H, Berg C. Some classes of completely monotonic functions. Ann Acad Sci Fenn Math. 2002;27:445–460. [Google Scholar]

[R2] Artstein S, Milman V, Szarek S, Tomczak-Jaegermann N. On convexified packing and entropy duality. Geom Funct Anal. 2004;14:1134–1141. [Google Scholar]

[R3] Aurzada F, Ibragimov I, Lifshits M, van Zanten JH. Small deviations of smooth stationary Gaussian processes. Theory of Probability and Its Applications. 2009;53:697–707. [Google Scholar]

[R4] Ball K, Pajor A. Geometry of Banach spaces (Strobl, 1989), vol 158 of London Math Soc Lecture Note Ser. Cambridge Univ. Press; Cambridge: 1990. The entropy of convex bodies with “few” extreme points; pp. 25–32. [Google Scholar]

[R5] Blei R, Gao F, Li WV. Metric entropy of high dimensional distributions. Proc Amer Math Soc. 2007;135:4009–4018. [Google Scholar]

[R6] Bourgain J, Pajor A, Szarek SJ, Tomczak-Jaegermann N. Geometric aspects of functional analysis (1987–88), vol 1376 of Lecture Notes in Math. Springer; Berlin: 1989. On the duality problem for entropy numbers of operators; pp. 50–63. [Google Scholar]

[R7] Carl B. Metric entropy of convex hulls in Hilbert spaces. Bull London Math Soc. 1997;29:452–458. [Google Scholar]

[R8] Carl B, Kyrezi I, Pajor A. Metric entropy of convex hulls in Banach spaces. J London Math Soc. 1999;60(2):871–896. [Google Scholar]

[R9] Dudley RM. Universal Donsker classes and metric entropy. Ann Probab. 1987;15:1306–1326. [Google Scholar]

[R10] Feller W. An introduction to probability theory and its applications. Second edition. II. John Wiley & Sons Inc; New York: 1971. [Google Scholar]

[R11] Gao F. Entropy of absolute convex hulls in Hilbert spaces. Bull London Math Soc. 2004;36:460–468. [Google Scholar]

[R12] Gao F. Entropy estimate for k-monotone functions via small ball probability of integrated Brownian motion. Electron Commun Probab. 2008;13:121–130. [Google Scholar]

[R13] Gao F, Wellner JA. On the rate of convergence of the maximum likelihood estimator of a k-monotone density. Science in China, Series A Mathematics. 2009;52:1525–1538. doi: 10.1007/s11425-009-0102-y. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R14] Jewell NP. Mixtures of exponential distributions. Ann Statist. 1982;10:479–484. [Google Scholar]

[R15] Krattenthaler C. Sém Lothar Combin. Vol. 42. The Andrews Festschrift; Maratea, 1998: 1999. Advanced determinant calculus. Art. B42q, 67 pp. (electronic) [Google Scholar]

[R16] Kuelbs J, Li WV. Metric entropy and the small ball problem for Gaussian measures. J Funct Anal. 1993;116:133–157. [Google Scholar]

[R17] Li WV, Linde W. Approximation, metric entropy and small ball estimates for Gaussian measures. Ann Probab. 1999;27:1556–1578. [Google Scholar]

[R18] Li WV, Linde W. Metric entropy of convex hulls in Hilbert spaces. Studia Math. 2000;139:29–45. [Google Scholar]

[R19] Tomczak-Jaegermann N. Dualité des nombres d’entropie pour des opérateurs à valeurs dans un espace de Hilbert. C R Acad Sci Paris Sér I Math. 1987;305:299–301. [Google Scholar]

[R20] van der Vaart AW, Wellner JA. Weak convergence and empirical processes. Springer Series in Statistics. Springer-Verlag; New York: 1996. With applications to statistics. [Google Scholar]

[R21] Widder DV. The Laplace Transform. Princeton Mathematical Series, v. 6. Princeton University Press; Princeton, N. J: 1941. [Google Scholar]

[R22] Williamson RE. Multiply monotone functions and their Laplace transforms. Duke Math J. 1956;23:189–207. [Google Scholar]

PERMALINK

How many Laplace transforms of probability measures are there?

Fuchang Gao

Wenbo V Li

Jon A Wellner

Abstract

1 Introduction

Theorem 1.1

Theorem 1.2

2 Upper Bound Estimate

3 Entropy of Convex Hulls

Proposition 3.1

Corollary 3.2

4 Lower Bound Estimate

Lemma 4.1

Proof

Lemma 4.2

Proof

Acknowledgments

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

How many Laplace transforms of probability measures are there?

Fuchang Gao

Wenbo V Li

Jon A Wellner

Abstract

1 Introduction

Theorem 1.1

Theorem 1.2

2 Upper Bound Estimate

3 Entropy of Convex Hulls

Proposition 3.1

Corollary 3.2

4 Lower Bound Estimate

Lemma 4.1

Proof

Lemma 4.2

Proof

Acknowledgments

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases