Stein’s method and approximating the quantum harmonic oscillator

IAN W MCKEAGUE; EROL A PEKÖZ; YVIK SWAN

doi:10.3150/17-BEJ960

. Author manuscript; available in PMC: 2019 Jun 5.

Published in final edited form as: Bernoulli (Andover). 2018 Dec 12;25(1):89–111. doi: 10.3150/17-BEJ960

Stein’s method and approximating the quantum harmonic oscillator

IAN W MCKEAGUE ¹, EROL A PEKÖZ ², YVIK SWAN ³

PMCID: PMC6550468 NIHMSID: NIHMS1029584 PMID: 31178654

Abstract

Hall et al. (2014) recently proposed that quantum theory can be understood as the continuum limit of a deterministic theory in which there is a large, but finite, number of classical “worlds.” A resulting Gaussian limit theorem for particle positions in the ground state, agreeing with quantum theory, was conjectured in Hall et al. (2014) and proven by McKeague and Levin (2016) using Stein’s method. In this article we show how quantum position probability densities for higher energy levels beyond the ground state may arise as distributional fixed points in a new generalization of Stein’s method These are then used to obtain a rate of distributional convergence for conjectured particle positions in the first energy level above the ground state to the (two-sided) Maxwell distribution; new techniques must be developed for this setting where the usual “density approach” Stein solution (see Chatterjee and Shao (2011)) has a singularity.

Keywords: Interacting particle system, Higher energy levels, Maxwell distribution, Stein’s method

1. Introduction

Hall et al. (2014) proposed a many interacting worlds (MIW) theory for interpreting quantum mechanics in terms of a large but finite number of classical “worlds.” In the case of the MIW harmonic oscillator, an energy minimization argument was used to derive a recursion giving the location of the oscillating particle as viewed in each of the worlds. Hall et al. conjectured that the empirical distribution of these locations converges to Gaussian as the total number of worlds N increases. McKeague and Levin (2016) recently proved such a result and provided a rate of convergence. More specifically, McKeague and Levin showed that if x₁, … x_N is a decreasing, zero-mean sequence of real numbers satisfying the recursion relation

x_{n + 1} = x_{n} - \frac{1}{x_{1} + \dots + x_{n}},

(1)

then the empirical distribution of the x_n tends to standard Gaussian when N → ∞. Here x_n represents the location of the oscillating particle in the nth world, and the Gaussian limit distribution agrees with quantum theory for a particle in the lowest energy (ground) state.

The hypothesized correspondence with quantum theory suggests that stable configurations should also exist at higher energies in the MIW theory. Moreover, the empirical distributions of these configurations should converge to distributions with densities of the form

p_{k} (x) = \frac{({He}_{k} (x))^{2}}{k!} φ (x), x \in R,

(2)

where φ(x) is the standard normal density,

{He}_{k} (x) = (- 1)^{k} e^{x^{2} ∕ 2} \frac{d^{k}}{d x^{k}} e^{- x^{2} ∕ 2}

is the (probabilist’s) kth Hermite polynomial, and k is a non-negative integer. The ground state discussed above corresponds to k = 0 and has the standard Gaussian limit. However, the question of how to characterize higher energy MIW states corresponding to k ≥ 1 is still unresolved as far as we know.

The energy minimization approach of Hall et al. (2014) starts with an analysis of the Hamiltonian for the MIW harmonic oscillator:

H_{0} (x, p) = E (p) + V (x) + U_{0} (x),

where the locations of particles (having unit mass) in the N worlds are specified by x = (x₁, … , x_N) with x₁ > x₂ > … > x_N, and their momenta by p = (p₁, … , p_N). Here $E (p) = \sum_{n = 1}^{N} p_{n}^{2} ∕ 2$ is the kinetic energy, $V (x) = \sum_{n = 1}^{N} x_{n}^{2}$ is the potential energy (for the parabolic trap), and

U_{0} (x) = \sum_{n = 1}^{N} {(\frac{1}{x_{n + 1} - x_{n}} - \frac{1}{x_{n} - x_{n - 1}})}^{2}

is called the “interworld” potential, where x₀ = ∞ and x_N+1 = −∞. In the ground state, there is no movement because all the momenta p_n have to vanish for the total energy to be minimized. In this case, as mentioned above, Hall et al. (2014) showed that the particle locations x_n satisfy (1) and McKeague and Levin (2016) showed that the empirical distribution tends to a standard Gaussian distribution.

Our contribution in the present article is to derive an interworld potential for the second energy state (k = 1) and show that the empirical distribution of the configuration that minimizes the corresponding Hamiltonian has a limit distribution that again agrees with quantum theory. The interworld potential in this case is shown to be

U_{1} (x) = 9 \sum_{n = 1}^{N} {(\frac{1}{x_{n + 1}^{3} - x_{n}^{3}} - \frac{1}{x_{n}^{3} - x_{n - 1}^{3}})}^{2} x_{n}^{4}

(3)

and the minimizer of the corresponding Hamiltonian H₁(x, p) = E(p) + V(x) + U₁(x) is shown to satisfy the recursion

x_{n + 1}^{3} = x_{n}^{3} - 3 {(\sum_{i = 1}^{n} \frac{1}{x_{i}})}^{- 1} .

(4)

Further, we show that if x₁, … , x_N is a decreasing, zero-mean solution, then the empirical distribution of the x_n converges to the (two-sided) Maxwell distribution having density $p_{1} (x) = x^{2} e^{- x^{2} ∕ 2} ∕ \sqrt{2 π}$ . The entire sequence x₁, … x_N should be viewed as indexed by N, though we suppress notation for this dependence and write x₁, … x_N instead of x_{1, N}, … , x_N,N. We also give a rate of convergence using a new extension of Stein’s method. Our approach is generalizable to recursions that converge to the distributions of other higher energy states of the quantum harmonic oscillator, although we do not pursue such extensions here.

We initially thought that the MIW interpretation could be based on a “universal” interworld potential function U₀ that applies to all energy levels, with the densities p_k(x) then arising as limits of local minima of H₀. However, this idea turned out to be analytically unworkable. Here we propose an alternative approach in terms of adapting the interworld potential to each higher energy level. Minimizing the resulting Hamiltonian is then tractable and the solution can be shown to converge to p_k(x), at least in the case k = 1. Hall et al. (2014) derived their interworld potential U₀ as a discretization of Bohm’s quantum potential summed over the particle ensemble, see Bohm (1952). The challenge in general is to extend this derivation to higher-energy wave functions in a way that leads to an explicit recursion minimizing the resulting Hamiltonian, and to show that it agrees with p_k(x) in the limit. A major contribution here, in addition to providing a rate of convergence, is a general method for finding such interworld potential functions and their associated particle recursions.

Stein’s method (see Stein (1986), Chen et al. (2010) and Ross (2011)) is a well-established technique for obtaining explicit error bounds for distributional limit theorems. However, the usual “density approach” (see Chatterjee and Shao (2011)) for applying Stein’s method does not seem to work in cases where the density function vanishes at a point in the interior of the support of the target distribution (here we have p₁ (0) = 0 and the support is the whole real line). As we elaborate later, in this case the solution to the Stein equation will have a singularity and also unbounded derivatives. This motivates the new technique we will develop to handle such distributions. While there are plenty of examples of Stein’s method applied to distributions with a density having a zero on the boundary of the support (the gamma and beta distributions, for example), there have been no examples (that we know of) with a zero in the interior of the support; the higher energy distributions p_k (x), for k ≥ 1, appear to be the first such distributions considered. The price one has to pay with our approach for handling these zeros is more complicated estimates involving couplings. In our case, however, analytical properties of the recursion (4) can fortunately be exploited to establish such estimates.

In Section 2 we generalize the argument of Hall et al. (2014) to derive the interworld potential, and show how it leads to the solution (4). In Section 3 we introduce the notion of a generalized zero-bias transformation, and show that the distributional properties of eigenstates of the quantum harmonic oscillator can be characterized in terms of fixed points of this transformation. Also, we derive the generalized zero-bias distribution for the empirical distribution of general configurations. Section 4 develops our results based on the new extension of Stein’s method to show convergence of the configuration that minimizes the Hamiltonian of the second energy state.

2. Interworld potentials for higher energy states

Hall et al. (2014) introduced their MIW theory from the perspective of the de Broglie–Bohm interpretation of quantum mechanics, which is mathematically equivalent to standard quantum theory. They used this approach to construct an ansatz for the conjectured interworld potential U₀ governing the ground state wave function of the quantum harmonic oscillator. In this section we introduce an extended version of this ansatz aimed at providing a MIW characterization of the higher energy eigenstates.

Our argument follows along the lines of Section IIIA of Hall et al. (2014) with the major difference being that we now need to introduce a more general way of approximating the density of particle location for a stationary wave function ψ(x), namely for a density of the form p(x) = ∣ψ(x)∣² = b(x)φ(x), where b(x) is a non-negative, even, smooth function having finitely many zeros. Here b represents a “baseline” that varies more rapidly than φ(x). Let x₁ > x₂ > … > x_N. Bohm’s quantum potential summed over the ensemble {x_n} is defined by

U_{ψ} (x) = \sum_{n = 1}^{N} [p^{'} (x_{n}) ∕ p (x_{n})]^{2}

(5)

where we are using dimensionless units. An approximation to p(x_n) based on ignoring φ(x) is given (up to a normalizing constant) by

\tilde{p} (x_{n}) = \frac{b (x_{n})}{B (x_{n}) - B (x_{n + 1})},

where $B (x) = \int_{0}^{x} b (t) d t$ is the cumulative baseline function. This suggests

\frac{p^{'} (x_{n})}{p (x_{n})} \approx \frac{\tilde{p} (x_{n}) - \tilde{p} (x_{n - 1})}{(x_{n} - x_{n - 1}) \tilde{p} (x_{n})} \approx [\frac{1}{B (x_{n}) - B (x_{n + 1})} - \frac{1}{B (x_{n - 1}) - B (x_{n})}] b (x_{n}),

where we set B(x₀) = ∞ and B(x_N+1) = −∞. Our proposed ansatz for the interworld potential is then based on inserting the above expression into (5) to obtain

U_{b} (x) = \sum_{n = 1}^{N} {[\frac{1}{B (x_{n + 1}) - B (x_{n})} - \frac{1}{B (x_{n}) - B (x_{n - 1})}]}^{2} b (x_{n})^{2} .

(6)

Note that our earlier assumptions about b imply that B is strictly increasing, so U_b is well-defined. In the simplest cases b(x) = 1 and b(x) = x² the above expression for U_b agrees with the interworld potentials U₀ and U₁ defined in the Introduction. We conjecture that the interworld potential U_b is suitable for obtaining MIW approximations to the class of target distributions of the form p_k(x). Indeed, there may be a natural affinity between our new version of Stein’s method and the densities p_k(x) for all the energy levels of the quantum harmonic oscillator.

Specializing to the case b(x) = x², the following argument characterizes the minimizer of the Hamiltonian H₁ (i.e., the ground state when the interworld potential is U₁) in terms of a solution to the recursion (4). In any ground state the particles do not move, so the kinetic energy E vanishes. Then, adapting the argument of Hall et al. (2014) to apply to H₁, we have

9 (N - 1)^{2} = 9 {[\sum_{n = 1}^{N - 1} \frac{x_{n + 1}^{3} - x_{n}^{3}}{x_{n + 1}^{3} - x_{n}^{3}}]}^{2} = 9 {[\sum_{n = 1}^{N} (\frac{1}{x_{n + 1}^{3} - x_{n}^{3}} - \frac{1}{x_{n}^{3} - x_{n - 1}^{3}}) x_{n}^{2} (x_{n} - \bar{x_{N}^{3}} ∕ x_{n}^{2}]}^{2} \leq 9 [\sum_{n = 1}^{N} {(\frac{1}{x_{n + 1}^{3} - x_{n}^{3}} - \frac{1}{x_{n}^{3} - x_{n - 1}^{3}})}^{2} x_{n}^{4}] [\sum_{n = 1}^{N} (x_{n} - \bar{x_{N}^{3}} ∕ x_{n}^{2})^{2}] \leq U_{1} (x) V (x),

where the first inequality is Cauchy–Schwarz. So U₁ ≥ 9(N – 1)²/V, leading to

H_{1} = U_{1} + V \geq 9 (N - 1)^{2} ∕ V + V \geq 6 (N - 1)

with the last inequality being equality for V = 3(N – 1). It follows that H₁ is minimized when V = 3(N – 1), the mean $\bar{x_{N}^{3}}$ of ${x_{n}^{3}, n = 1, \dots, N}$ vanishes, and

\frac{1}{x_{n}} = α [\frac{1}{x_{n + 1}^{3} - x_{n}^{3}} - \frac{1}{x_{n}^{3} - x_{n - 1}^{3}}]

for some constant α. The sum of the right of the above display telescopes, leading to the recursion (4) by rearranging and noting that α = −V/(N – 1) = −3.

The following lemma provides the basic properties we need to ensure the existence of a solution of the Maxwell recursion (4) that minimizes the Hamiltonian H₁, as well as ensuring that the solution is unique. This result is analogous to Lemma 1 of McKeague and Levin (2016) concerning solutions of (1), but the difference here is that the variance is 3, agreeing with the Maxwell distribution (rather than close to standard normal in the case of (1)).

Lemma 2.1. Suppose N is even. Every zero-median solution x₁, … , x_N of (4) satisfies:

(P1) Zero-mean: x₁ + … + x_N = 0.
(P2) Maxwell variance: $x_{1}^{2} + \dots + x_{N}^{2} = 3 (N - 1)$ .
(P3) Symmetry: x_n = −x_N+1−n for n = 1, … , N.

Further, there exists a unique solution x₁, … , x_N such that (P1) and

(P4) Strictly decreasing: x₁ > … > x_N

hold. This solution has the zero-median property, and thus also satisfies (P2) and (P3).

Proof. The proof follows identical steps to the proof of Lemma 1 of McKeague and Levin (2016), apart from the variance property (P2), which is proved using (P1) and (P3) as follows. Denote $S_{n} = \sum_{i = 1}^{n} x_{i}^{- 1}$ for n = 1, … , N, and set S₀ = 0. Using (4) we can write

3 (N - 1) = 3 \sum_{n = 1}^{N - 1} S_{n} S_{n}^{- 1} = \sum_{n = 1}^{N - 1} S_{n} (x_{n}^{3} - x_{n + 1}^{3}) = \sum_{n = 1}^{N - 1} [(S_{n - 1} + x_{n}^{- 1}) x_{n}^{3} - S_{n} x_{n + 1}^{3}] = \sum_{n = 1}^{N - 1} [S_{n - 1} x_{n}^{3} - S_{n} x_{n + 1}^{3} + x_{n}^{2}] = x_{1}^{2} + \dots + x_{N - 1}^{2} - S_{N - 1} x_{N}^{3},

where we used the recursion in the second equality, and the last equality is from a telescoping sum. (P3) implies S_N = 0, so −S_N–1 = 1/x_N, and (P2) follows. □

Although in the sequel we concentrate on the case k = 1 (see Figure 1), to conclude this section we briefly discuss general densities of the form p_k given in (2). The above argument for b(x) = x² can be extended to general U_b under the condition that B(x) is proportional to xb(x), which is the case when b(x) is proportional to x^r for some even non-negative integer r (but not for the square of the kth Hermite polynomial unless k = 0 or 1). Under this condition, it can be shown that the minimizer of the Hamiltonian based on U_b is a symmetric solution of the recursion

B (x_{n + 1}) = B (x_{n}) - {(\sum_{i = 1}^{n} \frac{x_{i}}{b (x_{i})})}^{- 1} .

(7)

Figure 1. — Example with b(x) = x², N = 22, showing the piecewise constant density having mass 1/(N – 1) uniformly distributed over the intervals between successive *x_n* compared with the Maxwell density, where the breaks in the histogram are the successive *x_n* satisfying the recursion (4).

We have not been able to show that this recursion minimizes the Hamiltonian for general b, but our numerical results suggest that it is very close if not identical to a minimizer. With k = 2 we have b(x) = (x² – 1)²/2, B(x) = x⁵/10 – x³/3 + x/2, and the symmetric solution of the resulting recursion produces a remarkably good agreement with p_k, see Figure 2.

Figure 2. — Example with b(x) = He_k(x)²/k! for k = 2, N = 41, where the breaks in the histogram are the successive *x_n* satisfying the recursion (7) and the red curve is *p_k*(x).

3. Generalized zero-bias transformations

Let W be a symmetric random variable and b: $R \to R$ a non-negative function such that $σ^{2} = E [W^{2} ∕ b (W)] < \infty$ . Goldstein and Reinert (1997) gives a distributional fixed point characterization of the Gaussian distribution, which we generalize in the definition below.

Definition 3.1. If there is a random variable W* such that

σ^{2} E [\frac{f^{'} (W^{⋆})}{b (W^{⋆})}] = E [\frac{W f (W)}{b (W)}]

for all absolutely continuous functions f: $R \to R$ such that $E ∣ W f (W) ∕ b (W) ∣ < \infty$ , we say that W* has the b-generalized-zero-bias distribution of W.

Remark 3.2. Goldstein and Reinert (1997) study the case b(x) = 1 and show that W* has the same distribution as W if and only if W has a Gaussian distribution. Distributional fixed point characterizations for exponential, gamma and other nonnegative distributions and the connection with Stein’s method have been studied in Peköz and Röllin (2011), Peköz et al. (2013), and Peköz et al. (2016).

Remark 3.3. By a routine extension of the proof of Proposition 2.1 of Chen et al. (2010), it can be shown that there exists a unique distribution for W*, and it is absolutely continuous with density

p^{⋆} (x) \propto b (x) E [\frac{W}{b (W)} 1_{W ⩾ x}] .

We note in passing that the σ² should be on the other side of the equality in the first, display of Chen et al.’s proposition, which corresponds to b(x) = 1, the usual zero-bias distribution of W. The composition of the b-generalized-zero-bias transformation with the (1/b)-generalized-zero-bias transformation is the usual zero-bias transformation.

Remark 3.4. With φ the standard normal density and b a φ-integrable function, if W has density

p (x) = b (x) φ (x),

(8)

then its distribution is a fixed point, for the b-generalized-zero-bias transformation since

p^{⋆} (x) = b (x) \int_{x}^{\infty} \frac{t}{b (t)} p (t) d t = b (x) \int_{x}^{\infty} t φ (t) d t = p (x) .

The following result gives the b-generalized-zero-bias distribution of the uniform distribution on N points.

Proposition 3.5. Given an integer N > 1, let x₁ > x₂ > … > x_N be such that b(x_n) > 0 for all n. Let $P_{N}$ be the empirical distribution of the x_n:

P_{N} (A) = \frac{# {n : x_{n} \in A}}{N}

for any Borel set $A \subset R$ . Under the symmetry condition x_n = x_N−n+1 for n = 1, … , N, the b-generalized-zero-bias distribution $P_{N}^{⋆}$ of $P_{N}$ is defined, and has density

p^{⋆} (x) \propto b (x) [\sum_{i = 1}^{n} \frac{x_{i}}{b (x_{i})}]

for x_n+1 < x ≤ x_n (n = 1, … , N – 1), and p* (x) = 0 if x > x₁ or x ≤ x_N.

Proof. Immediate from Remark 3.3. □

Recall the following distances between distribution functions F and G. The Kolmogorov distance is

d_{K} (F, G) = sup_{x \in R} ∣ F (x) - G (x) ∣,

and the Wasserstein distance is

d_{W} (F, G) = sup_{h \in H} ∣ \int_{R} h d F - \int_{R} h d G ∣

where

H = {h : R \to R Lipschitz with ‖ h^{'} ‖ ⩽ 1}

and ∥ · ∥ is the supremum norm. Using Proposition 1.2 in Ross (2011), these two metrics are seen to be related by

d_{K} (F, G) \leq \sqrt{2 C d_{W} (F, G)}

if G has density bounded by C.

Restricting attention to the special case b(x) = x², we can now state our main result, along with an important corollary.

Theorem 3.6. Suppose W* is constructed on the same probability space as the zero-mean random variable W and is distributed according to the x²-generalized-zero-bias distribution of W. Let M have the two-sided Maxwell density $x^{2} e^{- x^{2} ∕ 2} ∕ \sqrt{2 π}$ . Then there exist positive finite constants λ₁, λ₂, λ₃ and λ₄ such that

d_{W} (L (W), L (M)) \leq λ_{1} E ∣ W - W^{⋆} ∣ + λ_{2} E [∣ W ∣ ∣ W - W^{⋆} ∣] + λ_{3} E ∣ \frac{1}{W} - \frac{1}{W^{⋆}} ∣ + λ_{4} E ∣ 1 - \frac{W^{⋆}}{W} ∣ .

(9)

Proof. The inequality follows immediately from Theorem 4.4. Finiteness of the constants (along with explicit upper bounds) is detailed in Proposition 4.5. □

The following corollary gives a rate of convergence of the solution to (4) to the two-sided Maxwell distribution in terms of the Wasserstein distance; we postpone the proof until Section 4.3.

Corollary 3.7. Suppose x₁, … x_N is a monotonic, zero-mean, finite sequence of real numbers satisfying (4), let $P_{N}$ be the empirical distribution of these values, and let M be as in Theorem 3.6. Then there is a constant C > 0 such that

d_{W} (P_{N}, L (M)) \leq C \sqrt{\frac{\log N}{N}} .

4. The Stein equation and its solutions

4.1. General considerations on Stein’s method and the problem with Stein’s density approach

Let F and G be two cumulative distribution functions which one wishes to compare. Denote L¹(F) (resp., L¹(G)) the class of Borel measurable functions $h : R \to R$ such that ∫ ∣h∣ dF < ∞ (resp., ∫ ∣h∣ dG < ∞). A discrepancy measure between F and G is an integral probability metric if it can be written in the form

d_{H} (F, G) ≔ sup_{h \in H} ∣ \int ∣ h ∣ d F - \int ∣ h ∣ d G ∣

for some class of test functions $H \subset L^{1} (F) \cap L^{1} (G)$ . The aforementioned Kolmogorov and Wasserstein distances are two important examples of integral probability metrics.

Suppose that F is absolutely continuous with density p on the real line, and introduce the operator h → f_h which, to each h ∈ L¹(F), assigns the function

f_{h} (x) = \frac{1}{p (x)} \int_{x}^{\infty} (h (u) - F (h)) p (u) d u

(10)

where F(h) = ∫ h dF. The integrability condition h ∈ L¹(F) guarantees that f_h is the unique absolutely continuous solution to the differential equation

f_{h}^{'} (x) + \frac{p^{'} (x)}{p (x)} f_{h} (x) = h (x) - F (h)

to also satisfy the boundary conditions lim_x→±∞ p(x)f_h(x) = 0. Under the assumption that $H \subset L^{1} (G)$ we can integrate with respect to G on both sides of the differential equation (known, in the Stein community argot, as a “Stein equation”) to get

d_{H} (F, G) = sup_{h \in H} ∣ E [f_{h}^{'} (W) + \frac{p^{'} (W)}{p (W)} f_{h} (W)] ∣,

(11)

with W a random variable distributed according to G. This last expression provides a means of bounding integral probability metrics (and thus in particular the Kolmogorov and Wasserstein distances) in terms of the action of a differential operator over a class of functions.

The steps outlined above form the basis of what is known as the “density approach” to Stein’s method (see e.g. Chatterjee and Shao (2011)), which is the most intuitive extension of Stein’s method of normal approximation (as described in Chen et al. (2010)) to arbitrary continuous target distributions. In order for (11) to be of practical use, however, it is crucial that the functions p′/p, f_h and $f_{h}^{'}$ be amenable to computations; it is particularly important that f_h and its first derivatives be bounded. Such conditions are not met in the case of the two-sided Maxwell distribution p(x) = x²φ(x) with which we are concerned in this paper. Indeed, for such a density, we have on the one hand p′(x)/p(x) = 2/x – x and, on the other hand, $f_{h} (x) = x^{- 2} e^{x^{2} ∕ 2} \int_{x}^{\infty} (h (u) - F (h)) u^{2} e^{- u^{2} ∕ 2}$ du, both of which have a singularity at x = 0. Because of this, applying the classical Stein’s method toolkit to the right-hand side of (11) will ultimately lead to trivial upper bounds and more elaborate methods need to be devised. This will be performed in the coming sections.

Before proceeding to the description of our proposal, we stress that the classical version of the “density approach” to Stein’s method that we have just described actually breaks down for any target density p such that p(x₀) = 0 at some x₀ not on the edges of the support. Indeed, in general, the Stein solution (10) is the product of two terms: one term with a singularity wherever the density has a zero, and a second term that vanishes at the endpoints of the range of the target random variable. This results in the peculiar behavior of singularities inside the range of the random variable when the density has a zero there. Note that for the one-sided Maxwell distribution the solution f_h(x) has the same form as for its two-sided counterpart, though it is now only defined for x ⩾ 0; since in this one-sided case when x = 0 the second term $\int_{0}^{\infty} (h (u) - F (h)) u^{2} e^{- u^{2} ∕ 2} d u$ vanishes, we would have f_h(0) = 0. This term doesn’t vanish (unless h is an even function) for the two-sided Maxwell case, thus giving rise to the singularity at x = 0.

4.2. Coupling based Stein’s method for densities of the form (8)

Let X be a random variable with probability density function p which we assume to be of the form (8). Let W be a symmetric random variable whose distribution we want to compare with that of X. First, we introduce the random variable W* proposed in Definition 3.1 and write

E [h (W)] - E [h (X)] = E [\frac{f^{'} (W)}{b (W)} - W \frac{f (W)}{b (W)}] = E [\frac{f^{'} (W)}{b (W)} - \frac{f^{'} (W^{⋆})}{b (W^{⋆})}]

(12)

for f = f_h solutions to the differential equation

\frac{f^{'} (w)}{b (w)} - w \frac{f (w)}{b (w)} = h (w) - E [h (X)] .

(13)

Taking suprema over all $h \in H$ , we deduce

d_{H} (L (W), L (X)) = sup_{h \in H} ∣ E [\frac{f^{'} (W)}{b (W)} - \frac{f^{'} (W^{⋆})}{b (W^{⋆})}] ∣

(14)

for all $H$ such that the solutions to (13) are well-defined. Expression (14) provides an alternative to (11) which we will now prove to be useful to our purpose.

At this stage the next typical “Stein-method” step is to write W* = W + (W* – W) and Taylor expand the integrand in (14) around W to deduce a bound on $d_{H} (L (W), L (X))$ expressed in terms of the difference between W* and W. Unfortunately, for similar reasons as those described in Section 4.1, the solutions to (13) also have singularities which make this intuition unexploitable directly. We propose to bypass this difficulty by introducing intermediate functions τ_X and g – to be defined later on in the text – for which

E [h (W)] - E [h (X)] = E [\frac{f^{'} (W)}{b (W)} - \frac{f^{'} (W^{⋆})}{b (W^{⋆})}] = E [W (τ_{X} (W) - 1) g (W) - W^{⋆} (τ_{X} (W^{⋆}) - 1) g (W^{⋆})] + E [τ_{X} (W) g^{'} (W) - τ_{X} (W^{⋆}) g^{'} (W^{⋆})] .

(15)

Bounding integral probability metrics $d_{H} (\cdot, \cdot)$ between $L (W)$ and $L (X)$ then boils down to finding bounds on the four terms provided in (15). Obviously this will only lead to reasonable results if the intermediate functions τ_X and g are chosen wisely.

4.3. The Stein kernel equation for densities of the form (8)

We start by introducing the integral operator

h - Φ (h) \mapsto T_{φ}^{- 1} (h - Φ (h)) (w) ≔ \frac{1}{φ (w)} \int_{w}^{\infty} (h (u) - Φ (h)) φ (u) d u

(16)

with Φ(h) = ∫ hdΦ and Φ the standard Gaussian cumulative distribution function. (The notation $T_{φ}^{- 1}$ is taken from Ley, Reinert and Swan (2017).) We also introduce the function

τ_{X} (x) = \frac{1}{p (x)} \int_{x}^{\infty} u p (u) d u

which is called the “Stein kernel” of X (or, equivalently, of p) – again we refer to Ley, Reinert and Swan (2017) for intuition and first properties.

Remark 4.1. Stein kernels were introduced in Stein (1986); Cacoullos and Papathanasiou (1989), and have proven to be of great use in Gaussian analysis, see, e.g., Nourdin and Peccati (2009) and Chatterjee (2009). Their importance in the abstract approach to Stein’s method has been investigated in Döbler (2015), where it is shown that they have a regularizing effect on the solutions to general Stein equations.

Lemma 4.2. Let x ↦ b(x) be a nonnegative even function with support a subset of (−∞, ∞) and such that lim_x→±∞ b(x)φ(x) = 0. Suppose furthermore that b is absolutely continuous and integrable w.r.t. φ with integral $\int_{- \infty}^{\infty}$ b(x)φ(x) dx = 1. Let X be a random variable with density x ↦ b(x)φ(x). Then

τ_{X} (x) = 1 + \frac{T_{φ}^{- 1} b^{'} (x)}{b (x)}

(17)

under the convention that the ratio is set to zero at all points x such that b(x) = 0 and $T_{φ}^{- 1} b^{'} (x) \neq 0$ . Let $h : R \to R$ be a Borel function such that E∣h(X)∣ < ∞, and set $\tilde{h} = h - E [h (X)] .$ Then

g_{h} (x) = \frac{\int_{x}^{\infty} b (u) \tilde{h} (u) φ (u) d u}{b (x) φ (x) + \int_{x}^{\infty} b^{'} (u) φ (u) d u}

(18)

is the unique solution g of the ODE

τ_{X} (x) g^{'} (x) - x g (x) = \tilde{h} (x)

(19)

which satisfies the asymptotic property lim_x→±∞ τ(x)φ(x)b(x)g(x) = 0.

Proof. Integrating by parts in the definition of the Stein kernel for p = bφ we get (assuming that lim_x→±∞ b(x)φ(x) = 0)

\int_{x}^{+ \infty} y p (y) d y = \int_{x}^{\infty} b (y) (- φ^{'} (y)) d y = b (x) φ (x) + \int_{x}^{\infty} b^{'} (y) φ (y) d y

so that (17) follows by definition (16) of the inverse Stein operator. For the second claim we follow (Nourdin and Peccati, 2012, Proposition 3.2.2) and note how

τ_{X} (x) g^{'} (x) - x g (x) = \frac{(τ_{X} (x) g (x) p (x))^{'}}{p (x)}

so that any solution to (19) has the form

g (x) = \frac{1}{τ (x) p (x)} \int_{- \infty}^{x} \tilde{h} (u) p (u) d u + \frac{d}{τ (x) p (x)},

(20)

where $d \in R$ . By dominated convergence, one infers that

lim_{x \to \pm \infty} \int_{- \infty}^{x} \tilde{h} (y) b (y) φ (y) d y = 0,

so that the first summand in (20) has the announced form (18) and the asymptotic property is satisfied if and only if d = 0. □

Our next result provides the connection between the Stein equations (13) and (19).

Lemma 4.3. Suppose that b only has isolated zeros. Let all notations be as above and introduce the function g = g_f defined at all x such that b(x) > 0 through

\frac{f^{'} (x) - x f (x)}{b (x)} = τ_{X} (x) g^{'} (x) - x g (x) .

Then

E [\frac{f^{'} (W)}{b (W)} - \frac{f^{'} (W^{⋆})}{b (W^{⋆})}] = E [W (τ_{X} (W) - 1) g (W) - W^{⋆} (τ_{X} (W^{⋆}) - 1) g (W^{⋆})] + E [τ_{X} (W) g^{'} (W) - τ_{X} (W^{⋆}) g^{'} (W^{⋆})] .

(21)

Proof. Since

\frac{f^{'} (x) - x f (x)}{b (x)} = \frac{(f (x) φ (x))^{'}}{b (x) φ (x)}

and

τ_{X} (x) g^{'} (x) - x g (x) = \frac{(b (x) τ_{X} (x) g (x) φ (x))^{'}}{b (x) φ (x)}

at all x for which b(x) ≠ 0, we deduce that f and g are mutually defined by f = (bτ_X)g. This in turn gives

\frac{f^{'} (x)}{b (x)} = (\frac{b^{'} (x)}{b (x)} τ_{X} (x) + τ_{X}^{'} (x)) g (x) + τ_{X} (x) g^{'} (x) ≕ ψ (x) g (x) + τ_{X} (x) g^{'} (x)

(50)

which, combined with ψ(x) = x(τ_X(x) – 1) (that is easily derived using the various definitions involved), leads to the useful identity

\frac{f^{'} (x)}{b (x)} = x (τ_{X} (x) - 1) g (x) + τ_{X} (x) g^{'} (x)

(22)

from which (21) is directly derived. □

Combining identities (12) and (21) we get (15), as promised. As already mentioned in the introduction, the price to pay for circumventing the singularities is the necessity to bound several additional quantities concerning the couplings we obtain. The explicit nature of the recursion described in Section 2 nevertheless allows us to compute the resulting quantities satisfactorily, leading to the bounds claimed in Theorem 3.6 and Corollary 3.7. This we perform in the Maxwell case in the next sections.

4.4. Approximating the two-sided Maxwell distribution

Theorem 4.4. Let p(x) = x²φ(x), and take f a solution to the Stein equation

f^{'} (w) ∕ w^{2} - w f (w) ∕ w^{2} = \tilde{h} (w),

(23)

where $\tilde{h}$ is a function having bounded first derivative and zero-mean under p. Set $c = ‖ {\tilde{h}}^{'} ‖$ . Then for any coupling of W and W* on a joint probability space such that W* has the x²-generalized zero biased distribution for W,

∣ E [\frac{f^{'} (W)}{(W)^{2}} - \frac{f^{'} (W^{⋆})}{(W^{⋆})^{2}}] ∣ \leq λ_{1} E ∣ W - W^{⋆} ∣ + λ_{2} E [∣ W ∣ ∣ W - W^{⋆} ∣] + λ_{3} E ∣ \frac{1}{W} - \frac{1}{W^{⋆}} ∣ + λ_{4} E ∣ 1 - \frac{W^{⋆}}{W} ∣

(24)

with

λ_{1} \leq 6 c, λ_{2} \leq 7 c, λ_{3} \leq 18 c a n d λ_{4} \leq 22 c .

(25)

Proof. With b(x) = x² we have τ_X(x) = 1 + 2/x² and ψ(x) = 2/x, so that (21) becomes

= E [\frac{2}{W^{⋆}} g (W^{⋆}) - \frac{2}{W} g (W)] + E [(1 + \frac{2}{(W^{⋆})^{2}}) g^{'} (W^{⋆}) - (1 + \frac{2}{(W)^{2}}) g^{'} (W)] = 2 E [(\frac{1}{W^{⋆}} - \frac{1}{W}) g (W^{⋆})] + 2 E [\frac{1}{W} (g (W^{⋆}) - g (W))] + E [g^{'} (W^{⋆}) - g^{'} (W)] + 2 E [\frac{1}{(W^{⋆})^{2}} g^{'} (W^{*}) - \frac{1}{(W)^{2}} g^{'} (W)] .

The first two terms are dealt with easily to get

2 ∣ E [(\frac{1}{W^{⋆}} - \frac{1}{W}) g (W^{⋆})] + 2 E [\frac{1}{W} (g (W^{⋆}) - g (W))] ∣ \leq 2 ‖ g ‖ E [∣ \frac{1}{W^{⋆}} - \frac{1}{W} ∣] + 2 ‖ g^{'} ‖ E [\frac{1}{∣ W ∣} ∣ W^{⋆} - W ∣] .

For the last two terms we introduce the function

χ (x) = g^{'} (x) ∕ x

to get on the one hand

E [g^{'} (W^{⋆}) - g^{'} (W)] = E [W^{⋆} \frac{g^{'} (W^{⋆})}{W^{⋆}} - W \frac{g^{'} (W)}{W}] = E [(W^{⋆} - W) χ (W^{⋆})] + E [W (χ (W^{⋆}) - χ (W))]

so that

∣ E [g^{'} (W^{⋆}) - g^{'} (W)] ∣ \leq ‖ χ ‖ E [∣ W^{⋆} - W ∣] + ‖ χ^{'} ‖ E [∣ W (W^{⋆} - W) ∣]

and, on the other hand

E [\frac{1}{(W^{⋆})^{2}} g^{'} (W^{⋆}) - \frac{1}{(W)^{2}} g^{'} (W)] = E [\frac{1}{W^{⋆}} χ (W^{⋆}) - \frac{1}{W} χ (W)] = E [(\frac{1}{W^{⋆}} - \frac{1}{W}) χ (W^{⋆})] + E [\frac{1}{W} (χ (W^{⋆}) - χ (W))]

so that

2 ∣ E [\frac{1}{(W^{⋆})^{2}} g^{'} (W^{⋆}) - \frac{1}{(W)^{2}} g^{'} (W)] ∣ \leq 2 ‖ χ ‖ E [∣ \frac{1}{W^{⋆}} - \frac{1}{W} ∣] + 2 ‖ χ^{'} ‖ E [\frac{1}{∣ W ∣} ∣ W^{⋆} - W ∣] .

Combining these different estimates we obtain (24), with λ₁, λ₂, λ₃ and λ₄ expressed in terms of ∥χ∥, ∥χ′∥, ∥g∥ and ∥g′∥ as follows:

λ_{1} = ‖ χ ‖, λ_{2} = ‖ χ^{'} ‖, λ_{3} = 2 (‖ g ‖ + ‖ χ ‖) and λ_{4} = 2 (‖ g^{'} ‖ + ‖ χ^{'} ‖) .

The inequalities in (25) are proved in the Proposition 4.5 below. □

The next step is to bound ∥χ∥, ∥χ′∥, ∥g∥ and ∥g′∥ in a non trivial way; this we achieve in the next proposition.

Proposition 4.5. Let $h : R \to R$ be absolutely continuous and integrable with respect to p(x) = x²φ(x). Set c = ∥h′∥ which we suppose to be finite. Let X ~ p, define

g_{0} (x) = {\begin{matrix} e^{x^{2} ∕ 2} \int_{x}^{\infty} y^{2} (h (y) - E [h (X)]) e^{- y^{2} ∕ 2} d y & i f x > 0 \\ e^{x^{2} ∕ 2} \int_{- \infty}^{x} y^{2} (h (y) - E [h (X)]) e^{- y^{2} ∕ 2} d y & i f x \leq 0 \end{matrix}

(26)

and set

g (x) = \frac{g_{0} (x)}{x^{2} + 2} a n d χ (x) = \frac{g^{'} (x)}{x} .

(27)

Then

‖ g ‖ \leq 3 c, ‖ g^{'} ‖ \leq 4 c, ‖ χ ‖ \leq 6 c a n d ‖ χ^{'} ‖ \leq 7 c .

Remark 4.6. The function g₀ defined in (26) satisfies

g_{0}^{'} (x) - x g_{0} (x) = x^{2} (h (x) - E [Z^{2} h (Z)])

(28)

with Z ~ φ a standard Gaussian random variable.

Remark 4.7. The function g defined in (27) satisfies

\frac{(g (x) τ (x) p (x))^{'}}{p (x)} = h (x) - E [h (X)]

with X ~ p and τ(x) = 1 + 2/x².

Proof. In order to simplify future notations we introduce $Φ (x) = \int_{- \infty}^{x} φ (t) d t$ , $\overset{‒}{Φ} (x) = \int_{x}^{\infty} φ (t) d t$ , $Υ (x) = e^{x^{2} ∕ 2} \int_{x}^{\infty} t^{2} e^{- t^{2} ∕ 2} d t$ and $\overset{‒}{Υ} (x) = e^{x^{2} ∕ 2} \int_{- \infty}^{x} t^{2} e^{- t^{2} ∕ 2} d t$ . Using the identity

\int_{a}^{b} t^{2} e^{- t^{2} ∕ 2} d t = a e^{- a^{2} ∕ 2} - b e^{- b^{2} ∕ 2} + \int_{a}^{b} e^{- t^{2} ∕ 2} d t, - \infty \leq a \leq b \infty,

(29)

We deduce that $Υ (x) = x + e^{x^{2} ∕ 2} \int_{x}^{\infty} e^{- t^{2} ∕ 2} d t$ and $\overset{‒}{Υ} (x) = - x + e^{x^{2} ∕ 2} \int_{- \infty}^{x} e^{- t^{2} ∕ 2} d t$ and thus

Υ (x), \overset{‒}{Υ} (x) \leq ∣ x ∣ + \sqrt{\frac{π}{2}} at all x \in R and lim_{x \to \infty} \frac{Υ (x)}{x} = lim_{x \to - \infty} \frac{\overset{‒}{Υ} (x)}{x} = 1 .

(30)

The proof is now broken down into several steps.

Step 1: rewrite the solutions. Following (Chen et al., 2010, page 39) we rewrite the test functions in term of their derivatives (still with Z a standard normal random variable)

h (y) - E [h (X)] = h (y) - E [Z^{2} h (Z)] = \int_{- \infty}^{\infty} z^{2} (h (y) - h (z)) φ (z) d z = \int_{- \infty}^{y} z^{2} (\int_{z}^{y} h^{'} (t) d t) φ (z) d z - \int_{y}^{\infty} z^{2} (\int_{y}^{z} h^{'} (t) d t) φ (z) d z .

Changing the order of integration then using (29) leads to the rhs becoming

\int_{- \infty}^{y} h^{'} (t) [\int_{- \infty}^{t} z^{2} φ (z) d z] d u - \int_{y}^{\infty} h^{'} (t) [\int_{t}^{\infty} z^{2} φ (z) d z] d t = \int_{- \infty}^{y} h^{'} (t) [- t φ (t) + \int_{- \infty}^{t} φ (z) d z] d t - \int_{y}^{\infty} h^{'} (t) [t φ (t) + \int_{t}^{\infty} φ (z) d z] d t = - \int_{- \infty}^{\infty} h^{'} (t) t φ (t) d t + \int_{- \infty}^{y} h^{'} (t) Φ (t) d t - \int_{y}^{\infty} h^{'} (t) \overset{‒}{Φ} (t) d t,

and thus

h (y) - E [h (X)] = \int_{- \infty}^{y} h^{'} (t) Φ (t) d t - \int_{y}^{\infty} h^{'} (t) \overset{‒}{Φ} (t)) d t - E [Z h^{'} (Z)] .

(31)

We deduce the following useful bound

\frac{h (x) - E [h (X)]}{x^{2} + 2} \leq c \frac{2 (x + 1 ∕ \sqrt{2 π}) + \sqrt{2 ∕ π}}{x^{2} + 2} \leq 2 c .

(32)

Plugging (31) in (26) leads to (we restrict the discussion to x > 0, the other case following by symmetry)

g_{0} (x) = - E [Z h^{'} (Z)] Υ (x) ≕ I (x) + e^{x^{2} ∕ 2} \int_{x}^{\infty} \int_{- \infty}^{y} y^{2} e^{- y^{2} ∕ 2} h^{'} (t) Φ (t) d t d y ≕ I I (x) - e^{x^{2} ∕ 2} \int_{x}^{\infty} \int_{y}^{\infty} y^{2} e^{- y^{2} ∕ 2} h^{'} (t) \overset{‒}{Φ} (t) d t d y ≕ I I I (x)

To deal with the quantities II(x) and III(x) we again interchange integrations to get

I I (x) = e^{x^{2} ∕ 2} \int_{- \infty}^{x} (\int_{x}^{\infty} y^{2} e^{- y^{2} ∕ 2} d y) h^{'} (t) Φ (t) d t + e^{x^{2} ∕ 2} \int_{x}^{\infty} (\int_{t}^{\infty} y^{2} e^{- y^{2} ∕ 2} d y) h^{'} (t) Φ (t) d t = Υ (x) \int_{- \infty}^{x} h^{'} (t) Φ (t) d t + e^{x^{2} ∕ 2} \int_{x}^{\infty} e^{- t^{2} ∕ 2} Υ (t) h^{'} (t) Φ (t) d t

and

I I I (x) = e^{x^{2} ∕ 2} \int_{x}^{\infty} (\int_{x}^{t} y^{2} e^{- y^{2} ∕ 2} d y) h^{'} (t) \overset{‒}{Φ} (t) d t = e^{x^{2} ∕ 2} \int_{x}^{\infty} (e^{- x^{2} ∕ 2} Υ (x) - e^{- t^{2} ∕ 2} Υ (t)) h^{'} (t) \overset{‒}{Φ} (t) d t = Υ (x) \int_{x}^{\infty} h^{'} (t) \overset{‒}{Φ} (t) d t - e^{x^{2} ∕ 2} \int_{x}^{\infty} e^{- t^{2} ∕ 2} Υ (t) h^{'} (t) \overset{‒}{Φ} (t) d t

and thus if x ≥ 0 we have

g_{0} (x) = - E [Z h^{'} (Z)] Υ (x) + Υ (x) \int_{- \infty}^{x} h^{'} (t) Φ (t) d t - Υ (x) \int_{x}^{\infty} h^{'} (t) \overset{‒}{Φ} (t) d t + e^{x^{2} ∕ 2} \int_{x}^{\infty} e^{- t^{2} ∕ 2} Υ (t) h^{'} (t) d t .

(33)

By a similar argument we deduce that if x < 0 then

g_{0} (x) = - E [Z h^{'} (Z)] Υ (x) + Υ (x) \int_{- \infty}^{x} h^{'} (t) \overset{‒}{Φ} (t) d t - Υ (x) \int_{x}^{\infty} h^{'} (t) Φ (t) d t + e^{x^{2} ∕ 2} \int_{- \infty}^{x} e^{- t^{2} ∕ 2} Υ (t) h^{'} (t) d t .

(34)

Step 2: a bound on ∥g∥. Supposing ∥h′∥ ≤ c we can use (33) and the first claim in (30) to deduce that for x ≥ 0:

∣ g_{0} (x) ∣ \leq c E ∣ Z ∣ (x + \sqrt{π ∕ 2}) + c (x + \sqrt{π ∕ 2}) \int_{- \infty}^{x} Φ (t) d t + c (x + \sqrt{π ∕ 2}) \int_{x}^{\infty} \overset{‒}{Φ} (t) d t + c e^{x^{2} ∕ 2} \int_{x}^{\infty} e^{- t^{2} ∕ 2} (t + \sqrt{π ∕ 2}) d t .

The last two terms decrease strictly to 0 as x → ∞, with maximum value c/2 and c(1 + π/2), respectively. The first term is equal to $c (\sqrt{2 ∕ π} x + 1)$ and the second one is equal to

c (x + \sqrt{π ∕ 2}) \int_{- \infty}^{x} Φ (t) d t = c (x + \sqrt{π ∕ 2}) (x Φ (x) + φ (x)) \leq c (x^{2} + (\sqrt{π ∕ 2}) + 1 ∕ \sqrt{2 π}) x + 1 ∕ 2) .

Similar (symmetric) bounds hold for x ≤ 0 and thus, collecting all these estimates, we may conclude:

∣ g (x) ∣ = \frac{∣ g_{0} (x) ∣}{x^{2} + 2} \leq 3 c .

(35)

Step 3: a bound on ∥g′∥. Here we start by rewriting the derivative as

g^{'} (x) = \frac{g_{0}^{'} (x)}{x^{2} + 2} - \frac{2 x}{(x^{2} + 2)^{2}} g_{0} (x) .

(36)

Using (35), the second summand is easily seen to be uniformly bounded (by 3c). We are left with the first summand for which we start by rewriting the numerator, for x ≥ 0, using (33):

g_{0}^{'} (x) = - Υ^{'} (x) E [Z h^{'} (Z)] + Υ^{'} (x) \int_{- \infty}^{x} h^{'} (t) Φ (t) d t - Υ^{'} (x) \int_{x}^{\infty} h^{'} (t) \overset{‒}{Φ} (t) d t + Υ (x) (h^{'} (x) Φ (x) + h^{'} (x) \overset{‒}{Φ} (x)) + x e^{x^{2} ∕ 2} \int_{x}^{\infty} Υ (t) h^{'} (t) e^{- t^{2} ∕ 2} d t - e^{x^{2} ∕ 2} Υ (x) h^{'} (x) e^{- x^{2} ∕ 2}

which leads to

g_{0}^{'} (x) = - Υ^{'} (x) E [Z h^{'} (Z)] + Υ^{'} (x) \int_{- \infty}^{x} h^{'} (t) Φ (t) d t - Υ^{'} (x) \int_{x}^{\infty} h^{'} (t) \overset{‒}{Φ} (t) d t + x e^{x^{2} ∕ 2} \int_{x}^{\infty} Υ (t) h^{'} (t) e^{- t^{2} ∕ 2} d t .

(37)

Now we can use the fact that $Υ^{'} (x) = x e^{x^{2} ∕ 2} \int_{x}^{\infty} e^{- t^{2} ∕ 2} d t \leq 1$ for all x ≥ 0 as well as all the arguments outlined at the previous step to deduce the bound: $∣ g_{0}^{'} (x) ∣ \leq c (\sqrt{\frac{2}{π}} + 2 (x + 1 ∕ \sqrt{2 π}) + \frac{1}{\sqrt{2 π}}) \leq 2 c (x + 1)$ whence

\frac{∣ g_{0}^{'} (x) ∣}{x^{2} + 2} \leq c \frac{2 x + 2}{x^{2} + 2} \leq c .

(38)

Similar (symmetric) arguments hold also for negative x and thus ∣g′(x)∣ ≤ 4c.

Step 4: a bound on χ(x) = g′(x)/x. Using (36) we know that

χ (x) = \frac{g_{0}^{'} (x)}{x (x^{2} + 2)} - \frac{1}{(x^{2} + 2)^{2}} g_{0} (x) .

(39)

The second summand in (39) is bounded using (34) to get

\frac{1}{(x^{2} + 2)^{2}} ∣ g_{0} (x) ∣ \leq 3 c .

(40)

For the first summand we use (37) to deduce

\frac{g_{0}^{'} (x)}{x} = - \frac{Υ^{'} (x)}{x} E [Z h^{'} (Z)] + \frac{Υ^{'} (x)}{x} \int_{- \infty}^{x} h^{'} (t) Φ (t) d t - \frac{Υ^{'} (x)}{x} \int_{x}^{\infty} h^{'} (t) \overset{‒}{Φ} (t) d t + e^{x^{2} ∕ 2} \int_{x}^{\infty} Υ (t) h^{'} (t) e^{- t^{2} ∕ 2} d t .

At this stage it is useful to remark that, for x ≥ 0, the function $Υ^{'} (x) ∕ x$ is strictly decreasing with maximal value $\sqrt{π ∕ 2}$ and hence $∣ \frac{g_{0}^{'} (x)}{x} ∣ \leq c (1 + 2 (x + \frac{1}{\sqrt{2 π}}) + \frac{1}{\sqrt{2 π}}) \leq c (2 x + 3)$ and thus $∣ \frac{g_{0}^{'} (x)}{x (x^{2} + 2)} ∣ \leq 3 c$ which, combined with (40), leads (after applying the symmetric arguments for x ≤ 0) to ∣χ(x)∣ ≤ 6c.

Step 5: a bound on ∥χ′∥. Direct computations using (28)

χ (x) = \frac{1}{x^{2} + 2} (1 - \frac{1}{x^{2} + 2}) g_{0} (x) - \frac{x}{x^{2} + 2} (h (x) - E [Z^{2} h (Z)])

and thus

χ^{'} (x) = - \frac{2 x^{3}}{(x^{2} + 2)^{2}} \frac{g_{0} (x)}{x^{2} + 2} + (1 - \frac{2}{x^{2} + 2}) \frac{g_{0}^{'} (x)}{x^{2} + 2} - \frac{2 - x^{2}}{x^{2} + 2} \frac{h (x) - E [Z^{2} h (Z)]}{x^{2} + 2} - \frac{x}{x^{2} + 2} h^{'} (x) .

Using the bounds ∣2x³/(x² + 2)²∣ ≤ 1, ∣1 – 2/(x² + 2)∣ ≤ 1, ∣(2 – x²)/(x² + 2)∣ ≤ 1 and ∣x/(x² + 2)∣ ≤ 1 as well as (35), (38) and (32) we conclude (after applying the symmetric arguments for x ≤ 0) ∣χ′(x)∣ ≤ 7c. □

4.5. Verifying bounds on expectations

In this section we find bounds on the expectations in Theorem 3.6 in order to prove Corollary 3.7. We will make use of the following lemma.

Lemma 4.8. If x₁, … , x_N is the unique strictly decreasing zero-mean solution of (4), then $x_{1} = O (\sqrt{\log N})$ .

Proof. To simplify the notation, note that it suffices to consider the rescaled recursion $x_{n + 1}^{3} = x_{n}^{3} - S_{n}^{- 1}$ , where S_n is defined in the proof of Lemma 2.1. By expressing $x_{1}^{3}$ as a telescoping sum,

x_{1}^{3} = \sum_{n = 1}^{m - 1} (x_{n}^{3} - x_{n + 1}^{3}) + x_{m}^{3} = \sum_{n = 1}^{m - 1} S_{n}^{- 1} + x_{m}^{3} \leq \sum_{n = 1}^{m - 1} (n ∕ x_{1})^{- 1} + x_{m}^{3} \leq x_{1} (1 + \log m) + x_{m}^{3},

where we have used Euler’s approximation to the harmonic sum for the last inequality. By the variance property (P2) (in this rescaled case $x_{1}^{2} + \dots + x_{N}^{2} = N - 1)$ we have that x₁ is bounded away from zero (as a sequence indexed by N) and x_m is bounded, so $x_{m}^{2} ∕ x_{1}$ is bounded. Dividing the above display by x₁, we then obtain $x_{1} = O (\sqrt{\log N})$ . □

Proof of Corollary 3.7. From Proposition 3.5 and the recursion (4), note that p*(x) puts mass 1/(N – 1) on each interval between successive x_n, so it is easy to create a coupling of $W \sim P_{N}$ with W* ~ p*(x) such that

∣ W - W^{⋆} ∣ \leq ∣ x_{n} - x_{n + 1} ∣

when W ∈ [x_n+1, x_n]. For a detailed proof of such a coupling, see the construction given in McKeague and Levin (2016). From (P3) (see Lemma 2.1) and Lemma 4.8 we then have

E ∣ W - W^{⋆} ∣ \leq \frac{1}{N - 1} \sum_{n = 1}^{N - 1} (x_{n} - x_{n + 1}) = \frac{2 x_{1}}{N - 1} = O (\frac{\sqrt{\log N}}{N}) .

(41)

Second, using $∣ W ∣ \leq x_{1} = O (\sqrt{\log N})$ it follows immediately that

E [∣ W ∣ ∣ W - W^{⋆} ∣] = O (\frac{\log N}{N}) .

Third, the zero-median property gives

2 x_{m}^{8} = x_{m}^{3} - x_{m + 1}^{3} = S_{m}^{- 1} \geq (m ∕ x_{m})^{- 1} = x_{m} ∕ m,

where m = N/2 + 1, so $x_{m} \geq 1 ∕ \sqrt{N}$ . By symmetry

E ∣ \frac{1}{W} - \frac{1}{W^{⋆}} ∣ = E ∣ \frac{1}{W} - \frac{1}{W^{⋆}} ∣ 1_{W^{⋆} \in (x_{m + 1}, x_{m}]} + 2 \sum_{n = 1}^{m - 1} E ∣ \frac{1}{W} - \frac{1}{W^{⋆}} ∣ 1_{W^{⋆} \in (x_{n + 1}, x_{n}]} .

From Proposition 3.5 note that p*(x) ∝ x² for x ∈ (x_m+1, x_m]. Also using the fact that p*(x) puts mass 1/(N – 1) on this interval, the first term above can be written

\frac{6}{x_{m}^{3} (N - 1)} \int_{0}^{x_{m}} (\frac{1}{x} - \frac{1}{x_{m}}) x^{2} d x \leq \frac{3}{x_{m} (N - 1)} = O (\frac{1}{\sqrt{N}}) .

The second term is bounded above by the telescoping sum

\frac{2}{N - 1} \sum_{n = 1}^{m - 1} (\frac{1}{x_{n + 1}} - \frac{1}{x_{n}}) = \frac{2}{N - 1} (\frac{1}{x_{m}} - \frac{1}{x_{1}}) = O (\frac{1}{\sqrt{N}}),

so we have

E ∣ \frac{1}{W} - \frac{1}{W^{⋆}} ∣ = O (\frac{1}{\sqrt{N}}) .

Fourth,

E ∣ 1 - \frac{W^{⋆}}{W} ∣ \leq \sqrt{N} E ∣ W - W^{⋆} ∣ = O (\sqrt{\frac{\log N}{N}}) .

sing $∣ W ∣ \geq x_{m} \geq 1 ∕ \sqrt{N}$ and (41). The Corollary now follows from Theorem 3.6. □

Acknowledgements

The research of Ian McKeague was partially supported by NSF Grant DMS-1307838 and NIH Grant 2R01GM095722-05. The research of Yvik Swan was partially by the Fonds de la Recherche Scientifique - FNRS under Grant no F.4539.16. We also thank the Institute for Mathematical Sciences at National University of Singapore for support during the Workshop on New Directions in Stein’s Method (May 18–29, 2015) where work on the paper was initiated.

References

Bohm D (1952). A suggested interpretation of the quantum theory in terms of “hidden” variables. I. Phys. Rev. 85, 166–179. [Google Scholar]
Cacoullos T and Papathanasiou V (1989). Characterizations of distributions by variance bounds. Statist. Probab. Lett. 7, 351–356. [Google Scholar]
Chatterjee S (2009). Fluctuations of eigenvalues and second order Poincaré inequalities. Probab. Theory Related Fields 143 1–40. [Google Scholar]
Chatterjee S and Shao Q-M (2011). Non-normal approximation by Stein’s method of exchangeable pairs with application to the Curie–Weiss model. Ann. App. Probab. 21, 464–483. [Google Scholar]
Chen L, Goldstein L and Shao Q-M (2010). Normal Approximation by Stein’s Method. Springer Verlag. [Google Scholar]
Döbler C (2015). Stein’s method of exchangeable pairs for the beta distribution and generalizations. Electron. J. Probab. 20, 1–34. [Google Scholar]
Goldstein C and Reinert G (1997). Stein’s method and the zero bias transformation with application to simple random sampling. Ann. Appl. Probab. 7, 935–952. [Google Scholar]
Hall MJW, Deckert DA and Wiseman HM (2014). Quantum phenomena modeled by interactions between many classical worlds. Phys. Rev. X 4, 041013. [Google Scholar]
Ley C, Reinert G and Swan Y (2017). Stein’s method for comparison of univariate distributions. Probab. Surv. 14 1–52 [Google Scholar]
McKeague IW and Levin B (2016). Convergence of empirical distributions in an interpretation of quantum mechanics. Ann. Appl. Probab. 26 2540–2555. [DOI] [PMC free article] [PubMed] [Google Scholar]
Nourdin I and Peccati G (2009). Stein’s method on Wiener chaos. Probab. Theory Related Fields 145 75–118. [Google Scholar]
Nourdin I and Peccati G (2012). Normal approximations with Malliavin calculus: from Stein’s method to universality. Vol. 192 Cambridge University Press. [Google Scholar]
Peköz E and Röllin A (2011). New rates for exponential approximation and the theorems of Rényi and Yaglom. Ann. Probab. 39, 587–608. [Google Scholar]
Peköz E, Röllin A and Ross N (2013). Degree asymptotics with rates for preferential attachment random graphs. Ann. Appl. Probab. 23, 1188–1218. [Google Scholar]
Peköz E, Röllin A and Ross N (2016). Generalized gamma approximation with rates for urns, walks and trees. Ann. Probab, Vol. 44, No. 3, pp. 1776–1816. [Google Scholar]
Ross N (2011). Fundamentals of Stein’s method. Probab. Surv. 8, 210–293. [Google Scholar]
Stein C (1986). Approximate Computation of Expectations. Institute of Mathematical Statistics Lecture Notes–Monograph Series, 7. Institute of Mathematical Statistics, Hayward, CA. [Google Scholar]

[R1] Bohm D (1952). A suggested interpretation of the quantum theory in terms of “hidden” variables. I. Phys. Rev. 85, 166–179. [Google Scholar]

[R2] Cacoullos T and Papathanasiou V (1989). Characterizations of distributions by variance bounds. Statist. Probab. Lett. 7, 351–356. [Google Scholar]

[R3] Chatterjee S (2009). Fluctuations of eigenvalues and second order Poincaré inequalities. Probab. Theory Related Fields 143 1–40. [Google Scholar]

[R4] Chatterjee S and Shao Q-M (2011). Non-normal approximation by Stein’s method of exchangeable pairs with application to the Curie–Weiss model. Ann. App. Probab. 21, 464–483. [Google Scholar]

[R5] Chen L, Goldstein L and Shao Q-M (2010). Normal Approximation by Stein’s Method. Springer Verlag. [Google Scholar]

[R6] Döbler C (2015). Stein’s method of exchangeable pairs for the beta distribution and generalizations. Electron. J. Probab. 20, 1–34. [Google Scholar]

[R7] Goldstein C and Reinert G (1997). Stein’s method and the zero bias transformation with application to simple random sampling. Ann. Appl. Probab. 7, 935–952. [Google Scholar]

[R8] Hall MJW, Deckert DA and Wiseman HM (2014). Quantum phenomena modeled by interactions between many classical worlds. Phys. Rev. X 4, 041013. [Google Scholar]

[R9] Ley C, Reinert G and Swan Y (2017). Stein’s method for comparison of univariate distributions. Probab. Surv. 14 1–52 [Google Scholar]

[R10] McKeague IW and Levin B (2016). Convergence of empirical distributions in an interpretation of quantum mechanics. Ann. Appl. Probab. 26 2540–2555. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R11] Nourdin I and Peccati G (2009). Stein’s method on Wiener chaos. Probab. Theory Related Fields 145 75–118. [Google Scholar]

[R12] Nourdin I and Peccati G (2012). Normal approximations with Malliavin calculus: from Stein’s method to universality. Vol. 192 Cambridge University Press. [Google Scholar]

[R13] Peköz E and Röllin A (2011). New rates for exponential approximation and the theorems of Rényi and Yaglom. Ann. Probab. 39, 587–608. [Google Scholar]

[R14] Peköz E, Röllin A and Ross N (2013). Degree asymptotics with rates for preferential attachment random graphs. Ann. Appl. Probab. 23, 1188–1218. [Google Scholar]

[R15] Peköz E, Röllin A and Ross N (2016). Generalized gamma approximation with rates for urns, walks and trees. Ann. Probab, Vol. 44, No. 3, pp. 1776–1816. [Google Scholar]

[R16] Ross N (2011). Fundamentals of Stein’s method. Probab. Surv. 8, 210–293. [Google Scholar]

[R17] Stein C (1986). Approximate Computation of Expectations. Institute of Mathematical Statistics Lecture Notes–Monograph Series, 7. Institute of Mathematical Statistics, Hayward, CA. [Google Scholar]

PERMALINK

Stein’s method and approximating the quantum harmonic oscillator

IAN W MCKEAGUE

EROL A PEKÖZ

YVIK SWAN

Abstract

1. Introduction

2. Interworld potentials for higher energy states

Figure 1.

Figure 2.

3. Generalized zero-bias transformations

4. The Stein equation and its solutions

4.1. General considerations on Stein’s method and the problem with Stein’s density approach

4.2. Coupling based Stein’s method for densities of the form (8)

4.3. The Stein kernel equation for densities of the form (8)

4.4. Approximating the two-sided Maxwell distribution

4.5. Verifying bounds on expectations

Acknowledgements

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Stein’s method and approximating the quantum harmonic oscillator

IAN W MCKEAGUE

EROL A PEKÖZ

YVIK SWAN

Abstract

1. Introduction

2. Interworld potentials for higher energy states

Figure 1.

Figure 2.

3. Generalized zero-bias transformations

4. The Stein equation and its solutions

4.1. General considerations on Stein’s method and the problem with Stein’s density approach

4.2. Coupling based Stein’s method for densities of the form (8)

4.3. The Stein kernel equation for densities of the form (8)

4.4. Approximating the two-sided Maxwell distribution

4.5. Verifying bounds on expectations

Acknowledgements

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases