Abstract
Hall et al. (2014) recently proposed that quantum theory can be understood as the continuum limit of a deterministic theory in which there is a large, but finite, number of classical “worlds.” A resulting Gaussian limit theorem for particle positions in the ground state, agreeing with quantum theory, was conjectured in Hall et al. (2014) and proven by McKeague and Levin (2016) using Stein’s method. In this article we show how quantum position probability densities for higher energy levels beyond the ground state may arise as distributional fixed points in a new generalization of Stein’s method These are then used to obtain a rate of distributional convergence for conjectured particle positions in the first energy level above the ground state to the (two-sided) Maxwell distribution; new techniques must be developed for this setting where the usual “density approach” Stein solution (see Chatterjee and Shao (2011)) has a singularity.
Keywords: Interacting particle system, Higher energy levels, Maxwell distribution, Stein’s method
1. Introduction
Hall et al. (2014) proposed a many interacting worlds (MIW) theory for interpreting quantum mechanics in terms of a large but finite number of classical “worlds.” In the case of the MIW harmonic oscillator, an energy minimization argument was used to derive a recursion giving the location of the oscillating particle as viewed in each of the worlds. Hall et al. conjectured that the empirical distribution of these locations converges to Gaussian as the total number of worlds N increases. McKeague and Levin (2016) recently proved such a result and provided a rate of convergence. More specifically, McKeague and Levin showed that if x1, … xN is a decreasing, zero-mean sequence of real numbers satisfying the recursion relation
| (1) |
then the empirical distribution of the xn tends to standard Gaussian when N → ∞. Here xn represents the location of the oscillating particle in the nth world, and the Gaussian limit distribution agrees with quantum theory for a particle in the lowest energy (ground) state.
The hypothesized correspondence with quantum theory suggests that stable configurations should also exist at higher energies in the MIW theory. Moreover, the empirical distributions of these configurations should converge to distributions with densities of the form
| (2) |
where φ(x) is the standard normal density,
is the (probabilist’s) kth Hermite polynomial, and k is a non-negative integer. The ground state discussed above corresponds to k = 0 and has the standard Gaussian limit. However, the question of how to characterize higher energy MIW states corresponding to k ≥ 1 is still unresolved as far as we know.
The energy minimization approach of Hall et al. (2014) starts with an analysis of the Hamiltonian for the MIW harmonic oscillator:
where the locations of particles (having unit mass) in the N worlds are specified by x = (x1, … , xN) with x1 > x2 > … > xN, and their momenta by p = (p1, … , pN). Here is the kinetic energy, is the potential energy (for the parabolic trap), and
is called the “interworld” potential, where x0 = ∞ and xN+1 = −∞. In the ground state, there is no movement because all the momenta pn have to vanish for the total energy to be minimized. In this case, as mentioned above, Hall et al. (2014) showed that the particle locations xn satisfy (1) and McKeague and Levin (2016) showed that the empirical distribution tends to a standard Gaussian distribution.
Our contribution in the present article is to derive an interworld potential for the second energy state (k = 1) and show that the empirical distribution of the configuration that minimizes the corresponding Hamiltonian has a limit distribution that again agrees with quantum theory. The interworld potential in this case is shown to be
| (3) |
and the minimizer of the corresponding Hamiltonian H1(x, p) = E(p) + V(x) + U1(x) is shown to satisfy the recursion
| (4) |
Further, we show that if x1, … , xN is a decreasing, zero-mean solution, then the empirical distribution of the xn converges to the (two-sided) Maxwell distribution having density . The entire sequence x1, … xN should be viewed as indexed by N, though we suppress notation for this dependence and write x1, … xN instead of x1, N, … , xN,N. We also give a rate of convergence using a new extension of Stein’s method. Our approach is generalizable to recursions that converge to the distributions of other higher energy states of the quantum harmonic oscillator, although we do not pursue such extensions here.
We initially thought that the MIW interpretation could be based on a “universal” interworld potential function U0 that applies to all energy levels, with the densities pk(x) then arising as limits of local minima of H0. However, this idea turned out to be analytically unworkable. Here we propose an alternative approach in terms of adapting the interworld potential to each higher energy level. Minimizing the resulting Hamiltonian is then tractable and the solution can be shown to converge to pk(x), at least in the case k = 1. Hall et al. (2014) derived their interworld potential U0 as a discretization of Bohm’s quantum potential summed over the particle ensemble, see Bohm (1952). The challenge in general is to extend this derivation to higher-energy wave functions in a way that leads to an explicit recursion minimizing the resulting Hamiltonian, and to show that it agrees with pk(x) in the limit. A major contribution here, in addition to providing a rate of convergence, is a general method for finding such interworld potential functions and their associated particle recursions.
Stein’s method (see Stein (1986), Chen et al. (2010) and Ross (2011)) is a well-established technique for obtaining explicit error bounds for distributional limit theorems. However, the usual “density approach” (see Chatterjee and Shao (2011)) for applying Stein’s method does not seem to work in cases where the density function vanishes at a point in the interior of the support of the target distribution (here we have p1 (0) = 0 and the support is the whole real line). As we elaborate later, in this case the solution to the Stein equation will have a singularity and also unbounded derivatives. This motivates the new technique we will develop to handle such distributions. While there are plenty of examples of Stein’s method applied to distributions with a density having a zero on the boundary of the support (the gamma and beta distributions, for example), there have been no examples (that we know of) with a zero in the interior of the support; the higher energy distributions pk (x), for k ≥ 1, appear to be the first such distributions considered. The price one has to pay with our approach for handling these zeros is more complicated estimates involving couplings. In our case, however, analytical properties of the recursion (4) can fortunately be exploited to establish such estimates.
In Section 2 we generalize the argument of Hall et al. (2014) to derive the interworld potential, and show how it leads to the solution (4). In Section 3 we introduce the notion of a generalized zero-bias transformation, and show that the distributional properties of eigenstates of the quantum harmonic oscillator can be characterized in terms of fixed points of this transformation. Also, we derive the generalized zero-bias distribution for the empirical distribution of general configurations. Section 4 develops our results based on the new extension of Stein’s method to show convergence of the configuration that minimizes the Hamiltonian of the second energy state.
2. Interworld potentials for higher energy states
Hall et al. (2014) introduced their MIW theory from the perspective of the de Broglie–Bohm interpretation of quantum mechanics, which is mathematically equivalent to standard quantum theory. They used this approach to construct an ansatz for the conjectured interworld potential U0 governing the ground state wave function of the quantum harmonic oscillator. In this section we introduce an extended version of this ansatz aimed at providing a MIW characterization of the higher energy eigenstates.
Our argument follows along the lines of Section IIIA of Hall et al. (2014) with the major difference being that we now need to introduce a more general way of approximating the density of particle location for a stationary wave function ψ(x), namely for a density of the form p(x) = ∣ψ(x)∣2 = b(x)φ(x), where b(x) is a non-negative, even, smooth function having finitely many zeros. Here b represents a “baseline” that varies more rapidly than φ(x). Let x1 > x2 > … > xN. Bohm’s quantum potential summed over the ensemble {xn} is defined by
| (5) |
where we are using dimensionless units. An approximation to p(xn) based on ignoring φ(x) is given (up to a normalizing constant) by
where is the cumulative baseline function. This suggests
where we set B(x0) = ∞ and B(xN+1) = −∞. Our proposed ansatz for the interworld potential is then based on inserting the above expression into (5) to obtain
| (6) |
Note that our earlier assumptions about b imply that B is strictly increasing, so Ub is well-defined. In the simplest cases b(x) = 1 and b(x) = x2 the above expression for Ub agrees with the interworld potentials U0 and U1 defined in the Introduction. We conjecture that the interworld potential Ub is suitable for obtaining MIW approximations to the class of target distributions of the form pk(x). Indeed, there may be a natural affinity between our new version of Stein’s method and the densities pk(x) for all the energy levels of the quantum harmonic oscillator.
Specializing to the case b(x) = x2, the following argument characterizes the minimizer of the Hamiltonian H1 (i.e., the ground state when the interworld potential is U1) in terms of a solution to the recursion (4). In any ground state the particles do not move, so the kinetic energy E vanishes. Then, adapting the argument of Hall et al. (2014) to apply to H1, we have
where the first inequality is Cauchy–Schwarz. So U1 ≥ 9(N – 1)2/V, leading to
with the last inequality being equality for V = 3(N – 1). It follows that H1 is minimized when V = 3(N – 1), the mean of vanishes, and
for some constant α. The sum of the right of the above display telescopes, leading to the recursion (4) by rearranging and noting that α = −V/(N – 1) = −3.
The following lemma provides the basic properties we need to ensure the existence of a solution of the Maxwell recursion (4) that minimizes the Hamiltonian H1, as well as ensuring that the solution is unique. This result is analogous to Lemma 1 of McKeague and Levin (2016) concerning solutions of (1), but the difference here is that the variance is 3, agreeing with the Maxwell distribution (rather than close to standard normal in the case of (1)).
Lemma 2.1. Suppose N is even. Every zero-median solution x1, … , xN of (4) satisfies:
(P1) Zero-mean: x1 + … + xN = 0.
(P2) Maxwell variance: .
(P3) Symmetry: xn = −xN+1−n for n = 1, … , N.
Further, there exists a unique solution x1, … , xN such that (P1) and
(P4) Strictly decreasing: x1 > … > xN
hold. This solution has the zero-median property, and thus also satisfies (P2) and (P3).
Proof. The proof follows identical steps to the proof of Lemma 1 of McKeague and Levin (2016), apart from the variance property (P2), which is proved using (P1) and (P3) as follows. Denote for n = 1, … , N, and set S0 = 0. Using (4) we can write
where we used the recursion in the second equality, and the last equality is from a telescoping sum. (P3) implies SN = 0, so −SN–1 = 1/xN, and (P2) follows. □
Although in the sequel we concentrate on the case k = 1 (see Figure 1), to conclude this section we briefly discuss general densities of the form pk given in (2). The above argument for b(x) = x2 can be extended to general Ub under the condition that B(x) is proportional to xb(x), which is the case when b(x) is proportional to xr for some even non-negative integer r (but not for the square of the kth Hermite polynomial unless k = 0 or 1). Under this condition, it can be shown that the minimizer of the Hamiltonian based on Ub is a symmetric solution of the recursion
| (7) |
Figure 1.

Example with b(x) = x2, N = 22, showing the piecewise constant density having mass 1/(N – 1) uniformly distributed over the intervals between successive xn compared with the Maxwell density, where the breaks in the histogram are the successive xn satisfying the recursion (4).
We have not been able to show that this recursion minimizes the Hamiltonian for general b, but our numerical results suggest that it is very close if not identical to a minimizer. With k = 2 we have b(x) = (x2 – 1)2/2, B(x) = x5/10 – x3/3 + x/2, and the symmetric solution of the resulting recursion produces a remarkably good agreement with pk, see Figure 2.
Figure 2.
Example with b(x) = Hek(x)2/k! for k = 2, N = 41, where the breaks in the histogram are the successive xn satisfying the recursion (7) and the red curve is pk(x).
3. Generalized zero-bias transformations
Let W be a symmetric random variable and b: a non-negative function such that . Goldstein and Reinert (1997) gives a distributional fixed point characterization of the Gaussian distribution, which we generalize in the definition below.
Definition 3.1. If there is a random variable W* such that
for all absolutely continuous functions f: such that , we say that W* has the b-generalized-zero-bias distribution of W.
Remark 3.2. Goldstein and Reinert (1997) study the case b(x) = 1 and show that W* has the same distribution as W if and only if W has a Gaussian distribution. Distributional fixed point characterizations for exponential, gamma and other nonnegative distributions and the connection with Stein’s method have been studied in Peköz and Röllin (2011), Peköz et al. (2013), and Peköz et al. (2016).
Remark 3.3. By a routine extension of the proof of Proposition 2.1 of Chen et al. (2010), it can be shown that there exists a unique distribution for W*, and it is absolutely continuous with density
We note in passing that the σ2 should be on the other side of the equality in the first, display of Chen et al.’s proposition, which corresponds to b(x) = 1, the usual zero-bias distribution of W. The composition of the b-generalized-zero-bias transformation with the (1/b)-generalized-zero-bias transformation is the usual zero-bias transformation.
Remark 3.4. With φ the standard normal density and b a φ-integrable function, if W has density
| (8) |
then its distribution is a fixed point, for the b-generalized-zero-bias transformation since
The following result gives the b-generalized-zero-bias distribution of the uniform distribution on N points.
Proposition 3.5. Given an integer N > 1, let x1 > x2 > … > xN be such that b(xn) > 0 for all n. Let be the empirical distribution of the xn:
for any Borel set . Under the symmetry condition xn = xN−n+1 for n = 1, … , N, the b-generalized-zero-bias distribution of is defined, and has density
for xn+1 < x ≤ xn (n = 1, … , N – 1), and p* (x) = 0 if x > x1 or x ≤ xN.
Proof. Immediate from Remark 3.3. □
Recall the following distances between distribution functions F and G. The Kolmogorov distance is
and the Wasserstein distance is
where
and ∥ · ∥ is the supremum norm. Using Proposition 1.2 in Ross (2011), these two metrics are seen to be related by
if G has density bounded by C.
Restricting attention to the special case b(x) = x2, we can now state our main result, along with an important corollary.
Theorem 3.6. Suppose W* is constructed on the same probability space as the zero-mean random variable W and is distributed according to the x2-generalized-zero-bias distribution of W. Let M have the two-sided Maxwell density . Then there exist positive finite constants λ1, λ2, λ3 and λ4 such that
| (9) |
Proof. The inequality follows immediately from Theorem 4.4. Finiteness of the constants (along with explicit upper bounds) is detailed in Proposition 4.5. □
The following corollary gives a rate of convergence of the solution to (4) to the two-sided Maxwell distribution in terms of the Wasserstein distance; we postpone the proof until Section 4.3.
Corollary 3.7. Suppose x1, … xN is a monotonic, zero-mean, finite sequence of real numbers satisfying (4), let be the empirical distribution of these values, and let M be as in Theorem 3.6. Then there is a constant C > 0 such that
4. The Stein equation and its solutions
4.1. General considerations on Stein’s method and the problem with Stein’s density approach
Let F and G be two cumulative distribution functions which one wishes to compare. Denote L1(F) (resp., L1(G)) the class of Borel measurable functions such that ∫ ∣h∣ dF < ∞ (resp., ∫ ∣h∣ dG < ∞). A discrepancy measure between F and G is an integral probability metric if it can be written in the form
for some class of test functions . The aforementioned Kolmogorov and Wasserstein distances are two important examples of integral probability metrics.
Suppose that F is absolutely continuous with density p on the real line, and introduce the operator h → fh which, to each h ∈ L1(F), assigns the function
| (10) |
where F(h) = ∫ h dF. The integrability condition h ∈ L1(F) guarantees that fh is the unique absolutely continuous solution to the differential equation
to also satisfy the boundary conditions limx→±∞ p(x)fh(x) = 0. Under the assumption that we can integrate with respect to G on both sides of the differential equation (known, in the Stein community argot, as a “Stein equation”) to get
| (11) |
with W a random variable distributed according to G. This last expression provides a means of bounding integral probability metrics (and thus in particular the Kolmogorov and Wasserstein distances) in terms of the action of a differential operator over a class of functions.
The steps outlined above form the basis of what is known as the “density approach” to Stein’s method (see e.g. Chatterjee and Shao (2011)), which is the most intuitive extension of Stein’s method of normal approximation (as described in Chen et al. (2010)) to arbitrary continuous target distributions. In order for (11) to be of practical use, however, it is crucial that the functions p′/p, fh and be amenable to computations; it is particularly important that fh and its first derivatives be bounded. Such conditions are not met in the case of the two-sided Maxwell distribution p(x) = x2φ(x) with which we are concerned in this paper. Indeed, for such a density, we have on the one hand p′(x)/p(x) = 2/x – x and, on the other hand, du, both of which have a singularity at x = 0. Because of this, applying the classical Stein’s method toolkit to the right-hand side of (11) will ultimately lead to trivial upper bounds and more elaborate methods need to be devised. This will be performed in the coming sections.
Before proceeding to the description of our proposal, we stress that the classical version of the “density approach” to Stein’s method that we have just described actually breaks down for any target density p such that p(x0) = 0 at some x0 not on the edges of the support. Indeed, in general, the Stein solution (10) is the product of two terms: one term with a singularity wherever the density has a zero, and a second term that vanishes at the endpoints of the range of the target random variable. This results in the peculiar behavior of singularities inside the range of the random variable when the density has a zero there. Note that for the one-sided Maxwell distribution the solution fh(x) has the same form as for its two-sided counterpart, though it is now only defined for x ⩾ 0; since in this one-sided case when x = 0 the second term vanishes, we would have fh(0) = 0. This term doesn’t vanish (unless h is an even function) for the two-sided Maxwell case, thus giving rise to the singularity at x = 0.
4.2. Coupling based Stein’s method for densities of the form (8)
Let X be a random variable with probability density function p which we assume to be of the form (8). Let W be a symmetric random variable whose distribution we want to compare with that of X. First, we introduce the random variable W* proposed in Definition 3.1 and write
| (12) |
for f = fh solutions to the differential equation
| (13) |
Taking suprema over all , we deduce
| (14) |
for all such that the solutions to (13) are well-defined. Expression (14) provides an alternative to (11) which we will now prove to be useful to our purpose.
At this stage the next typical “Stein-method” step is to write W* = W + (W* – W) and Taylor expand the integrand in (14) around W to deduce a bound on expressed in terms of the difference between W* and W. Unfortunately, for similar reasons as those described in Section 4.1, the solutions to (13) also have singularities which make this intuition unexploitable directly. We propose to bypass this difficulty by introducing intermediate functions τX and g – to be defined later on in the text – for which
| (15) |
Bounding integral probability metrics between and then boils down to finding bounds on the four terms provided in (15). Obviously this will only lead to reasonable results if the intermediate functions τX and g are chosen wisely.
4.3. The Stein kernel equation for densities of the form (8)
We start by introducing the integral operator
| (16) |
with Φ(h) = ∫ hdΦ and Φ the standard Gaussian cumulative distribution function. (The notation is taken from Ley, Reinert and Swan (2017).) We also introduce the function
which is called the “Stein kernel” of X (or, equivalently, of p) – again we refer to Ley, Reinert and Swan (2017) for intuition and first properties.
Remark 4.1. Stein kernels were introduced in Stein (1986); Cacoullos and Papathanasiou (1989), and have proven to be of great use in Gaussian analysis, see, e.g., Nourdin and Peccati (2009) and Chatterjee (2009). Their importance in the abstract approach to Stein’s method has been investigated in Döbler (2015), where it is shown that they have a regularizing effect on the solutions to general Stein equations.
Lemma 4.2. Let x ↦ b(x) be a nonnegative even function with support a subset of (−∞, ∞) and such that limx→±∞ b(x)φ(x) = 0. Suppose furthermore that b is absolutely continuous and integrable w.r.t. φ with integral b(x)φ(x) dx = 1. Let X be a random variable with density x ↦ b(x)φ(x). Then
| (17) |
under the convention that the ratio is set to zero at all points x such that b(x) = 0 and . Let be a Borel function such that E∣h(X)∣ < ∞, and set Then
| (18) |
is the unique solution g of the ODE
| (19) |
which satisfies the asymptotic property limx→±∞ τ(x)φ(x)b(x)g(x) = 0.
Proof. Integrating by parts in the definition of the Stein kernel for p = bφ we get (assuming that limx→±∞ b(x)φ(x) = 0)
so that (17) follows by definition (16) of the inverse Stein operator. For the second claim we follow (Nourdin and Peccati, 2012, Proposition 3.2.2) and note how
so that any solution to (19) has the form
| (20) |
where . By dominated convergence, one infers that
so that the first summand in (20) has the announced form (18) and the asymptotic property is satisfied if and only if d = 0. □
Our next result provides the connection between the Stein equations (13) and (19).
Lemma 4.3. Suppose that b only has isolated zeros. Let all notations be as above and introduce the function g = gf defined at all x such that b(x) > 0 through
Then
| (21) |
Proof. Since
and
at all x for which b(x) ≠ 0, we deduce that f and g are mutually defined by f = (bτX)g. This in turn gives
| (50) |
which, combined with ψ(x) = x(τX(x) – 1) (that is easily derived using the various definitions involved), leads to the useful identity
| (22) |
from which (21) is directly derived. □
Combining identities (12) and (21) we get (15), as promised. As already mentioned in the introduction, the price to pay for circumventing the singularities is the necessity to bound several additional quantities concerning the couplings we obtain. The explicit nature of the recursion described in Section 2 nevertheless allows us to compute the resulting quantities satisfactorily, leading to the bounds claimed in Theorem 3.6 and Corollary 3.7. This we perform in the Maxwell case in the next sections.
4.4. Approximating the two-sided Maxwell distribution
Theorem 4.4. Let p(x) = x2φ(x), and take f a solution to the Stein equation
| (23) |
where is a function having bounded first derivative and zero-mean under p. Set . Then for any coupling of W and W* on a joint probability space such that W* has the x2-generalized zero biased distribution for W,
| (24) |
with
| (25) |
Proof. With b(x) = x2 we have τX(x) = 1 + 2/x2 and ψ(x) = 2/x, so that (21) becomes
The first two terms are dealt with easily to get
For the last two terms we introduce the function
to get on the one hand
so that
and, on the other hand
so that
Combining these different estimates we obtain (24), with λ1, λ2, λ3 and λ4 expressed in terms of ∥χ∥, ∥χ′∥, ∥g∥ and ∥g′∥ as follows:
The inequalities in (25) are proved in the Proposition 4.5 below. □
The next step is to bound ∥χ∥, ∥χ′∥, ∥g∥ and ∥g′∥ in a non trivial way; this we achieve in the next proposition.
Proposition 4.5. Let be absolutely continuous and integrable with respect to p(x) = x2φ(x). Set c = ∥h′∥ which we suppose to be finite. Let X ~ p, define
| (26) |
and set
| (27) |
Then
Remark 4.6. The function g0 defined in (26) satisfies
| (28) |
with Z ~ φ a standard Gaussian random variable.
Remark 4.7. The function g defined in (27) satisfies
with X ~ p and τ(x) = 1 + 2/x2.
Proof. In order to simplify future notations we introduce , , and . Using the identity
| (29) |
We deduce that and and thus
| (30) |
The proof is now broken down into several steps.
Step 1: rewrite the solutions. Following (Chen et al., 2010, page 39) we rewrite the test functions in term of their derivatives (still with Z a standard normal random variable)
Changing the order of integration then using (29) leads to the rhs becoming
and thus
| (31) |
We deduce the following useful bound
| (32) |
Plugging (31) in (26) leads to (we restrict the discussion to x > 0, the other case following by symmetry)
To deal with the quantities II(x) and III(x) we again interchange integrations to get
and
and thus if x ≥ 0 we have
| (33) |
By a similar argument we deduce that if x < 0 then
| (34) |
Step 2: a bound on ∥g∥. Supposing ∥h′∥ ≤ c we can use (33) and the first claim in (30) to deduce that for x ≥ 0:
The last two terms decrease strictly to 0 as x → ∞, with maximum value c/2 and c(1 + π/2), respectively. The first term is equal to and the second one is equal to
Similar (symmetric) bounds hold for x ≤ 0 and thus, collecting all these estimates, we may conclude:
| (35) |
Step 3: a bound on ∥g′∥. Here we start by rewriting the derivative as
| (36) |
Using (35), the second summand is easily seen to be uniformly bounded (by 3c). We are left with the first summand for which we start by rewriting the numerator, for x ≥ 0, using (33):
which leads to
| (37) |
Now we can use the fact that for all x ≥ 0 as well as all the arguments outlined at the previous step to deduce the bound: whence
| (38) |
Similar (symmetric) arguments hold also for negative x and thus ∣g′(x)∣ ≤ 4c.
Step 4: a bound on χ(x) = g′(x)/x. Using (36) we know that
| (39) |
The second summand in (39) is bounded using (34) to get
| (40) |
For the first summand we use (37) to deduce
At this stage it is useful to remark that, for x ≥ 0, the function is strictly decreasing with maximal value and hence and thus which, combined with (40), leads (after applying the symmetric arguments for x ≤ 0) to ∣χ(x)∣ ≤ 6c.
Step 5: a bound on ∥χ′∥. Direct computations using (28)
and thus
Using the bounds ∣2x3/(x2 + 2)2∣ ≤ 1, ∣1 – 2/(x2 + 2)∣ ≤ 1, ∣(2 – x2)/(x2 + 2)∣ ≤ 1 and ∣x/(x2 + 2)∣ ≤ 1 as well as (35), (38) and (32) we conclude (after applying the symmetric arguments for x ≤ 0) ∣χ′(x)∣ ≤ 7c. □
4.5. Verifying bounds on expectations
In this section we find bounds on the expectations in Theorem 3.6 in order to prove Corollary 3.7. We will make use of the following lemma.
Lemma 4.8. If x1, … , xN is the unique strictly decreasing zero-mean solution of (4), then .
Proof. To simplify the notation, note that it suffices to consider the rescaled recursion , where Sn is defined in the proof of Lemma 2.1. By expressing as a telescoping sum,
where we have used Euler’s approximation to the harmonic sum for the last inequality. By the variance property (P2) (in this rescaled case we have that x1 is bounded away from zero (as a sequence indexed by N) and xm is bounded, so is bounded. Dividing the above display by x1, we then obtain . □
Proof of Corollary 3.7. From Proposition 3.5 and the recursion (4), note that p*(x) puts mass 1/(N – 1) on each interval between successive xn, so it is easy to create a coupling of with W* ~ p*(x) such that
when W ∈ [xn+1, xn]. For a detailed proof of such a coupling, see the construction given in McKeague and Levin (2016). From (P3) (see Lemma 2.1) and Lemma 4.8 we then have
| (41) |
Second, using it follows immediately that
Third, the zero-median property gives
where m = N/2 + 1, so . By symmetry
From Proposition 3.5 note that p*(x) ∝ x2 for x ∈ (xm+1, xm]. Also using the fact that p*(x) puts mass 1/(N – 1) on this interval, the first term above can be written
The second term is bounded above by the telescoping sum
so we have
Fourth,
sing and (41). The Corollary now follows from Theorem 3.6. □
Acknowledgements
The research of Ian McKeague was partially supported by NSF Grant DMS-1307838 and NIH Grant 2R01GM095722-05. The research of Yvik Swan was partially by the Fonds de la Recherche Scientifique - FNRS under Grant no F.4539.16. We also thank the Institute for Mathematical Sciences at National University of Singapore for support during the Workshop on New Directions in Stein’s Method (May 18–29, 2015) where work on the paper was initiated.
References
- Bohm D (1952). A suggested interpretation of the quantum theory in terms of “hidden” variables. I. Phys. Rev. 85, 166–179. [Google Scholar]
- Cacoullos T and Papathanasiou V (1989). Characterizations of distributions by variance bounds. Statist. Probab. Lett. 7, 351–356. [Google Scholar]
- Chatterjee S (2009). Fluctuations of eigenvalues and second order Poincaré inequalities. Probab. Theory Related Fields 143 1–40. [Google Scholar]
- Chatterjee S and Shao Q-M (2011). Non-normal approximation by Stein’s method of exchangeable pairs with application to the Curie–Weiss model. Ann. App. Probab. 21, 464–483. [Google Scholar]
- Chen L, Goldstein L and Shao Q-M (2010). Normal Approximation by Stein’s Method. Springer Verlag. [Google Scholar]
- Döbler C (2015). Stein’s method of exchangeable pairs for the beta distribution and generalizations. Electron. J. Probab. 20, 1–34. [Google Scholar]
- Goldstein C and Reinert G (1997). Stein’s method and the zero bias transformation with application to simple random sampling. Ann. Appl. Probab. 7, 935–952. [Google Scholar]
- Hall MJW, Deckert DA and Wiseman HM (2014). Quantum phenomena modeled by interactions between many classical worlds. Phys. Rev. X 4, 041013. [Google Scholar]
- Ley C, Reinert G and Swan Y (2017). Stein’s method for comparison of univariate distributions. Probab. Surv. 14 1–52 [Google Scholar]
- McKeague IW and Levin B (2016). Convergence of empirical distributions in an interpretation of quantum mechanics. Ann. Appl. Probab. 26 2540–2555. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nourdin I and Peccati G (2009). Stein’s method on Wiener chaos. Probab. Theory Related Fields 145 75–118. [Google Scholar]
- Nourdin I and Peccati G (2012). Normal approximations with Malliavin calculus: from Stein’s method to universality. Vol. 192 Cambridge University Press. [Google Scholar]
- Peköz E and Röllin A (2011). New rates for exponential approximation and the theorems of Rényi and Yaglom. Ann. Probab. 39, 587–608. [Google Scholar]
- Peköz E, Röllin A and Ross N (2013). Degree asymptotics with rates for preferential attachment random graphs. Ann. Appl. Probab. 23, 1188–1218. [Google Scholar]
- Peköz E, Röllin A and Ross N (2016). Generalized gamma approximation with rates for urns, walks and trees. Ann. Probab, Vol. 44, No. 3, pp. 1776–1816. [Google Scholar]
- Ross N (2011). Fundamentals of Stein’s method. Probab. Surv. 8, 210–293. [Google Scholar]
- Stein C (1986). Approximate Computation of Expectations. Institute of Mathematical Statistics Lecture Notes–Monograph Series, 7. Institute of Mathematical Statistics, Hayward, CA. [Google Scholar]

