Skip to main content
Entropy logoLink to Entropy
. 2019 Jan 18;21(1):89. doi: 10.3390/e21010089

Poincaré and Log–Sobolev Inequalities for Mixtures

André Schlichting 1
PMCID: PMC7514199  PMID: 33266805

Abstract

This work studies mixtures of probability measures on Rn and gives bounds on the Poincaré and the log–Sobolev constants of two-component mixtures provided that each component satisfies the functional inequality, and both components are close in the χ2-distance. The estimation of those constants for a mixture can be far more subtle than it is for its parts. Even mixing Gaussian measures may produce a measure with a Hamiltonian potential possessing multiple wells leading to metastability and large constants in Sobolev type inequalities. In particular, the Poincaré constant stays bounded in the mixture parameter, whereas the log–Sobolev may blow up as the mixture ratio goes to 0 or 1. This observation generalizes the one by Chafaï and Malrieu to the multidimensional case. The behavior is shown for a class of examples to be not only a mere artifact of the method.

Keywords: Poincaré inequality, log–Sobolev inequality, relative entropy, fisher information, Dirichlet form, mixture, finite Gaussian mixtures

1. Introduction

A mixture of two probability measures μ0 and μ1 on Rn is for some parameter p[0,1] the probability measure μp defined by

μp:=pμ0+(1p)μ1. (1)

Hereby, both measures μ0 and μ1 are assumed to be absolutely continuous with respect to the Lebesgue measure and their supports are nested, i.e., suppμ0suppμ1 or suppμ1suppμ0. Under these assumptions at least one measure is absolutely continuous to the other one

μ0μ1orμ1μ0,

which implies that at least one of the measures has a density with respect to the other one

dμ0=dμ0dμ1dμ1ordμ1=dμ1dμ0dμ0.

This work establishes criteria to check in a simple way under which a mixture of measures satisfies a Poincaré PI(ϱ) or log–Sobolev inequality LSI(α) with constants ϱ and α, respectively, provided that each of the components satisfies one.

Definition 1

(PI(ϱ) and LSI(α)). A probability measure μ on Rn satisfies the Poincaré inequality with constant ϱ>0, if for all functions f:RnR

Varμ[f]:=|ffdμ|2dμ1ϱ|f|2dμ. PI(ϱ)

A probability measure μ satisfies the log–Sobolev inequality with constant α>0, if for all functions f:RnR+ holds

Entμ[f]:=flogfdμfdμlog(fdμ)1α|f|22fdμ. LSI(α)

By the change of variable ff2, the log–Sobolev inequality LSI(α) is equivalent to

Entμ[f2]2α|f|2dμ. (2)

The question of how ϱp and αp in PI(ϱp) and LSI(αp) depend for a mixture μp on the parameter p[0,1] was first studied by Chafaï and Malrieu [1] for measures on Rn. The aim is to deduce simple criteria under which the measure μp (1) satisfies PI(ϱp) and LSI(αp) knowing that μ0 and μ1 satisfy PI(ϱ0), PI(ϱ1) and LSI(α0), LSI(α1), respectively. The approach by Chafaï and Malrieu [1] is based on a functional depending on the distribution function of the measures μ0 and μ1, which then lead to bounds on the Poincaré and log–Sobolev constant of the mixture in one dimension.

This work generalizes part of the results from Chafaï and Malrieu [1] to the multidimensional case by a simple argument. The estimates on the Poincaré and log–Sobolev constants hold for the case, when the χ2-distance of μ0 and μ1 is bounded (see Label (5) for its definition). For this to be true, at least one of the measures μ0 and μ1 needs to be absolutely continuous to the other, which is also a necessary condition for the mixture having connected support. The resulting bound is optimal in the scaling behavior of the mixture parameter p0,1, i.e., a logarithmic blow-up behavior in p for the log–Sobolev constant, whereas the Poincaré constant stays bounded. This different behavior of the Poincaré and log–Sobolev constant was also observed in the setting of metastability in ([2], Remark 2.20).

Let us first introduce the principle for the Poincaré inequality in Section 2 and then for the log–Sobolev inequality in Section 3. Then, the procedure is illustrated on specific examples of mixtures in Section 4.

2. Poincaré Inequality

To keep the presentation concise, the following notation for the mean of a function f:RnR with respect to a measure μ is introduced

Eμ[f]:=fdμ.

In this way, the variance in PI(ϱ) and relative entropy in LSI(α) become

Varμ[f]=Eμ[(fEμ[f])2]=Eμ[f](Eμ[f])2andEntμ[f]=Eμ[flogf]Eμ[f]log(Eμ[f]).

Likewise, the covariance of two functions f,g:RnR is defined by

Covμ[f,g]=Eμ[(fEμ[f](gEμ[g])=Eμ[fg]Eμ[f]Eμ[g].

The Cauchy–Schwarz inequality for the covariance takes now the form

Covμ[f,g]Varμ[f]Varμ[g].

The argument is based on an easy but powerful observation for measures μ0 and μ1 with joint support.

Lemma 1

(Mean-difference as covariance). If suppμ0=suppμ1, then for any ϑ[0,1] and any function f:RnR holds

Eμ0[f]Eμ1[f]=ϑCovμ0[f,dμ1dμ0]+(1ϑ)Covμ1[f,dμ0dμ1]. (3)

Proof. 

The change of measure formula yields that the covariances above are just the difference of the expectation on the right-hand side

Covμ0[f,dμ1dμ0]=Eμ0[fdμ1dμ0]Eμ0[f]Eμ0[dμ1dμ0]=Eμ1[f]Eμ0[f]

and likewise for Covμ1[f,dμ1dμ0]. □

The subsequent strategy is based on the identity (3) by using a Cauchy–Schwarz inequality to arrive at the product of two variances. Then, PI(ϱ0) or PI(ϱ1) can be applied and the parameter ϑ leaves freedom to optimize the resulting expression. This allows for proving the following theorem, which is the generalization of ([1], Theorem 4.4) to the multidimensional case for the Poincaré inequality provided μ0 and μ1 are absolutely continuous to each other.

Theorem 1 (PI for absolutely continuous mixtures).

Let μ0 and μ1 satisfy PI(ϱ0) and PI(ϱ1), respectively, and let both measures be absolutely continuous to each other. Then, for all p(0,1) and q=1p, the mixture measure μp=pμ0+qμ1 satisfies PI(ϱp) with

1ϱp{1ϱ0,ifϱ1ϱ01+pχ1,1ϱ1,ifϱ0ϱ11+qχ0,pχ1+pqχ0χ1+qχ0ϱ0pχ1+ϱ1qχ0,else, (4)

where

χ0:=Varμ0[dμ1dμ0]andχ1:=Varμ1[dμ0dμ1]. (5)

Proof. 

The variance of f with respect to μp is decomposed to

Varμp[f]=pVarμ0[f]+qVarμ1[f]+pq(Eμ0[f]Eμ1[f])2.

Hereby, the first two terms are just the expectation of the conditional variances. The third term is the variance of a Bernoulli random variable. Now, the mean-difference is rewritten by Lemma 1 and the square is estimated with the Young inequality introducing an additional parameter η>0

(a+b)2(1+η)a2+(1+η1)b2.

Then, the Cauchy–Schwarz inequality is applied to the covariances to obtain

Varμ[f]pVarμ0[f]+qVarμ1[f]++pq((1+η)ϑ2Covμ02[f,dμ1dμ0]+(1+η1)(1ϑ)2Covμ1[f,dμ0dμ1])(1+(1+η)ϑ2qχ0)pVarμ0[f]+(1+(1+η1)(1ϑ)2pχ1)qVarμ1[f]1+(1+η)ϑ2qχ0ϱ0|f|2pdμ0+1+(1+η1)(1ϑ)2pχ1ϱ1|f|2qdμ1max{1+(1+η)ϑ2qχ0ϱ0,1+(1+η1)(1ϑ)2pχ1ϱ1}|f|2dμ. (6)

The resulting maximum is now minimized in η>0 and ϑ[0,1]. To do so without loss of generality, ϱ0ϱ1 is assumed. The other case can always be obtained by interchanging the roles of μ0 and μ1. If ϱ0>ϱ1, then ϑ=1 and η0 is optimal as long as

1+qχ0ϱ01ϱ1.

This corresponds to the second case in (4). By symmetry, the first case follows if ϱ1ϱ0.

Now, in the case ϱ0ϱ1 and ϱ0(1+qχ0)ϱ1, there exists by monotonicity for every ϑ(0,1) a unique η*=η*(ϑ)>0 such that both terms in the max of the right-hand side in (6) are equal and hence the max is minimal. Since qχ0>0 and pχ1>0, the sum of the coefficients in the front is then given by h(ϑ):=(1+η)ϑ2+(1+1η)(1ϑ)2 in ϑ as a function of η. The minimization of h in ϑ(0,1) leads to ϑ*=11+η and

h(ϑ*)=11+η+η1+η=1

holds. Hence, in this case, the parameter s=(1+η*)ϑ*2=11+η*(0,1) and (1+η*1)(1ϑ*)2=η*1+η*=1s. Thus, the problem can be rephrased: Find s*(0,1) which solves

1+sqχ0ϱ0=1+(1s)pχ1ϱ1.

The solution s* is given by

s*=(1+pχ1)ϱ0ϱ1ϱ0pχ1+ϱ1qχ0.

For this value of s*, the value of the max in (6) is given by

1+s*qχ0ϱ0=pχ1+ϱ1ϱ0qχ0+(1+p,χ1)qχ0ϱ1ϱ0qχ0ϱ0pχ1+ϱ1qχ0=pχ1+pqχ0χ1+qχ0ϱ0pχ1+ϱ1qχ0.

 □

Remark 1.

The constants χ0 and χ1 can be rewritten if μ0 and μ1 are mutual absolutely continuous as

χ0=(dμ1dμ0)2dμ01=dμ1dμ0dμ11andχ1=(dμ0dμ1)2dμ11=dμ0dμ1dμ01.

This quantity is also known asχ2-distance on the space of probability measures (cf. [3]). The χ2-distance is a rather weak distance and therefore bounds many other probability distances. Among them is also the relative entropy. Indeed, by the concavity of the logarithm and the Jensen inequality follows

Entμ0[dμ1dμ0]=logμ1μ0dμ1log(μ1μ0dμ1)=log(1+χ0)χ0.

Remark 2.

The proof of Theorem 1 shows that the expression for 1ϱ in the last case of (4) can be bounded above and below by

max{1ϱ0,1ϱ1}pχ1+pqχ0χ1+qχ0ϱ0pχ1+ϱ1qχ0max{1+qχ0ϱ0,1+pχ1ϱ1}. (7)

In the case, where χ0=χ1=χ, the formula for ϱp(4) simplifies to

1ϱp1+pqχpϱ0+qϱ1. (8)

Corollary 1.

Let μ0μ1 and μ0, μ1 satisfy PI(ϱ0), PI(ϱ1), respectively. Then, for all p[0,1] with q=1p, the mixture measure μp=pμ0+qμ1 satisfies PI(ϱp) with

1ϱp=max{1ϱ0,1+pχ1ϱ1}.

Likewise, if μ1μ0, then it holds

1ϱp=max{1ϱ1,1+qχ0ϱ0}.

Proof. 

The proof is a simple consequence of Lemma 1 with ϑ=0 and a similar line of estimates as in (6). □

3. Log–Sobolev Inequality

In this section, a criterion for LSI(α) is established. It will be convenient to establish it in the form (2). For a function g:RnR+ and two probability measures μ0 and μ1, the averaged function g¯:0,1R+ is defined by

g¯(0):=Eμ0[g]andg¯(1):=Eμ1[g].

Moreover, the mixture of two Dirac measures δ0 and δ1 is by slight abuse of notation denoted by δp:=pδ0+qδ1 for p[0,1] and q=1p. Then, the entropy of the mixture μp=pμ0+qμ1 is given by

Entμp[f2]=pEntμ0[f2]+qEntμ1[f2]+Entδp[f2¯]. (9)

The following discrete log–Sobolev inequality for a Bernoulli random variable is used to estimate the entropy of the averaged function f¯. The optimal log–Sobolev constant was found by Higuchi and Yoshida [4] and Diaconis and Saloff-Coste ([5], Theorem A.2.) at the same time.

Lemma 2 (Optimal log–Sobolev inequality for Bernoulli measures).

A Bernoulli measure μp on 0,1, i.e., a mixture of two Dirac measures δp=pδ0+qδ1 with p[0,1] and q=1p satisfies the discrete log–Sobolev inequality

Entδp[g]pqΛ(p,q)(g(0)g(1))2forallg:0,1R+,

where Λ:R+×R+R+ is the logarithmic mean defined by

Λ(p,q):=pqlogplogq,forpqandΛ(p,p):=limqpΛ(p,q)=p.

The above result allows for estimating the coarse-grained entropy in (9).

Lemma 3 (Estimate of the coarse-grained entropy).

Let f2¯:{0,1}R+ be given by f2¯(i):=Eμi[f2] for i{0,1}. Then, for all p[0,1] and q=1p,

Entδp[f2¯]pqΛ(p,q)(Varμ0[f]+Varμ1[f]+(Eμ0[f]Eμ1[f])2) (10)

holds.

Proof. 

Lemma 2 applied to Entδp(f2¯) yields

Entμ¯(f2¯)pqΛ(p,q)(f2¯(0)f2¯(1))2. (11)

The square-root-mean-difference on the right-hand side of (11) can be estimated by using the fact that the function (a,b)(ab)2 is jointly convex on R+×R+. Indeed, by introducing the functions f0,f1:Rn×RnR+ defined by f0(x,y)=f(x) and f1(x,y)=f(y), an application of the Jensen inequality yields the estimate

(Eμ0[f2]Eμ1[f2])2=(Eμ0×μ1[f02]Eμ0×μ1[f12])2Eμ0×μ1[(f0f1)2]Eμ0[f2]2Eμ0[f]Eμ1[f]+Eμ1[f2]=Varμ0[f]+Varμ1[f]+(Eμ0[f]Eμ1[f])2. (12)

Now, a combination (11) and (12) gives (10). □

The decomposition (9) together with (10) yields that a mixture μp=pμ0+qμ1 for p[0,1] and q=1p satisfies

Entμp[f2]pEntμ0[f2]+qEntμ1[f2]+pqΛ(p,q)Varμ0[f]+Varμ1[f]+(Eμ0[f]Eμ1[f])2. (13)

The right-hand side of (13) consists of quantities, which can be estimated under the assumption that μ0 and μ1 satisfy LSI(α0) and LSI(α1). The following theorem provides an extension of the result ([1] Theorem 4.4) to the multidimensional case for the log–Sobolev inequality.

Theorem 2 (LSI for absolutely continuous mixtures).

Let μ0 and μ1 satisfy LSI(α0) and LSI(α1), respectively, and let both measures be absolutely continuous to each other. Then, for all p(0,1) and q=1p, the mixture measure μp=pμ0+qμ1 satisfies LSI(αp) with

1αp{1+qλpα0,ifα1α01+pλp(1+χ1(1+qλp)),1+pλpα1,ifα0α11+qλp(1+χ0(1+pλp)),p(1+qλp)χ1+pqλpχ0χ1+q(1+pλp)χ0α0pχ1+α1qχ0,else. (14)

Hereby, χ0 and χ1 are given in (5) and λp is used for the inverse logarithmic mean

λp:=1Λ(p,q)=logplogqpq,forp12,andλ1/2=2.

Proof. 

The starting point is the splitting obtained from (13). The variances and mean-difference in (13) can be estimated in the same way as in the proof (6) of Theorem 1. Additionally, the fact [6] that LSI(α) implies PI(α) is used to derive for any η>0 and any ϑ(0,1)

Entμp[f2]1α0(1+qλp(1+(1+η)ϑ2χ0))|f|2pdμ0+1α1(1+pλp(1+(1+η1)(1ϑ)2χ1))|f|2qdμ1max{1+qλp(1+(1+η)ϑ2χ0)α0,1+pλp(1+(1+η1)(1ϑ)2χ1)α1}|f|2dμp. (15)

By introducing reduced log–Sobolev constants

α˜0:=α01+qλpandα˜1:=α11+pλp, (16)

as well as defining the constants χ˜0 and χ˜1 by

χ˜0:=χ0λp1+qλpandχ˜1=χ1λp1+pλp, (17)

the bound (15) takes the form

Entμp(f2)max{1+(1+η)ϑ2χ˜0α˜0,1+(1+1η)(1ϑ)2χ˜1α˜1}|f|2dμp. (18)

The estimate (18) has the same structure as the estimate (6), where α˜0, α˜i play the role of ϱ0, ϱ1 and χ˜0, χ˜1 the roles of χ0, χ1. Hence, the optimization procedure from the proof of Theorem 1 applies to this case and the last step consists of translating the constants α˜0, α˜1 and χ˜0, χ˜1 back to the original ones. □

Remark 3.

Let the bound for 1αp in the last case of (14) be denoted by 1Ap. Then, the proof shows that it can be bounded above and below in the same way as in (7) in terms of the reduced constants (16) and (17)

max{1+qλpα0,1+pλpα1}1Apmax{1+qλp(1+χ0)α0,1+pλp(1+χ1)α1}.

In the case χ0=χ1=χ, the simplified bound

1αp1+λp+pqλpχpα0+qα1 (19)

holds. The inverse logarithmic mean λp=1Λ(p,q) blows up logarithmically for p0,1. Hence, even in the case χ=0, the bound (19) diverges logarithmically. This logarithmic divergence looks at first sight artificial, especially in comparison to (8) showing that the Poincaré constant is bounded. However, the next section with examples shows that this blow-up may actually occur. Hence, the bound in (14) is actually optimal on this level of generality.

An analogue statement as Corollary 1 for the Poincaré constant is obtained for the lob-Sobolev constant, whose proof follows along the same lines and is omitted.

Corollary 2.

Let μ0μ1 and μ0, μ1 satisfy LSI(α0) and LSI(α1), respectively. Then, for any p(0,1) and p=1q, the mixture measure μp=pμ0+qμ1 satisfies LSI(αp) with

1αpmax{1+qλpα0,1+pλp(1+χ1)α1}.

Likewise, if μ1μ0, then

1αpmax{1+pλpα1,1+qλp(1+χ0)α0}

holds.

4. Examples

The results of Theorems 1 and 2 are illustrated for some specific examples and also compared to the results ([1], Section 4.5), which however are restricted to one-dimensional measures. Although the criterion of Theorems 1 and 2 can only give upper bounds for the multidimensional case, when at least one of the mixture component is absolutely continuous to the other, it is still possible to obtain the optimal results in terms of scaling in the mixture parameter p0,1.

4.1. Mixture of Two Gaussian Measures with Equal Covariance Matrix

Let us consider the mixtures of two Gaussians μ0:=N(0,Σ) and μ1:=N(y,Σ), for some yRn and a strictly positive definite covariance matrix ΣσId in the sense of quadratic forms for some σ>0. Then, μ0 and μ1 satisfy PI(σ1) and LSI(σ1) by the Bakry–Émery criterion (Theorem A1), i.e., ϱ0=α0=ϱ1=α1=σ1. Furthermore, the χ2-distance between μ0 and μ1 can be explicitly calculated as a Gaussian integral (see also [7])

χ0=χ1=1(2π)n2detΣexp(x·Σ1x+12(xy)·Σ1(xy))dx1=exp(y·Σ1y)1(2π)n2detΣexp(12(x+y)Σ1(x+y))dx1e|y|2/σ1.

Then, the bound from Theorem 1 in the form (8) yields

1ϱp(1+pq(e|y|2/σ1))σ. (20)

Likewise, the log–Sobolev constant follows from Theorem 2 in the form (19) leads to

1αp(1+pqλp(e|y|2/σ+1))σ.

By noting that pqpqλp14, both constants stay uniformly bounded in p. The large exponential factor in the distance e|y|2/σ cannot be avoided on this level of generality since the mixed measure μp has a bimodal structure leading to metastable effects ([2], Remark 2.20).

The result ([1] Corollary 4.7) deduced the following bound on 1ϱp for the mixture of two one-dimensional standard Gaussians σ=1 in (20)

1ϱp1+pq|y|2(Φ(|y|)e|y|2+|y|2πe|y|2/2+12), (21)

where Φ(a)=12πaey2/2dy. The elementary inequalities ea21a2ea2 and Φ(a)1+a2πea2/2 for all aR show that the bound (20) is better than the bound (21) for all parameter values p[0,1] and |y|0.

Hence, this example shows that, for mixtures with components that are absolutely continuous to each other as well as whose tail behavior is controlled in terms of the χ2-distance, Theorems 1 and 2 even improve the bound of [1] and generalize it to the multidimensional case.

4.2. Mixture of a Gaussian and Sub-Gaussian Measure

Let us consider μ1=N(0,Σ) where ΣσId is strictly positive definite. In addition, let the density of μ0 with respect to μ1 be bounded uniformly by some κ1, that is the relative density satisfies dμ0/dμ1κ almost everywhere on Rn. By the Bakry–Émery criterion (Theorem A1), ϱ1=α1=1σ holds. Furthermore, an upper bound for χ1 is obtained by the assumption on the bound on the relative density

χ1=Varμ1[μ0μ1]=(μ0μ1)2dμ11κ21.

Provided that μ0 satisfies PI(ϱ0), the Poincaré constant of the mixture μp=pμ0+qμ1 satisfies by Corollary 1 the estimate

1ϱpmax{1ϱ0,(1+p(κ21))σ}.

Similarly, Corollary 2 provides whenever μ0 satisfies LSI(α0) the following bound for the log–Sobolev constant of the mixture measure μp

1αpmax{1+qλpα0,(1+pλpκ2)σ}.

In this case, the logarithmic blow-up of the log–Sobolev constant cannot be ruled out for p0,1, without any further information on μ0.

4.3. Mixture of Two Centered Gaussians with Different Variance

For μ0=N(0,Id) and μ1=N(0,σId), the Bakry–Émery criterion (Theorem A1) implies ϱ0=α0=1 and ϱ1=α1=σ1. The calculation of the χ2-distance can be done using the spherical symmetry and is reduced to the one dimensional integral

χ0=dμ1dμ0dμ11=Hn1(B1)(2π)n2σnR+rn1e(1σ12)r2dr1.

Hereby, Hn1(Sn1) denotes the n1-dimensional Hausdorff measure of the sphere B1=xRn:|x|=1. The integral does only exist for σ<2. In this case, it can be evaluated and simplified. The bound for the constant χ1 follows by duality under the substitution σσ1 and is given by

χ0=1(σ(2σ))n21,σ<2,+,σ2,andχ1=1(σ1(2σ1))n21,σ>12,+,σ12. (22)

If σ1/2, that is for χ1=, the bound given in Corollary 1 yields

1ϱpmax{σ,1+qχ0}=max{σ,(1q)+q(σ(2σ))n2}=p+q(σ(2σ))n2.

Similarly, if σ2, that is, for χ0=, the bound becomes

1ϱpmax{1,(1+pχ1)σ}σ(q+p(σ1(2σ1))n2).

In the case 12<σ<2, the interpolation bound (4) of Theorem 1 could be applied. However, the scaling behavior for the Poincaré constant can already be observed with the estimate (7) in Remark 2, where again, thanks to the symmetry σ1σ,

1ϱp{p+q(σ(2σ))n2,forσ1,σ(q+p(σ1(2σ1))n2),forσ1, (23)

holds. Hence, the Poincaré constant stays bounded for the full range of parameter p[0,1] and σ>0.

In the case for the log–Sobolev constant, the bound from Corollary 2 gives

1αp1+qλp(σ(2σ))n2,σ1,σ(1+pλp(σ1(2σ1))n2),σ1. (24)

The bound (24) blows up logarithmically for p0,1 in general. However, the special case σ=1, although trivially, allows for the combined bound 1αp1+minp,qλp, which stays bounded. This behavior can be extended to the range σ(12,2) thanks to (22) and the interpolation bound of Theorem 2.

The result (23) can be compared with the one of ([1], Section 4.5.2), which states that, for some C>0, all σ>1 and p(0,1/2),

1ϱp,CMσ+Cp1σ1 (25)

holds. In general, depending on the constant C, the bound (23) is better for σ small, whereas the scaling in σ is better for (25), namely linear instead of σ32 as in (20).

4.4. Mixture of Uniform and Gaussian Measure

Let μ0=N(0,Id) and μ1=1Hn(B1)1B1 with B1 the unit ball around zero. Then, ϱ0=1 holds by the Bakry–Émergy criterion (Theorem A1) and ϱ1π2diam(B1)2=π24 by the result of [8]. Furthermore, since μ1μ0, the χ2-distance between μ0 and μ1 becomes thanks to the spherical symmetry

χ0+1=(μ1μ0)2dμ0=(2π)n2Hn(B1)2B1e|x|/2dx=(2π)n2Hn1(B1)Hn(B1)201rn1er2/2dr. (26)

The volume Hn(B1) and the surface area Hn1(B1) of the n-sphere satisfy the following relations

Hn1(B1)Hn(B1)=nand(2π)n2Hn(B1)=2n2Γ(n2+1)=:gn. (27)

The integral on the right-hand side in (26) can be bounded below by 1n and above by en, which altogether yields

gnχ0+1egn.

Corollary 1 implies that the Poincaré constant of the mixture μp=pμ0+qμ1 satisfies

1ϱpmax{1ϱ1,1+qχ0}p+qegn, (28)

where the last inequality follows from 4π2p+qegn for n1 and all p[0,1].

The estimate of the log–Sobolev constant uses α0=1 by the Bakry–Émergy criterion (Theorem A1) and α12e from (A1). Then, Corollary 2 yields the bound

1αpmax{1+pλpα1,1+qλp(1+χ0)α0}max{(1+pλp)e2,1+qλpegn}. (29)

There is a logarithmically blow-up of the bound for p0,1.

The blow-up for p1 is artificial, which can be shown by a combination Bakry–Émery criterion and the Holley–Stroock perturbation principle. To do so, the Hamiltonian of μp is decomposed into a convex function and some error term

Hp(x):=logμp(x)=log(p(2π)n2e|x|22+1pHn(B1)𝟙B1(0)(x))=log(e|x|22+12+1pp(2π)n2Hn(B1)e𝟙B1(0)(x))+Cp,n=|x|212ψp(x)+C˜p,n, (30)

where

ψp(x):=(log(e|x|22+12+1pp(2π)n2Hn(B1)e)+|x|212)𝟙B1(0)(x).

The function ψp is radially monotone towards the boundary of B1, which yields for |x|1 the bound

0ψp(x)log(1+1pp(2π)n2Hn(B1)e). (31)

From (30), the Hamiltonian Hp is compared with the convex potential |x|212 with the bound (31) on the perturbation ψp. This together yields, by the Bakry–Émergy criterion (Theorem A1) and the Holley–Stroock perturbation principle (Theorem A2), the μp satisfies PI(ϱ˜p) and LSI(α˜p) with

1ϱ˜p1α˜p1+1ppegn, (32)

where gn is the same constant as in (27). This bound only blows up for p0. However, the blow-up is like 1p. Furthermore, the bound on the Poincaré constant is worse than the one from (28). Therefore, both approaches need to be combined.

The combination of the bounds obtained in (29) and (32) results in the improved bound

1αCn(1+qλpgn),withCnsomeuniversalconstant, (33)

which only logarithmically blows up for p0.

This example shows that the Poincaré constant and log–Sobolev constant may have different scaling behavior for p0. Indeed, Ref. [1] shows, for this specific mixture in the one-dimensional case that the log–Sobolev constant can be bounded below by

C|logp|1α,

for p small enough and a constant C independent of p. In one dimension, lower bounds are accessible via the functional introduced by Bobkov–Götze [9]. Hence, the bound (33) is optimal in the one-dimensional case, which strongly indicates also optimality for the higher dimension case in terms of scaling in the mixture ration p.

To conclude, the Bakry–Émery criterion in combination with the Holley–Stroock perturbation principle is effective for detecting blow-ups of the log–Sobolev constant for mixtures, but has, in general, the wrong scaling behavior in the mixing parameter p. On the other hand, the criterion presented in Theorem 2 provides the right scaling of the blow-up but may give artificial blow-ups, if the components of the mixture become singular in the sense of the χ2-distance.

5. Conclusions

Recently, the investigation of mixtures can be found in many different applications, and the main results of this work may be useful to the investigation of asymmetric Kalman filter estimates [10], the study of asymmetric mixtures in Marine Biology [11], Econometrics [12], Gradient-quadratic and fixed-point iteration algorithms [7] and estimates of multivariate Gaussian mixtures [13].

Theorems 1 and 2 provide a simple estimate of the Poincaré and log–Sobolev constants of a two-component mixture measure μp=pμ0+qμ1 if the χ2-distance of μ0 and μ1 is bounded and each of the components satisfies a Poincaré or log–Sobolev inequality. Section 4 reviews several examples with the following findings:

  • For mixtures with components that are mutually absolutely continuous and whose tail behavior is mutually controlled in terms of the χ2-distance, Theorems 1 and 2 are very effective.

  • If only one of the components is absolutely continuous to the other one with bounded density, then it is still possible to obtain a bound on the Poincaré and log–Sobolev constant. However, the log–Sobolev constant blows up logarithmically in the mixture parameter p approaching 0 or 1. It is shown for specific examples that this blow-up is at least for one limit p0 or p1 not artificial due to the applied method.

  • A necessary condition for the finiteness of the χ2-distance between two measures is that at least one of the measures μ0 and μ1 is absolutely continuous to the other one, which in particular provides a mixture with connected support. This condition is too strong since one can easily decompose a measure into a mixture, where the joint support of the components is a null set. In this case, the present approach would not be helpful, even though the mixture may still satisfy both functional inequalities.

Future work could overcome the limits of the present approach by revisiting the crucial ingredient for both the Poincaré and log–Sobolev inequality, which was the representation of the mean-difference in Lemma 1 regarding covariances. Formula (3) from Lemma 1 applies only in the case where both measures are mutually absolutely continuous. However, the idea of an interpolation bound can be generalized to suitable weighted Sobolev spaces. For this, since μ0,μ1μp for all p(0,1), one can formally write and estimate

Eμ0[f]Eμ1[f]=Covμp[f,dμ0dμpdμ1dμp]fH˙1(μp)dμ0dμpdμ1dμpH˙1(μp). (34)

Hereby, H˙1(μp) is the homogeneous weighted H˙1 space with norm fH1(μp)2:=|f|2dμp and H˙1(μp) is its dual space with norm

ωH˙1(μp)2:=supfH˙1(μp){2f,ωμpfH1(μp)2}.

The representation (34) is fruitful to many more applications in which the components of the mixture do not need to be absolutely continuous. Similar ideas for estimating mean-differences were successfully applied in the metastable setting [2,14], in which suitable bounds on the H˙1-norm are obtained. In this regard, the bound (34) promises many interesting new insights for future studies.

Acknowledgments

This work is based on part of the Ph.D. thesis [15] written under the supervision of Stephan Luckhaus at the University of Leipzig. The author thanks the Max-Planck-Institute for Mathematics in the Sciences in Leipzig for providing excellent working conditions. The author thanks Georg Menz for many discussions on mixtures and metastability.

Appendix A. Bakry–Émery Criterion and Holley–Stroock Perturbation Principle

Two classical conditions for Poincaré and log–Sobolev inequalities are stated in this part of the appendix. The Bakry–Émery criterion relates the convexity of the Hamiltonian of a measure and positive curvature of the underlying space to constants for the Poincaré and log–Sobolev inequalities. Although the result is classical for the case of Rn, the result for general convex domain was established in ([16], Theorem 2.1).

Theorem A1

(Bakry–Émery criterion ([17] Proposition 3, Corollary 2), ([16], Theorem 2.1)). Let ΩRn be convex and let H:ΩR be a Hamiltonian with Gibbs measure μ(dx)=Zμ1eH(x)𝟙Ω(x)dx and assume that 2H(x)κ>0 for all xsuppμ. Then, μ satisfies PI(ϱ) and LSI(α) with

ϱκandακ.

The second condition is the Holley–Stroock perturbation principle, which allows to show Poincaré and log–Sobolev inequalities for a very large class of measures.

Theorem A2

(Holley–Stroock perturbation principle ([18], p. 1184)). Let ΩRn and H:ΩR and ψ:ΩRn be a bounded function. Let μ and μ˜ be the Gibbs measures with Hamiltonian H and H+ψ, respectively

μ(dx)=1ZμeH(x)𝟙Ω(x)dxandμ˜(dx)=1Zμ˜eH(x)ψ(x)𝟙Ω(x)dx.

Then, if μ satisfies PI(ϱ) and LSI(α), then μ˜ satisfies PI(ϱ˜) and LSI(α˜), respectively. Hereby, the constants satisfy

ϱ˜eoscΩψϱandα˜eoscΩψα,

where oscΩψ:=supΩψinfΩψ.

Proofs relying on semigroup theory of Theorems A1 and A2 can be found in the exposition by Ledoux ([6], Corollary 1.4, Corollary 1.6 and Lemma 1.2).

Example A1 (Uniform measure on the ball).

The measure μ1=1Hn(B1)𝟙B1, with B1 is the unit ball around zero, satisfies LSI(α1) with

α12e. (A1)

The proof compares the measure μ1 with a family of measures

νσ(dx)=1Zσexp(σ|x|2+σ2)𝟙(x)dxforσ>0.

Then, it holds that νσ satisfies LSI(2σ) by the Bakry–Émery criterion (Theorem A1). Moreover, it holds that oscxB1|σ|x|2+σ/2|=σ2 and hence μ1 satisfies LSI(2σeσ) by the Holley–Stroock perturbation principle (Theorem A2) for all σ>0. Optimizing the expression 2σeσ in σ gives the bound (A1).

Funding

This research received no external funding.

Conflicts of Interest

The author declares no conflict of interest.

References

  • 1.Chafaï D., Malrieu F. On fine properties of mixtures with respect to concentration of measure and Sobolev type inequalities. Annales de l’Institut Henri Poincaré Probabilités et Statistiques. 2010;46:72–96. doi: 10.1214/08-AIHP309. [DOI] [Google Scholar]
  • 2.Menz G., Schlichting A. Poincaré and logarithmic Sobolev inequalities by decomposition of the energy landscape. Ann. Probab. 2014;42:1809–1884. doi: 10.1214/14-AOP908. [DOI] [Google Scholar]
  • 3.Gibbs A.L., Su F.E. On Choosing and Bounding Probability Metrics. Int. Stat. Rev. 2002;70:419–435. doi: 10.1111/j.1751-5823.2002.tb00178.x. [DOI] [Google Scholar]
  • 4.Higuchi Y., Yoshida N. Analytic Conditions and Phase Transition for Ising Models. 1995. Unpublished lecture notes in Japanese.
  • 5.Diaconis P., Saloff-Coste L. Logarithmic Sobolev inequalities for finite Markov chains. Ann. Appl. Probab. 1996;6:695–750. doi: 10.1214/aoap/1034968224. [DOI] [Google Scholar]
  • 6.Ledoux M. Logarithmic Sobolev Inequalities for Unbounded Spin Systems Revisited. Springer; Berlin, Germany: 1999. pp. 167–194. Séminaire de Probabilités XXXV. [DOI] [Google Scholar]
  • 7.Carreira-Perpinan M.A. Mode-finding for mixtures of Gaussian distributions. IEEE Trans. Pattern Anal. Mach. Intell. 2000;22:1318–1323. doi: 10.1109/34.888716. [DOI] [Google Scholar]
  • 8.Payne L.E., Weinberger H.F. An optimal Poincaré inequality for convex domains. Arch. Ration. Mech. Anal. 1960;5:286–292. doi: 10.1007/BF00252910. [DOI] [Google Scholar]
  • 9.Bobkov S.G., Götze F. Exponential Integrability and Transportation Cost Related to Logarithmic Sobolev Inequalities. J. Funct. Anal. 1999;163:1–28. doi: 10.1006/jfan.1998.3326. [DOI] [Google Scholar]
  • 10.Nurminen H., Ardeshiri T., Piche R., Gustafsson F. Skew-t Filter and Smoother with Improved Covariance Matrix Approximation. IEEE Trans. Signal Process. 2018;66:5618–5633. doi: 10.1109/TSP.2018.2865434. [DOI] [Google Scholar]
  • 11.Contreras-Reyes J., López Quintero F., Yáñez A. Towards Age Determination of Southern King Crab (Lithodes santolla) Off Southern Chile Using Flexible Mixture Modeling. J. Mar. Sci. Eng. 2018;6:157. doi: 10.3390/jmse6040157. [DOI] [Google Scholar]
  • 12.Tasche D. Exact Fit of Simple Finite Mixture Models. J. Risk Financ. Manag. 2014;7:150–164. doi: 10.3390/jrfm7040150. [DOI] [Google Scholar]
  • 13.McLachlan G., Peel D. Finite Mixture Models. John Wiley & Sons, Inc.; Hoboken, NJ, USA: 2000. (Wiley Series in Probability and Statistics). [DOI] [Google Scholar]
  • 14.Schlichting A., Slowik M. Poincaré and logarithmic Sobolev constants for metastable Markov chains via capacitary inequalities. arXiv. 2017. 1705.05135
  • 15.Schlichting A. Ph.D. Thesis. Universität Leipzig; Leipzig, Germany: 2012. The Eyring-Kramers Formula for Poincaré and Logarithmic Sobolev Inequalities. [Google Scholar]
  • 16.Kolesnikov A.V., Milman E. Riemannian metrics on convex sets with applications to Poincaré and log–Sobolev inequalities. Calc. Var. Part. Differ. Equ. 2016;55:1–36. doi: 10.1007/s00526-016-1018-3. [DOI] [Google Scholar]
  • 17.Bakry D., Émery M. Diffusions Hypercontractives. Springer; Berlin, Germany: 1985. pp. 177–206. Séminaire de Probabilités, XIX. [Google Scholar]
  • 18.Holley R., Stroock D. Logarithmic Sobolev inequalities and stochastic Ising models. J. Stat. Phys. 1987;46:1159–1194. doi: 10.1007/BF01011161. [DOI] [Google Scholar]

Articles from Entropy are provided here courtesy of Multidisciplinary Digital Publishing Institute (MDPI)

RESOURCES