Skip to main content
Springer logoLink to Springer
. 2018 Nov 12;2018(1):306. doi: 10.1186/s13660-018-1897-2

Jensen–Steffensen inequality for strongly convex functions

M Klaričić Bakula 1,
PMCID: PMC6244717  PMID: 30839851

Abstract

The Jensen inequality for convex functions holds under the assumption that all of the included weights are nonnegative. If we allow some of the weights to be negative, such an inequality is called the Jensen–Steffensen inequality for convex functions. In this paper we prove the Jensen–Steffensen inequality for strongly convex functions.

Keywords: Strongly convex functions, Jensen inequality, Jensen–Steffensen inequality

Introduction

Let IR be an interval. It is well known that if a function f:IR is convex, then

f(i=1npixi)i=1npif(xi) 1.1

for all nN, x1,,xnI, and p1,,pn>0 with p1++pn=1. If f is strictly convex, then (1.1) is strict unless all xi are equal [7, p. 43]. This classical Jensen inequality is one of the most important inequalities in convex analysis, and it has various applications in mathematics, statistics, economics, and engineering sciences.

It is also known that the assumption p1,,pn>0 can be relaxed at the expense of restricting x1,,xn more severely [9]. Namely, if p=(p1,,pn) is a real n-tuple such that for every k{1,,n}

0p1++pkp1++pn=1, 1.2

then for any monotonic n-tuple x=(x1,,xn)In (increasing or decreasing) we get

x=p1x1++pnxnI,

and for any function f convex on I (1.1) still holds. Under such assumptions (1.1) is called the Jensen–Steffensen inequality for convex functions and (1.2) are called Steffensen’s conditions due to J. F. Steffensen. Again, for a strictly convex function f, (1.1) remains strict under certain additional assumptions on x and p [1]. It is needless to say that a mathematical mind has to question the limitation p1,,pn>0 even if in the usual practice we can cope with it.

Variants of the Jensen inequality are proved for various classes of generalized convex functions, and the class of strongly convex functions is among them. Recall that a function f:IR is called strongly convex with modulus c>0 if

f(tx+(1t)y)tf(x)+(1t)f(y)ct(1t)(xy)2 1.3

for all x,yI and t[0,1] [8]. Strongly convex functions are useful in optimization theory, mathematical economics and approximation theory, and an interested reader can find more about them in an excellent survey paper [5].

As we can easily see, strong convexity is a strengthening of the notion of convexity, and some properties of strongly convex functions are just “stronger versions” of analogous properties of convex functions (for more details, see [5]). One example of such a stronger version is the Jensen inequality for strongly convex functions (see [4] or [5]). If f:IR, IR, is strongly convex with modulus c, then

f(i=1npixi)i=1npif(xi)ci=1npi(xix¯)2 1.4

for all x1,,xnI and all p1,,pn>0 such that p1++pn=1. If we compare (1.4) with (1.1), we see that (1.4) provides a better upper bound for f(x¯) since the term ci=1npi(xix¯)2 is always nonnegative. Of course, if c=0, we go right back to convex functions and (1.1).

We must emphasize here that proving a Jensen type inequality for some class of generalized convex functions does not necessarily mean that such inequality holds under Steffensen’s conditions. The goal of this paper is to prove that for the class of strongly convex functions this is not the case.

Main result

Strongly convex functions have a very useful characterization: they always have a specific convex representation. This is stated in the following theorem (see [3] or [6]).

Theorem 1

Let I be an interval in R. A function f:IR is strongly convex with modulus c if and only if the function g=fc()2 is convex.

The Jensen inequality for strongly convex functions can be proved either using Theorem 1 and the Jensen inequality for convex functions or (for I open) directly, using the “support parabola” property [5, Theorem 1]. In this section we prove the Jensen–Steffensen inequality for strongly convex functions using Theorem 1.

In the rest of the paper we use the following notation related to the n-tuples x=(x1,,xn) and p=(p1,,pn),nN:

x¯=p1x1++pnxn,Pk=p1++pk,k{1,2,,n},Pk=pk++pn,k{1,2,,n}.

Theorem 2

Let I be an interval in R. If f:IR is a strongly convex function with modulus c, then for every monotonic n-tuple x=(x1,,xn)In and every real n-tuple p=(p1,,pn) such that, for every i{1,2,,n},

0PiPn=1

the following inequality holds:

f(i=1npixi)i=1npif(xi)ci=1npi(xix¯)2.

Proof

Suppose that x is increasing (for x decreasing the proof is analogous). It can be easily seen that Steffensen’s conditions yield

Pk0,k{1,2,,n},

and

xnx¯=Pn(xnx¯)=i=1n1Pi(xi+1xi)0,

hence we obtain x¯xn. Analogously,

x¯x1=Pn(x¯x1)=i=2nPi(xixi1)0,

and x1x¯. From that we may conclude x¯[x1,xn]I, which means that g(x¯)=g(i=1npixi) is defined.

Using the convex representation g=fc()2 as in Theorem 1 and applying the Jensen–Steffensen inequality for convex functions, we obtain

g(i=1npixi)i=1npig(xi).

Returning back to f, we get

f(i=1npixi)c(i=1npixi)2i=1npi(f(xi)cxi2)=i=1npif(xi)ci=1npixi2,

or written differently

f(i=1npixi)i=1npif(xi)c[i=1npixi2(i=1npixi)2]=i=1npif(xi)c[i=1npixi2x¯2]=i=1npif(xi)c[i=1npixi22x¯2+x¯2]=i=1npif(xi)c[i=1npixi22x¯i=1npixi+x¯2i=1npi]=i=1npif(xi)ci=1npi(xix¯)2.

 □

Alternative reproach

What would happen if we try to prove (1.4) under Steffensen’s conditions directly using the support parabola property? The question is not without sense since in the case of the Jensen inequality for strongly convex functions both ways produce the same inequality as in (1.4) but, generally speaking, any negative weights in p can at some place interrupt the chain of conclusions in a proof. This is exactly the reason why it is considerably more difficult to prove (1.1) under Steffensen’s conditions. We will see what happens in this case in the next theorem, but first we need the following lemma which basically says that the support parabola in x0 can be “shifted up” from x0 to y and still remain “under” f(x) if xyx0.

Lemma 1

Let IR be an open interval, let f:IR be a strongly convex function with modulus c, and for x0I let

y=f(x0)+λ(xx0)+c(xx0)2 3.1

be the support parabola for f in x0. Then for every x,yI such that xyx0

f(x)f(y)λ(xy)+c(xy)2, 3.2

and for x,yI such that x0xy

f(y)f(x)λ(yx)+c(yx)2. 3.3

Proof

Since (3.1) is a support parabola for f in x0, it follows that for every xI

f(x)f(x0)λ(xx0)+c(xx0)2. 3.4

Let x,yI be such that x<y<x0. The middle element y can be represented as a convex combination of x and z in the following way:

y=x0yx0xx+yxx0xx0.

From the definition of strong convexity we have

f(y)x0yx0xf(x)+yxx0xf(x0)cx0yx0xyxx0x(xx0)2,

and since

x0yx0x+yxx0x=1,

we can write

f(y)=x0yx0xf(y)+yxx0xf(y)x0yx0xf(x)+yxx0xf(x0)cx0yx0xyxx0x(xx0)2.

After a simple calculation we obtain

(x0y)(f(x)f(y))(xy)(f(x0)f(y))+c(x0y)(yx)x0x(xx0)2

and

f(x)f(y)xyf(x0)f(y)x0yc(x0x). 3.5

The support parabola property (3.4) gives

f(y)f(x0)λ(yx0)+c(yx0)2,

and since yx0<0

f(x0)f(y)x0yλc(x0y).

Using the above inequality and (3.5), we obtain

f(x)f(y)xyf(x0)f(y)x0yc(x0x)λc(x0y)c(x0x)=λ+c(x+y2x0).

Since xy<0 we get

f(x)f(y)λ(xy)+c(xy)(x+y2x0),

and because of x+y2x0<xy, we end up with

f(x)f(y)λ(xy)+c(xy)2.

If x0<x<y, in an analogous way we can prove

f(y)f(x)λ(yx)+c(yx)2.

Note that the above inequalities still hold in the trivial way if x=y. □

Remark 1

(3.2) and (3.3) can be also proved using the convex representation g=fc()2. We start from the support parabola property in x0I

f(x)f(x0)λ(xx0)+c(xx0)2.

Then

g(x)g(x0)+cx2cx02λ(xx0)+c(xx0)2,

that is,

g(x)g(x0)λ(xx0)+c(xx0)2cx2+cx02=(λ2cx0)(xx0)=λ(xx0),

hence g has a support line in x0 for λ=λ2cx0. Since g is convex, we know that for every x0xy [7]

g(y)g(x)λ(yx)=(λ2cx0)(yx).

Returning to f, we obtain

f(y)cy2f(x)+cx2(λ2cx0)(yx),

hence

f(y)f(x)(λ2cx0)(yx)+cy2cx2=λ(yx)+c(yx)(x+y2x0)λ(yx)+c(yx)(x+y2x)=λ(yx)+c(yx)2.

Consequently,

f(y)f(x)λ(yx)+c(yx)2,x0xy.

Analogously, we can prove

f(x)f(y)λ(xy)+c(xy)2,xyx0.

Theorem 3

Let IR be an open interval. If f:IR is a strongly convex function with modulus c, then for every monotonic n-tuple x=(x1,,xn)In and every real n-tuple p=(p1,,pn) such that for every i{1,2,,n}

0PiPn=1,

there exists k{1,,n1} such that x¯[xk,xk+1] for x increasing or x¯[xk+1,xk] for x decreasing, and

i=1npif(xi)f(i=1npixi)c[i=1k1Pi(xixi+1)2+Pk(xkx¯)2+Pk+1(xk+1x¯)2+i=k+2nPi(xixi1)2]0.

Proof

Suppose that x is increasing (for x decreasing the proof is analogous).

First observe that as in Theorem 2 we know that x¯[x1,xn]I, and we may conclude that there exists some k{1,,n1} such that x¯[xk,xk+1].

From (3.4), choosing x0=x¯, we get

f(x)f(x¯)λ(xx¯)+c(xx¯)2

for some λR and every xI.

Next we use the Abel transformation to obtain the identities (similar can be found in [1])

0=i=1npixix¯=i=1k1Pi(xixi+1)+Pk(xkx¯)+Pk+1(xk+1x¯)+i=k+2nPi(xixi1) 3.6

and

i=1npif(xi)f(x¯)=i=1k1Pi(f(xi)f(xi+1))+Pk(f(xk)f(x¯))+Pk+1(f(xk+1)f(x¯))+i=k+2nPi(f(xi)f(xi1)), 3.7

where in the case k=1 we assume i=1k1 to be 0, while in the case k=n1 we assume i=k+2n to be 0.

From (3.7), using (3.2), (3.3), and then (3.6), we get

i=1npif(xi)f(x¯)i=1k1Pi(λ(xixi+1)+c(xixi+1)2)+Pk(λ(xkx¯)+c(xkx¯)2)+Pk+1(λ(xk+1x¯)+c(xk+1x¯)2)+i=k+2nPi(λ(xixi1)+c(xixi1)2)=λ[i=1k1Pi(xixi+1)+Pk(xkx¯)+Pk+1(xk+1x¯)+i=k+2nPi(xixi1)]+c[i=1k1Pi(xixi+1)2+Pk(xkx¯)2+Pk+1(xk+1x¯)2+i=k+2nPi(xixi1)2]=c[i=1k1Pi(xixi+1)2+Pk(xkx¯)2+Pk+1(xk+1x¯)2+i=k+2nPi(xixi1)2].

 □

It was hopeful to think that this way we can end up with

i=1npif(xi)f(x¯)ci=1npi(xix¯)2

since this is exactly what happens in the analogous proofs (direct and indirect) for convex functions. It would be possible if

i=1k1Pi(xixi+1)2+Pk(xkx¯)2+Pk+1(xk+1x¯)2+i=k+2nPi(xixi1)2i=1npi(xix¯)2, 3.8

but sadly this is not generally true.

Example 1

Let x=(1,2,3,4), p=(1,1,0,1). Then

P1=1,P2=0,P3=0,P4=1,P1=1,P2=0,P3=1,P4=1,x¯=3[2,3],k=2(or k=3),i=11Pi(xixi+1)2+P2(x2x¯)2+P3(x3x¯)2+i=44Pi(xixi1)2=(12)2+0+(33)2+(43)2=2,i=14pi(xix¯)2=(13)2(23)2+0+(43)2=4>2.

In fact, the following theorem holds.

Theorem 4

Let f,p,x, and k be as in Theorem 3. Then

i=1k1Pi(xixi+1)2+Pk(xkx¯)2+Pk+1(xk+1x¯)2+i=k+2nPi(xixi1)2i=1npi(xix¯)2.

Proof

For the sake of simplicity, we introduce the following notation:

Ik=i=1k1Pi(xixi+1)2+Pk(xkx¯)2+Pk+1(xk+1x¯)2+i=k+2nPi(xixi1)2,x2=i=1npixi2.

Suppose that x is increasing (for x decreasing the proof is analogous). First note that for k as in Theorem 3 we have

xix¯,i=1,2,,k,x¯xi,i=k+1,,n.

Using this notation, we get

Ik=i=1k1Pi(xi2xi+12)+Pk(xk2x2)+Pk+1(xk+12x2)+i=k+2nPi(xi2xi12)2i=1k1Pixixi+1+2i=1k1Pixi+12+Pk(2xkx¯+x¯2)+Pkx2+Pk+1x2+Pk+1(2xk+1x¯+x¯2)2i=k+2nPixixi1+2i=k+2nPixi12.

Applying (3.6) on p and x2=(x12,,xn2), we obtain

i=1k1Pi(xi2xi+12)+Pk(xk2x2)+Pk+1(xk+12x2)+i=k+2nPi(xi2xi12)=0,

hence

Ik=2i=1k1Pixi+1(xi+1xi)+2i=k+2nPixi1(xi1xi)+Pkx2+Pk+1x2+Pk(2xkx¯+x¯2)+Pk+1(2xk+1x¯+x¯2)=2i=1k1Pixi+1(xi+1xi)+2i=k+2nPixi1(xi1xi)+x2+Pk(2xkx¯+x¯2)+Pk+1(2xk+1x¯+x¯2).

Taking into account that x is increasing and

Pi,Pi0,i=1,2,,n,xix¯,i=1,2,,k,x¯xi,i=k+1,,n,

we obtain

Ik2x¯i=1k1Pi(xi+1xi)+2x¯i=k+2nPi(xi1xi)+x22Pkxkx¯2Pk+1xk+1x¯+x¯2.

Applying again (3.6) on p and x, we get

i=1k1Pi(xi+1xi)+i=k+2nPi(xi1xi)=Pk(xkx¯)+Pk+1(xk+1x¯),

hence

Ik2x¯[Pk(xkx¯)+Pk+1(xk+1x¯)]+x22Pkxkx¯2Pk+1xk+1x¯+x¯2=2Pkx¯22Pk+1x¯2+x¯2+x2=2x¯2+x¯2+x2=x2x¯2,

or written differently

Iki=1npixi2x¯2=i=1npi(xix¯)2.

 □

We have just proven that the Jensen–Steffensen inequality for strongly convex functions behaves differently than the Jensen inequality for strongly convex functions: applying the same proof techniques, we end up with two different bounds, and surprisingly the indirect proof gives the better one.

Integral version

The integral version of the Jensen–Steffensen inequality for convex functions was proved by Boas in 1970 [2].

Theorem 5

Let x:[α,β](a,b) be a continuous and monotonic function, where <α<β<+ and a<b+, and let f:(a,b)R be a convex function. If λ:[α,β]R is either continuous or of bounded variation satisfying

(t[α,β])λ(α)λ(t)λ(β),λ(β)λ(α)>0,

then

f(αβx(t)dλ(t)αβdλ(t))αβf(x(t))dλ(t)αβdλ(t).

Since the indirect proof as in Theorem 2 produced a better bound, we will use the same technique to prove the integral version of the Jensen–Steffensen inequality for strongly convex functions.

Theorem 6

Let x:[α,β](a,b) be a continuous and monotonic function, where <α<β<+ and a<b+, and let f:(a,b)R be a strongly convex function with modulus c. If λ:[α,β]R is either continuous or of bounded variation satisfying

(t[α,β])λ(α)λ(t)λ(β),λ(β)λ(α)>0,

then

f(μ)αβf(x(t))dλ(t)αβdλ(t)cαβ(x(t)μ)2dλ(t)αβdλ(t),

where

μ=αβx(t)dλ(t)αβdλ(t).

Proof

Using the convex representation g=fc()2 as in Theorem 1 and applying the integral Jensen–Steffensen inequality for convex functions, we obtain

g(μ)=g(αβx(t)dλ(t)αβdλ(t))αβg(x(t))dλ(t)αβdλ(t).

Going back to f we get

f(μ)cμ2αβ(f(x(t))cx(t)2)dλ(t)αβdλ(t)=αβf(x(t))dλ(t)αβdλ(t)cαβx(t)2dλ(t)αβdλ(t),

or written differently

f(μ)αβf(x(t))dλ(t)αβdλ(t)cαβx(t)2dλ(t)αβdλ(t)+cμ2=αβf(x(t))dλ(t)αβdλ(t)c[αβx(t)2dλ(t)αβdλ(t)μ2]=αβf(x(t))dλ(t)αβdλ(t)c[αβx(t)2dλ(t)αβdλ(t)2μ2+μ2]=αβf(x(t))dλ(t)αβdλ(t)c[αβx(t)2dλ(t)αβdλ(t)2μαβx(t)dλ(t)αβdλ(t)+μ2αβdλ(t)αβdλ(t)]=αβf(x(t))dλ(t)αβdλ(t)cαβ(x(t)μ)2dλ(t)αβdλ(t).

 □

Availability of data and materials

Not applicable.

Authors’ contributions

Author read and approved the final manuscript.

Funding

University of Split, Faculty of Science, Split, Croatia.

Competing interests

The author declares that there are no competing interests.

Footnotes

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  • 1.Abramovich S., Klaričić Bakula M., Matić M., Pečarić J. A variant of Jensen–Steffensen’s inequality and quasi-arithmetic means. J. Math. Anal. Appl. 2005;307(1):370–386. doi: 10.1016/j.jmaa.2004.10.027. [DOI] [Google Scholar]
  • 2.Boas R.P., Jr. The Jensen–Steffensen inequality. Publ. Elektroteh. Fak. Univ. Beogr., Ser. Mat. Fiz. 1970;302–319:1–8. [Google Scholar]
  • 3.Hiriart-Urruty J.-B., Lemaréchal C. Fundamentals of Convex Analysis. Abridged Version of Convex Analysis and Minimization Algorithms I and II. Berlin: Springer; 2001. [Google Scholar]
  • 4.Merentes N., Nikodem K. Remarks on strongly convex functions. Aequ. Math. 2010;80(1–2):193–199. doi: 10.1007/s00010-010-0043-0. [DOI] [Google Scholar]
  • 5.Nikodem K. Handbook of Functional Equations. New York: Springer; 2014. On strongly convex functions and related classes of functions; pp. 365–405. [Google Scholar]
  • 6.Nikodem K., Páles Z. Characterizations of inner product spaces by strongly convex functions. Banach J. Math. Anal. 2011;5(1):83–87. doi: 10.15352/bjma/1313362982. [DOI] [Google Scholar]
  • 7.Pečarić J.E., Proschan F., Tong Y.L. Convex Functions, Partial Orderings, and Statistical Applications. Boston: Academic Press; 1992. [Google Scholar]
  • 8.Polyak B.T. Existence theorems and convergence of minimizing sequences in extremum problems with restrictions. Sov. Math. Dokl. 1966;7:72–75. [Google Scholar]
  • 9.Steffensen J.F. On certain inequalities and methods of approximation. J. Inst. Actuar. 1919;51:274–297. [Google Scholar]

Articles from Journal of Inequalities and Applications are provided here courtesy of Springer

RESOURCES