Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2014 Oct 20;111(44):15681–15686. doi: 10.1073/pnas.1412216111

Approximation of the expected value of the harmonic mean and some applications

Calyampudi Radhakrishna Rao a,b,1, Xiaoping Shi c, Yuehua Wu c
PMCID: PMC4226087  PMID: 25331886

Significance

The harmonic mean (HM) filter is better at removing positive outliers than the arithmetic mean (AM) filter. There are especially difficult issues when an accurate evaluation of expected HM is needed such as, for example, in image denoising and marginal likelihood evaluation. A major challenge is to develop a higher-order approximation of the expected HM when the central limit theorem is not applicable. A two-term approximation of the expected HM is derived in this paper. This approximation enables us to develop a new filtering procedure to denoise the noisy image with an improved performance, and construct a truncated HM estimator with a faster convergence rate in marginal likelihood evaluation.

Keywords: harmonic mean, second-order approximation, arithmetic mean, image denoising, marginal likelihood

Abstract

Although the harmonic mean (HM) is mentioned in textbooks along with the arithmetic mean (AM) and the geometric mean (GM) as three possible ways of summarizing the information in a set of observations, its appropriateness in some statistical applications is not mentioned in textbooks. During the last 10 y a number of papers were published giving some statistical applications where HM is appropriate and provides a better performance than AM. In the present paper some additional applications of HM are considered. The key result is to find a good approximation to E(Hn), the expectation of the harmonic mean of n observations from a probability distribution. In this paper a second-order approximation to E(Hn) is derived and applied to a number of problems.


The harmonic mean Hn of n observations Z1,,Zn drawn from a population is defined by

Hn=ni=1n1Zi. [1]

There have been a number of applications of the harmonic mean in recent papers. A more general version of Hn with weights w1,,wn is

Hn(w)=i=1nwii=1nwiZi. [2]

where w=(w1,,wn)T. The harmonic mean Hn is used to provide the average rate in physics and to measure the price ratio in finance as well as the program execution rate in computer engineering. Some statistical applications of the harmonic mean are given in refs. 14, among others. Hn(w) has been used in evaluation of the portfolio price-to-earnings ratio value (ref. 5, p. 339) and the signal-to-interference-and-noise ratio (6) among others. The asymptotic properties of Hn including the asymptotic expansion of E(Hn) are investigated in refs. 7 and 8 by either assuming that some moments of 1/Zi are finite or that Zi s follow the Poisson distribution. It is noted that recent papers (9, 10) enable one to use saddle-point approximation to give the asymptotic expansion of E(Hn) to any given order of 1/n for some constants c0,c1,c2,, i.e.,

E(Hn)=c0+c1n+c2n2+. [3]

However, such methods are not applicable for obtaining the asymptotic expansion of Hn when the first moment of 1/Zi is infinite. In ref. 3, Zi s are assumed to follow a uniform distribution in the interval (0,1), i.e., U(0,1), motivated by learning theory. Using the property that the inverse of Hn converges to the stable law, ref. 3 showed that

E(Hn)1log(n), [4]

where the symbol “∼” means asymptotic equivalence as n → ∞. Our interest in this paper is to determine the second term in the asymptotic expansion of E(Hn) or the general version E(Hn(w)) under more general assumptions on distributions of Zi s. We show that under mild assumptions,

E(Hn)1log(n){1+c1log(n)}, [5]

where the constant c1 will be given. In addition, we use the approach for obtaining [5] to the case that the first moment of 1/Zi is finite, motivated by evaluation of the marginal likelihood in ref. 11.

Approximations

We derive the asymptotic approximation of E(Hn) when the first moment of 1/Zi is not finite. Let {Zi} be a sequence of independent and identically distributed (i.i.d.) random variables with possible infinite first moment. Suppose that there exist constants An and Bn, such that the distribution Fn(x) of

Xn=1/Z1+1/Z2++1/ZnBnAn [6]

converges weakly to a nondegenerate distribution F(x) such that

F(x)=d1+o(1)|x|αasx, [7]
1F(x)=d2+o(1)|x|αasx, [8]

where α, d1, and d2 are constants with 0α<2, d1,d20, and d1+d2>0, respectively. The set of all distributions converging to F(x) is called the domain of attraction of F(x). It is known that only stable laws with index α(0α<2) have the nonempty domains of attraction as shown by refs. 12 (chap. 7) and 13 (chap. 2).

Assume that there is a positive constant d3 which does not depend on n such that

Xn+And3>0. [9]

We further assume a uniform rate of convergence of Fn(x) to F(x) such that

supx|Fn(x)F(x)|=o(nβ), [10]

for some positive constant β<1. Our assumptions are mild. Ref. 14 showed that supx|Fn(x)F(x)| has the rate of o{n1log(n)} under some assumptions.

We have the following asymptotic approximation of E(Hn):

Theorem 1. Assume that conditions [7][10] are satisfied and An=log(n), Bn=n, α=1, d1=0, and d2=1. Then we have the following first approximation:

E(Hn)=E(Xn+logn)1=n2n3+o(n3), [11]

where n=log(n).

The proof is given in Appendix: Proof of Theorem 1. Because nβ in [10] is smaller than the remaining terms in [11], the coefficients of both n2 and n3 are independent of β in [11].

Remark 1: For an extension of Theorem 1 to the weighted harmonic mean in [2], we consider the following partial sum:

Xn(w)=w1/Z1+w2/Z2++wn/ZnWnAn, [12]

where Wn=(i=1n|wi|α)1/α. Motivated by ref. 15, we may assume the following two conditions on the weights wi s:

max1in|wi|=o(Wn), [13]

and the characteristic function of 1/Zi in [6] satisfies that

1c|t|α+o(|t|α)ast0. [14]

Under the conditions [13] and [14], ref. 15 showed that the distribution of Xn(w) converges to a stable distribution with characteristic function exp(c|t|α). For example, if Zis follow uniform distribution U(0,1), the condition [14] is satisfied when An=logn and α=1. Following the proof of Theorem 1, it can be shown that

E{Hn(w)}=n2n3+o(n3), [15]

where n=log(n). It is noted that the weights in [2] do not have to be nonnegative, but must satisfy both conditions [9] and [13].

By Theorem 1, c1 in [5] has the value −1. It is noted that Theorem 1 holds true if Zi s follow a uniform distribution U(0,1).

A higher-order approximation may be similarly obtained but extra conditions on Fn(x) in [7] and [8] may be needed. In view of the proof of Theorem 2.1 given in Appendix: Proof of Theorem 1, the higher-order term should be n4log(n). Because it is difficult to obtain the coefficient of this term theoretically, it may be constructed empirically. As a demonstration, we consider the case where Zi s follow a uniform distribution U(0,1). We perform Monte Carlo simulation with 1,000,000 replications of n independent observations from standard uniform distribution U(0,1) for different values of n=10,15,20,,200. The coefficient of n4log(n) is estimated to be 0.5673 by fitting the simulated data to the following model by least squares:

log(n)Hn1+1log(n)=βlog[log(n)]log(n).

Thus, we obtain the following approximation:

E(Hn)n2n3+0.5673log(log(n))n4. [16]

As in ref. 3, suppose that Zi s follow a uniform distribution U(0,1). The distribution of Yi=1/Zi is easily seen to be given by

P(Yt)=(11/t)I(t1),

where I() is an indicator function. It is well known that the mean of Yi is infinite but EYir< for r<1. By considering the limit stable distribution with index α=1 of the distribution of Xn for An=log(n) and Bn=n, ref. 3 obtained the result [4], which is

E{[log(n)]Hn}1. [17]

According to our Theorem 1 and the approximation [16],

E{[log(n)]Hn}11log(n), [18]
E{[log(n)]Hn}11log(n)+0.5673log[log(n)]log(n). [19]

Fig. 1 displays the approximations given in [17][19] compared with the sample mean of 1,000,000 replications of n independent observations from the uniform distribution U(0,1) that serves as a proxy for the exact value of E(Hn). Here n takes values 10,15,20,, and 200. From Fig. 1, it can be seen that the approximation [18] is better than the approximation [17]. Although the approximation [19] is purely empirical, this empirical exercise basically achieves the desired result as shown in Fig. 1; it clearly gives much better approximation of E{[log(n)]Hn} than its other two counterparts.

Fig. 1.

Fig. 1.

Comparisons of three approximations of E{[log(n)]Hn} with respect to the sample mean (denoted by ) of [log(n)]Hn with 1,000,000 replications of n independent observations from U(0,1) for n=10,15,20,,200. (i) “L-M” denotes the approximations of E{[log(n)]Hn} by [17] less . (ii) “F-M” stands for the approximations of E{[log(n)]Hn} by [18] less . (iii) “S-M” represents the approximations of E{[log(n)]Hn} by [19] less .

We now consider the case that α>1. In this case, Bn=n1/α and An=E(1/Z1)n11/α. Thus, we have

Hn=n11/αXn+n11/αE(1/Z1). [20]

In light of the proof of Theorem 1, we have the following asymptotic approximation of E(Hn):

Theorem 2. Assume that conditions in [7]–[10] are satisfied and An=E(1/Z1)n11/α, Bn=n1/α, α>1, d1=0, and d2=1; then we have the following approximation:

E(Hn)=n2+n3+o(n3), [21]

where n=An.

Remark 2: A similar result as in Theorem 2 can be obtained for the weighted harmonic mean in [2] by assuming that conditions [13] and [14] are satisfied with α>1 and An=E(1/Z1)i=1nwi/Wn. It can be shown that

E(Hn(w)={n2+n3+o(n3)}i=1nwi/Wn, [22]

where n=An.

Some Applications

We present two applications which involve the use of the approximation of E(Hn).

Image Denoising.

Image denoising is very important in image processing. There are many methods for image denoising in the literature of image process. We are interested in the local filters such as the arithmetic and harmonic mean filters that have been used in image denoising. The harmonic mean filter is better at removing positive outliers and preserving edge features than the arithmetic mean filter. However, both of them fail when the image is contaminated by a uniform noise. Comparing the difference between the two means on different segments, we use the ratio of the harmonic mean and the arithmetic mean (defined in [23]) as a local filter and select the corresponding threshold of the ratio using the improved approximation [16] plus a saddle-point approximation. This application shows how the local filter can improve the performance of image denoising. The details are given below.

For demonstration, we consider a test image with dimension 250 × 250 (Fig. 2A) including disk, hand, human body, ring, sunflower, and triangle as shown in figure 2 of ref. 16. We contaminate the image with uniform noise, which is displayed in Fig. 2B. The usual harmonic mean filter method in image denoising is to replace the value of each pixel with the harmonic mean of values of the pixels in a surrounding region. We consider a square containing 9 pixels for each pixel such that this pixel is located at the center. Here the variable Zi represents the value of the pixel taking values 0 (black), 1/255,…,255/255 (white) in this 256 grayscale image and the sample size is 9. For the edge of an image with dimension 250 × 250 such as the first or last row and column, where the pixels are not surrounded by a square, we copy them to the neighbor areas in the original image and the new image becomes 252 × 252. Note that this handling is only for convenience of filtering and the added pixels will not be analyzed. From Fig. 2 C and D, it can be seen that even though the harmonic mean filter outperforms the arithmetic mean filter, both arithmetic mean filter and the harmonic mean filter fail to denoise the noisy image given in Fig. 2B. However, we can first use the ratio of the harmonic mean and the arithmetic mean jointly with a threshold θ to transform the pixel Zi,j at the pixel location (i,j) as follows:

Z˜i,j={1,ifHi,j/Ai,jθ,0,otherwise, [23]

where Hi,j and Ai,j are, respectively, the harmonic mean and the arithmetic mean of 9 pixels centering at Zi,j. We then apply the arithmetic or harmonic mean filter to the pixels {Z˜i,j} to denoise the image of pixels {Z˜i,j}. By Fig. 2 E and F, it can be seen that both images look much better than the images in Fig. 2 C and D. The image in Fig. 2F (by the harmonic mean filter) looks almost the same as the initial unnoisy image.

Fig. 2.

Fig. 2.

(A) Initial unnoisy image. (B) Image that is noised by adding U(0,1) noise to each pixel value of the image A. (C) Image obtained by denoising the noisy image B using the arithmetic mean filter. (D) Image obtained denoising the noisy image B using the harmonic mean filter. (E) The arithmetic mean filtered image of {Z˜i,j} (see [23]). (F) The harmonic mean filtered image of {Z˜i,j}.

We note that only when using the ratio of the harmonic mean and the arithmetic mean, we assign 1 or 0 according to a threshold θ in [23], which is determined by the asymptotic behavior of the ratio of their expected values. How to select the threshold θ is important in practice. To demonstrate how to select θ, we consider two cases of uniform distributions with sample size n: (i) ZiU(0,1); (ii) ZiU(0.2,0.8). Let Hn and An be, respectively, the harmonic mean and the arithmetic mean of this sample. An approximation to Hn/An would be the ratio of their means, E(Hn)/E(An) as in ref. 9. For case (i), E(Hn) can be approximated by [16], an improved approximation compared with the result of Theorem 1. For case (ii), 1/Zi,j has moment of any order. Hence the saddle-point approximation [3.12] in ref. 10 can be applied, and E(Hn) can thus be approximated by the three terms in that expansion. Fig. 3 displays the approximations of ratios of E(Hn)/E(An) with n=5,6,,20 for both cases. It can be seen that the approximation for case (ii) is larger than the one for case (i). By this figure, a practical recommendation of the threshold θ may be 0.85, which has been used for obtaining images displayed in Fig. 2 E and F.

Fig. 3.

Fig. 3.

Ratios E(Hn)/E(An) with n=5,6,,20 for both cases. “R1” denotes the ratio for case (i), whereas “R2” stands for the ratio for case (ii). The dotted line is 0.85.

Evaluating Marginal Likelihood.

It is of importance to calculate the marginal likelihood in the process of likelihood maximization. Let π(θ|x)=f(x|θ)π0(θ)/fm(x) be the posterior density for prior π0(θ), which implies that [fm(x)]1=Eπ{[f(x|θ)]1}. Ref. 11 proposed the harmonic mean estimator for the marginal likelihood fm(x) by letting Zi=f(x|θi) in [1], where θi s are i.i.d. draws from the posterior distribution. Ref. 11 noted that 1/Zi can have infinite variance, in which case the central limit theorem is not applicable to the partial sums. Later, ref. 17 showed that in typical applications [f(x|θi)]1 may lie in the domain of attraction of a one-sided α-stable law with index α(1,2]. If the sample information exceeds the prior information in an application, the limit law for a harmonic mean estimator is stable with index α close to 1, and the convergence is very slow at rate n1α1. In the following we demonstrate via one of their examples that if {1/Zi} are properly right truncated, a good approximation can be constructed so that it converges to the expected harmonic mean of the right truncated {1/Zi}, which converges to the marginal likelihood.

Suppose we want to evaluate the marginal likelihood f(X¯) based on X1,,Xr independently normally distributed N(θ,1) variables with mean θ and variance 1 for a sample {Xi} of size r=10 with sample mean X¯. Set the prior distribution θN(0,1). The exact marginal likelihood for r=10 is available analytically, f(X¯)=(2.2π)1/2eX¯2/2.2. Our aim is to estimate the marginal likelihood f(X¯), where P(X¯=0)=0. The harmonic mean estimate of the marginal likelihood is Hn=n/[i=1n1/Zi], where 1/Zi=π/5e5(θiX¯)2 for independent and identical draws θi from the posterior distribution N(10X¯/11,1/11). Ref. 17 showed that the convergence rate of Hn to the marginal likelihood f(x¯) is slow because of α=1.1, and the harmonic mean estimator behaves badly (Fig. 4). As described above, in light of the truncation method used in refs. 18 and 19, we consider the right truncated variable 1/ZiI(1/Zi<nδ), where I() is an indicator function and δ is a positive constant. Let

H˜n=ni=1n[1/ZiI(1/Zi<nδ)]. [24]

By Theorem 2.2, it follows that

E(Hn)E(H˜n)f(X¯)+f3/2(X¯)/n(11/α)/2. [25]

Fig. 4.

Fig. 4.

Comparison of four approximations of the marginal likelihood with n=10,20,30,,300. (i) “M” denotes the sample mean of Hn in [1] with 100,000 replications of n independent observations from the posterior distribution. (ii) “T” stands for the sample mean of H˜n in [24] (δ=1.5 is used) with 100,000 replications of n independent observations from the posterior distribution. (iii) “L” represents the sample mean of f(X¯) in [25]. (iv) “F” denotes the sample mean of f(X¯)+f3/2(X¯)/n(11/α)/2 in [25].

As displayed in Fig. 4, the convergence rate of Hn is very slow as described in ref. 17. The main reason is that the value of α is close to 1. From Fig. 4, it can be seen that H˜n given in [24] has a faster convergence rate to the two-term approximation in [25]. It is noted that this two-term approximation converges to the marginal likelihood f(X¯). Thus, H˜n may be used as its approximation.

Similar results are obtained for different values of δ, although rate increases with less accuracy or decreases when δ is larger or smaller than 1.5, e.g., δ=2 or δ=1.

Appendix: Proof of Theorem 1

First we prove the case α=1, which implies that An=log(n), Bn=n in [6], and the distribution of Xn converges to the stable distribution F(x) with index α=1 satisfying [7] and [8] where d1=0 and d2=1. Denote n=log(n).

E(Hn)=n1x+log(n)dFn(x)+nn1x+log(n)dFn(x)+n1x+log(n)dFn(x)=I1n+I2n+I3n.

Integrating by parts, we have

I1n=Fn(n)n+log(n)+nFn(x){x+log(n)}2dx=I1n,1+I1n,2.

By [7] and [10], I1n,1=o(n3). We now show that I1n,2=o(n3). By [9]

I1n,2=d3n2/2Fn(xlog(n))x2dx+n2/2nFn(x){x+log(n)}2dx=I1n,2,1+I1n,2,2.

Because I1n,2,1Fn(log(n)/2)0, by applying l’Hôpital’s rule,

limnI1n,2,11/log2(n)=limnlog(n)Fn(log(n)/2)=o(1),

then I1n,2,1=o(n4). For the other part, I1n,2,2Fn(n)O(log1(n))=o(n3). So, I1n=o(n3).

Using Taylor expansion, we have

I2n=1log(n)nndFn(x)+1log(n)nnxlog(n)dFn(x)+1log(n)nnO(1log(n))dFn(x)=I2n,1+I2n,2+o(n4).

By [8] and [10]

I2n,1=Fn(n)Fn(n)log(n)=1{1Fn(n)}Fn(n)log(n)=1n1+o(n1)log(n),
I2n,2=nFn(n)nFn(n)log2(n)+nnF(x)log2(n)dx,=n1log(n)+nnF(x)log2(n)dx+o(n4)=n1log(n)+o(n4).

By applying l’Hôpital’s rule, we obtain

limnnnF(x)log(n)dxn1=limnn1n1F(tlog(n))dtn1=1,

which implies that nn[F(x)/log(n)]dx=n1+o(n1), and hence I2n,2=[1/log(n)]2n3+o(n3).

Because ndx/(x2(x+log(n)))=n3n4log(1+n), we have I3n=n3+o(n3) by [8] and [10]. In sum, we have

E(Hn)=n2n3+o(n3).

Acknowledgments

This work was partially supported by the Natural Sciences and Engineering Research Council of Canada.

Footnotes

The authors declare no conflict of interest.

References

  • 1.Hamerly G, Elkan C. 2002. Alternatives to k-means algorithm that find better clusterings. Proceedings of the 11th International Conference on Information and Knowledge Management (ACM, New York), pp 600–607. [Google Scholar]
  • 2.Iman RL. Harmonic mean. In: Kotz S, Johnson NL, editors. Encyclopedia of Statistical Sciences. Vol 3. Wiley; New York: 1983. pp. 575–576. [Google Scholar]
  • 3.Komarova NL, Rivin I. Harmonic mean, random polynomials and stochastic matrices. Adv Appl Math. 2003;31(2):501–526. [Google Scholar]
  • 4.Zhang B, Hsu M, Dayal U. 1999. K-harmonic means a data clustering algorithm. (Hewlett-Packard Labs), Technical Report HPL-1999-124. Available at www.hpl.hp.com/techreports/1999/HPL-1999-124.html. Accessed October 10, 2014.
  • 5.Pinto JE, Henry E, Robinson TR, Stowe JD. Equity Asset Valuation. 2nd Ed John Wiley & Sons; Hoboken, NJ: 2010. [Google Scholar]
  • 6.Lim MCH, McLernon DC, Ghogho M. 2009. Weighted harmonic mean SINR maximization for the MIMO downlink. IEEE International Conference on Acoustics, Speech, and Signal Processing (Institute of Electrical and Electronics Engineers, Taipei, Taiwan) pp 2381–2384. [Google Scholar]
  • 7.Jones CM. Approximating negative and harmonic mean moments for the Poisson distribution. Math Commun. 2003;8(2):157–172. [Google Scholar]
  • 8.Pakes AG. On the convergence of moments of geometric and harmonic means. Stat Neerl. 1999;53(1):96–110. [Google Scholar]
  • 9.Shi X, Reid N, Wu Y. Approximation to the moments of ratios of cumulative sums. Can J Stat. 2014;42(2):325–336. [Google Scholar]
  • 10.Shi X, Wang X-S, Reid N. Saddlepoint approximation of nonlinear moments. Stat Sin 2014 [Google Scholar]
  • 11.Newton MA, Raftery AE. Approximate Bayesian inference with the weighted likelihood bootstrap. J R Stat Soc, B. 1994;56(1):3–48. [Google Scholar]
  • 12.Gnedenko BV, Kolmogorov AN. Limit Distributions for Sums of Independent Random Variables. Addison-Wesley; Cambridge, MA: 1954. [Google Scholar]
  • 13.Ibragimov IA, Linnik YV. Independent and Stationary Sequences of Random Variables. Wolters-Noordhoff; Groningen, The Netherlands: 1971. [Google Scholar]
  • 14.Hall P. On the rate of convergence to a stable law. J Lond Math Soc. 1981;23:179–192. [Google Scholar]
  • 15.Berkes I, Tichy R. 2014 Lacunary series and stable distributions. Available at www.math.tugraz.at/discrete/publications/projects/files/berkes_tichy_lac.pdf. Accessed March 10, 2014.
  • 16.Aue A, Lee TCM. On image segmentation using information theoretic criteria. Ann Stat. 2011;39(6):2912–2935. [Google Scholar]
  • 17.Wolpert RL, Schmidler SC. α-Stable limit laws for harmonic mean estimators of marginal likelihoods. Stat Sin. 2012;22(3):1233–1251. [Google Scholar]
  • 18.Shi X, Wu Y, Liu Y. A note on asymptotic approximations of inverse moments of nonnegative random variables. Stat Probab Lett. 2010;80(15-16):1260–1264. [Google Scholar]
  • 19.Wu T-J, Shi X, Miao B. Asymptotic approximation of inverse moments of nonnegative random variables. Stat Probab Lett. 2009;79(11):1366–1371. [Google Scholar]

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES