Tight bounds for the median of a gamma distribution

Richard F Lyon

doi:10.1371/journal.pone.0288601

. 2023 Sep 8;18(9):e0288601. doi: 10.1371/journal.pone.0288601

Tight bounds for the median of a gamma distribution

Richard F Lyon ^1,^*

Editor: Pablo Martin Rodriguez²

PMCID: PMC10490949 PMID: 37682854

Abstract

The median of a standard gamma distribution, as a function of its shape parameter k, has no known representation in terms of elementary functions. In this work we prove the tightest upper and lower bounds of the form 2^−1/k(A + k): an upper bound with A = e^−γ (with γ being the Euler–Mascheroni constant) and a lower bound with $A = log (2) - \frac{1}{3}$ . These bounds are valid over the entire domain of k > 0, staying between 48 and 55 percentile. We derive and prove several other new tight bounds in support of the proofs.

Introduction

We prove some of the bounds conjectured by Lyon [1] for the median of a gamma distribution, relying on previously known and new bounds that are tighter in some regions of the domain, and also relying on numerically transparent evaluations of the CDF at a few points, using a convergent series.

The gamma distribution’s probability density function (PDF) is x^k−1 e^−x/θ/Γ(k)θ^k, but we’ll use θ = 1 because both the mean and median simply scale with this parameter. Thus we use this “standard gamma distribution” PDF with just the shape parameter k, with k > 0 and x > 0:

p_{k} (x) = \frac{1}{Γ (k)} x^{k - 1} e^{- x} .

The mean μ of this distribution is well known to be μ(k) = k, which is easy to verify since the first moment of the PDF evaluates to Γ(k + 1)/Γ(k) = k. The median ν(k) is the value of x at which the cumulative distribution function (CDF) equals one-half:

\frac{1}{2} = \int_{0}^{ν (k)} p_{k} (x) d x = \frac{1}{Γ (k)} \int_{0}^{ν (k)} x^{k - 1} e^{- x} d x .

We seek to prove closed-form tight bounds for ν(k) that achieve 50th percentile in the limit for k → 0 and for k → ∞. By “tight” we mean a bound is equal to the true value at a point, or in the limit at 0 or ∞. Informally, there are degrees of tightness, based on how many derivatives are also matched; so one tight bound might be tighter than another.

See Fig 1 for some known linear and piecewise-linear bounds, illustrating the lack of good known bounds for 0 < k < 1. Many prior publications have specifically only considered k ≥ 1 [2, 3], or only positive integer values [4], or positive half-integers for the Chi-square distribution [5]; the full range k > 0 has also been considered [6, 7], but the bounds found leave room for improvement, particularly at low k.

Theorems to prove

We will show that, for all 0 < k < ∞, the median of the gamma distribution is bounded above and below by:

2^{- 1 / k} (A_{L} + k) < ν (k) < 2^{- 1 / k} (A_{U} + k)

with closed-form scalar constants $A_{L} = log (2) - \frac{1}{3}$ and A_U = e^−γ (with γ ≈ 0.5772157 being the Euler–Mascheroni constant), and that these bounds are asymptotically tight for k → ∞ and k → 0, respectively. Equivalently, we define the function A(k) and tightly bound it with these constants:

A (k) = 2^{1 / k} ν (k) - k

\Rightarrow log (2) - \frac{1}{3} < A (k) < e^{- γ} .

In addition, we’d like to prove that A(k) is monotonically decreasing between its low-k and high-k limits, as suggested by asymptotic and graphical numerical observations. If we could prove monotonicity, the other proof would be easier, but we don’t see how yet. So, the strategy here is to show that over various subsets of the k domain, with their union covering 0 < k < ∞, there are other bounds that we can prove are between our new closed-form bounds and the true median. Therefore, along the way, we derive several other new upper and lower bounds that are tighter over portions of the k domain, and some asymptotic values and slopes.

We prove the new theorems exhibited in Table 1, culminating in Theorems U6 and U8.

Table 1. Bounds table.

Summary comparison of several upper and lower bounds. Gr&M refers to Groeneveld and Meeden 1977 [11], C&R refers to Chen and Rubin 1986 [4], B&P refers to Berg and Pedersen 2006 [6], and Ga&M refers to Gaunt and Merkle 2021 [3]. The * refers to bounds presented with informal proofs or derivations, and ** for bounds presented as conjectures without proof, in Lyon 2021 [1]; in the present paper they are treated as new theorems, with proofs.

Upper bounds, with names and formulae	Domain	Tight at	Notes
ν(k) < U₀(k) = k	k > 0	k → 0	Gr&M, C&R
ν(k) < U₁(k) = ke^−1/(3k)	k > 0	k → ∞	B&P
ν(k) ≤ U₂(k) = log(2) + (k − 1)	k ≥ 1	k = 1	Ga&M
ν(k) ≤ U₃(k) = log(2)k	k ≤ 1	k = 1	Theorem U3*
$ν (k) < U_{4} (k) = 2^{- 1 / k} (\frac{Γ (k + 1)}{1 - \frac{k}{k + 1} U_{1} (k)})^{1 / k}$	0 < k ≤ 1	k → 0	Theorem U4
$ν (k) < U_{5} (k) = 2^{- 1 / k} (\frac{Γ (k + 1)}{1 - \frac{k}{k + 1} U_{3} (k)})^{1 / k}$	0 < k ≤ 1	k → 0	Theorem U5
ν(k) < U₆(k) = 2^−1/k(e^−γ + k)	k > 0	k → 0	Theorem U6**
Lower bounds, with names and formulae	Domain	Tight at	Notes
ν(k) > L₀(k) = 0	k > 0	k → 0	trivial lower bound
$ν (k) > L_{1} (k) = k - \frac{1}{3}$	k > 0	k → ∞	Doodson 1917 [12], C&R
ν(k) > L₂(k) = 2^−1/kk	k > 0	k → 0	B&P
ν(k) ≥ L₃(k) = log(2) + (k − 1)ν^′(1)	k > 0	k = 1	Theorem L3*
ν(k) > L₄(k) = 2^−1/kΓ(k + 1)^1/k	k > 0	k → 0	Theorem L4*
ν(k) > L₅(k) = 2^−1/ke^−γ	k > 0	k → 0	B&P asymptote, Theorem L5*
$ν (k) > L_{6} (k) = 2^{- 1 / k} (\frac{Γ (k + 1)}{1 - \frac{1 - e^{- k}}{k + 1} L_{3} (k)})^{1 / k}$	k > 0	k → 0	Theorem L6
ν(k) > L₇(k) = ν_Li + k − k_i	0 < k ≤ k_i	—	Theorem L7 (for ν_Li < ν(k_i))
ν(k) > L₈(k) = 2^−1/k(log(2) − 1/3 + k)	k > 0	k → ∞	Theorem L8**

Open in a new tab

Recent related work

In addition to the works mentioned above, there have been several more recent works on asymptotic properties and bounds for medians and other quantiles of gamma distributions and of the closely related Poisson and negative binomial (or Polya or Pascal) distributions, with a variety of interesting approaches [8–10].

Theorems and proofs

Chords and tangents

The convexity of the median (i.e., nonnegative second derivative) proved by Berg and Pedersen [7] implies that any tangent line is a lower bound, tight at the point of tangency, and that any chord, the straight line segment defined by two points of intersection, is an upper bound over the k interval delimited by the points of intersection.

The point k = 1 with ν(1) = log(2) is a point for which we have a known value, so is a good place to make a tangent lower bound. We can also use it for a chord with the other point of intersection at k → 0 or k → ∞. The Gaunt and Merkle upper bound [3] U₂(k) = log(2) + (k − 1) can be viewed as the limit of chords with a point of intersection at k = 1 and the other at k → ∞ where the slope approaches 1. At the zero end is our U₃(k), a rather trivial but apparently new observation.

Theorem U3

The median of the standard gamma distribution is bounded above by the chord between k = 0 and k = 1:

ν (k) \leq log (2) k for 0 < k \leq 1 .

Proof: The convexity of the median implies that a chord is an upper bound, between its points of intersection, tight at those points. The point k = 1, ν(1) = log(2) (from the median of the exponential distribution, a well-known result and an easy computation), and the limiting point at k = 0, ν(0) = 0, are the only places we have definite known expressions for the value of the median, so we can provide a formula for that chord. The straight line between (0, 0) and (1, log(2)) is the formula given.

Theorem L3

The median of the standard gamma distribution is bounded below by the tangent line at k = 1:

ν (k) \geq log (2) + (k - 1) ν^{'} (1) with ν^{'} (1) = γ - 2 Ei (- log (2)) - log (log (2)) .

Proof: The convexity of the median [7] implies that a tangent line is a lower bound, tight at the point of tangency. At k = 1 we know ν(1) = log(2) and can compute the slope to form the equation for the tangent line, log(2) + (k − 1)ν′(1).

The slope ν′(k) is not generally tractable, but is at the special point k = 1, where the CDF P_k(x) (the lower incomplete gamma function) and PDF p_k(x) are both exponential functions.

P_{k} (x) = \int_{0}^{x} p_{k} (t) d t = \int_{0}^{x} \frac{t^{k - 1}}{Γ (k)} e^{- t} d t .

At the point where $P_{k} (x) = \frac{1}{2}$ , where x = ν(k), the slope is:

ν^{'} (k) = \frac{d ν}{d k} = - \frac{\partial P_{k} (x)}{\partial k} / \frac{\partial P_{k} (x)}{\partial x} .

The derivative with respect to x is easy,

\frac{\partial P_{k} (x)}{\partial x} = p_{k} (x) = \frac{x^{k - 1}}{Γ (k)} e^{- x},

except that we only have a closed-form relation between x and k at k = 1, where we know x = ν(1) = log(2) and $p_{k} (x) = e^{- log (2)} = \frac{1}{2}$ , so the derivative is $\frac{1}{2}$ there. The derivative with respect to k is messier:

\frac{\partial P_{k} (x)}{\partial k} = - Γ {(k)}^{- 2} \frac{d Γ (k)}{d k} \int_{0}^{x} t^{k - 1} e^{- t} d t + Γ {(k)}^{- 1} \int_{0}^{x} \frac{d t^{k - 1}}{d k} e^{- t} d t .

At k = 1, using Γ(k) = 1 and dΓ(k)/dk = −γ, this derivative evaluates to

\frac{\partial P_{k} (x)}{\partial k} |_{k = 1} = \frac{γ}{2} + \int_{0}^{x} log t e^{- t} d t

= \frac{γ}{2} + Ei (- log (2)) - γ - \frac{1}{2} log (log (2)),

where Ei(−log(2)) ≈ −0.3786710 is the exponential integral (integration and evaluation assisted by Wolfram Alpha and independently verified). Putting these results together we get the slope of the median at 1, and hence the slope of the tangent-line lower bound there:

ν^{'} {(k) |}_{k = 1} = γ - 2 Ei (- log (2)) + log (log (2)) \approx 0.9680448 .

Using bounds for the exponential

To find new bounds for the median, we can bound the exponential in the integrand. Since e^−x is a decreasing function of x (derivative is −e^−x < 0), we can upper bound it for x > 0 by its starting value: e^−x < 1. Since it is convex (second derivative is e^−x > 0), we can lower bound it by a tangent line: 1 − x < e^−x, and can upper bound it over the interval 0 < x < k by a chord: e^−x < 1 − x(1 − e^−k)/k.

In the proofs below, the superscripts a, b, and c are just names, not exponents.

Theorem L4

The median of the standard gamma distribution is bounded below by:

ν (k) > L_{4} (k) = 2^{- 1 / k} Γ {(k + 1)}^{1 / k} for all k > 0 .

Proof: Use the constant upper bound to the exponential, e^−x < 1 for x > 0, in the CDF integrand, to notice this inequality:

\frac{1}{Γ (k)} \int_{0}^{ν (k)} x^{k - 1} d x > \frac{1}{Γ (k)} \int_{0}^{ν (k)} e^{- x} x^{k - 1} d x = \frac{1}{2},

which integrates to:

\frac{ν {(k)}^{k}}{Γ (k + 1)} > \frac{1}{2} .

Since the denominator and the exponent are positive, this expression can be decreased to achieve equality to one-half by substituting for ν(k) a positive function of sufficiently lower value, which we’ll call L₄(k):

\frac{L_{4} {(k)}^{k}}{Γ (k + 1)} = \frac{1}{2} \Rightarrow L_{4} (k) < ν (k) for all k > 0 .

Solving, we find the expression given: L₄(k) = 2^−1/kΓ(k + 1)^1/k.

We choose to write the result factored this way, rather than a single fraction with exponent 1/k or −1/k, because the two parts emphasize the shape of the median function in two regions: 2^−1/k, which is just a small fraction bigger than Berg and Pedersen’s asymptote L₅(k) [6] at low k, and Γ(k + 1)^1/k, which is just a small offset below the Chen and Rubin straight-line bound $k - \frac{1}{3}$ [4] at high k. Most of our other new bounds keep the 2^−1/k factor, which characterizes the “hockey stick” shape at low k.

Theorem L5

The median of the standard gamma distribution, and its lower bound L₄(k), are bounded below by Berg and Pedersen’s asymptote L₅(k):

ν (k) > L_{4} (k) > L_{5} (k) = 2^{- 1 / k} e^{- γ} .

Proof: Comparing to Theorem L4, L₅(k) < L₄(k) is implied if e^−γ < Γ(k + 1)^1/k for all k > 0. We prove this by showing that Γ(k + 1)^1/k monotonically increases from e^−γ, its limit at 0.

That lim_k→0 Γ(k + 1)^1/k = e^−γ follows from the Taylor series about 0 of Γ(k + 1), which is 1 − γk + O(k²). That Γ(k + 1)^1/k increases monotonically from there, even though Γ(k + 1) is decreasing, is proved by showing that its derivative is everywhere positive. Differentiating, in terms of the digamma function ψ⁽⁰⁾, the logarithmic derivative of the gamma function, we have:

\frac{d}{d k} Γ {(k + 1)}^{1 / k} = \frac{Γ {(k + 1)}^{1 / k} (k ψ^{(0)} (k + 1) - log (Γ (k + 1))}{k^{2}} .

The only factor here that is not obviously positive for k > 0 is kψ⁽⁰⁾(k + 1) − log(Γ(k + 1)), which at k = 0 is equal to 0, and which has a surprisingly simple derivative:

\frac{d}{d k} (k ψ^{(0)} (k + 1) - log (Γ (k + 1)) = k ψ^{(1)} (k + 1) .

Here ψ⁽¹⁾ is the trigamma function, the derivative of the digamma function. This derivative is positive since the trigamma function, a special case of the Hurwitz zeta function, is positive for real arguments, because it has a series expansion with all positive terms:

ψ^{(1)} (z) = \sum_{n = 0}^{\infty} \frac{1}{{(z + n)}^{2}} .

Since it starts at zero and has a positive derivative everywhere, the factor in question is positive for k > 0, so Γ(k + 1)^1/k is monotonically increasing.

Theorem U4

The median of the standard gamma distribution is bounded above by this expression, which is asymptotic at low k to the lower bound L₄(k):

ν (k) < U_{4} (k) = 2^{- 1 / k} (\frac{Γ (k + 1)}{1 - \frac{k^{2}}{k + 1} e^{- 1 / (3 k)}})^{1 / k} when k \leq 1 .

Proof: Use the tangent-line-at-0 lower bound to the exponential, 1 − x < e^−x, in the integrand, to notice this inequality with easy integrals:

\frac{1}{Γ (k)} \int_{0}^{ν (k)} (1 - x) x^{k - 1} d x < \frac{1}{Γ (k)} \int_{0}^{ν (k)} e^{- x} x^{k - 1} d x = \frac{1}{2}

\Rightarrow \frac{ν {(k)}^{k}}{Γ (k + 1)} - \frac{ν {(k)}^{k + 1} k}{Γ (k + 1) (k + 1)} < \frac{1}{2} .

In this difference of terms, their exists a U^a(k) such that the first term can be increased to achieve equality to one-half by substituting U^a(k) for ν(k):

\frac{U^{a} {(k)}^{k}}{Γ (k + 1)} - \frac{ν {(k)}^{k + 1} k}{Γ (k + 1) (k + 1)} = \frac{1}{2} \Rightarrow U^{a} (k) > ν (k) .

Now, picking any known upper bound U^b(k) ≥ ν(k), we have U^a(k)^kU^b(k) > ν(k)^k+1, so we can write this inequality where the subtracted term has been increased:

\frac{U^{a} {(k)}^{k}}{Γ (k + 1)} - \frac{U^{a} {(k)}^{k} U^{b} (k) k}{Γ (k + 1) (k + 1)} < \frac{1}{2}

\frac{U^{a} {(k)}^{k}}{Γ (k + 1)} (1 - \frac{U^{b} (k) k}{k + 1}) < \frac{1}{2} .

As long as both factors are positive, there exists a U^c(k) such that we can increase the first factor by substituting U^c(k) for U^a(k) to achieve equality:

\frac{U^{c} {(k)}^{k}}{Γ (k + 1)} (1 - \frac{U^{b} (k) k}{k + 1}) = \frac{1}{2} \Rightarrow U^{c} (k) > U^{a} (k) > ν (k) .

So we can solve for this new bound U^c(k) in terms of the known bound U^b(k):

ν (k) < U^{c} (k) = 2^{- 1 / k} (\frac{Γ (k + 1)}{1 - \frac{k}{k + 1} U^{b} (k)})^{1 / k} as long as U^{b} (k) \frac{k}{k + 1} < 1 .

Using U₁(k) = ke^−1/(3k) as the known bound U^b, and verifying the positivity constraint by noting that U₁(k) < 1 for k ≤ 1 completes the proof.

By this method, we’ve taken an upper bound that’s relatively loose at low k and converted it to one that’s asymptotically tight, approaching the lower bound L₄(k), at low k. Let’s do another like that.

Theorem U5

The median of the standard gamma distribution is bounded above by:

ν (k) < U_{5} (k) = 2^{- 1 / k} (\frac{Γ (k + 1)}{1 - \frac{k^{2}}{k + 1} log (2)})^{1 / k} when k \leq 1 .

Proof: Same as Theorem U4, but use U₃(k) = klog(2) as the known upper bound U^b(k). Note that U₃(k) is a bound only for k ≤ 1, and that the positivity constraint $U^{c} (k) \frac{k}{k + 1} < 1$ holds through k ≤ 1, since both factors U^c(k) and $\frac{k}{k + 1}$ are less than 1 in that domain.

Theorem L6

The median of the standard gamma distribution is bounded below by:

L_{6} (k) = 2^{- 1 / k} (\frac{Γ (k + 1)}{1 - \frac{1 - e^{- k}}{k + 1} L_{3} (k)})^{1 / k} < ν (k) forall k > 0 .

Proof: Like Theorem U4, but with an upper, as opposed to lower, bound on the exponential, and changing all the directions of the inequalities. That is, use the chord from x = 0 to x = k upper bound to the exponential, $e^{- x} < 1 - x \frac{1 - e^{- k}}{k}$ , in the integrand, to notice this inequality with easy integrals:

\frac{1}{Γ (k)} \int_{0}^{ν (k)} (1 - x \frac{1 - e^{- k}}{k}) x^{k - 1} d x > \frac{1}{Γ (k)} \int_{0}^{ν (k)} e^{- x} x^{k - 1} d x = \frac{1}{2}

\Rightarrow \frac{ν {(k)}^{k}}{Γ (k + 1)} - \frac{ν {(k)}^{k + 1} (1 - e^{- k})}{Γ (k + 1) (k + 1)} > \frac{1}{2} .

In this difference of terms, the first term can be decreased to achieve equality to one-half by substituting for ν(k) a sufficiently smaller (but positive) L^a(k):

\frac{L^{a} {(k)}^{k}}{Γ (k + 1)} - \frac{ν {(k)}^{k + 1} (1 - e^{- k})}{Γ (k + 1) (k + 1)} = \frac{1}{2} \Rightarrow L^{a} (k) < ν (k) .

Now, picking any other lower bound L^b(k) ≤ ν(k), we have L^a(k)^kL^b(k) < ν(k)^k+1 (even if L^b(k) < 0), so we can write this inequality where the subtracted term has been decreased:

\frac{L^{a} {(k)}^{k}}{Γ (k + 1)} - \frac{U^{a} {(k)}^{k} L^{b} (k) (1 - e^{- k})}{Γ (k + 1) (k + 1)} > \frac{1}{2}

\Rightarrow \frac{L^{a} {(k)}^{k}}{Γ (k + 1)} (1 - \frac{L^{b} (k) (1 - e^{- k})}{k + 1}) > \frac{1}{2} .

As long as both factors are positive we can decrease the first factor by substituting a sufficiently smaller L^c(k) to achieve equality:

\frac{L^{c} {(k)}^{k}}{Γ (k + 1)} (1 - \frac{L^{b} (k) (1 - e^{- k})}{k + 1}) = \frac{1}{2} \Rightarrow L^{c} (k) < L^{a} (k) < ν (k) .

So we can solve for a new lower bound L^c(k) in terms of a known lower bound L^b(k):

ν (k) > L^{c} (k) = 2^{- 1 / k} (\frac{Γ (k + 1)}{1 - \frac{(1 - e^{- k})}{k + 1} L^{b} (k)})^{1 / k} as long as \frac{(1 - e^{- k})}{k + 1} L^{b} (k) < 1 .

The constraint is met for all positive k, with any lower bound L^b(k), since L^b(k) < ν(k) < k ⇒ L^b(k)/(k + 1) < 1.

Using L₃(k) = log(2) + (k − 1)ν′(1), the tangent at 1, as the known bound L^b completes the proof.

ν (k) > L_{6} (k) = 2^{- 1 / k} (\frac{Γ (k + 1)}{1 - \frac{k^{2}}{k + 1} (log (2) + (k - 1) ν^{'} (1))})^{1 / k} for all positive k .

Corollary 1

ν (k) > L_{61} (k) = 2^{- 1 / k} (\frac{Γ (k + 1)}{1 - \frac{k^{2}}{k + 1} (log (2) + k - 1)})^{1 / k} for k \leq 1 .

Proof: Use L^b(k) = log(2) + k − 1 < L₃(k), a lower bound to the tangent at k = 1 for k ≤ 1, since the slope 1 exceeds the tangent-line slope ν′(1).

We could obviously write some more corollaries, replacing negative lower bounds by zero, i.e. using max(0, L₃(k)) or max(0, log(2) + k − 1) as L^b(k). Where the negative values are clipped to zero, the bound will turn to follow the tighter L₄(k).

Bounds in a box

Our theorems U6 and L8 follow if we can prove that

log (2) - \frac{1}{3} < A (k) < e^{- γ} where A (k) = 2^{1 / k} ν (k) - k .

These bounds $A_{L} = log (2) - \frac{1}{3}$ and A_U = e^−γ define the bottom and top edges of a “box” that we need A(k) to be constrained to, and we can visualize that by mapping other bounds through the same function, and plot them, and show that at least one is inside the box for any and all k. See Fig 2.

Fig 2 — Upper (blue) and lower bounds (red) mapped for comparison to A(k) (black dotted). Bounds U₆(k) and L₈(k) define the top and bottom of the box via A_U6(k) = A_U = e^−γ and A_L8(k) = A_L = log(2) − 1/3, while the left and right are defined by the limits of arctan(k) for 0 < k < ∞. To prove that the top and bottom are bounds of A(k), our approach is to find other upper and lower bounds “inside the box” over domains covering all k > 0. In this figure, we have no lower bound in the box around 1.7 < k < 3.0.

The function that maps the median and its bounds is f(k, x(k)) = 2^1/kx(k) − k, where x(k) is any real-valued function of positive k. We map bounds into the same space as A(k) with this, and identify them with subscripts. In particular, consider these upper bounds (plotted in Fig 2):

A_{U 1} (k) = 2^{1 / k} U_{1} (k) - k = 2^{1 / k} k e^{- 1 / (3 k)} - k,

A_{U 2} (k) = 2^{1 / k} U_{2} (k) - k = 2^{1 / k} (log (2) + (k - 1)) - k,

A_{U 3} (k) = 2^{1 / k} U_{3} (k) - k = 2^{1 / k} (k log (2)) - k,

A_{U 4} (k) = 2^{1 / k} U_{4} (k) - k = (\frac{Γ (k + 1)}{1 - \frac{k}{k + 1} U_{1} (k)})^{1 / k} - k,

A_{U 5} (k) = 2^{1 / k} U_{5} (k) - k = (\frac{Γ (k + 1)}{1 - \frac{k}{k + 1} U_{3} (k)})^{1 / k} - k .

It’s easy to see, graphically, that A_U1(k) and A_U2(k) are “in the box” for k ≥ 1, and that A_U4(k) and A_U5(k) are “in the box” for k ≤ 1. Whether it’s easy to prove is another matter. At least we’re working with well-defined expressions and functions with known properties, not with the implicitly defined ν(k) itself.

We can do the same for some lower bounds, and see where they end up relative to the box:

A_{L 1} (k) = 2^{1 / k} L_{1} (k) - k = 2^{1 / k} (k - \frac{1}{3}) - k,

A_{L 3} (k) = 2^{1 / k} L_{3} (k) - k = 2^{1 / k} (log (2) + (k - 1) ν^{'} (1)) - k,

A_{L 4} (k) = 2^{1 / k} L_{4} (k) - k = Γ {(k + 1)}^{1 / k} - k,

A_{L 6} (k) = 2^{1 / k} L_{6} (k) - k = (\frac{Γ (k + 1)}{1 - \frac{1 - e^{- k}}{k + 1} L_{3} (k)})^{1 / k} - k .

The lower bounds L₁(k), L₃(k), and L₄(k) are well inside the box (greater than $A_{L} = log (2) - \frac{1}{3}$ ) for k near enough to ∞, 1, and 0, respectively, but they leave significant holes between them, where they don’t constrain A(k) against sagging out of the bottom of the box. That’s why we resort to the additional complexity of L₆(k) and L₇(k), to fill in the holes.

Line bounds from point bounds

It is straightforward to find or verify bounds for ν(k) at a finite set of discrete values of k, using numerical evaluation of the CDF integral via a rapidly converging series based on the Taylor series for the exponential function. For Theorem L7, we prove slope-1 line lower bounds based on point bounds at eight k points, for k values less than the point k (for higher values of k, the line will cross to be greater than ν(k). We use these line bounds to supplement our other lower bounds, in service of proving Theorem L8.

Theorem L7

The median of the standard gamma distribution is bounded below by lines of slope 1 below point lower bounds:

ν (k) > L_{7} (k) = ν_{L i} + k - k_{i} for k \leq k_{i},

where the values k_i and ν_Li represent eight point lower bounds ν_Li < ν(k_i) per this array of values:

\begin{matrix} i & k_{i} & ν_{L i} \\ 1 & 0.40 & 0.145 \\ 2 & 0.44 & 0.177 \\ 3 & 0.50 & 0.227 \\ 4 & 0.60 & 0.315 \\ 5 & 0.75 & 0.454 \\ 6 & 1.00 & 0.693 \\ 7 & 1.50 & 1.182 \\ 8 & 3.50 & 3.172 . \end{matrix}

Proof: Since the slope of ν(k) is everywhere less than 1, these lines of slope 1 below point lower bounds are below the tangent line lower bounds at those points. To demonstrate that each ν_Li is indeed a lower bound at each of the k_i points, we use a transparent technique not relying on anybody’s implementation of incomplete gamma functions. The point lower bounds are verified to lead to percentiles (100 times the CDF estimate) below 50%, by more than the corresponding estimation error bounds. In each case, we have chosen the ν_Li to be the largest multiple of 0.001 that can be verified to be a lower bound.

These calculations are shown in the S1 Appendix, with intermediate steps printed out from the algorithm given there. The formulae used for the CDF include approximations for the incomplete gamma function integral. The incomplete gamma function integral in the CDF can be written as a convergent series, using the Taylor series of the exponential function:

\int_{0}^{ν} x^{k - 1} e^{- x} d x = \int_{0}^{ν} x^{k - 1} \sum_{n = 0}^{\infty} \frac{{(- x)}^{n}}{n!} d x = \sum_{n = 0}^{\infty} \frac{{(- 1)}^{n}}{n!} \int_{0}^{ν} x^{n + k - 1} d x = \sum_{n = 0}^{\infty} \frac{{(- 1)}^{n} ν^{n + k}}{n! (n + k)} .

Since the magnitudes of the terms are decreasing by half or better after enough terms that n > 2ν, the error in truncating the series is bounded by the magnitude of the last term added (and due to the alternation, the error is actually quite a bit lower than that). So to verify a point lower bound ν < ν(k), we accumulate terms until that condition is met and the sum plus the bound on the residue, divided by a lower bound of Γ(k), is less than 0.5.

We lower-bound Γ(k) at these points by truncating values from a standard math package to 5 decimal digits. We also computed these bounds from scratch, using Euler’s product formula with 80 or more factors, and using upper and lower bounds on the truncated tail residual factor computed by using integrals to bound the sum of logarithms of the factors. That analysis is too long and complicated to include, but we list the values we used so that others can independently verify them.

See the S1 Appendix for the numerical verifications of these values by this method.

The reason for the particular set of k_i selected for this theorem will become clear later (or may be obvious from looking at Fig 3).

Piecing the bounds together

This is where it gets complicated. We have to find regions where various of the previous bounds are tighter than the main upper and lower bounds that we set out to prove. To this end we will prove the lemmas in Table 2.

Table 2. Lemmas table.

Lemmas to prove in support of Theorems U6 and L8.

Upper bounds in box	Domain	Lemma
A_U1(k) < e^−γ	k ≥ 0.5	Lemma 1
A_U5(k) < e^−γ	k ≤ 1	Lemma 2
Lower bounds in box	Domain
A_L1(k) > log(2) − 1/3	k ≥ 3.1	Lemma 3
A_L4(k) > log(2) − 1/3	0 < k ≤ 0.36	Lemma 4
A_L7i(k) > log(2) − 1/3	k_i−1 ≤ k ≤ k_i, k₀ = 0.36	Lemma 5

Open in a new tab

Lemma 1

A_{U 1} (k) = 2^{1 / k} k e^{- 1 / (3 k)} - k < e^{- γ} for k \geq 0.5 .

Proof:

A_{U 1} (k) = 2^{1 / k} k e^{- 1 / (3 k)} - k = k e^{(log (2) - 1 / 3) / k} - k

\Rightarrow \frac{d}{d k} A_{U 1} (k) = e^{(log (2) - 1 / 3) / k} \frac{k - log (2) + 1 / 3}{k} - 1 .

The exponential can be upperbounded using e^x < 1/(1 − x) for x < 1:

\frac{d}{d k} A_{U 1} (k) < \frac{1}{1 - (log (2) - 1 / 3) / k} \frac{k - log (2) + 1 / 3}{k} - 1 for k > log (2) - 1 / 3

\Rightarrow \frac{d}{d k} A_{U 1} (k) < 0 for k > 0.360 > log (2) - 1 / 3 .

Concluding that the slope of A_U1(k) is negative for k > 0.360, and evaluating A_U1(0.5) < 0.527 and e^−γ > 0.561, we can conclude that A_U1(k) < e^−γ for k ≥ 0.5.

Lemma 2

A_{U 5} (k) = 2^{1 / k} U_{5} (k) - k < e^{- γ} for k \leq 1 .

Proof:

A_{U 5} (k) = 2^{1 / k} U_{5} (k) - k = (\frac{Γ (k + 1)}{1 - \frac{k^{2}}{k + 1} log (2)})^{1 / k} - k .

The low-k limit is again e^−γ; the slope is initially negative, but the A_U5(k) eventually crosses above the limit and diverges to infinity where the denominator expression goes to 0. To show that it doesn’t cross above the limit until somewhere after k = 0.5, we’ll use several steps of bounding of functions and derivatives, starting by invoking an upper bound on the gamma function, Theorem 1.4 of Batir [13]:

Γ (k + 1) \leq {(e^{- γ})}^{(- e^{- γ})} e^{- k} {(k + e^{- γ})}^{(k + e^{- γ})} .

The condition A_U5(k) ≤ e^−γ is satisfied if substituting this upper bound satisfies the constraint:

(\frac{{(e^{- γ})}^{(- e^{- γ})} e^{- k} {(k + e^{- γ})}^{(k + e^{- γ})}}{1 - \frac{k^{2}}{k + 1} log (2)})^{1 / k} - k \leq e^{- γ}

\frac{{(e^{- γ})}^{(- e^{- γ})} e^{- k} {(k + e^{- γ})}^{(k + e^{- γ})}}{1 - \frac{k^{2}}{k + 1} log (2)} \leq {(k + e^{- γ})}^{k} .

(since the denominator is positive in the domain we care about, 0 < k < 1). Dividing by the positive right-hand side gives the equivalent condition

\frac{{(e^{- γ})}^{(- e^{- γ})} e^{- k} {(k + e^{- γ})}^{e^{- γ}}}{1 - \frac{k^{2}}{k + 1} log (2)} \leq 1,

which simplifies to

\frac{e^{- k} {(k e^{γ} + 1)}^{e^{- γ}}}{1 - \frac{k^{2}}{k + 1} log (2)} \leq 1

g (k) = (k + 1) (e^{- k} {(k e^{γ} + 1)}^{e^{- γ}}) - (k + 1) + k^{2} log (2) \leq 0 .

This function has a Taylor series at 0 starting with a negative k² term and a positive k³ term, so it is negative in some region of low enough k, as needed. By showing that the third derivative stays positive in 0 < k < 1, we can conclude that it crosses its starting value not more than once, and we can bound where that happens with a few evaluations. So first we need the derivatives; with help from WolframAlpha:

g^{'} (k) = e^{- k} (1 - e^{γ} k^{2}) {(e^{γ} k + 1)}^{e^{- γ} - 1} + 2 k log (2) - 1,

g^{″} (k) = \frac{e^{- k + e^{- γ} γ + γ} (e^{γ} k^{3} - e^{γ} k^{2} - 3 k - 1) {(k + e^{- γ})}^{e^{- γ}}}{{(e^{γ} k + 1)}^{2}} + 2 log (2),

g^{‴} (k) = e^{γ - k} {(e^{γ} k + 1)}^{e^{- γ} - 3} (2 e^{γ} (3 k^{2} + k + 1) - e^{2 γ} (k - 2) k^{3} - 3) .

To show g‴(k) > 0 over the domain of interest (say 0 < k < 0.5), we can throw away the leading obviously-positive factors, and the condition becomes 2e^γ(3k² + k + 1) − e^2γ(k − 2)k³ − 3 > 0, a simple fourth-degree polynomial-in-k condition:

- e^{2 γ} k^{4} + 2 e^{2 γ} k^{3} + 6 e^{γ} k^{2} + 2 e^{γ} k + (2 e^{γ} - 3) > 0 .

All coefficients except the highest-order one are positive. Without that fourth-order term, this polynomial would be positive for all k ≥ 0, with one real root below zero; with that term, it has a positive real root, below which it is positive, including about 0.562 at k = 0 and about 17.98 at k = 1, so it’s positive for at least 0 ≤ k ≤ 1. This is all we need; let’s unwind.

The positive g‴(k) means that the curvature, g″(k), is increasing between k = 0 and k = 1. The slope g′(k) starts out negative at k = 0, keeping g(k) < 0 until after the slope eventually increases enough due to the eventually positive curvature. After the g(k) becomes positive, it will stay that way until some time after g‴(k) turns negative, making g″(k) turn negative, etc., and this is well above the region of interest. Evaluating a few points, g(0.5) = −0.0258, g(1) = −0.00025, g(1.01) = 0.0018. So the original condition holds, A_U5(k) is “in the box”, through k = 1, the end of its domain of validity.

Lemma 3

A_{L 1} (k) = 2^{1 / k} L_{1} (k) - k > log (2) - 1 / 3 for k \geq 3.1 .

Proof: L₁(k) maps to A_L1(k) = 2^1/k(k − 1/3) − k. The Laurent series at infinity of this function is easily found (with the help of Wolfram Alpha) to be:

A_{L 1} (k) = 2^{1 / k} (k - 1 / 3) - k = \sum_{n \geq 0} \frac{{log}^{n} (2) (log (2) - 1 / 3 - n / 3)}{k^{n} (n + 1)!}

= log (2) - 1 / 3 + log (2) (log (2) / 2 - 1 / 3) k^{- 1} + O (k^{- 2}),

where the coefficient of k⁻¹ is positive and all of the coefficients in the O(k⁻²) term are negative (where n > 3 log(2) − 1). Therefore, for sufficiently large k, A_L1(k) exceeds L_U = log(2) − 1/3, and for smaller k, the value can only cross below log(2) − 1/3 ≈ 0.35981 once due to all the higher-order coefficients being negative. Evaluating at k = 3 we find A_L1(k) ≈ 0.35979, outside the box, and evaluating at k = 3.1 we find A_L1(k) ≈ 0.35990, inside the box. Thus we can conclude that A_L1(k) is inside the box for k ≥ 3.1.

Lemma 4

A_{L 4} (k) = 2^{1 / k} L_{4} (k) - k > log (2) - 1 / 3 for 0 < k \leq 0.36 .

Proof: L₄(k) maps to A_L4(k) = Γ(k + 1)^1/k − k, which approaches e^γ (at the top of the box) as k → 0, and exits the box somewhere in 0.36 < k < 0.37 since A_L4(0.36) ≈ 0.3638 > 0.3598 and A_L4(0.37) ≈ 0.3583 < 0.3598. We just need to show it can only exit the box once in 0 < k < 0.37 to conclude A_L4(k) > log(2) − 1/3 for all 0 < k < 0.36. Γ(k + 1)^1/k has a positive derivative, as we showed in the proof of Theorem L5. Now, if we can prove that derivative is less than 1, then A_L4(k) is monotonically decreasing, so it can only go out of the box once.

\frac{d}{d k} Γ {(k + 1)}^{1 / k} = Γ {(k + 1)}^{1 / k} \frac{k ψ^{(0)} (k + 1) - log (Γ (k + 1))}{k^{2}} < 1 in 0 < k < 1

because both factors are less than 1; Γ(x)<1 in 1 < x < 2 (a well-known property of the gamma function); to show (kψ⁽⁰⁾(k + 1) − log(Γ(k + 1))/k² < 1, we need to do a bit more work. First, define the function h(k) that this factor is the derivative of:

h (k) = \frac{log (Γ (k + 1))}{k} \Rightarrow \frac{d}{d k} h (x) = h^{'} (k) = (k ψ^{(0)} (k + 1) - log (Γ (k + 1)) / k^{2} .

Then rewrite the log of the gamma function in terms of this identity derived again from the Weierstrauss product formula and the zeta function:

log (Γ (z + 1)) = - γ z + \int_{0}^{\infty} \frac{e^{- z t} - 1 + z t}{t (e^{t} - 1)} d t

so that

h (k) = \frac{log (Γ (k + 1))}{k} = - γ + \int_{0}^{\infty} \frac{(e^{- k t} - 1) / k t + 1}{(e^{t} - 1)} d t .

The 1/(e^t − 1) in the denominator of the integrand makes this integral absolutely convergent, so we can differentiate twice, getting:

h^{'} (k) = \int_{0}^{\infty} \frac{- (k t + 1) e^{- k t} + 1}{k^{2} t (e^{t} - 1)} d t,

h^{″} (k) = \int_{0}^{\infty} \frac{[{(k t + 1)}^{2} + 1] e^{- k t} - 2}{k^{3} t (e^{t} - 1)} d t,

in which the integrand has a positive denominator and a negative numerator, since for any k we can define y = kt and write the numerator as [(y + 1)² + 1]e^−y − 2, which starts at 0 and has derivative −y²e^−y < 0 for all y > 0. Since h″(k) < 0, h′(k) only decreases from its starting value, which is less than 1, this completes the proof that A_L4(k) decreases monotonically, so it is “in the box” for 0 < k ≤ 0.36.

Lemma 5

A_{L 7 i} (k) = 2^{1 / k} L_{7 i} (k) - k > log (2) - 1 / 3 for k_{i - 1} \leq k \leq k_{i}, k_{0} = 0.36 .

Proof: The lower bounds L_7i(k) = ν_Li + k − k_i, lines of slope 1 that are above L₁(k), are of the form k − a for a = ν_Li − k_i, and map under f to a function with series similar to the one we saw for Lemma 3 where a = 1/3, but with a values in the range ν_L1 − k₁ = 0.255 ≤ a ≤ ν_L8 − k₈ = 0.328.

A_{L 7} = 2^{1 / k} (k - a) - k = \sum_{n \geq 0} \frac{{log}^{n} (2) (log (2) - a - a n)}{k^{n} (n + 1)!} .

In Lemma 3 we saw that the coefficient of k⁻¹ was positive, as it is here, since a < log(2)/2 ≈ 0.346, and that all the higher-order coefficients are negative, since for n ≥ 2, (1 + n)a > log(2) for a > log(2)/3 ≈ 0.231. Thus, as with A_L1(k) in Lemma 3, lines of slope 1 are not monotonic when mapped into the box, but are in the box at some k and only leave the bottom of the box once if a ≤ 1/3; that is, if L_7i(k)>L₁(k). Therefore, to verify that they are in the box over the domains specified, all that is needed is to evaluate them at both ends of their respective domains, as in this array:

\begin{matrix} i & a & k_{i} & ν_{i} = L_{i} (k_{i}) & A_{L i} (k_{i}) & k_{i - 1} & L_{i} (k_{i - 1}) & A_{L i} (k_{i - 1}) \\ 1 & 0.255 & 0.40 & 0.145 & 0.4202 & 0.36 & 0.105 & 0.3601 \\ 2 & 0.263 & 0.44 & 0.177 & 0.4153 & 0.40 & 0.137 & 0.3750 \\ 3 & 0.273 & 0.50 & 0.227 & 0.4080 & 0.44 & 0.167 & 0.3670 \\ 4 & 0.285 & 0.60 & 0.315 & 0.4001 & 0.50 & 0.215 & 0.3600 \\ 5 & 0.296 & 0.75 & 0.454 & 0.3940 & 0.60 & 0.304 & 0.3651 \\ 6 & 0.307 & 1.00 & 0.693 & 0.3860 & 0.75 & 0.443 & 0.3663 \\ 7 & 0.318 & 1.50 & 1.182 & 0.3763 & 1.00 & 0.682 & 0.3640 \\ 8 & 0.328 & 3.50 & 3.172 & 0.3667 & 1.50 & 1.172 & 0.3604 \end{matrix}

The A_Li(k_i) and A_Li(k_i−1) columns in this array being greater than log(2) − 1/3 ≈ 0.3598 proves that the lower bounds are “in the box” over the respective domains k_i−1 ≤ k ≤ k_i.

Theorem U6

The median of the standard gamma distribution is bounded above by L₆(k):

ν (k) < U_{6} (k) = 2^{- 1 / k} (e^{- γ} + k) for all k > 0 .

Proof: Relative to the top of the box A_U6(k) = e^−γ ≈ 0.5615, from Lemma 1 and Lemma 2, A_U1(k) < e^−γ for k ≥ 0.5 and A_U5(k) < e^−γ for k ≤ 1, which imply U₆(k) > U₁(k) > ν(k) and U₆(k) > U₅(k) > ν(k) over the respective same domains of k, so we conclude U₆(k) > ν(k) for all k > 0.

It is left as an exercise for the reader to prove using U₂(k) and/or U₄(k), leading to U₆(k) > U₂(k) ≥ ν(k) for k ≥ 1 and/or U₆(k) > U₄(k) > ν(k) for 0 < k ≤ 1; or some other combination of bounds in the box.

Theorem L8

The median of the standard gamma distribution is bounded below by L₈(k):

ν (k) > L_{8} (k) = 2^{- 1 / k} (log (2) - 1 / 3 + k) for all k > 0 .

Proof: Relative to the bottom of the box A_L8(k) = log(2)−1/3 ≈ 0.3598, Lemmas 3, 4, and 5 show other bounds “in the box” over regions covering all k > 0: A_L4(k) > log(2) − 1/3 for 0 < k ≤ 0.36, eight versions of A_L7i(k) > log(2) − 1/3 for eight adjacent domains spanning 0.36 ≤ k ≤ 3.5, and A_L1(k) > log(2)−1/3 for k > 3.1, which imply L₈(k) < L₄(k) < ν(k), L₈(k) < L_7i(k) < ν(k), and L₈(k) < L₁(k) < ν(k) over the respective same k domains, so we conclude L₈(k) < ν(k) for all k > 0.

It is left as an exercise for the reader to prove via some other combination of lower bounds in the box. The Theorem L6, or its simpler corollary, provides a bound in the box until L3 takes over. The Theorem L7 bound with just the one segment at i = 8 can fill in between L3 and L1.

Better bounds

From the previous figures it is clear that there is room for smooth functions bounding A(k) above and below that are tighter (in some regions, especially in the middle) than the bounds we constructed to prove theorems U6 and L8. Lyon [1] explored rational-function and arctan interpolators to approximate or bound A(k), writing A(k) as a monotonic interpolation between its upper and lower limits via a function g(k):

A (k) = g (k) A_{L} + (1 - g (k)) A_{U} .

The simplest rational-function interpolator used was

{\tilde{g}}_{1} (k) = \frac{k}{b_{0} + k} .

Based on slopes at 0 and infinity, Lyon showed that the b₀ value for a lower bound could not exceed $b_{L} = (\frac{8}{405} + e^{- γ} log 2 - \frac{{log}^{2} 2}{2}) / (e^{- γ} - log 2 + \frac{1}{3}) - log 2 \approx 0.143472$ and the value for an upper bound could not be less than $b_{U} = (e^{- γ} - log 2 + \frac{1}{3}) / (1 - \frac{e^{- γ} π^{2}}{12}) \approx 0.374654$ , but that these values lead to actual lower and upper bounds, respectively, has not been proved.

See Fig 4 for the relation between these interpolated approximations to A(k) and the actual (numerical) A(k). Conjectured bounds from arctan interpolators are also shown, using this formula to approximate or bound g(k):

{\tilde{g}}_{a} (k) = \frac{2}{π} {tan}^{- 1} \frac{k}{b} .

where for an upper bound the b can not be less than $b_{U} = (24 / π) (e^{- γ} - log 2 + \frac{1}{3}) / (12 - e^{- γ} π^{2}) \approx 0.238512$ , and for a lower bound not more than b_L ≈ 0.205282 (no closed form is known for this limit). The plots also show that the CDF, near 0.5, is surprisingly insensitive to the value of A(k) near both k → 0 and k → ∞.

Fig 4 — Conjectured better upper (blue) and lower (red) bounds mapped for comparison to A(k) (black dotted). Solid curves are from first-order rational-function interpolators; dashed curves are from arctan interpolators. For a sense of how close these bounds come to the median (50th percentile), curves of selected nearby percentiles are included (thin black curves, computed with Matlab’s `gammaincinv` function).

Again, that these interpolated functions produce actual bounds has not been proved. Presumably the method of line bounds from point bounds could be extended to construct proofs by finding tighter bounds numerically, though it might take thousands of points. These conjectures are left for others to consider.

Conclusions

We give upper and lower bounds for the median of the gamma distribution that are tighter at low k than previously known bounds, and are proved valid over all k > 0: U₆(k) and L₈(k). They are simple and in closed form. Though their validity seemed obvious from the numerical and asymptotic methods by which they were discovered, they were originally presented as conjectures [1] because they were not easy to prove analytically; the proofs here follow the outline of the proof the “hard way” as proposed there.

In summary, the median ν(k) of the standard gamma distribution satisfies

log (2) - \frac{1}{3} < 2^{1 / k} ν (k) - k < e^{- γ} for all k > 0

2^{- 1 / k} (log (2) - \frac{1}{3} + k) < ν (k) < 2^{- 1 / k} (e^{- γ} + k) .

This formulation for the median

ν (k) = 2^{- 1 / k} (A (k) + k)

defines the function A(k) = 2^1/kν(k) − k with the remaining conjecture that A(k) is monotonically decreasing between the limits lim_k→0A(k) = e^−γ and lim_k→∞A(k) = log(2) − 1/3.

As we showed before [1], monotonic approximations to A(k) that interpolate between these limits can make excellent approximations to the median, in closed form, with controlled good properties such as being exact at k = 1.

Supporting information

S1 Appendix

(PDF)

Click here for additional data file.^{(126.8KB, pdf)}

Acknowledgments

Help and ideas from discussions with my mathematically talented colleagues at Google are gratefully acknowledged: Pascal Getreuer, Srinivas Vasudevan, Dan Piponi, Michael Keselman, Yuan Li, Thomas Fischbacher, Daniel Parry, Lizao Li, Fred Akalin, and John Vogler.

Data Availability

All relevant data are within the paper.

Funding Statement

The author(s) received no specific funding for this work. Author RFL is employed by Google. The funder provided support in the form of salary for RFL and the publication fee for this article, but did not have any additional role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript. There was no additional external funding received for this study. The specific roles of these authors are articulated in the ‘author contributions’ section. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

1. Lyon RF. On closed-form tight bounds and approximations for the median of a gamma distribution. PLOS One. 2021;16(5):e0251626. doi: 10.1371/journal.pone.0251626 [DOI] [PMC free article] [PubMed] [Google Scholar]
2. Choi KP. On the medians of gamma distributions and an equation of Ramanujan. Proceedings of the American Mathematical Society. 1994;121(1):245–251. doi: 10.1090/S0002-9939-1994-1195477-8 [DOI] [Google Scholar]
3. Gaunt RE, Merkle M. On bounds for the mode and median of the generalized hyperbolic and related distributions. Journal of Mathematical Analysis and Applications. 2021;493(1):124508. doi: 10.1016/j.jmaa.2020.124508 [DOI] [Google Scholar]
4. Chen J, Rubin H. Bounds for the difference between median and mean of gamma and Poisson distributions. Statistics & Probability Letters. 1986;4(6):281–283. doi: 10.1016/0167-7152(86)90044-1 [DOI] [Google Scholar]
5. Wilson EB, Hilferty MM. The distribution of chi-square. Proceedings of the National Academy of Sciences. 1931;17(12):684–688. doi: 10.1073/pnas.17.12.684 [DOI] [PMC free article] [PubMed] [Google Scholar]
6. Berg C, Pedersen HL. The Chen–Rubin conjecture in a continuous setting. Methods and Applications of Analysis. 2006;13(1):63–88. doi: 10.4310/MAA.2006.v13.n1.a4 [DOI] [Google Scholar]
7. Berg C, Pedersen HL. Convexity of the median in the gamma distribution. Arkiv för Matematik. 2008;46(1):1–6. doi: 10.1007/s11512-006-0037-2 [DOI] [Google Scholar]
8.Priore S, Petersen C, Oishi M. Approximate stochastic optimal control for linear time-invariant systems with heavy-tailed disturbances. arXiv preprint arXiv:221009479. 2022;.
9. Ouimet F. A refined continuity correction for the negative binomial distribution and asymptotics of the median. Metrika. 2023; p. 1–23.37360276 [Google Scholar]
10. Pinelis I. Monotonicity properties of the gamma family of distributions. Statistics & Probability Letters. 2021;171:109027. doi: 10.1016/j.spl.2020.109027 [DOI] [Google Scholar]
11. Groeneveld RA, Meeden G. The mode, median, and mean inequality. The American Statistician. 1977;31(3):120–121. doi: 10.1080/00031305.1977.10479215 [DOI] [Google Scholar]
12. Doodson AT. Relation of the mode, median and mean in frequency curves. Biometrika. 1917;11(4):425–429. doi: 10.1093/biomet/11.4.425 [DOI] [Google Scholar]
13. Batir N. Inequalities for the gamma function. Archiv der Mathematik. 2008;91(6):554–563. doi: 10.1007/s00013-008-2856-9 [DOI] [Google Scholar]

PLoS One. doi: 10.1371/journal.pone.0288601.r001

Decision Letter 0

Pablo Martin Rodriguez

22 May 2023

PONE-D-22-34782

Tight bounds for the median of a gamma distribution

PLOS ONE

Dear Dr. Lyon,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

You will see that the two reviewers are advising that you revise your manuscript. Note that it is a minor revision so please consider making the suggested changes.

Please submit your revised manuscript by Jun 22 2023 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.
A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.
An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Pablo Martin Rodriguez

Academic Editor

PLOS ONE

Journal requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf.

2. We note that the grant information you provided in the ‘Funding Information’ and ‘Financial Disclosure’ sections do not match. When you resubmit, please ensure that you provide the correct grant numbers for the awards you received for your study in the ‘Funding Information’ section.

3. Thank you for stating in your Funding Statement:

“The author(s) received no specific funding for this work. Author RFL is employed and partially funded by Google. The funder provided support in the form of salary for RFL and the publication fee for this article, but did not have any additional role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript.”

Please provide an amended statement that declares *all* the funding or sources of support (whether external or internal to your organization) received during this study, as detailed online in our guide for authors at http://journals.plos.org/plosone/s/submit-now. Please also include the statement “There was no additional external funding received for this study.” in your updated Funding Statement.

Please include your amended Funding Statement within your cover letter. We will change the online submission form on your behalf.

4. Thank you for stating the following in the Competing Interests section:

“Author RFL is employed by Google. This does not alter RFL’s adherence to PLOS ONE policies on sharing data and materials. Google has no restrictions on this work.”

Please confirm that this does not alter your adherence to all PLOS ONE policies on sharing data and materials, by including the following statement: ""This does not alter our adherence to PLOS ONE policies on sharing data and materials.” (as detailed online in our guide for authors http://journals.plos.org/plosone/s/competing-interests). If there are restrictions on sharing of data and/or materials, please state these. Please note that we cannot proceed with consideration of your article until this information has been declared.

Please include your updated Competing Interests statement in your cover letter; we will change the online submission form on your behalf.

5. Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: N/A

Reviewer #2: N/A

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: In this paper, the author investigates various bounds (including tight bounds) for the median the scaled

Gamma distribution. The results and proofs are quite technical and the author is nonetheless precise in their

treatment. The weaknesses I see are that the review of the literature is quite lacking (there are many works

on very close topics such as asymptotic bounds for the gamma, Poisson and negative binomial distributions

that are not mentioned, even very recent ones. Including them could help motivate the results of the paper

a bit more and help the reader put the results in perspective. Also, the figures could benefit from a legend

or a clearer labelling of the different curves. Other than that, the conclusions and the proofs look accurate.

I found the paper to be well-written, easy to follow, and of interest to researchers working on asymptotic

bounds for the gamma, Poisson and negative binomial distributions.

My recommendation is to accept the paper if the author can address the weaknesses mentioned above.

Reviewer #2: I appreciated reading this paper which presents clearly new bounds for the median of a standard gamma distribution. The paper is well written and clear. The problem addressed by the article is interesting and the results seem to be relevant to probability theory and could be potentially useful. However, the paper needs minor revisions before it can be accepted for publication. Thus, should the author answer adequately to my comments and suggestions below, then I would recommend the paper for publication in Plos One.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

**********

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

Attachment

Submitted filename: Report - PONE- - D - 22 - 34782.pdf

Click here for additional data file.^{(93.8KB, pdf)}

PLoS One. 2023 Sep 8;18(9):e0288601. doi: 10.1371/journal.pone.0288601.r002

Author response to Decision Letter 0

25 Jun 2023

Rebuttal Letter / Response to Reviewers

by Richard F. Lyon

PONE-D-22-34782

24 June 2023

I appreciate the positive reviews from both reviewers.

I mostly made changes in line with their suggestions, and added a few more bits along the way to help clarify (such as the new section "Toward a proof" that might help others complete the work, and an example of a third-order rational function interpolation with low relative error).

Reviewer #1:

…The weaknesses I see are that the review of the literature is quite lacking (there are many works

on very close topics such as asymptotic bounds for the gamma, Poisson and negative binomial distributions

that are not mentioned, even very recent ones. Including them could help motivate the results of the paper

a bit more and help the reader put the results in perspective.

I have added three references to very recent related works. But I don’t understand them well enough to say much about how related they are, so I just said, "In addition to the works mentioned above, there have been several more recent works on asymptotic properties and bounds for medians and other quantiles of gamma distributions and of the closely related Poisson and negative binomial (or Polya or Pascal) distributions, with a variety of interesting approaches \\cite{priore2022approximate, ouimet2023refined, pinelis2021monotonicity}.

… Also, the figures could benefit from a legend or a clearer labelling of the different curves.

I found that Figure 2 had an extra curve (from a corollary) that should not have been there, and that confused the interpretation of some of the labels on the curves, so I simply removed that (it was a thin solid red curve in the original figure). Now Figure 2 seems more clear.

For Figures 1 and 3 the labels on the curves seem clear and unambiguous.

That leaves Figure 4, which I agree was a bit confusing and crowded, with the (conjectured) interpolated bounds being described in the caption but not in the figure itself. So I managed to squeeze in a legend, and removed the 53% and 54% quantile curves that were in that area, leaving the presentation more symmetric between 48% and 52% and leaving more room for the legend.

Reviewer #2

…the paper needs minor revisions … my comments and suggestions below …

Most of the suggestions were of the form “might be better to add a full stop after the equation” or “might be better to add a comma after the equation”. I followed all of those, but I think I also added another comma or two. In addition, many were “might be better to add the symbol ⇒ before the equation and a full stop after the equation”, which I also agreed with and did. In each case I followed the typical LaTeX styling advice of separating the end punctuation from the math by a thin-space. Where I already had full stops after a few equations, I moved them away by a thin-space. Most of these changes are hard to spot in the diff, as the full stop or comma are too short for underlining to show up, and the blue color is easy to miss. But they’re there.

I enumerate responses to the numbered suggestions that were more than these.

These 3 I group and treat out of order:

1. p.1-9: γ should be defined.

5. p.2-10: Again, γ should be defined.

12. p.4-13: Again, γ should be defined.

In the abstract I added “(with $\\gamma$ being the Euler--Mascheroni constant)”. In the text, on p.2, I added “(with $\\gamma \\approx 0.5772157$ being the Euler--Mascheroni constant)” which includes its approximate value. Thank you for noticing this was needed. I think it does not need to be said again on p.4.

2. p.1-17: might be “...probability density function (PDF)...” instead of “...PDF...”.

I took the liberty to add a possessive, this way: “The gamma distribution's probability density function (PDF) is …”

3. p.1-19: might be x>0 instead of x≥0. Note that if k=1 and x=0, we have p1(0) = 0/0, which is an indeterminate form.

Yes, x>0. Done.

4. p.1-23: might be “... cumulative distribution function (CDF)...” instead of “...CDF... ”.

Done

9. p.4-2: the author uses p(k,x) to denote the PDF, but in p.1 the notation used is pk(x).

I switched to using the subscript k consistently for both the PDF p_k(x) and the CDF P_k(x).

17. p.5-18: might be helpful to clarify that ν(k) > L4(k) for all k > 0.

I added “for all k > 0” and also added some clarifying words to the start of the paragraph, “Since the denominator and the exponent are positive…”. And I removed the redundant comment that followed: “The derivation holds for all k > 0.”

31. p.7-27: might be “when k < 1” (as in the theorem statement) instead of for all positive k.

I fixed it the other way. The theorem statement now says for all positive k, as in the table of theorems to prove. I re-checked that the proof is valid over that full domain.

50. p.11-22: avoid acronym like RHS and LHS without say its respective meaning.

Reworded as “Dividing by the positive right-hand side gives the equivalent condition…”.

Other changes I made, which you may notice in the diffs:

I changed “The positivity constraint is met” to “The constraint is met” in one place where the constraint inequality is not a comparison to 0.

Where I had “rather a” I changed to “rather than a”.

I changed “which is completes the proof” to “this completes the proof” in the proof of Lemma 4.

In the Fig. 4 caption I used a code font (\\texttt) for the Matlab function name gammaincinv.

I think that’s all. I tried to restrain my tendency to keep tweaking.

Attachment

Submitted filename: Rebuttal Letter.pdf

Click here for additional data file.^{(75.2KB, pdf)}

PLoS One. doi: 10.1371/journal.pone.0288601.r003

Decision Letter 1

Pablo Martin Rodriguez

2 Jul 2023

Tight bounds for the median of a gamma distribution

PONE-D-22-34782R1

Dear Dr. Lyon,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Pablo Martin Rodriguez

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

Reviewer #2: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: N/A

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

Reviewer #1: Yes

Reviewer #2: Yes

**********

6. Review Comments to the Author

Reviewer #1: After thorough consideration and addressing all mentioned points, I find this version acceptable. The paper can be accepted.

Reviewer #2: The former report pointed out some minor revisions. These has been corrected in the revised version of the authors. More generally, all of the modifications made by the authors (included those that were not requested) are satisfactory. Thus, I recommend the paper for publication in Plos One.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

**********

PLoS One. doi: 10.1371/journal.pone.0288601.r004

Acceptance letter

Pablo Martin Rodriguez

29 Aug 2023

PONE-D-22-34782R1

Tight bounds for the median of a gamma distribution

Dear Dr. Lyon:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Professor Pablo Martin Rodriguez

Academic Editor

PLOS ONE

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

S1 Appendix

(PDF)

Click here for additional data file.^{(126.8KB, pdf)}

Attachment

Submitted filename: Report - PONE- - D - 22 - 34782.pdf

Click here for additional data file.^{(93.8KB, pdf)}

Attachment

Submitted filename: Rebuttal Letter.pdf

Click here for additional data file.^{(75.2KB, pdf)}

Data Availability Statement

All relevant data are within the paper.

[pone.0288601.ref001] 1. Lyon RF. On closed-form tight bounds and approximations for the median of a gamma distribution. PLOS One. 2021;16(5):e0251626. doi: 10.1371/journal.pone.0251626 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0288601.ref002] 2. Choi KP. On the medians of gamma distributions and an equation of Ramanujan. Proceedings of the American Mathematical Society. 1994;121(1):245–251. doi: 10.1090/S0002-9939-1994-1195477-8 [DOI] [Google Scholar]

[pone.0288601.ref003] 3. Gaunt RE, Merkle M. On bounds for the mode and median of the generalized hyperbolic and related distributions. Journal of Mathematical Analysis and Applications. 2021;493(1):124508. doi: 10.1016/j.jmaa.2020.124508 [DOI] [Google Scholar]

[pone.0288601.ref004] 4. Chen J, Rubin H. Bounds for the difference between median and mean of gamma and Poisson distributions. Statistics & Probability Letters. 1986;4(6):281–283. doi: 10.1016/0167-7152(86)90044-1 [DOI] [Google Scholar]

[pone.0288601.ref005] 5. Wilson EB, Hilferty MM. The distribution of chi-square. Proceedings of the National Academy of Sciences. 1931;17(12):684–688. doi: 10.1073/pnas.17.12.684 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0288601.ref006] 6. Berg C, Pedersen HL. The Chen–Rubin conjecture in a continuous setting. Methods and Applications of Analysis. 2006;13(1):63–88. doi: 10.4310/MAA.2006.v13.n1.a4 [DOI] [Google Scholar]

[pone.0288601.ref007] 7. Berg C, Pedersen HL. Convexity of the median in the gamma distribution. Arkiv för Matematik. 2008;46(1):1–6. doi: 10.1007/s11512-006-0037-2 [DOI] [Google Scholar]

[pone.0288601.ref008] 8.Priore S, Petersen C, Oishi M. Approximate stochastic optimal control for linear time-invariant systems with heavy-tailed disturbances. arXiv preprint arXiv:221009479. 2022;.

[pone.0288601.ref009] 9. Ouimet F. A refined continuity correction for the negative binomial distribution and asymptotics of the median. Metrika. 2023; p. 1–23.37360276 [Google Scholar]

[pone.0288601.ref010] 10. Pinelis I. Monotonicity properties of the gamma family of distributions. Statistics & Probability Letters. 2021;171:109027. doi: 10.1016/j.spl.2020.109027 [DOI] [Google Scholar]

[pone.0288601.ref011] 11. Groeneveld RA, Meeden G. The mode, median, and mean inequality. The American Statistician. 1977;31(3):120–121. doi: 10.1080/00031305.1977.10479215 [DOI] [Google Scholar]

[pone.0288601.ref012] 12. Doodson AT. Relation of the mode, median and mean in frequency curves. Biometrika. 1917;11(4):425–429. doi: 10.1093/biomet/11.4.425 [DOI] [Google Scholar]

[pone.0288601.ref013] 13. Batir N. Inequalities for the gamma function. Archiv der Mathematik. 2008;91(6):554–563. doi: 10.1007/s00013-008-2856-9 [DOI] [Google Scholar]

PERMALINK

Tight bounds for the median of a gamma distribution

Richard F Lyon

Roles

Abstract

Introduction

Fig 1. Linear and piecewise-linear bounds.

Theorems to prove

Table 1. Bounds table.

Recent related work

Theorems and proofs

Chords and tangents

Theorem U3

Theorem L3

Using bounds for the exponential

Theorem L4

Theorem L5

Theorem U4

Theorem U5

Theorem L6

Bounds in a box

Fig 2. Bounds in a box.

Line bounds from point bounds

Theorem L7

Fig 3. More bounds in the box.

Piecing the bounds together

Table 2. Lemmas table.

Lemma 1

Lemma 2

Lemma 3

Lemma 4

Lemma 5

Theorem U6

Theorem L8

Better bounds

Fig 4. Tighter bounds.

Conclusions

Supporting information

Acknowledgments

Data Availability

Funding Statement

References

Decision Letter 0

Pablo Martin Rodriguez

Roles

Author response to Decision Letter 0

Decision Letter 1

Pablo Martin Rodriguez

Roles

Acceptance letter

Pablo Martin Rodriguez

Roles

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases