Relationship between concentration ratio and Herfindahl-Hirschman index: A re-examination based on majorization theory

Tarald O Kvålseth

doi:10.1016/j.heliyon.2018.e00846

. 2018 Oct 11;4(10):e00846. doi: 10.1016/j.heliyon.2018.e00846

Relationship between concentration ratio and Herfindahl-Hirschman index: A re-examination based on majorization theory

Tarald O Kvålseth ^a,^b,^∗

PMCID: PMC6190613 PMID: 30338305

Abstract

While the two most widely used measures of market (industrial) concentration, the m-firm concentration ratio $C R_{m}$ and the Herfindahl-Hirschman index H, have no precise functional relationship, they can be related by means of boundary formulations. Such bounds and potential relationships, which have been considered in some earlier reported studies, are being re-examined, corrected, and reformulated in this paper. The underlying analysis uses a different approach based on majorization theory and the results are supported by computer simulation. Such boundary relationships make it possible to determine approximate values of H from those of $C R_{m}$ and vice versa for any given set of market shares. Much more accurate predictions of H-values can be obtained with knowledge of the individual market shares of the m largest firms within a market (industry), with or without knowledge of the total number of firms.

Keyword: Economics

1. Introduction

Market concentration, also often referred to as industry concentration, refers to the extent to which the market shares of the largest firms within a market (industry) accounts for a large proportion of economic activity such as sales, assets, or employment. As stated by OECD [1]:

“The rationale underlying the measurement of industry or market concentration is the industrial organization economic theory which suggests that, other things being equal, high levels of market concentration are more conducive to firms engaging in monopolistic practices which leads to misallocation of resources and poor economic performance. Market concentration in this context is used as one possible indicator of market power.”

Increasing market concentration causes decreasing competition and efficiency and increasing market power. Any such trends are being monitored by the business community and by government antitrust authorities such as the U.S. Department of Justice (DOJ) and the Federal Trade Commission (FTC) [2].

As measures of market concentration, the best-known candidates are the m-firm concentration ratio $C R_{m}$ , especially the 4-firm $C R_{4}$ , and the Herfindahl-Hirschman index H after Herfindahl [3] and Hirschman [4] (see, e.g., [5], pp. 116–118; [6], pp. 97–101; [7], Ch. 8). The $C R_{m}$ is defined as the combined market share of the m largest firms within the market whereas H equals the sum of all the squared market shares. Of the two measures, H appears to have become the generally preferred one in terms of its properties (e.g., [5], pp. 116–118; [7], Ch. 8, pp. 610–615). In terms of the merger guidelines of the U.S. DOJ and FTC [2], the earliest 1968 Merger Guidelines utilized the $C R_{4}$ while the later guidelines, the most recent being the 2010 Horizontal Merger Guidelines, have been using the H index as a screening tool for potential antitrust concerns raised by a proposed merger.

These two measures $C R_{m}$ and H are sufficiently different that no precise functional relationship can exist between them. Nevertheless, it would be informative and potentially lead to approximate relationships if bounds and inequalities between the measures can be derived. Such work was done by Pautler [8], Kwoka [9], and Sleuwaegen et al. [10, 11], obtaining bounds on H in terms of $C R_{m}$ . Their work was done, at least in part, in response to the change in the U.S. merger guidelines, replacing the four-firm concentration ratio $C R_{4}$ with the H index. Results from actual market-share data showed that the absolute variation in values of H increased greatly with increasing $C R_{4}$ .

Since these early explorations of potential H- $C R_{4}$ relationships, there appears to have been no reported attempt to verify, correct, or expand on these results. It is the purpose of the present paper to take another critical look at those earlier findings using a more rigorous and transparent approach, resulting in some corrections or modifications and alternative formulations. The analytic approach used is that of majorization theory [12, 13] supported by data from computer simulation generating random market-share distributions. Some real market-share data are also being used.

If the objective were to simply determine the “best” function to describe the relationship between H and $C R_{m}$ (or vice versa), then some statistical model could be explored using regression analysis. Such analysis could be performed for real or simulated market share data. Kwoka [9] reported one such effort by relating the logarithm $\log C R_{m}$ linearly to $\log H$ for m = 2 and m = 4 and obtained quite a good fit to real market-share data. More recently, Pavic et al. [14] fitted real data to a model in which $C R_{4}$ is expressed as a power function of H. Those authors fitted market-share data at different levels of aggregation and obtained good model fits. By contrast, instead of using a function that aims to relate each value of H to an approximate single value of $C R_{m}$ or vice versa, the approach used in the present paper uses majorization theory to develop bounds that can in turn be used to approximately relate one measure to another. This approach also provides tolerance or error limits within which the value of H has to lie given any particular value of $C R_{m}$ and vice versa.

The majorization theory as used in this paper has become a well-established approach to a wide variety of problems and fields of study. Since the celebrated book by Marshall and Olkin [13], there has been a surge of interest in potential applications of majorization theory in a wide variety of fields. The more recent edition [12] provides a more up-to-date account of the broad spectrum of applications. One of the early applications was by economists interested in the measurement of income inequality, notably by the work of Dalton and Lorenz (see [12], Ch. 1). In fact, the notion of economic inequality is closely linked to majorization or the Lorenz order [15]. Other applications have been reviewed by Arnold [16] and some relate to as diverse fields as quantum mechanics [17] and statistical variation [18]. The theory is particularly useful for establishing inequalities and extreme values of functions of discrete distributions. This is precisely the reason for using majorization theory in the present paper involving market-share distributions.

Since $C R_{m}$ , particularly $C R_{4}$ , and H are by far the most popular measures of market concentration and since market concentration is an indicator of competition, efficiency, and market power of firms within a market or industry, it is important to economists, policy makers and anyone with interests in such issues that any relationships between the two measures are accurate and reliable and based on an approach that is rigorous, complete, and explained in sufficient detail for verification. This has been the objective of the present paper.

While any relationship between H and $C R_{m}$ can only be approximate, a more accurate estimation or prediction for H may be possible from some of the largest individual market shares such as those on which $C R_{m}$ is based. Such a formulation would be important since market-share data are frequently reported for some of the largest firms with the smaller firms being combined into an “other” category. This type of formulation is also being considered in this paper.

2. Theory

2.1. Some introductory definitions

In order to appreciate the logic behind majorization theory, some definitions and properties are needed. In particular and conceptually, a vector or distribution $X_{n} = (x_{1}, ..., x_{n})$ is said to be majorized by a second vector (distribution) $Y_{n} = (y_{1}, ..., y_{n})$ if the components of $X_{n}$ are “more nearly equal”, “more evenly distributed”, or “less concentrated” than are the components of $Y_{n}$ . Formally, with $X_{n}$ and $Y_{n}$ ordered decreasingly as

x_{1} \geq x_{2} \geq \dots \geq x_{n}, y_{1} \geq y_{2} \geq \dots \geq y_{n}

(1)

$X_{n}$ is majorized by $Y_{n}$ , denoted by $X_{n} ≺ Y_{n}$ , under the following conditions:

X_{n} ≺ Y_{n} if {\begin{cases} \sum_{i = 1}^{j} x_{i} \leq \sum_{i = 1}^{j} y_{i}, j = 1, ..., n - 1 \\ and \\ \sum_{i = 1}^{n} x_{i} = \sum_{i = 1}^{n} y_{i} . \end{cases}

(2)

For example, if $X_{n}$ is a vector (distribution) such that $x_{i} \geq 0$ for all i and $\sum_{i = 1}^{n} x_{i} = 1$ , then the following majorization applies:

(\frac{1}{n}, ..., \frac{1}{n}) ≺ (x_{1}, ..., x_{n}) ≺ (1,0, ..., 0)

([12], p. 9).

The concept of Schur-convexity of a function is a property that preserves the order of majorization. That is, a function f is Schur-convex if its value increases as its arguments become increasingly uneven or concentrated. Formally, f is Schur-convex if

X_{n} ≺ Y_{n} implies f (X_{n}) \leq f (Y_{n})

(3)

and f is strictly Schur-convex if the inequality in (3) is strict when $X_{n}$ is not a permutation of $Y_{n}$ .

As a simple example, consider the two vectors or distributions $X_{4} = (0.40, 0.30, 0.20, 0.10)$ and $Y_{4} = (0.50, 0.30, 0.19, 0.01)$ . It is readily seen from the definition in (2) that $X_{4}$ is majorized by $Y_{4}$ . Thus, if $X_{4}$ and $Y_{4}$ were to represent two market-share distributions and if a measure of market concentration is Schur-convex, then it follows from (3) that $Y_{4}$ shows a higher degree of concentration than does $X_{4}$ . This result would seem entirely reasonable from simply looking at the forms of the two distributions.

2.2. Bounds on H

Let $S_{n} = (s_{1}, ..., s_{n})$ denote the set or distribution of market shares for some particular industry or market with n firms where $\sum_{i = 1}^{n} s_{i} = 1$ (or 100%). With the market shares decreasingly ordered as in (1), i.e.,

s_{1} \geq s_{2} \geq \dots \geq s_{n}

(4)

the m-firm concentration ratio is defined as

C R_{m} = \sum_{i = 1}^{m} s_{i}

(5)

and the Herfindahl-Hirschman index as

H = \sum_{i = 1}^{n} s_{i}^{2} .

(6)

An objective is then to determine bounds on H in terms of $C R_{m}$ , m, and n.

A lower bound on H is relatively simple to derive. In the expression

H = \sum_{i = 1}^{n} s_{i}^{2} = \sum_{i = 1}^{m} s_{i}^{2} + \sum_{i = m + 1}^{n} s_{i}^{2}

(7)

each of the terms is strictly Schur-convex in their respective arguments ([12], pp. 138–139). Furthermore, the following majorizations exist:

(\frac{C R_{m}}{m}, ..., \frac{C R_{m}}{m}) ≺ (s_{1}, ..., s_{m}), (\frac{1 - C R_{m}}{n - m}, ..., \frac{1 - C R_{m}}{n - m}) ≺ (s_{m + 1}, ..., s_{n})

(8)

which are rather apparent from (2) (see also [12], pp. 9, 21). It then follows from (3), (8), and the strict Schur-convexity of the terms in (7) that

H \geq \frac{C R_{m}^{2}}{m} + \frac{{(1 - C R_{m})}^{2}}{n - m}

(9)

which become a lower bound on H.

Derivation of an upper bound on H will be based in part on the following theorem given by Marshall et al. ([12], pp. 192–193) and attributed to Kemperman [19]: For a set of real-valued data $X_{n} = (x_{1}, ..., x_{n}) with L \leq x_{i} \leq U, i = 1, ..., n$ , there exists a unique integer $k \in [0,1, ..., n]$ and a unique $θ \in [L, U)$ where

K - 1 \leq k < K, K = \frac{\sum_{i = 1}^{n} x_{i} - n L}{U - L}

(10)

and

θ = \sum_{i = 1}^{n} x_{i} - (n - k - 1) L - k U

(11)

such that

X_{n} ≺ (\underset{k}{\underset{︸}{U, ..., U}}, θ, \underset{n - k - 1}{\underset{︸}{L, ..., L}}) .

(12)

It is readily apparent from the proof given by Marshall et al. ([12], pp. 192–193) that this theorem also holds when $L = \min {x_{1}, ..., x_{n})$ and $U = \max {x_{1}, ..., x_{n}}$ .

Using this theorem and the fact that

s_{m + 1} \leq s_{i} \leq C R_{m} - (m - 1) s_{m + 1}, i = 1, ..., m

it follows from (10) and (11) that K = 1 and hence k = 0 and $θ = C R_{m} - (m - 1) s_{m + 1}$ so that

(s_{1}, ..., s_{m}) ≺ (C R_{m} - (m - 1) s_{m + 1}, s_{m + 1}, ..., s_{m + 1})

(13)

which is also given in Marshall and Olkin ([13], p. 133). From (13) and the (strict) Schur-convexity of $\sum_{i = 1}^{m} s_{i}^{2}$ , it follows that

\sum_{i = 1}^{m} s_{i}^{2} \leq {[C R_{m} - (m - 1) s_{m + 1}]}^{2} + (m - 1) s_{m + 1}^{2} .

(14)

Furthermore, since $0 \leq s_{i} \leq s_{m + 1}, i = m + 1, ..., n,$ it is seen from (10), (11) and (12) that $θ = 1 - C R_{m} - k s_{m + 1}$ and

(s_{m + 1}, ..., s_{n}) ≺ (\underset{k}{\underset{︸}{s_{m + 1}, ..., s_{m + 1}}}, 1 - C R_{m} - k s_{m + 1}, 0, ..., 0)

(15)

which, together with the (strict) Schur-convexity of $\sum_{i = m + 1}^{n} s_{i}^{2}$ , result in

{\sum_{i = m + 1}^{n} s_{i}^{2} \leq k s_{m + 1}^{2} + (1 - C R_{m} - k s_{m + 1})}^{2} .

(16)

Treating k as a continuous variable, it is clear that the bound in (16) is strictly convex in $k \in [K - 1, K) where K = (1 - C R_{m}) / s_{m + 1}$ from (10) with L = 0 and $U = s_{m + 1}$ . For this K and for both k = K – 1 and k = K (although k is strictly less than K), the bound in (16) is seen to equal $(1 - C R_{m}) s_{m + 1}$ for both k - values, i.e.,

\sum_{i = m + 1}^{n} s_{i}^{2} \leq (1 - C R_{m}) s_{m + 1} .

(17)

Thus, from (7), (14), and (17),

{H = \sum_{i = 1}^{n} s_{i}^{2} \leq U_{H} = [C R_{m} - (m - 1) s_{m + 1}]}^{2} + (m - 1) s_{m + 1}^{2} + (1 - C R_{m}) s_{m + 1} .

(18)

The second - order partial derivative $\partial^{2} U_{H} / \partial s_{m + 1}^{2} = 2 m (m - 1)$ from (18) so that $U_{H}$ is convex in $s_{m + 1}$ (strictly so if m > 1). From this convexity and since

\frac{1 - C R_{m}}{n - m} \leq s_{m + 1} \leq \min {\frac{C R_{m}}{m}, 1 - C R_{m}}

(19)

so that, from (18) and (19),

U_{H} = {\begin{cases} U_{H 1} = {[C R_{m} - (m - 1) (\frac{1 - C R_{m}}{n - m})]}^{2} + (n - 1) {(\frac{1 - C R_{m}}{n - m})}^{2} for s_{m + 1} = \frac{1 - C R_{m}}{n - m} \\ U_{H 2} = \frac{C R_{m}}{m} for s_{m + 1} = \frac{C R_{m}}{m} \\ {{U_{H 3} = [1 - m (1 - C R_{m})]}^{2} + m (1 - C R_{m})}^{2} for s_{m + 1} = 1 - C R_{m} \end{cases}

(20)

and hence

H \leq \max {U_{H 1}, \min {U_{H 2}, U_{H 3}}}

(21)

which is then an upper bound on H in terms of any given m, n, and $C R_{m}$ .

The bound in (21) is equivalent to one given by [10, 11] with two exceptions. First, their bound did not incorporate the term $U_{H 3}$ in (20). Second, their bound involved a potential fraction term that was indeterminable, but presumably sufficiently small that it could be ignored. Sleuwaegen et al. ([11], p. 628) presented an upper bound as being equivalent to the largest of $U_{H 1}$ and $U_{H 2}$ in (20), “except for a fraction if the maximum is $C_{k} / k$ and $k (1 - C_{k}) / C_{k}$ is non-integer”, with their $C_{k}$ and k being equivalent to $C R_{m}$ and m used in the present paper. See also ([10], pp. 206–207).

For a large number of firms n (strictly for $n \to \infty$ ), it is clear from (9) and (20), (21) that

\frac{C R_{m}^{2}}{m} \leq H \leq \max {C R_{m}^{2}, \min {U_{H 2}, U_{H 3}}} .

(22)

By (a) defining $V = U_{H 2} - U_{H 3}$ , which is strictly concave in $C R_{m}$ for any given m since $\partial^{2} V / \partial C R_{m}^{2} = - 2 m (m + 1) < 0$ , (b) setting V = 0, and (c) solving the resulting second-order equation for $C R_{m}$ , it is found that

U_{H 2} \geq U_{H 3} if, and only if, \frac{m^{3}}{m^{2} (m + 1)} \leq C R_{m} \leq \frac{m^{3} + 1}{m^{2} (m + 1)} .

(23)

Similarly, for $n \to \infty and W = C R_{m}^{2} - U_{H 3}, with \partial^{2} W / \partial C {R^{2}}_{m} = 2 (1 - m^{2} - m) < 0$

C R_{m}^{2} \geq U_{H 3} if, and only if, \frac{m (m - 1) + 1}{m (m + 1) - 1} \leq C R_{m} \leq 1 .

(24)

It is apparent from (22), (23) and (24) that the upper bound in (22) equals $U_{H 3}$ only for m = 1 when, from (20), ${U_{H 3} = C R_{1}^{2} + (1 - C R_{1})}^{2}$ . This also becomes the bound in (21) for m = 1 and for n not large since then, from (20), $U_{H 1} = {C R_{1}^{2} + (1 - C R_{1})}^{2} / (n - 1)$ . Also, note that, from (20), $U_{H 1} = U_{H 3}$ when n = m + 1 for any $m \geq 1$ . Consequently, it follows from (22), (23) and (24) that

\frac{C R_{m}^{2}}{m} \leq H \leq \max {C R_{m}^{2}, \frac{C R_{m}}{m}} = {\begin{cases} C R_{m}^{2} for C R_{m} \geq \frac{1}{m}, m > 1, n \to \infty \\ \frac{C R_{m}}{m} for C R_{m} \leq \frac{1}{m}, m > 1, n \to \infty . \end{cases}

(25)

These (asymptotic) bounds are equivalent to those given by [10, 11] with one exception: the upper bound in (25) when $C R_{m} < 1 / m$ contains no indeterminate fraction term mentioned by those authors. Furthermore, the upper bound given by [8, 9] for m = 4 is $C R_{4} / 4$ irrespective of the value of $C R_{4}$ and whether n is finite or not. See also Martin ([20], p.337).

2.3. Some comments

It may perhaps be tempting to assume that for any given $C R_{m}$ , m, and n, the upper bound on H corresponds to the market-share distribution

(1 - (n - 1) (\frac{1 - C R_{m}}{n - m}), \frac{1 - C R_{m}}{n - m}, ..., \frac{1 - C R_{m}}{n - m})

(26)

(see, e.g., [10, 11]). However, this is not necessarily true because of (21) and the fact that, as can easily be verified, the value of H for the distribution in (26) equals the $U_{H 1}$ in (20).

As a counterexample, consider the market-share distribution $S_{5} = (0.26, 0.25, 0.25, 0.23, 0.01)$ for which H = 0.2456. For m = 2, $C R_{2} = 0.51$ , and $n = 5,$ the value of H for the corresponding distribution in (26) becomes 0.2269, which equals $U_{H 1}$ , but which is less than the H = 0.2456. Rather, the upper bound from (21) becomes $U_{H 2} = 0.51 / 2 = 0.2550 .$

While in the preceding counterexample, $C R_{m} > 1 / m$ , consider next the following example with $C R_{m} < 1 / m$ :

S_{27} = (0.10, 0.05, \dots, 0.05, 0.01, \dots, 0.01)

for which H = 0.0510. For m = 3, $C R_{3} = 0.2000$ , and n = 27, the value of H for the corresponding distribution in (26) becomes 0.0467, which equals $U_{H 1}$ in (20), but which is less than the H = .0510. In fact, the correct upper bound on H from (21) is $U_{H 2} = 0.2000 / 3 = 0.0667 .$

For the case when all market shares are equal, i.e., $s_{i} = 1 / n$ for i = 1,…,n, it follows from (6) and (20) that $H = U_{H 1} = U_{H 2} = 1 / n$ . However, in this equal-share case, $U_{H 3}$ in (20) equals 1/n if and only if m = n – 1; otherwise, $U_{H 3} > 1 / n$ .

For the particular case when m = 1 so that the concentration ratio is simply equal to market share $s_{1}$ of the largest firm (see the order in (4)), it is readily seen from (20) that $U_{H 1} \leq U_{H 2}$ , $U_{H 1} < U_{H 3}$ , and $U_{H 2} \leq U_{H 3}$ for $p_{1} \leq 1 / 2$ and $U_{H 2} > U_{H 3}$ for $p_{1} > 1 / 2$ . Therefore, in the m = 1 case, the upper bound on H from (21) is either $U_{H 2}$ or $U_{H 3},$ depending upon which one is the smaller.

3. Results

3.1. Simulation results

Besides the above boundary comparisons between the H and $C R_{m}$ in (5) and (6), an analysis has also been performed in terms of a scatter diagram of H versus $C R_{4}$ . The four-firm concentration ratio $C R_{4}$ was chosen since it is by far the most widely used one (e.g., [6], p. 97; [21], p. 255). Rather than using real data of limited sample size as done by [9, 10, 11], computer simulation was used to randomly generate a very large number of market-share distributions $S_{n} = (s_{1}, ...., s_{n}),$ with each $s_{i}$ and n based on random number generation.

The algorithm developed was based on the following steps:

(1)
Generate n as a random integer such that $5 \leq n \leq 100$ .
(2)
For each n - value generated in Step 1, generate $s_{1}, ..., s_{n - 1}$ (to the desired number of decimal places) as random numbers within the following intervals:
$\begin{array}{l} \frac{1}{n} \leq s_{1} \leq 1 \\ \frac{1 - s_{1}}{n - 1} \leq s_{2} \leq \min {s_{1}, 1 - s_{1}} \\ . \\ . \\ . \\ \frac{1 - \sum_{j = 1}^{n - 2} s_{j}}{n - (n - 2)} \leq s_{n - 1} \leq \min {s_{n - 2}, 1 - \sum_{j = 1}^{n - 2} s_{j}} \end{array}$
that is,
$\frac{1 - \sum_{j = 1}^{i - 1} s_{j}}{n - (i - 1)} \leq s_{i} \leq \min {s_{i - 1}, 1 - \sum_{j = 1}^{i - 1} s_{j}} for i = 2, ..., n - 1$
(3)
Compute $s_{n} = 1 - \sum_{j = 1}^{n - 1} s_{j}$ .

A total of 10,000 such $S_{n}$ distributions were generated and the corresponding values for H and $C R_{4}$ were computed. The results are summarized in the scatter diagram in Fig. 1.

The solid curves in Fig. 1 represent the boundary conditions in (25) and appear to be entirely appropriate even though the number of firms n is finite $(5 \leq n \leq 100)$ . Thus, the lower solid curve in Fig. 1 represents the lower bound on H as

L_{H} = \frac{C R_{4}^{2}}{4}

(27)

and upper solid curve represents the upper bound as

U_{H} = {\begin{cases} C R_{4}^{2} for C R_{4} \geq \frac{1}{4} \\ \frac{C R_{4}}{4} for C R_{4} \leq \frac{1}{4} . \end{cases}

(28)

It is clear from Fig. 1 that the variation in potential values of H for any given $C R_{m}$ (with m = 4 in Fig. 1) tends to increase dramatically with increasing $C R_{m}$ . Most interesting is perhaps the systematic nature of the variation in H-values as restricted by the bounds $L_{H}$ and $U_{H}$ in (27) and (28). The form of the scatter graph in Fig. 1 is generally similar to results given by [8, 9, 10, 11], although their results are based on a limited number of data points. See also Pavic et al. [14].

Another interesting observation can be made with respect to the relative variation of H-values for varying $C R_{m}$ . Such variation can be measured in terms of the range $U_{H} - L_{H}$ relative to the midrange defined by the estimated (predicted) H as

H 1 = \frac{1}{2} (L_{H} + U_{H})

(29)

with $L_{H}$ and $U_{H}$ defined in (27) and (28). Thus, the relative variation $R V = (U_{H} - L_{H}) / H 1$ becomes 6/5 for $C R_{4} \geq \frac{1}{4}$ whereas $R V = 2 (1 - C R_{4}) / (1 + C R_{4})$ for $C R_{4} \leq \frac{1}{4}$ . That is, in spite of the considerable absolute variation in H for each value of $C R_{4}$ as seen from Fig. 1 when $C R_{4}$ is not small $(C R_{4} > \frac{1}{4})$ , the relative variation remains constant. While these results are based on m = 4, there seems no reason to suspect that the results would not hold generally for all m.

3.2. H as function of $C R_{m}$

From the different expressions for H and $C R_{m}$ in (5) and (6) and from the scatter graph in Fig. 1 and those in [8, 9, 10, 11], there cannot be any precise functional relationship between H and $C R_{m}$ in general. However, a very approximate relationship can be formulated in terms of the lower bound $L_{H}$ in (9) and the upper bound $U_{H}$ in (21). The fact that the true value of H for any given market-share distribution $S_{n} = (s_{1}, ..., s_{n})$ falls within the interval $[L_{H}, U_{H}]$ can be expressed as

H = \frac{1}{2} (L_{H} + U_{H}) \pm \frac{1}{2} (U_{H} - L_{H}) .

(30)

Note that the first term in (30) is the same is H1 in (29) (for m = 4) whereas the second one is a tolerance or error term.

If the total number of firms n in a market or industry is known, then $L_{H}, U_{H},$ and (30) can be computed from (9) and (21). For very large n and for $m > 1$ , those computations can be done from (25) or from (27) and (28) for $m = 4 .$ Furthermore, when n is not large, the bounds in (25) (and (27) and (28)) still apply. That is,

L_{H} = \frac{C R_{m}^{2}}{m} \leq H \leq U_{H} = {\begin{cases} C R_{m}^{2} for C R_{m} \geq \frac{1}{m} \\ \frac{C R_{m}}{m} for C R_{m} \leq \frac{1}{m} \end{cases}

(31)

for any n and $m > 1$ . This lower bound follows immediately from (9) by ignoring the term involving n whereas the upper bound can be proved as follows. First, the inequality in (18) can be expressed as

U_{H} = C R_{m}^{2} - s_{m + 1} [(2 m - 1) C R_{m} - m (m - 1) s_{m + 1} - 1]

= C R_{m}^{2} - s_{m + 1} A

(32)

where A is strictly decreasing in $s_{m + 1}$ . Second, for any given m and $C R_{m}$ , the maximum possible value of $s_{m + 1}$ will necessarily occur when $s_{i} = C R_{m} / m$ for $i = 1, ..., m + 1$ in which case

A = m C R_{m} - 1 .

(33)

Then, from (33), $A \geq 0$ for $C R_{m} \geq 1 / m$ which, together with (32), gives $U_{H} \leq C R_{m}^{2}$ for $C R_{m} \geq 1 / m .$ Finally, since the $U_{H 2}$ in (20) is independent of n, this completes the proof of $U_{H}$ in (31).

Equivalent bounds can also be derived for $C R_{m}$ in terms of H. In fact, it follows immediately from (31) that

U_{C R m} = \sqrt{m H} \geq C R_{m} \geq L_{C R m} = {\begin{cases} \sqrt{H} for H \geq 1 / m^{2} \\ m H for H \leq 1 / m^{2} . \end{cases}

(34)

As in the case of (31), the bounds in (34) hold for any n and $m > 1$ (and not just when $n \to \infty$ ). Then, as in the case of H1 in (29) (for m = 4) and H in (30), the following formulation applies to $C R_{m}$ :

C R_{m} = C R_{m}^{1} \pm \frac{1}{2} (U_{C R m} - L_{C R m}), C R_{m}^{1} = \frac{1}{2} (L_{C R m} + U_{C R m}) .

(35)

In order to explore these and other formulations discussed subsequently, various market-share distributions were randomly generated using the computer algorithm described in Subsection 3.1 and with m = 4 and $n \in [5, 100]$ . The results are summarized in Table 1 for 25 different distributions. As an example of the computation involved for (29), (30) and (35), consider Data Set 1 in Table 1 with $C R_{4} = 0.2881$ and $H = 0.0459$ . From (27), (28) and (29), $L_{H} = 0.0208$ , $U_{H} = 0.0830$ , and $H 1 = 0.0519$ so that, from (30), $H = 0.0519 \pm 0.0311$ or [0.0208, 0.0830]. That is, the value of H is roughly equal to 0.0519, but it has to lie in the interval between 0.0208 and 0.0830. The true value of H is 0.0459 in Table 1. The computations with respect to $C R_{4}$ are similarly done for (34) and (35) with $m = 4$ .

Table 1.

Values of CR₄, H, H1, CR¹₄, H2, H3, and H4 defined in (5), (6), (29), (35), (36), (42), and (43) for 25 randomly generated market-share distributions (s₁, ..., s_n) with random n between 5 and 100.

Data Set	n	$C R_{4}$	H	H1	$C R_{4}^{1}$	H2	H3	H4
1	27	0.2881	0.0459	0.0519	0.3060	0.0723	0.0459	0.0402
2	80	0.0535	0.0125	0.0070	0.1368	0.0070	0.0125	0.0068
3	66	0.3628	0.0678	0.0823	0.3906	0.0995	0.0679	0.0689
4	60	0.1210	0.0181	0.0170	0.1707	0.0217	0.0182	0.0120
5	81	0.0652	0.0126	0.0092	0.1374	0.0092	0.0125	0.0069
6	16	0.9666	0.5764	0.5839	1.1388	0.3868	0.5765	0.5767
7	31	0.3670	0.0805	0.0842	0.4256	0.1011	0.0809	0.0736
8	28	0.2584	0.0520	0.0417	0.3320	0.0621	0.0523	0.0406
9	65	0.5115	0.1708	0.1635	0.6199	0.1601	0.1709	0.1693
10	75	0.0566	0.0133	0.0075	0.1419	0.0076	0.0133	0.0073
11	22	0.1898	0.0455	0.0282	0.3043	0.0405	0.0455	0.0273
12	52	0.1051	0.0196	0.0176	0.1792	0.0179	0.0196	0.0118
13	30	0.4724	0.1553	0.1395	0.5911	0.1434	0.1558	0.1509
14	84	0.1085	0.0158	0.0150	0.1573	0.0187	0.0157	0.0109
15	38	0.8513	0.4773	0.4529	1.0363	0.3244	0.4775	0.4779
16	74	0.1045	0.0146	0.0144	0.1500	0.0177	0.0147	0.0094
17	25	0.2131	0.0418	0.0323	0.2881	0.0476	0.0419	0.0289
18	93	0.8406	0.6755	0.4416	1.2328	0.3188	0.6756	0.6754
19	85	0.0968	0.0140	0.0133	0.1463	0.0159	0.0140	0.0091
20	98	0.1340	0.0145	0.0190	0.1494	0.0250	0.0146	0.0114
21	59	0.0751	0.0170	0.0101	0.1644	0.0112	0.0170	0.0093
22	62	0.1324	0.0184	0.0187	0.1724	0.0246	0.0185	0.0123
23	91	0.0604	0.0112	0.0080	0.1282	0.0083	0.0112	0.0066
24	79	0.1646	0.0174	0.0240	0.1667	0.0333	0.0175	0.0141
25	62	0.2096	0.0271	0.0317	0.2188	0.0465	0.0272	0.0218

Open in a new tab

It is evident from the results in Table 1 that H1 as a function of $C R_{4}$ in (29) and $C R_{4}^{1}$ as a function of H in (35) do provide some respectable indications of the values of the indices H and $C R_{4}$ . Thus, knowing the value of one of these two indices, one can make a rough estimation (prediction) about the corresponding value of the other index.

A statistical approach to determining the relationship between H and $C R_{m}$ would be the use of regression analysis using simulated data or real market-share data. Such real-data analysis has been reported by Kwoka [9] and Pavic et al [14]. Based on the values of H and $C R_{4}$ in Table 1, the following regression model is obtained:

H 2 = a C R_{4}^{b}, a = 0.4055, b = 1.3861 .

(36)

However, the fit of this model to the data is unimpressive as seen from the values of H2 based on (36) as given in Table 1. The coefficient of determination, when properly computed [22], becomes $R^{2} = 1 - \sum {(H - H 2)}^{2} / \sum {(H - \bar{H})}^{2} = 0.77$ (where $\bar{H}$ denotes the mean of the H-values). That is, only 77% of the variation of H (about its mean) is explained by the fitted model in (36).

By comparison with the results in (36), Kwoka [9] obtained the parameter estimates a = 0.315 and b = 1.724 for some real market-share data. Similarly, Pavic et al. [14] fitted a power function to real market data for different levels of aggregation or degrees of specificity of the commodity. Their results corresponded to a-values ranging from about 0.70 to 0.93 and b-values ranging from 1.74 to 1.84 for the function in (36).

3.3. H as function of $s_{1}, ..., s_{m}$ and n

The developments so far have been concerned with potential relationships between the Herfindahl-Hirschman Index H and the m-firm concentration ratio $C R_{m}$ in (5) and (6). Consider next the case when the individual market shares $s_{1}, ..., s_{m}$ for the m largest firms are known (and not just their sum $C R_{m}$ ) as well as the total number of firms n in the market or industry. In this case, m may be rather small, most typically m = 4, but m could also be 3, 8, or 10, for example. Another situation may be one in which some of the smaller market shares are simply excluded from the computation of H since their effect on H is relatively low. Whatever the case may be, it would be of interest to determine if $H = \sum_{i = 1}^{n} s_{i}^{2}$ could reasonably be approximated by some function of $s_{1}, ..., s_{m}$ and n. This was briefly considered by [11].

From the Schur-convexity of $\sum_{i = m + 1}^{n} s_{i}^{2}$ and the majorization $(\frac{1 - C R_{m}}{n - m}, ..., \frac{1 - C R_{m}}{n - m}) ≺ (s_{m + 1}, ..., s_{n})$ ([12], pp. 21, 138–139) and from (17), it follows that

A = \frac{{(1 - C R_{m})}^{2}}{n - m} \leq \sum_{i = m + 1}^{n} s_{i}^{2} \leq (1 - C R_{m}) s_{m} = B .

(37)

Then, from (7) and (37), it would seem reasonable to consider a measure approximating H as being $\sum_{i = 1}^{m} s_{i}^{2}$ plus a mean of the bounds A and B. The most obvious mean would be the simple arithmetic mean, resulting in the measure

H^{'} = \sum_{i = 1}^{m} s_{i}^{2} + \frac{1}{2} (A + B)

(38)

as being one whose values would approximate those of H. However, this measure is not Schur-convex as seen from the fact that

\frac{\partial H^{'}}{\partial s_{m - 1}} - \frac{\partial H^{'}}{\partial s_{m}} = 2 (s_{m - 1} - s_{m}) - \frac{1}{2} (1 - C R_{m})

which is not necessarily nonnegative.

An alternative formulation can be considered in terms of $\sum_{i = 1}^{m} s_{i}^{2}$ plus a weighted arithmetic mean of the bounds A and B in (37) with $s_{m} \leq C R_{m} / m$ . That is,

H_{w} = \sum_{i = 1}^{m} s_{i}^{2} + w \frac{{(1 - C R_{m})}^{2}}{n - m} + (1 - w) \frac{(1 - C R_{m}) C R_{m}}{m}, w \in [0, 1].

(39)

In order to explore the Schur-convexity of $H_{w}$ , it follows from (39) that

Δ_{i} = \frac{\partial H_{w}}{\partial s_{i}} - \frac{\partial H_{w}}{\partial s_{i + 1}} = {\begin{cases} 2 (s_{i} - s_{i + 1}), i = 1, ..., m - 1 \\ Δ_{m}, i = m \\ 0, i = m + 1, ..., n - 1 \end{cases}

(40)

where

Δ_{m} = 2 s_{m} + (\frac{2 (1 - C R_{m})}{m (n - m)}) [n (1 - w) - m] - \frac{1 - w}{m} .

(41)

Since this $Δ_{m}$ in (40) and (41) is not necessarily nonnegative, the $H_{w}$ cannot be Schur-convex ([12], p. 84).

From exploratory data analysis, it becomes readily apparent from various market-share distributions that values of $\sum_{i = 1}^{m} s_{i}^{2} + {(1 - C R_{m})}^{2} / (n - m)$ are generally much closer to the corresponding values of H than are the values of $\sum_{i = 1}^{m} s_{i}^{2} + (1 - C R_{m}) C R_{m} / m$ . This observation can be accounted for by choosing a large value of w in (39) such as $w = 1 - 1 / m (n - m)$ , resulting in

H 3 = \sum_{i = 1}^{m} s_{i}^{2} + \frac{{(1 - C R_{m})}^{2}}{n - m} + \frac{(1 - C R_{m}) (n C R_{m} - m)}{{m^{2} (n - m)}^{2}} .

(42)

In order to explore how accurately values of H3 from (42) approximate those of H, the randomly generated market-share distributions described in Subsection 3.1 were used to compute H, $C R_{4}$ , and H3 from (42) with m = 4. Again, the results are given in Table 1.

It is rather striking from the results in Table 1 how closely the values of H3 agree with those of H. Their slight differences occur only in the fourth decimal place, with 0.0005 for Data Set 13 being the largest difference. In fact, if values of H are predicted to equal those of H3, it is found from the results in Table 1 that the coefficient of determination, when properly computed [22], becomes $R^{2} = 1 - {{\sum (H - H 3)}^{2} / \sum (H - \bar{H})}^{2} = 1.0000$ (rounded off to four decimal places). That is, nearly all of the variation of H (about its mean $\bar{H}$ ) is explained (accounted for) by the fitted model stating that the predicted H equals H3.

3.4. H as function of $s_{1}, ..., s_{m}$

The H3 in (42) requires knowledge of the market shares of the m largest firms within a market or industry as well as the total number of firms n. There may, of course, be cases when $s_{1}, ..., s_{m}$ are reported, but n is not reported or the true value of n is not available since the market shares of an unknown number of firms have been combined into “others.” It would therefore be desirable to have an alternative measure that approximates H, but that depends only on $s_{1}, ..., s_{m}$ .

Such a potential measure could again be considered in terms of the bounds in (37) except for setting the lower bound A = 0 to eliminate the dependence on n. When approximating $\sum_{i = m + 1}^{m} s_{i}^{2}$ with the arithmetic mean of A and B in (37), the elimination of A is partly compensated for by using $s_{m}$ instead of $s_{m + 1}$ in the bound in (17). Therefore, a new measure can be defined as

H 4 = \sum_{i = 1}^{m} s_{i}^{2} + \frac{1}{2} (1 - C R_{m}) s_{m}

(43)

which is only a function of $s_{1}, ..., s_{m}$ . Furthermore, in terms of tolerance (error) intervals, the true value of H can be expressed as

H = H 4 \pm \frac{1}{2} (1 - C R_{m}) s_{m} .

(44)

Thus, for example, for the market-share distribution $(0.20, 0.15, 0.15, 0.10,...)$ and for $m = 4$ , $C R_{4} = 0.40$ and $s_{4} = 0.10$ so that, from (43), $H 4 = 0.115$ . Then, from (44), $H = 0.115 \pm 0.020$ so that the true value of H has to fall within the interval $[0.095, 0.135]$ .

The H4 is not, however, Schur-convex as seen from the following differences between partial derivatives (using the descending order in (1)):

Δ_{i} = \frac{\partial H 4}{\partial s_{i - 1}} - \frac{\partial H 4}{\partial s_{i}} = {\begin{cases} 2 (s_{i - 1} - s_{i}), i = 2, ..., m - 1 \\ 2 (s_{i - 1} - s_{i}) - (1 - C R_{m}) / 2, i = m \\ (3 / 2) s_{m} + (1 - C R_{m}) / 2, i = m + 1 \\ 0, i = m + 2, ..., n . \end{cases}

(45)

The terms in (45) are nonnegative, indicating Schur-convexity, but with one exception: $i = m$ . In this case, $Δ_{m}$ can potentially take on negative values. To place this relatively minor limitation of H4 within some context, consider the following implication from the above results: among the $n (n - 1) / 2$ possible transfers of small amount of market shares from larger to smaller firms, only $m - 1$ such transfers to the m-th largest firm could potentially lead to a slight increase in the value of H4. All other possible transfers could only lead to decreasing or nonincreasing H4. In most real situations, cases with $Δ_{m} < 0$ in (45) would probably be rather exceptional so that H4 can still be considered as a reasonable measure for all practical purposes.

For the randomly generated market-share distributions behind the data in Table 1, the values of H4 were also computed as shown in the table. The results show that the values of H4 tend to approximate quite closely those of H and nearly as closely as those of H3. From the data in Table 1, the coefficient of determination is $R^{2} = 1 - \sum {(H - H 4)}^{2} / \sum {(H - \bar{H})}^{2} = 0.9985$ for the fitted model $\hat{H} = H 4 .$ There would seem to be little disadvantage in using H4 instead of H3 in (42), which also incorporates the number of firms n in a market (industry). When using two decimal places, which is clearly adequate for practical purposes, the values of H3 and H4 may differ by only about $\pm 0.01 .$ These results are based on the most common choice of $m = 4$ and would probably differ somewhat for different m.

3.5. Results from real market data

The numerical results have so far been based on market-share distributions that have been randomly generated. Such results have the broadest possible implication without any particular bias or restrictions. By comparison, results from using real market-share data may depend on factors such as the market classification system used and the level of aggregation.

Nevertheless, it would be of interest to subject the above developments to some real data and compare the results with those from computer generated random samples. Therefore, readily accessible market-share data from a wide diversity of markets were used to determine the true values of $C R_{4}$ and H in (5) and (6) and their approximate values from H1 in (29), $C R_{4}^{1}$ in (35), and H3 and H4 in (42) and (43). The results are given in Table 2. While the results in Table 1 are given with four decimal places, primarily to illustrate the accuracy with which H3 estimates (predicts) H from (42), two decimal places are used in Table 2 since this is adequate for all practical purposes.

Table 2.

Values of CR₄, H, H1, H1^·, CR¹₄, CR^1·₄, H3, and H4 in (5), (6), (29), (47), (35), (48), (42), and (43) for some real market-share data.

n	$C R_{4}$	H	H1	$H 1^{•}$	$C R_{4}^{1}$	$C R_{4}^{1 •}$	H3	H4	Source (Market type)
16	0.50	0.10	0.16	0.08	0.47	0.55	0.09	0.09	[23] (Airline travel)
16	0.60	0.12	0.23	0.12	0.52	0.61	0.12	0.12	[23] (Airline travel)
8	0.75	0.16	0.35	0.18	0.60	0.70	0.17	0.17	[24] (U.S. distilled liquor)
10	0.64	0.14	0.26	0.13	0.56	0.65	0.14	0.13	[25] (Paints, coatings)
10	0.52	0.11	0.17	0.09	0.50	0.58	0.11	0.10	[26] (Pharmaceuticals)
15	0.54	0.10	0.18	0.09	0.47	0.55	0.10	0.10	[27] (Insurance companies)
12	0.69	0.18	0.30	0.15	0.64	0.74	0.19	0.18	[28] (Weapons exporters)
30	0.34	0.05	0.07	0.04	0.32	0.39	0.05	0.05	[29] (Car sales, Britain)
12	0.60	0.12	0.23	0.12	0.52	0.61	0.11	0.11	[30] (Auto manufacturers, US)
8	0.77	0.18	0.37	0.19	0.64	0.74	0.18	0.17	[25] (Craft beer, US)
9	0.75	0.16	0.35	0.18	0.60	0.70	0.16	0.16	[25] (Running shoe sales)
10	0.72	0.17	0.32	0.17	0.62	0.72	0.16	0.17	[25] (Top charter airlines)
10	0.68	0.18	0.29	0.15	0.64	0.74	0.16	0.17	[25] (Farm machinery, equip.)
20	0.48	0.08	0.14	0.07	0.42	0.49	0.07	0.09	[31] (Global car sales)
10	0.60	0.12	0.23	0.12	0.52	0.61	0.12	0.12	[25] (Top airlines worldwide)

Open in a new tab

It is clear from the values of $C R_{4}$ , H, H1, and $C R_{4}^{1}$ in Table 2 that the estimation (prediction) of H from any given $C R_{4}$ and vice versa is subject to substantial inaccuracy since the values of $C R_{4}^{1}$ differ considerably from those of $C R_{4}$ and similarly for H1 versus H (by up to a multiplicative factor of 2). These results differ significantly from those of Table 1 probably because the values of $C R_{4}$ and H are generally much larger in Table 2 than in Table 1. The relatively small values of $C R_{4}$ and H in Table 1 are partly due to the rather large values of n that affect the generated market-share distributions.

The estimated (predicted) values of H based on $C R_{m}$ , and those of $C R_{m}$ based on H, given in Table 2 could probably be improved upon by considering different weights for the two pairs of bounds. Thus, instead of H1 in (29), one could consider the bounds in (31) for $C R_{m} \geq 1 / m$ and define

H 1^{•} = w L_{H} + (1 - w) U_{H} = C R_{m}^{2} [1 - w (1 - \frac{1}{m})], 0 \leq w \leq 1 .

(46)

By exploring different values of w from the data in Table 2 with $m = 4$ , $w = 0.9$ becomes appropriate so that, from (46),

H 1^{•} = (\frac{1.30}{4}) C R_{4}^{2} for C R_{4} \geq \frac{1}{4} .

(47)

Then, setting $H 1^{•} = H$ and solving for $C R_{4}$ gives

C R_{4}^{1 •} = 1.75 \sqrt{H} for H \geq 0.02 .

(48)

From the results in Table 2, it is seen that the values of $H 1^{•}$ and $C R_{4}^{1 •}$ from (47) and (48) are considerably closer to the values of H and $C R_{4}$ than are those of H1 and $C R_{4}^{1}$ .

However, with respect to the measures H3 in (42) and H4 in (43) and the index H, the results in the two tables are quite comparable. Thus, from the results in Table 2, it is seen that the values of H3 and H4 provide close approximation to those of H just as is the case with Table 1. Even though H3 incorporates the additional variable n (the total number of firms in a market), the values of H4 are about as close to those of H as are the H3-values.

4. Conclusions

This careful and detailed re-examination of the boundary relationships between the Herfindahl-Hirschman index H and the m-firm concentration ratio $C R_{m}$ is important in a number of regards. The various derivations and intermediate steps are verifiable and based on a rigorous approach using majorization theory, resulting in some new findings and revisions of earlier results. Since (a) market (industrial) concentration is generally considered to be an indicator of the competitiveness, efficiency, or power of a market (or industry), (b) it is convenient and useful to be able to measure this market property, and (c) the H and $C R_{m}$ (especially $C R_{4}$ ) are the most widely used measures, it is essential that the formulations of any H- $C R_{m}$ relationships be correct, clear, and reliable. This requirement is clearly important to economists, policy planners, and others using such summary measures for any purpose.

Although there is clearly a relationship between the two indices $C R_{m}$ and H, this paper is emphasizing the very approximate nature of the relationship based on bounds using majorization theory. If only the values of $C R_{m}$ or H are known, without knowledge of any of the individual market shares $s_{1}, ..., s_{n}$ or the total number of firms in the market or industry n, then their approximate relationships are given by (27), (28) and (29) for $m = 4$ and more generally by (30) and (31) as well as by (34) and (35), depending upon which is considered the dependent and which the independent (explanatory) variable.

The relationship in (34) and (35) is based on the bounds on $C R_{m}$ for any $m > 1$ . Alternatively, one could derive $C R_{m}$ as the inverse function of $H = (L_{H} + U_{H}) / 2$ in (31), with (27), (28) and (29) being the particular, but most common, case of $m = 4$ . This inverse procedure yields

C R_{m} = {\begin{cases} \sqrt{\frac{2 m H}{m + 1}} for H \geq \frac{m + 1}{2 m^{3}} \\ \frac{\sqrt{1 + 8 m H} - 1}{2} for H \leq \frac{m + 1}{2 m^{3}} \end{cases}

(49)

However, this relationship does not necessarily produce the same results as (34) and (35) and differ from (48).

As an example of the approximate nature of the conversion from one of the indices H and $C R_{m}$ to the other, it would be of interest to consider the most recent 2010 Horizontal Merger Guidelines [2] that uses H as the concentration index. Those guidelines use $H = 0.15$ and $H = 0.25$ as two of the bench marks. From (35) with $m = 4,$ the equivalent values of $C R_{4}$ would be, respectively, $0.58 \pm 0.19$ and $0.75 \pm 0.25 .$ If individual values of $C R_{4}$ are computed from (35) and (36) and (48) and (49) for $H = 0.15$ and $H = 0.25$ , those values of $C R_{4}$ , which differ considerably, are found to have mean values of about 0.55 for $H = 0.15$ and 0.75 for $H = 0.25$ . Thus, in terms of those mean values, and from the 2010 Guidelines, the following equivalence applies:

Unconcentrated Markets: H < 0.15 or $C R_{4} < 0.55$
Moderately Concentrated Markets: $0.15 \leq H \leq 0.25 or 0.55 \leq C R_{4} \leq 0.75$
Highly Concentrated Markets: H > 0.25 or $C R_{4} > 0.75 .$

While any relationship between H and $C R_{m}$ can only provide a rough approximation, the estimation (prediction) of H becomes rapidly more accurate when given some of the largest market shares $s_{1}, ..., s_{m}$ . An interesting observation is the fact that knowledge of the size of the market (industry) n does not generally have any substantial effect on such estimation accuracy. That is, the measure H4 in (43) tends to be about as close to the real measure H as is H3 in (42) as exemplified by the data in Tables 1 and 2 for m = 4.

Such support for the H4 in (43) is important since market-share data are often reported with the smaller market shares grouped into an “other” category without any specification of n. In some cases, an “other” category may account for more than 50% of the market shares without any indication of the number of firms included in that category. Furthermore, H4 together with its tolerance (error) limits in (44) provide a complete description of the true value of the index H. The H4 also has the advantage of having the zero-indifference property, i.e., introducing a firm with zero market share does not affect H4. In spite of the fact that H4 is not Schur-convex, it does indeed appear to provide good approximation to H.

All the proofs and derivations in this paper involving the m-firm concentration ratio $C R_{m}$ are done for any m rather than for some specific m-value. Whenever a particular m is used, the discussion involves m = 4 since, as pointed out earlier, the 4-firm concentration ratio is the one used most frequently. If any other m were to be of particular interest, such as m = 2 or m = 8, the analysis would simply require such substitution into the appropriate equations. Some of the numerical results may, of course, differ depending upon m.

A concluding comment is also warranted about the comparison between $C R_{m}$ and H. From majorization theory as commented on in Subsection 2.1, it follows that if two market-share distributions $S_{n} = (s_{1}, ..., s_{n})$ and $R_{n} = (r_{1}, ..., r_{n})$ are comparable with respect to majorization (i.e., $S_{n}$ and $R_{n}$ can be compared as in (2)), then H and $C R_{m}$ provide the same size (order) comparison. That is, if $H (S_{n}) < H (R_{n})$ , it is implied that $C R_{m} (S_{n}) \leq C R_{m} (R_{n})$ and vice versa. This follows from the fact that $C R_{m}$ is Schur-convex and H is strictly Schur-convex. If one has to choose between the two concentration measures, and if the market shares are known for all the firms within a particular market (industry), it should be noted that H has the advantage of strict Schur-convexity and of incorporating all available information about the market shares. If the market shares $s_{1}, ..., s_{m}$ of the m largest firms, but not the market size n, are known, a good compromise would seem to be H4 in (43) and (44), with H4 being Schur-convex and strictly Schur-convex in $s_{1}, ..., s_{m}$ .

Declarations

Author contribution statement

Tarald O. Kvålseth: Conceived and designed the analysis; Analyzed and interpreted the data; Contributed analysis tools or data; Wrote the paper.

Funding statement

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Competing interest statement

The authors declare no conflict of interest.

Additional information

No additional information is available for this paper.

Acknowledgements

The author wants to thank the three reviewers for their helpful and constructive comments.

References

1.OECD (Organization for Economic Co-operation and Development) 2003. OECD Glossary of Statistical Terms: Concentration. URL https://stats.oecd.org/glossary/detail.asp?ID=3165. [Google Scholar]
2.U.S. Department of Justice and the Federal Trade Commission . 2010. Horizontal Merger Guidelines. [Google Scholar]
3.Herfindahl O.C. Columbia University, U.S.A.; 1950. Concentration in the Steel Industry. Unpublished Ph.D. dissertation. [Google Scholar]
4.Hirschman A.O. University of California Press; Berkeley, CA: 1945. National Power and the Structure of Foreign Trade. [Google Scholar]
5.Gaughan P.A. fifth ed. Wiley; Hoboken, NJ: 2011. Mergers, Acquisitions, and Corporate Restructuring. [Google Scholar]
6.Andreosso B., Jacobson D. second ed. McGraw-Hill; London: 2005. Industrial Economics and Organization: a European Perspective. [Google Scholar]
7.Tremblay V.J., Horton Tremblay C. Springer; New York: 2012. New Perspectives on Industrial Organization. [Google Scholar]
8.Pautler P.A. A guide to the Herfindahl index for antitrust attorneys. Res. Law Econ. 1983;5:167–190. [Google Scholar]
9.Kwoka J.E., Jr. The Herfindahl in theory and practice. Antitrust Bull. 1985;30(winter):915–947. [Google Scholar]
10.Sleuwaegen L., Dehandschutter W. The critical choice between the concentration ratio and the H-index in assessing performance. J. Ind. Econ. 1986;XXXV:193–208. [Google Scholar]
11.Sleuwaegen L.E., DeBondt R.R., Dehandschutter W.V. The Herfindahl index and concentration ratio revisited. Antitrust Bull. 1989;34(fall):625–640. [Google Scholar]
12.Marshall A.W., Olkin I., Arnold B.C. second ed. Springer; New York: 2011. Inequalities: Theory of Majorization and its Application. [Google Scholar]
13.Marshall A.W., Olkin I. Academic Press; San Diego, CA: 1979. Inequalities: Theory of Majorization and its Applications. [Google Scholar]
14.Pavic I., Galetic F., Piplica D. Similarities and differences between the CR and HHI as an indicator of market concentration and market power. Br. J. Econ. Manag. Trade. 2016;13(1):1–8. [Google Scholar]
15.Ibragimov M., Ibragimov R. Market demand elasticity and income inequality. Econ. Theor. 2007;32:579–587. [Google Scholar]
16.Arnold B.C. Majorization: here, there and everywhere. Stat. Sci. 2007;22:407–413. [Google Scholar]
17.Nielsen M.A., Vidal G. Majorization and the interconversion of bipartite states. Quant. Inf. Comput. 2001;1:76–93. [Google Scholar]
18.Kvålseth T.O. Bounds on sample variation measures based on majorization. Commun. Stat. Theor. Methods. 2015;44:3375–3386. [Google Scholar]
19.Kemperman J.H.B. Moment problems for sampling without replacement. I, II, III. Nederl. Akad. Wetensch. Prac. Ser. A 76 (=Indag. Math. 35) 1973 149 – 164, 165 – 180, and 181 – 188. [MR 49(1975(9997a,b,c; Zbl. 266(1974)62006, 62007, 62008] (1973) [Google Scholar]
20.Martin S. second ed. Blackwell; Oxford, U.K: 2002. Advanced Industrial Economics. [Google Scholar]
21.Carlton D.W., Perloff J.M. fourth ed. Addison-Wesley; Boston, MA: 2005. Modern Industrial Organization. [Google Scholar]
22.Kvålseth T.O. Cautionary note about R2. Am. Statistician. 1985;39:279–285. [Google Scholar]
23.Lijesen M.G., Nijkamp P., Rietveld P. Measuring competition in civil aviation. J. Air Transport. Manag. 2002;8:189–197. [Google Scholar]
24.Bain J.S. Wiley; New York: 1959. Industrial Organization. [Google Scholar]
25.Market Share Reporter. twenty seventh ed. Gale; 2017. [Google Scholar]
26.Editorial New 2016 data and statistics for global pharmaceutical products and projections through 2017. ACS Chem. Neurosci. 2017;8:1635–1636. doi: 10.1021/acschemneuro.7b00253. [DOI] [PubMed] [Google Scholar]
27.Statista . 2018. Market Share of the Leading Insurance Companies in Belgium as of 2016.https://www.statista.com/statistics/780454/market-share-leading-insurance-companies-belgium/ [Google Scholar]
28.Statista . 2018. Market Share of the Leading Exporters of Major Weapons between 2013 and 2017, by Country.https://www.statista.com/statistics/267131/market-share-of-the-leadings-exporters-of-conventional-weapons/ [Google Scholar]
29.SMMT- The Society of Motor Manufacturers and Traders . 2018. Best-selling Car Marques in Britain in 2018 (Q1)https://www.best-selling-cars.com/britain-uk/2018-q1-britain-best-selling-car-brands-and-models/ [Google Scholar]
30.Statista . 2014. U.S. Market Share of Selected Automobile Manufacturers 2013.https://www.statista.com/statistics/249375/us-market-share-of-selected-automobile-manufacturers/ [Google Scholar]
31.Carsalesbase.com . 2017. Global Car Sales Analysis 2017-Q1.http://carsalesbase.com/global-car-sales-2017-q1/ [Google Scholar]

[bib1] 1.OECD (Organization for Economic Co-operation and Development) 2003. OECD Glossary of Statistical Terms: Concentration. URL https://stats.oecd.org/glossary/detail.asp?ID=3165. [Google Scholar]

[bib2] 2.U.S. Department of Justice and the Federal Trade Commission . 2010. Horizontal Merger Guidelines. [Google Scholar]

[bib3] 3.Herfindahl O.C. Columbia University, U.S.A.; 1950. Concentration in the Steel Industry. Unpublished Ph.D. dissertation. [Google Scholar]

[bib4] 4.Hirschman A.O. University of California Press; Berkeley, CA: 1945. National Power and the Structure of Foreign Trade. [Google Scholar]

[bib5] 5.Gaughan P.A. fifth ed. Wiley; Hoboken, NJ: 2011. Mergers, Acquisitions, and Corporate Restructuring. [Google Scholar]

[bib6] 6.Andreosso B., Jacobson D. second ed. McGraw-Hill; London: 2005. Industrial Economics and Organization: a European Perspective. [Google Scholar]

[bib7] 7.Tremblay V.J., Horton Tremblay C. Springer; New York: 2012. New Perspectives on Industrial Organization. [Google Scholar]

[bib8] 8.Pautler P.A. A guide to the Herfindahl index for antitrust attorneys. Res. Law Econ. 1983;5:167–190. [Google Scholar]

[bib9] 9.Kwoka J.E., Jr. The Herfindahl in theory and practice. Antitrust Bull. 1985;30(winter):915–947. [Google Scholar]

[bib10] 10.Sleuwaegen L., Dehandschutter W. The critical choice between the concentration ratio and the H-index in assessing performance. J. Ind. Econ. 1986;XXXV:193–208. [Google Scholar]

[bib11] 11.Sleuwaegen L.E., DeBondt R.R., Dehandschutter W.V. The Herfindahl index and concentration ratio revisited. Antitrust Bull. 1989;34(fall):625–640. [Google Scholar]

[bib12] 12.Marshall A.W., Olkin I., Arnold B.C. second ed. Springer; New York: 2011. Inequalities: Theory of Majorization and its Application. [Google Scholar]

[bib13] 13.Marshall A.W., Olkin I. Academic Press; San Diego, CA: 1979. Inequalities: Theory of Majorization and its Applications. [Google Scholar]

[bib14] 14.Pavic I., Galetic F., Piplica D. Similarities and differences between the CR and HHI as an indicator of market concentration and market power. Br. J. Econ. Manag. Trade. 2016;13(1):1–8. [Google Scholar]

[bib15] 15.Ibragimov M., Ibragimov R. Market demand elasticity and income inequality. Econ. Theor. 2007;32:579–587. [Google Scholar]

[bib16] 16.Arnold B.C. Majorization: here, there and everywhere. Stat. Sci. 2007;22:407–413. [Google Scholar]

[bib17] 17.Nielsen M.A., Vidal G. Majorization and the interconversion of bipartite states. Quant. Inf. Comput. 2001;1:76–93. [Google Scholar]

[bib18] 18.Kvålseth T.O. Bounds on sample variation measures based on majorization. Commun. Stat. Theor. Methods. 2015;44:3375–3386. [Google Scholar]

[bib19] 19.Kemperman J.H.B. Moment problems for sampling without replacement. I, II, III. Nederl. Akad. Wetensch. Prac. Ser. A 76 (=Indag. Math. 35) 1973 149 – 164, 165 – 180, and 181 – 188. [MR 49(1975(9997a,b,c; Zbl. 266(1974)62006, 62007, 62008] (1973) [Google Scholar]

[bib20] 20.Martin S. second ed. Blackwell; Oxford, U.K: 2002. Advanced Industrial Economics. [Google Scholar]

[bib21] 21.Carlton D.W., Perloff J.M. fourth ed. Addison-Wesley; Boston, MA: 2005. Modern Industrial Organization. [Google Scholar]

[bib22] 22.Kvålseth T.O. Cautionary note about R2. Am. Statistician. 1985;39:279–285. [Google Scholar]

[bib23] 23.Lijesen M.G., Nijkamp P., Rietveld P. Measuring competition in civil aviation. J. Air Transport. Manag. 2002;8:189–197. [Google Scholar]

[bib24] 24.Bain J.S. Wiley; New York: 1959. Industrial Organization. [Google Scholar]

[bib25] 25.Market Share Reporter. twenty seventh ed. Gale; 2017. [Google Scholar]

[bib26] 26.Editorial New 2016 data and statistics for global pharmaceutical products and projections through 2017. ACS Chem. Neurosci. 2017;8:1635–1636. doi: 10.1021/acschemneuro.7b00253. [DOI] [PubMed] [Google Scholar]

[bib27] 27.Statista . 2018. Market Share of the Leading Insurance Companies in Belgium as of 2016.https://www.statista.com/statistics/780454/market-share-leading-insurance-companies-belgium/ [Google Scholar]

[bib28] 28.Statista . 2018. Market Share of the Leading Exporters of Major Weapons between 2013 and 2017, by Country.https://www.statista.com/statistics/267131/market-share-of-the-leadings-exporters-of-conventional-weapons/ [Google Scholar]

[bib29] 29.SMMT- The Society of Motor Manufacturers and Traders . 2018. Best-selling Car Marques in Britain in 2018 (Q1)https://www.best-selling-cars.com/britain-uk/2018-q1-britain-best-selling-car-brands-and-models/ [Google Scholar]

[bib30] 30.Statista . 2014. U.S. Market Share of Selected Automobile Manufacturers 2013.https://www.statista.com/statistics/249375/us-market-share-of-selected-automobile-manufacturers/ [Google Scholar]

[bib31] 31.Carsalesbase.com . 2017. Global Car Sales Analysis 2017-Q1.http://carsalesbase.com/global-car-sales-2017-q1/ [Google Scholar]

PERMALINK

Relationship between concentration ratio and Herfindahl-Hirschman index: A re-examination based on majorization theory

Tarald O Kvålseth

Abstract

1. Introduction

2. Theory

2.1. Some introductory definitions

2.2. Bounds on H

2.3. Some comments

3. Results

3.1. Simulation results

Fig. 1.

3.2. H as function of $C R_{m}$

Table 1.

3.3. H as function of $s_{1}, ..., s_{m}$ and n

3.4. H as function of $s_{1}, ..., s_{m}$

3.5. Results from real market data

Table 2.

4. Conclusions

Declarations

Author contribution statement

Funding statement

Competing interest statement

Additional information

Acknowledgements

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Relationship between concentration ratio and Herfindahl-Hirschman index: A re-examination based on majorization theory

Tarald O Kvålseth

Abstract

1. Introduction

2. Theory

2.1. Some introductory definitions

2.2. Bounds on H

2.3. Some comments

3. Results

3.1. Simulation results

Fig. 1.

3.2. H as function of CRm

Table 1.

3.3. H as function of s1,...,sm and n

3.4. H as function of s1,...,sm

3.5. Results from real market data

Table 2.

4. Conclusions

Declarations

Author contribution statement

Funding statement

Competing interest statement

Additional information

Acknowledgements

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

3.2. H as function of $C R_{m}$

3.3. H as function of $s_{1}, ..., s_{m}$ and n

3.4. H as function of $s_{1}, ..., s_{m}$