Abstract
While the two most widely used measures of market (industrial) concentration, the m-firm concentration ratio and the Herfindahl-Hirschman index H, have no precise functional relationship, they can be related by means of boundary formulations. Such bounds and potential relationships, which have been considered in some earlier reported studies, are being re-examined, corrected, and reformulated in this paper. The underlying analysis uses a different approach based on majorization theory and the results are supported by computer simulation. Such boundary relationships make it possible to determine approximate values of H from those of and vice versa for any given set of market shares. Much more accurate predictions of H-values can be obtained with knowledge of the individual market shares of the m largest firms within a market (industry), with or without knowledge of the total number of firms.
Keyword: Economics
1. Introduction
Market concentration, also often referred to as industry concentration, refers to the extent to which the market shares of the largest firms within a market (industry) accounts for a large proportion of economic activity such as sales, assets, or employment. As stated by OECD [1]:
“The rationale underlying the measurement of industry or market concentration is the industrial organization economic theory which suggests that, other things being equal, high levels of market concentration are more conducive to firms engaging in monopolistic practices which leads to misallocation of resources and poor economic performance. Market concentration in this context is used as one possible indicator of market power.”
Increasing market concentration causes decreasing competition and efficiency and increasing market power. Any such trends are being monitored by the business community and by government antitrust authorities such as the U.S. Department of Justice (DOJ) and the Federal Trade Commission (FTC) [2].
As measures of market concentration, the best-known candidates are the m-firm concentration ratio , especially the 4-firm , and the Herfindahl-Hirschman index H after Herfindahl [3] and Hirschman [4] (see, e.g., [5], pp. 116–118; [6], pp. 97–101; [7], Ch. 8). The is defined as the combined market share of the m largest firms within the market whereas H equals the sum of all the squared market shares. Of the two measures, H appears to have become the generally preferred one in terms of its properties (e.g., [5], pp. 116–118; [7], Ch. 8, pp. 610–615). In terms of the merger guidelines of the U.S. DOJ and FTC [2], the earliest 1968 Merger Guidelines utilized the while the later guidelines, the most recent being the 2010 Horizontal Merger Guidelines, have been using the H index as a screening tool for potential antitrust concerns raised by a proposed merger.
These two measures and H are sufficiently different that no precise functional relationship can exist between them. Nevertheless, it would be informative and potentially lead to approximate relationships if bounds and inequalities between the measures can be derived. Such work was done by Pautler [8], Kwoka [9], and Sleuwaegen et al. [10, 11], obtaining bounds on H in terms of . Their work was done, at least in part, in response to the change in the U.S. merger guidelines, replacing the four-firm concentration ratio with the H index. Results from actual market-share data showed that the absolute variation in values of H increased greatly with increasing .
Since these early explorations of potential H- relationships, there appears to have been no reported attempt to verify, correct, or expand on these results. It is the purpose of the present paper to take another critical look at those earlier findings using a more rigorous and transparent approach, resulting in some corrections or modifications and alternative formulations. The analytic approach used is that of majorization theory [12, 13] supported by data from computer simulation generating random market-share distributions. Some real market-share data are also being used.
If the objective were to simply determine the “best” function to describe the relationship between H and (or vice versa), then some statistical model could be explored using regression analysis. Such analysis could be performed for real or simulated market share data. Kwoka [9] reported one such effort by relating the logarithm linearly to for m = 2 and m = 4 and obtained quite a good fit to real market-share data. More recently, Pavic et al. [14] fitted real data to a model in which is expressed as a power function of H. Those authors fitted market-share data at different levels of aggregation and obtained good model fits. By contrast, instead of using a function that aims to relate each value of H to an approximate single value of or vice versa, the approach used in the present paper uses majorization theory to develop bounds that can in turn be used to approximately relate one measure to another. This approach also provides tolerance or error limits within which the value of H has to lie given any particular value of and vice versa.
The majorization theory as used in this paper has become a well-established approach to a wide variety of problems and fields of study. Since the celebrated book by Marshall and Olkin [13], there has been a surge of interest in potential applications of majorization theory in a wide variety of fields. The more recent edition [12] provides a more up-to-date account of the broad spectrum of applications. One of the early applications was by economists interested in the measurement of income inequality, notably by the work of Dalton and Lorenz (see [12], Ch. 1). In fact, the notion of economic inequality is closely linked to majorization or the Lorenz order [15]. Other applications have been reviewed by Arnold [16] and some relate to as diverse fields as quantum mechanics [17] and statistical variation [18]. The theory is particularly useful for establishing inequalities and extreme values of functions of discrete distributions. This is precisely the reason for using majorization theory in the present paper involving market-share distributions.
Since , particularly , and H are by far the most popular measures of market concentration and since market concentration is an indicator of competition, efficiency, and market power of firms within a market or industry, it is important to economists, policy makers and anyone with interests in such issues that any relationships between the two measures are accurate and reliable and based on an approach that is rigorous, complete, and explained in sufficient detail for verification. This has been the objective of the present paper.
While any relationship between H and can only be approximate, a more accurate estimation or prediction for H may be possible from some of the largest individual market shares such as those on which is based. Such a formulation would be important since market-share data are frequently reported for some of the largest firms with the smaller firms being combined into an “other” category. This type of formulation is also being considered in this paper.
2. Theory
2.1. Some introductory definitions
In order to appreciate the logic behind majorization theory, some definitions and properties are needed. In particular and conceptually, a vector or distribution is said to be majorized by a second vector (distribution) if the components of are “more nearly equal”, “more evenly distributed”, or “less concentrated” than are the components of . Formally, with and ordered decreasingly as
| (1) |
is majorized by , denoted by, under the following conditions:
| (2) |
For example, if is a vector (distribution) such that for all i and , then the following majorization applies:
([12], p. 9).
The concept of Schur-convexity of a function is a property that preserves the order of majorization. That is, a function f is Schur-convex if its value increases as its arguments become increasingly uneven or concentrated. Formally, f is Schur-convex if
| (3) |
and f is strictly Schur-convex if the inequality in (3) is strict when is not a permutation of .
As a simple example, consider the two vectors or distributions and . It is readily seen from the definition in (2) that is majorized by . Thus, if and were to represent two market-share distributions and if a measure of market concentration is Schur-convex, then it follows from (3) that shows a higher degree of concentration than does . This result would seem entirely reasonable from simply looking at the forms of the two distributions.
2.2. Bounds on H
Let denote the set or distribution of market shares for some particular industry or market with n firms where (or 100%). With the market shares decreasingly ordered as in (1), i.e.,
| (4) |
the m-firm concentration ratio is defined as
| (5) |
and the Herfindahl-Hirschman index as
| (6) |
An objective is then to determine bounds on H in terms of , m, and n.
A lower bound on H is relatively simple to derive. In the expression
| (7) |
each of the terms is strictly Schur-convex in their respective arguments ([12], pp. 138–139). Furthermore, the following majorizations exist:
| (8) |
which are rather apparent from (2) (see also [12], pp. 9, 21). It then follows from (3), (8), and the strict Schur-convexity of the terms in (7) that
| (9) |
which become a lower bound on H.
Derivation of an upper bound on H will be based in part on the following theorem given by Marshall et al. ([12], pp. 192–193) and attributed to Kemperman [19]: For a set of real-valued data , there exists a unique integer and a unique where
| (10) |
and
| (11) |
such that
| (12) |
It is readily apparent from the proof given by Marshall et al. ([12], pp. 192–193) that this theorem also holds when and .
Using this theorem and the fact that
it follows from (10) and (11) that K = 1 and hence k = 0 and so that
| (13) |
which is also given in Marshall and Olkin ([13], p. 133). From (13) and the (strict) Schur-convexity of , it follows that
| (14) |
Furthermore, since it is seen from (10), (11) and (12) that and
| (15) |
which, together with the (strict) Schur-convexity of , result in
| (16) |
Treating k as a continuous variable, it is clear that the bound in (16) is strictly convex in from (10) with L = 0 and . For this K and for both k = K – 1 and k = K (although k is strictly less than K), the bound in (16) is seen to equal for both k - values, i.e.,
| (17) |
Thus, from (7), (14), and (17),
| (18) |
The second - order partial derivative from (18) so that is convex in (strictly so if m > 1). From this convexity and since
| (19) |
| (20) |
and hence
| (21) |
which is then an upper bound on H in terms of any given m, n, and .
The bound in (21) is equivalent to one given by [10, 11] with two exceptions. First, their bound did not incorporate the term in (20). Second, their bound involved a potential fraction term that was indeterminable, but presumably sufficiently small that it could be ignored. Sleuwaegen et al. ([11], p. 628) presented an upper bound as being equivalent to the largest of and in (20), “except for a fraction if the maximum is and is non-integer”, with their and k being equivalent to and m used in the present paper. See also ([10], pp. 206–207).
For a large number of firms n (strictly for ), it is clear from (9) and (20), (21) that
| (22) |
By (a) defining , which is strictly concave in for any given m since , (b) setting V = 0, and (c) solving the resulting second-order equation for , it is found that
| (23) |
Similarly, for
| (24) |
It is apparent from (22), (23) and (24) that the upper bound in (22) equals only for m = 1 when, from (20), . This also becomes the bound in (21) for m = 1 and for n not large since then, from (20), . Also, note that, from (20), when n = m + 1 for any . Consequently, it follows from (22), (23) and (24) that
| (25) |
These (asymptotic) bounds are equivalent to those given by [10, 11] with one exception: the upper bound in (25) when contains no indeterminate fraction term mentioned by those authors. Furthermore, the upper bound given by [8, 9] for m = 4 is irrespective of the value of and whether n is finite or not. See also Martin ([20], p.337).
2.3. Some comments
It may perhaps be tempting to assume that for any given , m, and n, the upper bound on H corresponds to the market-share distribution
| (26) |
(see, e.g., [10, 11]). However, this is not necessarily true because of (21) and the fact that, as can easily be verified, the value of H for the distribution in (26) equals the in (20).
As a counterexample, consider the market-share distribution for which H = 0.2456. For m = 2, , and the value of H for the corresponding distribution in (26) becomes 0.2269, which equals , but which is less than the H = 0.2456. Rather, the upper bound from (21) becomes
While in the preceding counterexample, , consider next the following example with :
for which H = 0.0510. For m = 3, , and n = 27, the value of H for the corresponding distribution in (26) becomes 0.0467, which equals in (20), but which is less than the H = .0510. In fact, the correct upper bound on H from (21) is
For the case when all market shares are equal, i.e., for i = 1,…,n, it follows from (6) and (20) that . However, in this equal-share case, in (20) equals 1/n if and only if m = n – 1; otherwise, .
For the particular case when m = 1 so that the concentration ratio is simply equal to market share of the largest firm (see the order in (4)), it is readily seen from (20) that , , and for and for . Therefore, in the m = 1 case, the upper bound on H from (21) is either or depending upon which one is the smaller.
3. Results
3.1. Simulation results
Besides the above boundary comparisons between the H and in (5) and (6), an analysis has also been performed in terms of a scatter diagram of H versus . The four-firm concentration ratio was chosen since it is by far the most widely used one (e.g., [6], p. 97; [21], p. 255). Rather than using real data of limited sample size as done by [9, 10, 11], computer simulation was used to randomly generate a very large number of market-share distributions with each and n based on random number generation.
The algorithm developed was based on the following steps:
-
(1)
Generate n as a random integer such that .
-
(2)For each n - value generated in Step 1, generate (to the desired number of decimal places) as random numbers within the following intervals:
that is, -
(3)
Compute .
A total of 10,000 such distributions were generated and the corresponding values for H and were computed. The results are summarized in the scatter diagram in Fig. 1.
Fig. 1.
Comparison of values of and H for 10,000 randomly generated market-share distributions with the number of firms n varying as a random integer between 5 and 100. The solid curves are the bounds in (25).
The solid curves in Fig. 1 represent the boundary conditions in (25) and appear to be entirely appropriate even though the number of firms n is finite . Thus, the lower solid curve in Fig. 1 represents the lower bound on H as
| (27) |
and upper solid curve represents the upper bound as
| (28) |
It is clear from Fig. 1 that the variation in potential values of H for any given (with m = 4 in Fig. 1) tends to increase dramatically with increasing . Most interesting is perhaps the systematic nature of the variation in H-values as restricted by the bounds and in (27) and (28). The form of the scatter graph in Fig. 1 is generally similar to results given by [8, 9, 10, 11], although their results are based on a limited number of data points. See also Pavic et al. [14].
Another interesting observation can be made with respect to the relative variation of H-values for varying . Such variation can be measured in terms of the range relative to the midrange defined by the estimated (predicted) H as
| (29) |
with and defined in (27) and (28). Thus, the relative variation becomes 6/5 for whereas for . That is, in spite of the considerable absolute variation in H for each value of as seen from Fig. 1 when is not small , the relative variation remains constant. While these results are based on m = 4, there seems no reason to suspect that the results would not hold generally for all m.
3.2. H as function of
From the different expressions for H and in (5) and (6) and from the scatter graph in Fig. 1 and those in [8, 9, 10, 11], there cannot be any precise functional relationship between H and in general. However, a very approximate relationship can be formulated in terms of the lower bound in (9) and the upper bound in (21). The fact that the true value of H for any given market-share distribution falls within the interval can be expressed as
| (30) |
Note that the first term in (30) is the same is H1 in (29) (for m = 4) whereas the second one is a tolerance or error term.
If the total number of firms n in a market or industry is known, then and (30) can be computed from (9) and (21). For very large n and for , those computations can be done from (25) or from (27) and (28) for Furthermore, when n is not large, the bounds in (25) (and (27) and (28)) still apply. That is,
| (31) |
for any n and . This lower bound follows immediately from (9) by ignoring the term involving n whereas the upper bound can be proved as follows. First, the inequality in (18) can be expressed as
| (32) |
where A is strictly decreasing in . Second, for any given m and , the maximum possible value of will necessarily occur when for in which case
| (33) |
Then, from (33), for which, together with (32), gives for Finally, since the in (20) is independent of n, this completes the proof of in (31).
Equivalent bounds can also be derived for in terms of H. In fact, it follows immediately from (31) that
| (34) |
As in the case of (31), the bounds in (34) hold for any n and (and not just when ). Then, as in the case of H1 in (29) (for m = 4) and H in (30), the following formulation applies to :
| (35) |
In order to explore these and other formulations discussed subsequently, various market-share distributions were randomly generated using the computer algorithm described in Subsection 3.1 and with m = 4 and . The results are summarized in Table 1 for 25 different distributions. As an example of the computation involved for (29), (30) and (35), consider Data Set 1 in Table 1 with and . From (27), (28) and (29), , , and so that, from (30), or [0.0208, 0.0830]. That is, the value of H is roughly equal to 0.0519, but it has to lie in the interval between 0.0208 and 0.0830. The true value of H is 0.0459 in Table 1. The computations with respect to are similarly done for (34) and (35) with .
Table 1.
Values of CR4, H, H1, CR14, H2, H3, and H4 defined in (5), (6), (29), (35), (36), (42), and (43) for 25 randomly generated market-share distributions (s1, ..., sn) with random n between 5 and 100.
| Data Set | n | H | H1 | H2 | H3 | H4 | ||
|---|---|---|---|---|---|---|---|---|
| 1 | 27 | 0.2881 | 0.0459 | 0.0519 | 0.3060 | 0.0723 | 0.0459 | 0.0402 |
| 2 | 80 | 0.0535 | 0.0125 | 0.0070 | 0.1368 | 0.0070 | 0.0125 | 0.0068 |
| 3 | 66 | 0.3628 | 0.0678 | 0.0823 | 0.3906 | 0.0995 | 0.0679 | 0.0689 |
| 4 | 60 | 0.1210 | 0.0181 | 0.0170 | 0.1707 | 0.0217 | 0.0182 | 0.0120 |
| 5 | 81 | 0.0652 | 0.0126 | 0.0092 | 0.1374 | 0.0092 | 0.0125 | 0.0069 |
| 6 | 16 | 0.9666 | 0.5764 | 0.5839 | 1.1388 | 0.3868 | 0.5765 | 0.5767 |
| 7 | 31 | 0.3670 | 0.0805 | 0.0842 | 0.4256 | 0.1011 | 0.0809 | 0.0736 |
| 8 | 28 | 0.2584 | 0.0520 | 0.0417 | 0.3320 | 0.0621 | 0.0523 | 0.0406 |
| 9 | 65 | 0.5115 | 0.1708 | 0.1635 | 0.6199 | 0.1601 | 0.1709 | 0.1693 |
| 10 | 75 | 0.0566 | 0.0133 | 0.0075 | 0.1419 | 0.0076 | 0.0133 | 0.0073 |
| 11 | 22 | 0.1898 | 0.0455 | 0.0282 | 0.3043 | 0.0405 | 0.0455 | 0.0273 |
| 12 | 52 | 0.1051 | 0.0196 | 0.0176 | 0.1792 | 0.0179 | 0.0196 | 0.0118 |
| 13 | 30 | 0.4724 | 0.1553 | 0.1395 | 0.5911 | 0.1434 | 0.1558 | 0.1509 |
| 14 | 84 | 0.1085 | 0.0158 | 0.0150 | 0.1573 | 0.0187 | 0.0157 | 0.0109 |
| 15 | 38 | 0.8513 | 0.4773 | 0.4529 | 1.0363 | 0.3244 | 0.4775 | 0.4779 |
| 16 | 74 | 0.1045 | 0.0146 | 0.0144 | 0.1500 | 0.0177 | 0.0147 | 0.0094 |
| 17 | 25 | 0.2131 | 0.0418 | 0.0323 | 0.2881 | 0.0476 | 0.0419 | 0.0289 |
| 18 | 93 | 0.8406 | 0.6755 | 0.4416 | 1.2328 | 0.3188 | 0.6756 | 0.6754 |
| 19 | 85 | 0.0968 | 0.0140 | 0.0133 | 0.1463 | 0.0159 | 0.0140 | 0.0091 |
| 20 | 98 | 0.1340 | 0.0145 | 0.0190 | 0.1494 | 0.0250 | 0.0146 | 0.0114 |
| 21 | 59 | 0.0751 | 0.0170 | 0.0101 | 0.1644 | 0.0112 | 0.0170 | 0.0093 |
| 22 | 62 | 0.1324 | 0.0184 | 0.0187 | 0.1724 | 0.0246 | 0.0185 | 0.0123 |
| 23 | 91 | 0.0604 | 0.0112 | 0.0080 | 0.1282 | 0.0083 | 0.0112 | 0.0066 |
| 24 | 79 | 0.1646 | 0.0174 | 0.0240 | 0.1667 | 0.0333 | 0.0175 | 0.0141 |
| 25 | 62 | 0.2096 | 0.0271 | 0.0317 | 0.2188 | 0.0465 | 0.0272 | 0.0218 |
It is evident from the results in Table 1 that H1 as a function of in (29) and as a function of H in (35) do provide some respectable indications of the values of the indices H and . Thus, knowing the value of one of these two indices, one can make a rough estimation (prediction) about the corresponding value of the other index.
A statistical approach to determining the relationship between H and would be the use of regression analysis using simulated data or real market-share data. Such real-data analysis has been reported by Kwoka [9] and Pavic et al [14]. Based on the values of H and in Table 1, the following regression model is obtained:
| (36) |
However, the fit of this model to the data is unimpressive as seen from the values of H2 based on (36) as given in Table 1. The coefficient of determination, when properly computed [22], becomes (where denotes the mean of the H-values). That is, only 77% of the variation of H (about its mean) is explained by the fitted model in (36).
By comparison with the results in (36), Kwoka [9] obtained the parameter estimates a = 0.315 and b = 1.724 for some real market-share data. Similarly, Pavic et al. [14] fitted a power function to real market data for different levels of aggregation or degrees of specificity of the commodity. Their results corresponded to a-values ranging from about 0.70 to 0.93 and b-values ranging from 1.74 to 1.84 for the function in (36).
3.3. H as function of and n
The developments so far have been concerned with potential relationships between the Herfindahl-Hirschman Index H and the m-firm concentration ratio in (5) and (6). Consider next the case when the individual market shares for the m largest firms are known (and not just their sum ) as well as the total number of firms n in the market or industry. In this case, m may be rather small, most typically m = 4, but m could also be 3, 8, or 10, for example. Another situation may be one in which some of the smaller market shares are simply excluded from the computation of H since their effect on H is relatively low. Whatever the case may be, it would be of interest to determine if could reasonably be approximated by some function of and n. This was briefly considered by [11].
From the Schur-convexity of and the majorization ([12], pp. 21, 138–139) and from (17), it follows that
| (37) |
Then, from (7) and (37), it would seem reasonable to consider a measure approximating H as being plus a mean of the bounds A and B. The most obvious mean would be the simple arithmetic mean, resulting in the measure
| (38) |
as being one whose values would approximate those of H. However, this measure is not Schur-convex as seen from the fact that
which is not necessarily nonnegative.
An alternative formulation can be considered in terms of plus a weighted arithmetic mean of the bounds A and B in (37) with . That is,
| (39) |
In order to explore the Schur-convexity of , it follows from (39) that
| (40) |
where
| (41) |
Since this in (40) and (41) is not necessarily nonnegative, the cannot be Schur-convex ([12], p. 84).
From exploratory data analysis, it becomes readily apparent from various market-share distributions that values of are generally much closer to the corresponding values of H than are the values of . This observation can be accounted for by choosing a large value of w in (39) such as , resulting in
| (42) |
In order to explore how accurately values of H3 from (42) approximate those of H, the randomly generated market-share distributions described in Subsection 3.1 were used to compute H, , and H3 from (42) with m = 4. Again, the results are given in Table 1.
It is rather striking from the results in Table 1 how closely the values of H3 agree with those of H. Their slight differences occur only in the fourth decimal place, with 0.0005 for Data Set 13 being the largest difference. In fact, if values of H are predicted to equal those of H3, it is found from the results in Table 1 that the coefficient of determination, when properly computed [22], becomes (rounded off to four decimal places). That is, nearly all of the variation of H (about its mean ) is explained (accounted for) by the fitted model stating that the predicted H equals H3.
3.4. H as function of
The H3 in (42) requires knowledge of the market shares of the m largest firms within a market or industry as well as the total number of firms n. There may, of course, be cases when are reported, but n is not reported or the true value of n is not available since the market shares of an unknown number of firms have been combined into “others.” It would therefore be desirable to have an alternative measure that approximates H, but that depends only on .
Such a potential measure could again be considered in terms of the bounds in (37) except for setting the lower bound A = 0 to eliminate the dependence on n. When approximating with the arithmetic mean of A and B in (37), the elimination of A is partly compensated for by using instead of in the bound in (17). Therefore, a new measure can be defined as
| (43) |
which is only a function of . Furthermore, in terms of tolerance (error) intervals, the true value of H can be expressed as
| (44) |
Thus, for example, for the market-share distribution and for , and so that, from (43), . Then, from (44), so that the true value of H has to fall within the interval .
The H4 is not, however, Schur-convex as seen from the following differences between partial derivatives (using the descending order in (1)):
| (45) |
The terms in (45) are nonnegative, indicating Schur-convexity, but with one exception: . In this case, can potentially take on negative values. To place this relatively minor limitation of H4 within some context, consider the following implication from the above results: among the possible transfers of small amount of market shares from larger to smaller firms, only such transfers to the m-th largest firm could potentially lead to a slight increase in the value of H4. All other possible transfers could only lead to decreasing or nonincreasing H4. In most real situations, cases with in (45) would probably be rather exceptional so that H4 can still be considered as a reasonable measure for all practical purposes.
For the randomly generated market-share distributions behind the data in Table 1, the values of H4 were also computed as shown in the table. The results show that the values of H4 tend to approximate quite closely those of H and nearly as closely as those of H3. From the data in Table 1, the coefficient of determination is for the fitted model There would seem to be little disadvantage in using H4 instead of H3 in (42), which also incorporates the number of firms n in a market (industry). When using two decimal places, which is clearly adequate for practical purposes, the values of H3 and H4 may differ by only about These results are based on the most common choice of and would probably differ somewhat for different m.
3.5. Results from real market data
The numerical results have so far been based on market-share distributions that have been randomly generated. Such results have the broadest possible implication without any particular bias or restrictions. By comparison, results from using real market-share data may depend on factors such as the market classification system used and the level of aggregation.
Nevertheless, it would be of interest to subject the above developments to some real data and compare the results with those from computer generated random samples. Therefore, readily accessible market-share data from a wide diversity of markets were used to determine the true values of and H in (5) and (6) and their approximate values from H1 in (29), in (35), and H3 and H4 in (42) and (43). The results are given in Table 2. While the results in Table 1 are given with four decimal places, primarily to illustrate the accuracy with which H3 estimates (predicts) H from (42), two decimal places are used in Table 2 since this is adequate for all practical purposes.
Table 2.
Values of CR4, H, H1, H1·, CR14, CR1·4, H3, and H4 in (5), (6), (29), (47), (35), (48), (42), and (43) for some real market-share data.
| n | H | H1 | H3 | H4 | Source (Market type) | ||||
|---|---|---|---|---|---|---|---|---|---|
| 16 | 0.50 | 0.10 | 0.16 | 0.08 | 0.47 | 0.55 | 0.09 | 0.09 | [23] (Airline travel) |
| 16 | 0.60 | 0.12 | 0.23 | 0.12 | 0.52 | 0.61 | 0.12 | 0.12 | [23] (Airline travel) |
| 8 | 0.75 | 0.16 | 0.35 | 0.18 | 0.60 | 0.70 | 0.17 | 0.17 | [24] (U.S. distilled liquor) |
| 10 | 0.64 | 0.14 | 0.26 | 0.13 | 0.56 | 0.65 | 0.14 | 0.13 | [25] (Paints, coatings) |
| 10 | 0.52 | 0.11 | 0.17 | 0.09 | 0.50 | 0.58 | 0.11 | 0.10 | [26] (Pharmaceuticals) |
| 15 | 0.54 | 0.10 | 0.18 | 0.09 | 0.47 | 0.55 | 0.10 | 0.10 | [27] (Insurance companies) |
| 12 | 0.69 | 0.18 | 0.30 | 0.15 | 0.64 | 0.74 | 0.19 | 0.18 | [28] (Weapons exporters) |
| 30 | 0.34 | 0.05 | 0.07 | 0.04 | 0.32 | 0.39 | 0.05 | 0.05 | [29] (Car sales, Britain) |
| 12 | 0.60 | 0.12 | 0.23 | 0.12 | 0.52 | 0.61 | 0.11 | 0.11 | [30] (Auto manufacturers, US) |
| 8 | 0.77 | 0.18 | 0.37 | 0.19 | 0.64 | 0.74 | 0.18 | 0.17 | [25] (Craft beer, US) |
| 9 | 0.75 | 0.16 | 0.35 | 0.18 | 0.60 | 0.70 | 0.16 | 0.16 | [25] (Running shoe sales) |
| 10 | 0.72 | 0.17 | 0.32 | 0.17 | 0.62 | 0.72 | 0.16 | 0.17 | [25] (Top charter airlines) |
| 10 | 0.68 | 0.18 | 0.29 | 0.15 | 0.64 | 0.74 | 0.16 | 0.17 | [25] (Farm machinery, equip.) |
| 20 | 0.48 | 0.08 | 0.14 | 0.07 | 0.42 | 0.49 | 0.07 | 0.09 | [31] (Global car sales) |
| 10 | 0.60 | 0.12 | 0.23 | 0.12 | 0.52 | 0.61 | 0.12 | 0.12 | [25] (Top airlines worldwide) |
It is clear from the values of , H, H1, and in Table 2 that the estimation (prediction) of H from any given and vice versa is subject to substantial inaccuracy since the values of differ considerably from those of and similarly for H1 versus H (by up to a multiplicative factor of 2). These results differ significantly from those of Table 1 probably because the values of and H are generally much larger in Table 2 than in Table 1. The relatively small values of and H in Table 1 are partly due to the rather large values of n that affect the generated market-share distributions.
The estimated (predicted) values of H based on , and those of based on H, given in Table 2 could probably be improved upon by considering different weights for the two pairs of bounds. Thus, instead of H1 in (29), one could consider the bounds in (31) for and define
| (46) |
By exploring different values of w from the data in Table 2 with , becomes appropriate so that, from (46),
| (47) |
Then, setting and solving for gives
| (48) |
From the results in Table 2, it is seen that the values of and from (47) and (48) are considerably closer to the values of H and than are those of H1 and .
However, with respect to the measures H3 in (42) and H4 in (43) and the index H, the results in the two tables are quite comparable. Thus, from the results in Table 2, it is seen that the values of H3 and H4 provide close approximation to those of H just as is the case with Table 1. Even though H3 incorporates the additional variable n (the total number of firms in a market), the values of H4 are about as close to those of H as are the H3-values.
4. Conclusions
This careful and detailed re-examination of the boundary relationships between the Herfindahl-Hirschman index H and the m-firm concentration ratio is important in a number of regards. The various derivations and intermediate steps are verifiable and based on a rigorous approach using majorization theory, resulting in some new findings and revisions of earlier results. Since (a) market (industrial) concentration is generally considered to be an indicator of the competitiveness, efficiency, or power of a market (or industry), (b) it is convenient and useful to be able to measure this market property, and (c) the H and (especially ) are the most widely used measures, it is essential that the formulations of any H- relationships be correct, clear, and reliable. This requirement is clearly important to economists, policy planners, and others using such summary measures for any purpose.
Although there is clearly a relationship between the two indices and H, this paper is emphasizing the very approximate nature of the relationship based on bounds using majorization theory. If only the values of or H are known, without knowledge of any of the individual market shares or the total number of firms in the market or industry n, then their approximate relationships are given by (27), (28) and (29) for and more generally by (30) and (31) as well as by (34) and (35), depending upon which is considered the dependent and which the independent (explanatory) variable.
The relationship in (34) and (35) is based on the bounds on for any . Alternatively, one could derive as the inverse function of in (31), with (27), (28) and (29) being the particular, but most common, case of . This inverse procedure yields
| (49) |
However, this relationship does not necessarily produce the same results as (34) and (35) and differ from (48).
As an example of the approximate nature of the conversion from one of the indices H and to the other, it would be of interest to consider the most recent 2010 Horizontal Merger Guidelines [2] that uses H as the concentration index. Those guidelines use and as two of the bench marks. From (35) with the equivalent values of would be, respectively, and If individual values of are computed from (35) and (36) and (48) and (49) for and , those values of , which differ considerably, are found to have mean values of about 0.55 for and 0.75 for . Thus, in terms of those mean values, and from the 2010 Guidelines, the following equivalence applies:
Unconcentrated Markets: H < 0.15 or
Moderately Concentrated Markets:
Highly Concentrated Markets: H > 0.25 or
While any relationship between H and can only provide a rough approximation, the estimation (prediction) of H becomes rapidly more accurate when given some of the largest market shares . An interesting observation is the fact that knowledge of the size of the market (industry) n does not generally have any substantial effect on such estimation accuracy. That is, the measure H4 in (43) tends to be about as close to the real measure H as is H3 in (42) as exemplified by the data in Tables 1 and 2 for m = 4.
Such support for the H4 in (43) is important since market-share data are often reported with the smaller market shares grouped into an “other” category without any specification of n. In some cases, an “other” category may account for more than 50% of the market shares without any indication of the number of firms included in that category. Furthermore, H4 together with its tolerance (error) limits in (44) provide a complete description of the true value of the index H. The H4 also has the advantage of having the zero-indifference property, i.e., introducing a firm with zero market share does not affect H4. In spite of the fact that H4 is not Schur-convex, it does indeed appear to provide good approximation to H.
All the proofs and derivations in this paper involving the m-firm concentration ratio are done for any m rather than for some specific m-value. Whenever a particular m is used, the discussion involves m = 4 since, as pointed out earlier, the 4-firm concentration ratio is the one used most frequently. If any other m were to be of particular interest, such as m = 2 or m = 8, the analysis would simply require such substitution into the appropriate equations. Some of the numerical results may, of course, differ depending upon m.
A concluding comment is also warranted about the comparison between and H. From majorization theory as commented on in Subsection 2.1, it follows that if two market-share distributions and are comparable with respect to majorization (i.e., and can be compared as in (2)), then H and provide the same size (order) comparison. That is, if , it is implied that and vice versa. This follows from the fact that is Schur-convex and H is strictly Schur-convex. If one has to choose between the two concentration measures, and if the market shares are known for all the firms within a particular market (industry), it should be noted that H has the advantage of strict Schur-convexity and of incorporating all available information about the market shares. If the market shares of the m largest firms, but not the market size n, are known, a good compromise would seem to be H4 in (43) and (44), with H4 being Schur-convex and strictly Schur-convex in .
Declarations
Author contribution statement
Tarald O. Kvålseth: Conceived and designed the analysis; Analyzed and interpreted the data; Contributed analysis tools or data; Wrote the paper.
Funding statement
This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.
Competing interest statement
The authors declare no conflict of interest.
Additional information
No additional information is available for this paper.
Acknowledgements
The author wants to thank the three reviewers for their helpful and constructive comments.
References
- 1.OECD (Organization for Economic Co-operation and Development) 2003. OECD Glossary of Statistical Terms: Concentration. URL https://stats.oecd.org/glossary/detail.asp?ID=3165. [Google Scholar]
- 2.U.S. Department of Justice and the Federal Trade Commission . 2010. Horizontal Merger Guidelines. [Google Scholar]
- 3.Herfindahl O.C. Columbia University, U.S.A.; 1950. Concentration in the Steel Industry. Unpublished Ph.D. dissertation. [Google Scholar]
- 4.Hirschman A.O. University of California Press; Berkeley, CA: 1945. National Power and the Structure of Foreign Trade. [Google Scholar]
- 5.Gaughan P.A. fifth ed. Wiley; Hoboken, NJ: 2011. Mergers, Acquisitions, and Corporate Restructuring. [Google Scholar]
- 6.Andreosso B., Jacobson D. second ed. McGraw-Hill; London: 2005. Industrial Economics and Organization: a European Perspective. [Google Scholar]
- 7.Tremblay V.J., Horton Tremblay C. Springer; New York: 2012. New Perspectives on Industrial Organization. [Google Scholar]
- 8.Pautler P.A. A guide to the Herfindahl index for antitrust attorneys. Res. Law Econ. 1983;5:167–190. [Google Scholar]
- 9.Kwoka J.E., Jr. The Herfindahl in theory and practice. Antitrust Bull. 1985;30(winter):915–947. [Google Scholar]
- 10.Sleuwaegen L., Dehandschutter W. The critical choice between the concentration ratio and the H-index in assessing performance. J. Ind. Econ. 1986;XXXV:193–208. [Google Scholar]
- 11.Sleuwaegen L.E., DeBondt R.R., Dehandschutter W.V. The Herfindahl index and concentration ratio revisited. Antitrust Bull. 1989;34(fall):625–640. [Google Scholar]
- 12.Marshall A.W., Olkin I., Arnold B.C. second ed. Springer; New York: 2011. Inequalities: Theory of Majorization and its Application. [Google Scholar]
- 13.Marshall A.W., Olkin I. Academic Press; San Diego, CA: 1979. Inequalities: Theory of Majorization and its Applications. [Google Scholar]
- 14.Pavic I., Galetic F., Piplica D. Similarities and differences between the CR and HHI as an indicator of market concentration and market power. Br. J. Econ. Manag. Trade. 2016;13(1):1–8. [Google Scholar]
- 15.Ibragimov M., Ibragimov R. Market demand elasticity and income inequality. Econ. Theor. 2007;32:579–587. [Google Scholar]
- 16.Arnold B.C. Majorization: here, there and everywhere. Stat. Sci. 2007;22:407–413. [Google Scholar]
- 17.Nielsen M.A., Vidal G. Majorization and the interconversion of bipartite states. Quant. Inf. Comput. 2001;1:76–93. [Google Scholar]
- 18.Kvålseth T.O. Bounds on sample variation measures based on majorization. Commun. Stat. Theor. Methods. 2015;44:3375–3386. [Google Scholar]
- 19.Kemperman J.H.B. Moment problems for sampling without replacement. I, II, III. Nederl. Akad. Wetensch. Prac. Ser. A 76 (=Indag. Math. 35) 1973 149 – 164, 165 – 180, and 181 – 188. [MR 49(1975(9997a,b,c; Zbl. 266(1974)62006, 62007, 62008] (1973) [Google Scholar]
- 20.Martin S. second ed. Blackwell; Oxford, U.K: 2002. Advanced Industrial Economics. [Google Scholar]
- 21.Carlton D.W., Perloff J.M. fourth ed. Addison-Wesley; Boston, MA: 2005. Modern Industrial Organization. [Google Scholar]
- 22.Kvålseth T.O. Cautionary note about R2. Am. Statistician. 1985;39:279–285. [Google Scholar]
- 23.Lijesen M.G., Nijkamp P., Rietveld P. Measuring competition in civil aviation. J. Air Transport. Manag. 2002;8:189–197. [Google Scholar]
- 24.Bain J.S. Wiley; New York: 1959. Industrial Organization. [Google Scholar]
- 25.Market Share Reporter. twenty seventh ed. Gale; 2017. [Google Scholar]
- 26.Editorial New 2016 data and statistics for global pharmaceutical products and projections through 2017. ACS Chem. Neurosci. 2017;8:1635–1636. doi: 10.1021/acschemneuro.7b00253. [DOI] [PubMed] [Google Scholar]
- 27.Statista . 2018. Market Share of the Leading Insurance Companies in Belgium as of 2016.https://www.statista.com/statistics/780454/market-share-leading-insurance-companies-belgium/ [Google Scholar]
- 28.Statista . 2018. Market Share of the Leading Exporters of Major Weapons between 2013 and 2017, by Country.https://www.statista.com/statistics/267131/market-share-of-the-leadings-exporters-of-conventional-weapons/ [Google Scholar]
- 29.SMMT- The Society of Motor Manufacturers and Traders . 2018. Best-selling Car Marques in Britain in 2018 (Q1)https://www.best-selling-cars.com/britain-uk/2018-q1-britain-best-selling-car-brands-and-models/ [Google Scholar]
- 30.Statista . 2014. U.S. Market Share of Selected Automobile Manufacturers 2013.https://www.statista.com/statistics/249375/us-market-share-of-selected-automobile-manufacturers/ [Google Scholar]
- 31.Carsalesbase.com . 2017. Global Car Sales Analysis 2017-Q1.http://carsalesbase.com/global-car-sales-2017-q1/ [Google Scholar]

