Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2022 Jan 1.
Published in final edited form as: Demogr Res. 2021 Feb 16;44:363–378. doi: 10.4054/demres.2021.44.15

How do populations aggregate?

Dennis M Feehan , Elizabeth Wrigley-Field
PMCID: PMC8315034  NIHMSID: NIHMS1651581  PMID: 34326681

Abstract

BACKGROUND

Understanding the relationship between populations at different scales plays an important role in many demographic analyses.

OBJECTIVE

We show that when a population can be partitioned into subgroups, the death rate for the entire population can be written as the weighted harmonic mean of the death rates in each subgroup, where the weights are given by the numbers of deaths in each subgroup. This decomposition can be generalized to other types of occurrence-exposure rate. Using different weights, the death rate for the entire population can also be expressed as an arithmetic mean of the death rates in each subgroup.

CONTRIBUTION

We use these relationships as a starting point for investigating how demographers can correctly aggregate rates across non-overlapping subgroups. Our analysis reveals conceptual links between classic demographic models and length-biased sampling. To illustrate how the harmonic mean can suggest new interpretations of demographic relationships, we present as an application a new expression for the frailty of the dying, given a standard demographic frailty model.

1. Relationship

Given a population and a partition of the population into subgroups, what is the relationship between the death rate in the subgroups and the death rate for the aggregate population? We show that occurrence-exposure rates can be understood to aggregate across scales according to an elegant and well-understood mathematical relationship: the weighted harmonic mean, which is the inverse of a weighted integral or sum of inverses.

Definition.

Let f:[α,β] be a continuous, positive function f(x) > 0 and let w:[α,β] be a continuous non-negative function w(x) ≥ 0. Then the Weighted Harmonic Mean of f with weights given by w is

AH[f(x);w(x)]=αβw(x)dxαβw(x)f(x)dx=(αβw(x)(f(x))1dxαβw(x)dx)1. (1)

Result.

Suppose that a population is a mixture of people with different values of some continuous trait u ∈ [α, β]. Let d(a, u) be the number of deaths at exact age a to people with mixing trait value u, and let μ(a, u) be the hazard faced at exact age a by people with trait value u. Assume d(a, u) and μ(a, u) are positive and continuous when viewed as functions of u. Then the aggregate hazard at age a, μ¯(a), is the weighted harmonic mean of μ(a, u), with weights given by the number of deaths, d(a, u):

μ¯(a)=AH[μ(a,u);d(a,u)]=αβd(a,u)duαβd(a,u)μ(a,u)du=(αβd(a,u)(μ(a,u))1duαβd(a,u)du)1. (2)

The relationship holds whenever a population can be partitioned into subgroups – i.e., when the population can be divided into a set of mutually exclusive and collectively exhaustive subgroups. There are many potentially interesting ways that populations can be understood to be continuous mixtures of non-overlapping subpopulations: for example, at any moment, a population is a mixture of people with different blood pressure, cholesterol, BMI, height, and so forth. The decomposition does not depend upon any particular feature of death rates; it will also hold for any other type of occurrence-exposure rate.

2. Proof

Proof. Let (a, u) be the number of survivors to exact age a with trait u, where the values of u partition the population. Then

μ¯(a)=αβd(a,u)duαβl(a,u)du=αβd(a,u)duαβμ(a,u)l(a,u)μ(a,u)du=αβd(a,u)duαβd(a,u)μ(a,u)du. (3)

3. Related relationships

Discrete version

A harmonic mean can also be defined for discrete quantities, which leads to an analogous result for a population that has been partitioned into a discrete set of subgroups. Suppose that a population has been partitioned into a countable set of subgroups, indexed by i, and let the death count, exposure, and death rate in subgroup i between ages a and a + n be denoted nDai, nLai, and nMai=nDainLai>0, respectively. Then

nMa=inDaiinLai=inDaiinLainMainMai=inDaiinDainMai, (4)

where nMa is the aggregate death rate between ages a and a + n. In words, the aggregate death rate is the discrete harmonic mean of the subgroup death rates, with weights given by the number of deaths in each subgroup.

Length-biased sampling

The harmonic mean often arises in applications of length-biased sampling (see, e.g., de Carvalho (2016) and Patil (2014) for reviews). Under length-biased sampling, the probability of observing a characteristic is proportional to the value of that characteristic.

Formally, suppose that the population-level distribution of some characteristic x > 0 is given by a probability density function f(x). (We use x to denote the general case of any single characteristic, as opposed to the earlier example of a population stratified simultaneously by age a and some non-age trait u.) Now consider y > 0, a length-biased observation from f. Then the probability density function describing the distribution of the length-biased observation y is given by f*(y):

f*(y)=yf(y)Ef[x], (5)

where Ef[x]=0xf(x)dx is the usual arithmetic mean of x taken with respect to the non-length biased distribution f. From this observation mechanism, de Carvalho (2016) shows that

Ef*(1y)=01u×uf(u)Ef[x]du=1Ef[x]0f(u)du=1Ef[x], (6)

where Ef* is the expectation taken with respect to the length-biased distribution f. Thus, Equation 6 shows that Ef[x]=1Ef*[1/y]. Since the harmonic mean of the length-biased distribution is AH[y,f]=0f(u)du01uf(u)du=1Ef*[1/y], the relationship in Equation 6 says that the arithmetic mean of the non length-biased distribution is equal to the harmonic mean of the length-biased distribution, Ef[x]=1Ef*[1/y]. Thus, when we wish to recover the population average from length-biased samples, we should use the harmonic mean.

What happens if, instead, we take the arithmetic mean of length-biased samples? Recall that, when y > 0,

Ef*[y]=0u×uf(u)Ef[x]du=1Ef[x]0u2f(u)du=Ef[x2]Ef[x]. (7)

Equation 7 is the second moment of the non-length biased distribution, Ef[x2], divided by the first moment of the non-length biased distribution, Ef[x]. We can use the definition of variance to help make Equation 7 more interpretable: by definition, varf[x] = Ef[x2] − Ef[x]2; so, rearranging, we have Ef[x2] = varf[x] + Ef[x]2. Plugging this relationship in Equation 7, we obtain

Ef*[y]=1Ef[x][Ef[x]2+varf[x]]=Ef[x][1+cvf2[x]], (8)

where cvf2[x]=varf[x]Ef2[x] is the squared coefficient of variation of x taken with respect to the distribution f (Sen, 1987). In words, Equation 8 shows that the arithmetic mean of length-biased samples will differ from the underlying population mean by a factor that increases in the squared coefficient of variation of samples from the non-length biased distribution.

In the case of death rates, Equation 7 can be used to show that the arithmetic mean of subpopulation mortality weighted by the number of deaths is affected by length bias:

Ed(x)[μ(x)]=0μ(x)d(x)dx0d(x)dx=0μ(x)2l(x)dx0μ(x)l(x)dx=El(x)[μ(x)2]El(x)[μ(x)], (9)

where Ed(x) is the expectation taken with respect to the distribution of deaths at each age, d(x), and E(x) is the expectation taken with respect to the distribution of survivors at each age, (x).

Equation 9 has the form of an the arithmetic mean of length-biased samples (Equation 7). We illustrate these relationships in the appendix with an illustration based on aggregating subnational life tables. We also show a different example of how this length-biasing relationship arises in demography in Section 5.

Two ways of aggregating rates

Closer inspection of Equation 4 reveals that the arithmetic mean can also be used to aggregate across subgroups, now using the exposure as the weights, rather than the deaths. So, using AA[X; W] to refer to the arithmetic mean of X with weights given by W, we have AH[M; D] = AA[M; L], or

nDanLaAggregate death rate =inDaiinDainMaiHarmonic mean weighted by deaths=inMainLaiinLaiArithmetic mean weighted by exposure. (10)

The two decompositions in Equation 10 distinguish between weighting subgroup death rates by exposure (arithmetic mean) or by the number of deaths (harmonic mean). The conceptual key is that the harmonic mean is a natural average to use for quantities whose reciprocals aggregate according to the usual (arithmetic) mean. This explains why the harmonic mean is relevant in length-biased sampling: Equation 6 shows how taking the reciprocal ‘adjusts’ for the length bias, because 1uuf(u)=f(u). (We discuss this in greater detail in below.)

For death rates, the arithmetic mean decomposition in Equation 10 says that, fixing exposure, deaths will be observed in direct proportion to the subgroups’ death rates. For example, suppose that subgroup 1 has twice the death rate of subgroup 2, (i.e., M1 = 2M2), and that we observe a fixed amount of exposure, say L, from each subgroup. Then subgroup 1 will be expected to contribute twice as many deaths to the aggregate as subgroup 2 (i.e., D1 = M1 L = 2M2L = 2D2). Thus, when weighted by exposure, the death rates aggregate arithmetically.

However, the harmonic mean decomposition in Equation 10 says that, fixing deaths, exposure will be not be observed in proportion to subgroups’ death rates; instead, exposure will be inversely proportional to subgroups’ death rates. For example, suppose again that subgroup 1 has twice the death rate of subgroup 2, (i.e., M1 = 2M2), but we now observe a fixed number of deaths, say D, in each subgroup. If subgroup 1 has twice the death rate of subgroup 2, then we expect that subgroup 1 will contribute half as much exposure as subgroup 2 to the aggregate (i.e., L1=DM1=D2M2=12L2). Thus, each subgroup’s contribution to the aggregate exposure is proportional to the reciprocal of its death rate, and the death rates aggregate according to the harmonic mean. (Note that this also implies that the expected waiting time to each death – i.e., the reciprocal of the death rate – aggregates in the usual way, according to the arithmetic mean.)

4. History

There is statistical literature on the properties and uses of the harmonic mean (e.g., de Carvalho, 2016; Sen, 1987). Demographers originally discussed the harmonic mean explicitly in the context of selection problems and heterogeneity in fertility (Sheps, 1964; Sheps, Menken, and Radick, 1973) and mortality (Keyfitz and Littman, 1979), or implicitly in the context of understanding family size, viewed from the perspective of mothers and children (Preston, 1976). However, after these papers, demographers have made relatively little use of the harmonic mean, in spite of the field’s focus on aggregate rates and their relationship to underlying individual rates. (An exception is Schoen (2013), who investigated the harmonic mean as it relates to two-sex population models.)

5. Application

Aggregation is a central focus of the literature on heterogeneity in mortality (e.g., Vaupel, Manton, and Stallard (1979); Vaupel and Missov (2014)). We illustrate the harmonic mean by applying it to derive a new formula for a classic result in that literature.

In what Vaupel and Missov (2014) calls the ‘relative risks and fixed frailty’ model, heterogeneity in death rates is captured by associating each individual in the population with a frailty parameter z > 0. People with frailty parameter z face hazards that are related to a baseline via

μ(a,z)=zμs(a), (11)

where μs(a) ≡ μ(a, 1) is the hazard faced by a ‘standard individual’, with frailty z = 1, at age a1. Note that this model describes a situation in which our result applies: the population can be understood to be a continuous mixture of people who have different frailty parameters.

We will now see that, at age a, the frailties of people who die are length-biased samples of the frailties of people who are alive. Following Vaupel and Missov (2014, p. 663), let s(a, z) be the survivorship to age a among those with frailty parameter z and let s¯(a) be the aggregate survivorship to age a. Let π(a, z) be the density of people with frailty parameter z among survivors to age a, and let z¯(a) be the average frailty of survivors at age a. Finally, let d(a, z) be the density of deaths at age a to people with frailty parameter z and let d(a) be the density of deaths at age a. Then the probability that a death aged a has frailty z is given by:

d(a,z)d(a)=μ(a,z)s(a,z)π(0,z)0μ(a,z)s(a,z)π(0,z)dz=μ(a,z)π(a,z)s¯(a)0μ(a,z)π(a,z)s¯(a)dz=μ(a,z)π(a,z)0μ(a,z)π(a,z)dz=zμs(a)π(a,z)0zμs(a)π(a,z)dz=zπ(a,z)0zπ(a,z)dz=zπ(a,z)z¯(a), (12)

where the second line uses the fact that s(a,z)π(0,z)=π(a,z)s¯(a) (Vaupel and Missov, 2014, p. 663). Equation 12 shows that the probability that a death aged a has frailty z is proportional to z times the density of frailties among survivors aged a, divided by the average frailty among survivors aged a. Thus, Equation 12 has exactly the same form as a length-biased sampling density (Equation 5). In words, the frailty of someone who dies at age a can be understood as a length-biased sample from the distribution of frailties among people who survive to age a.

As a length-biased sample, two results follow immediately. First, Equation 8 showed (as is well-known) that the arithmetic mean of length-biased samples can be written in terms of the mean and squared coefficient of variation in the non-length-biased population distribution; in the relative risks and fixed frailty model, Equation 8 provides an expression for z(a), the average frailty of the dead:

z(a)=z¯(a)[1+cvz2(a)], (13)

where cvz2(a) is the coefficient of variation in the frailty parameter of people at age a. Equation 13 has been previously derived by Vaupel, Manton, and Stallard (1979, p. 442); a proof is also briefly discussed in Vaupel and Missov (2014, p. 667–668).

Second, Equation 6 showed that the harmonic mean of length-biased samples will recover the population (non-length-biased) mean. In the relative risks and fixed frailty model, Equation 6 provides an expression for the average frailty of people who survive to age a, in terms of the frailties of people who die at age a:

AH[z,d(a,z)]=0d(a,z)dz0d(a,z)zdz=0μ(a,z)π(a,z)dz0μ(a,z)π(a,z)zdz=0zμs(a)π(a,z)dz0zμs(a)π(a,z)zdz=0zπ(a,z)dz0π(a,z)dz=0zπ(a,z)dz=z¯(a). (14)

Equation 14 reveals that, under the relative risks and fixed frailty model, the average frailty of people who survive to age a is equal to the harmonic mean of the frailties of people who die at age a. So, if the relative risks and fixed frailty model held in a real population, and z could be measured or estimated only from a sample of deaths at age a, then Equation 14 says that we could still recover the population average frailty by taking the harmonic mean of the frailties among the deaths. (Similarly, any way of estimating the frailty of the individuals alive at some moment would also produce an estimate of the frailty of those dying at the same moment.)

Finally, we note that the relationship between the harmonic mean and reciprocals suggests a new interpretation of the frailty parameter, z, by focusing attention on the reciprocal of frailty, 1z. This frailty reciprocal, 1z, represents the number of deaths to ‘standard individuals’ that are expected before a death to a single individual of frailty z occurs. As such, it can be conceptualized as a form of waiting time, or exposure, needed to produce a death at frailty z, expressed in a relative scale.

As this example illustrates, bringing harmonic means more widely into demographic analysis can create analogies and conceptual linkages between demographic quantities that are not otherwise obvious. This is particularly true for questions of aggregation, in which a length-biased selection process is often at play. More generally, the harmonic and arithmetic mean relationships discussed above illustrate how demographers can correctly aggregate rates across non-overlapping subgroups.

Acknowledgments

The authors gratefully acknowledge support from the Berkeley Population Center (P2C HD 073964) and the Minnesota Population Center (P2C HD041023). Emma Zhang provided helpful comments on an early draft of this manuscript at the PAA 2019 meetings in Austin, Texas.

Appendix: Empirical example

In this appendix, we use the example of combining subnational life tables to show that appreciable errors can arise when death rates are not aggregated correctly. To do so, we create a pseudo-population based on the lifetables found in the US Mortality database.2 The US Mortality database has life tables for each US State and for Washington, D.C. We treat these life tables as cohorts, each of which is initially the same size. We then compare three strategies for aggregating death rates for each single year of age, for each sex, and for both sexes combined.3

The aggregate death rate across these pseudo-cohorts is equal to

nMa=inDaiinLai. (15)

where i indexes the pseudo-cohorts. In the main text, we show that this true aggregate is equal to AH[M, D] or, alternatively, AA[M, L].

In order to illustrate how much variation there is between the death rates across the sub-populations, we calculate the cofficient of variation:

cvL(nMai)=sdL(nMai)nMa, (16)

where sdL(nMai) is the standard deviation across the exposure-weighted state life table death rates. The coefficient of variation quantifies the amount of spread in death rates across the subnational units, when compared to the true aggregate death rate.

Figure 1 shows the aggregated age-specific death rates for the pseudo-population created from the 51 state lifetables (left panel); and the coefficient of variation across the 51 state lifetables (right panel).

Figure 1:

Figure 1:

A) Aggregate death rates across sub-national psuedo-units; B) Relative variation in death rates across sub-national pseudo-units.

For each age-sex group, we compare the true aggregate death rate to two alternate strategies for aggregating death rates. The first alternate strategy is the simple (unweighted) arithmetic mean of the state death rates:

nMaI=1NinMai, (17)

where N is the number of states.

The second alternate strategy is the arithmetic mean weighted by the number of deaths, rather than the exposure:

nMaII=inManiDaiinDai. (18)

Figure 2 shows the relative errors, re(nMaI)=nManMaInMa and re(nMaII)=nManMaIInMa, that result from aggregating death rates under each strategy. The Figure reveals that relative errors can be of considerable magnitude – up to 10 percent or more in some scenarios.

Figure 2:

Figure 2:

Relative errors in aggregating using the unweighted arithmetic mean (Panel A) or in using the incorrectly weighted arithmetic mean (Panel B).

For the death-weighted arithmetic mean, comparing panel B of Figure 2 to panel B of Figure 1 illustrates how relative errors in nMaII are greater when there is greater variation in the 51 underlying subpopulation death rates. To better understand this relationship between the relative errors in nMaII and the variation in the death rates, note that nMaII, weights by deaths but incorrectly uses the arithmetic mean, instead of the harmonic mean. When weighting by the number of deaths, the arithmetic mean will be affected by length bias (Equation 9). To see this, note that the main text shows that the true aggregate death rate can be written as AA[M, L], or:

nMa=inMainLaiinLai=inMai×(nLainLa), (19)

where nLa=inLai is the aggregate exposure, and the term in parenthesis is the correct weight to use with the arithmetic mean. As we saw in the main text, taking the arithmetic mean of a length-biased sample would mean using weights that have the form f(y) = y ·f(y)·(1/Ef[x]). In this case, these length-biased weights are

nMainLainLa1nMa=nMainLainLanMa=nDainDa. (20)

Thus, nMaII, the death-weighted arithmetic mean of the subpopulation death rates, is a length-based average. In this case, Equation 8 shows that we should expect larger errors from nMaII when there is more variation in the death rates across the sub-populations. In fact, Equation 8 can be used to derive an exact expression for its relative error. In the main text, Equation 8 revealed that Ef[y]=Ef[x][1+cvf2[x]]. In our empirical setting, this relationship implies that nMaII=nMa[1+cvL2(nMai)]. Thus, we have

re(nMaII)=nMaIInManMa=nMa[1+cvL2(nMai)]nManMa=cvL2(nMai). (21)

Thus, the relative error in a length-biased aggregation of subpopulation death rates, nMaII, is exactly equal to cvL2(nMai), the square of the coefficient of variation (Equation 16). Figure 3 shows this directly, confirming that in our empirical example the relative errors in nMaII (Figure 2, Panel B) are exactly equal to the squared coefficients of variation in the subpopulation death rates (Figure 1, Panel B).

Figure 3:

Figure 3:

The relative errors resulting from incorrectly aggregating using the death-weighted arithmetic mean are exactly equal to the squared coefficient of variation in the subpopulation death rates. All points lie on the diagonal y = x line, confirming that these two quantities are equal.

Footnotes

1

To be consistent, we refer to the age variable as a here; note that Vaupel and Missov (2014) use x for age in their paper.

3

For each age and sex, we restrict our sample to the states that have at least one death; this is typically all 51 states, but some small states have no deaths at low-mortality ages – Vermont, for example, has no deaths at age 2 in the 2015 USMD life table. The harmonic mean is undefined when the outcome includes values of zero, since the reciprocal of the outcome appears in the denominator of the harmonic mean.

References

  1. de Carvalho M (2016). Mean, What do You Mean? The American Statistician 70(3): 270–274. URL 10.1080/00031305.2016.1148632. [DOI] [Google Scholar]
  2. Keyfitz N and Littman G (1979). Mortality in a heterogeneous population. Population studies 33(2): 333–342. [Google Scholar]
  3. Patil GP (2014). Weighted distributions. Wiley StatsRef: Statistics Reference Online. [Google Scholar]
  4. Preston SH (1976). Family sizes of children and family sizes of women. Demography 13(1): 105–114. [PubMed] [Google Scholar]
  5. Schoen R (2013). Modeling Multigroup Populations. Springer Science & Business Media. [Google Scholar]
  6. Sen PK (1987). What do the arithmetic, geometric and harmonic means tell us in length-biased sampling? Statistics & probability letters 5(2): 95–98. [Google Scholar]
  7. Sheps MC (1964). On the time required for conception. Population Studies 18(1): 85–97. [Google Scholar]
  8. Sheps MC, Menken J, and Radick AP (1973). Mathematical Models of Conception and Birth. Chicago: University of Chicago Press. [Google Scholar]
  9. Vaupel JW, Manton KG, and Stallard E (1979). The impact of heterogeneity in individual frailty on the dynamics of mortality. Demography 16(3): 439–454. [PubMed] [Google Scholar]
  10. Vaupel JW and Missov TI (2014). Unobserved population heterogeneity: A review of formal relationships. Demographic Research 31: 659–686. [Google Scholar]

RESOURCES