Skip to main content
International Journal of Environmental Research and Public Health logoLink to International Journal of Environmental Research and Public Health
. 2024 Feb 10;21(2):208. doi: 10.3390/ijerph21020208

A Comparison of Statistical Methods to Construct Confidence Intervals and Fiducial Intervals for Measures of Health Disparities

Tengfei Li 1, Anca D Dragomir 2, George Luta 1,*
Editors: Antonio G Oliveira, Jimmy T Efird
PMCID: PMC10887721  PMID: 38397697

Abstract

Health disparities are differences in health status across different socioeconomic groups. Classical methods, e.g., the Delta method, have been used to estimate the standard errors of estimated measures of health disparities and to construct confidence intervals for these measures. However, the confidence intervals constructed using the classical methods do not have good coverage properties for situations involving sparse data. In this article, we introduce three new methods to construct fiducial intervals for measures of health disparities based on approximate fiducial quantities. Through a comprehensive simulation study, We compare the empirical coverage properties of the proposed fiducial intervals against two Monte Carlo simulation-based methods—utilizing either a truncated Normal distribution or the Gamma distribution—as well as the classical method. The findings of the simulation study advocate for the adoption of the Monte Carlo simulation-based method with the Gamma distribution when a unified approach is sought for all health disparity measures.

Keywords: delta method, Monte Carlo simulation, fiducial inference, confidence interval, fiducial interval, measures of health disparities

1. Introduction

In recent years, more and more attention has been given to health equity, one of the goals of the Healthy People 2020 [1]. The World Health Organization (WHO) has pointed out the “gap” in health between segments of the population [2]. A health disparity is defined as “a particular type of health difference that is closely linked with social, economic, and/or environmental disadvantage” [1]. The US National Institute on Minority Health and Health Disparities has raised national awareness about the prevalence and impact of health disparities that would adversely affect groups of people who are more vulnerable to health-related issues. Last but not least, the US Centers for Disease Control and Prevention (CDC) has played an important role in identifying the factors that lead to health disparities among racial, ethnic, geographic, and other socioeconomic groups, an example being the 2011 CDC Health Disparities and Inequalities Report [3].

There are multiple measures available to quantify the presence of health disparities across socioeconomic groups. The Health Disparity Calculator (HD*Calc), version 2.0.0, is a free statistical software that calculates estimates of commonly used measures of health disparities and constructs corresponding confidence intervals (CIs), using both classical and Monte Carlo simulation (MCS)-based methods [4,5,6]. The measures implemented in HD*Calc belong to three categories: absolute measures, relative measures and pairwise comparison measures. The absolute measures include range difference (RD), between-group variance (BGV), extended absolute concentration index (eACI) and the slope index of inequality (SII). The relative measures include range ratio (RR), index of disparity (IDisp), mean log deviation (MLD), Theil’s index (T), extended relative concentration index (eRCI), relative index of inequality (RII) and the Kunst–Mackenbach relative index (KMI). The pair comparison methods include pair difference (PD) and pair ratio (PR). Although HD*Calc was designed to analyze data from the Surveillance, Epidemiology, and End Results (SEER) Program, the software can also be used for other population-based health data.

Two articles have formally evaluated the empirical coverage properties of the methods to construct CIs implemented in HD*Calc. The first article has compared the classical method and the MCS-based method using the truncated Normal distribution [7]. The authors concluded that the two methods work well except for situations when the data are sparse. As a general solution to dealing with sparse data, the second article has proposed the use of the MCS-based method with the Gamma distribution [8]. The MCS-based method with the Gamma distribution is currently the recommended approach to construct CIs for the measures of health disparities implemented in HD*Calc. By extending the work from Krishnamoorthy and Lee 2010 [9] to the case of measures of health disparities, the aims of the current article are to introduce three new methods to construct fiducial intervals for measures of health disparities, based on approximate fiducial quantities, and to compare their frequentist properties, i.e., their empirical coverage performance, with those of existing methods by using a simulation study involving nine different scenarios that allow different combinations of sample sizes and true rates per cell (where the cells are all cross-classifications of age groups and socioeconomic groups).

This paper is organized as follows. We review the measures of health disparities implemented in HD*Calc and describe the statistical methods used to construct confidence intervals and fiducial intervals, including the classical method, the MCS-based methods and the proposed new fiducial methods. We describe the simulation study used to evaluate the empirical coverage performance of the intervals constructed using these methods, and, at the end, report and discuss the results of the simulation study. We provide all the results of the simulation study in Appendix A.

2. Materials and Methods

2.1. Background and Notation

In what follows, in contrast to previous work [6,10], we clearly distinguish between functions of parameters and their estimates. We denote by λjk the true rate (e.g., cancer rate) of the k-th age group within the j-th socioeconomic group, j=1,,J,k=1,,K. The true age-adjusted rate of the j-th socioeconomic group, μj, is defined as

μj=k=1Kwkλjk, (1)

where wk is the weight for the k-th age group within the j-th socioeconomic group.

To estimate the true rate λjk and the true age-adjusted rate μj, we use the unbiased estimators Rjk and Yj, respectively. We denote the estimated rate of the k-th age group within the j-th socioeconomic group as

Rjk=Djknjk,j=1,,J,k=1,,K,

where Djk denotes the number of events and njk denotes the number of persons (or person-years). The estimated age-adjusted rate is

Yj=k=1KwkRjk. (2)

We assume that DjkPoissonnjkλjk. The variance of the estimator Yj is

σj2=VarYj=k=1Kwk2VarRjk=k=1Kwk2njk2VarDjk=k=1Kwk2njk2njkλjk=k=1Kwk2njkλjk. (3)

An unbiased estimate of this variance is

σ^j2=k=1Kwk2njk2Djk. (4)

We define the vector of J estimators/estimates as Y=Y1,,YJ and the vector of the J true age-adjusted rates as μ=μ1,,μJ, where E(Yj)=μj for j=1,,J. In what follows, we assume that the J estimators are independent and the estimated variances of these estimators are given by σ^j2 for j=1,,J.

2.2. Measures of Health Disparities

The (true) measures of health disparities are functions of parameters, F(μ), although they are often not clearly distinguished from their estimates, F(Y) which may lead to confusion. Depending on the function F(·), we obtain different measures of health disparities. In what follows, we will present the measures implemented in HD*Calc. For simplicity, we will refer to the (true) age-adjusted rates simply as (true) rates.

2.2.1. Range Difference (RD) and Pair Difference (PD)

The range difference is the difference between the true rates of the best and the worst socioeconomic groups

RD=μmaxμmin, (5)

where μmax=maxjμj and μmin=minjμj. It is estimated by

RD^=Y(J)Y(1), (6)

where Y(j) is the j-th order statistic of the observed values of Y. This may cause problems as Y(J) and Y(1) may not necessarily be unbiased estimators of μmax and μmin, respectively. To address this issue, we may fix in advance the groups to be compared and consider instead the pair difference

PD=μ1μ2, (7)

which has as its estimator

PD^=Y1Y2, (8)

where Y1 and Y2 are estimators of μ1 and μ2, respectively.

2.2.2. Between-Group Variance (BGV)

BGV is calculated using the squares of the differences between the socioeconomic groups’ rates and the population mean rate, with weighting by the corresponding population share

BGV=j=1Jpj(μjμ¯)2, (9)

where

pj=njs=1Jns (10)

is the population share of the j-th socioeconomic group (treated as essentially known, i.e., estimated with negligible sampling error), and

μ¯=j=1Jpjμj (11)

is the population mean rate. The estimator of BGV is given by

BGV^=j=1Jpj(YjY¯)2, (12)

where Y¯=j=1JpjYj.

2.2.3. Range Ratio (RR) and Pair Ratio (PR)

The RR is similar to the RD, where we replace the subtraction with division. It is defined as

RR=μmaxμmin, (13)

where μmax and μmin are defined in (5). It is estimated by

RR^=Y(J)Y(1), (14)

where Y(1) and Y(J) are defined in (6).

Similarly to PD, PR is defined as

PR=μ1μ2, (15)

and is estimated by

PR^=Y1Y2, (16)

where Y1 and Y2 are estimators of μ1 and μ2, respectively.

2.2.4. Relative Concentration Index (RCI) and Extended Relative Concentration Index (eRCI)

RCI is a measure that can be used only with ordinal socioeconomic groups. It is defined by Kakwani et al., 1997 [11] as

RCI=2μ¯j=1Jpjzjμj1, (17)

where μ¯ and pj are defined in (11) and (10), respectively. Here, zj is the relative rank of the j-th ordinal socioeconomic group, defined as

zj=k=1jpk12pj. (18)

RCI is estimated by

RCI^=2Y¯j=1JpjzjYj1, (19)

where Y¯, pj and zj are defined in (12), (10) and (18), respectively.

Yu et al., 2019 [12] used eRCI as a measure of health disparities. It can be calculated as

eRCI=νj=1Jpj(1zj)ν1νμ¯j=1Jpjμj(1zj)ν1, (20)

where ν>0 is the aversion parameter, and μj, μ¯, pj and zj are the same as in (17). The estimator is

eRCI^=νj=1Jpj(1zj)ν1νY¯j=1JpjYj(1zj)ν1. (21)

If ν=2 we obtain RCI. In this article, we use ν=3 for eRCI.

2.2.5. Absolute Concentration Index (ACI) and Extended Absolute Concentration Index (eACI)

ACI is the absolute version of RCI. It has the following formula

ACI=μ¯RCI=j=1Jpj(2zj1)μj, (22)

which can be estimated by

ACI^=j=1Jpj(2zj1)Yj, (23)

where pj and zj are defined in (10) and (18), respectively.

Yu et al. 2019 [12] used eACI as a measure of health disparities. It can be calculated as

eACI=μ¯eRCI=νμ¯j=1Jpj(1zj)ν1νj=1Jpjμj(1zj)ν1, (24)

where ν>0 is the aversion parameter, and μj, μ¯, pj and zj are the same as in (22). The estimator is

eACI^=νY¯j=1Jpj(1zj)ν1νj=1JpjYj(1zj)ν1. (25)

If ν=2 we obtain ACI. In this article, we use ν=3 for eACI.

2.2.6. Slope Index of Inequality (SII)

SII measures the difference in rates between a hypothetical person with zj=1 and a hypothetical person with zj=0. It was introduced by Preston, Haines and Pamuk, 1981 [13] using a simple linear regression model

E(Yj|zj)=β0+β1zj, (26)

where zj is defined in (18) and SII = β1.

Since the regression is run on grouped data, SII is estimated using the least squares weighted by the population shares pj

SII^=j(pjzjYj)j(pjzj)j(pjYj)j(pjzj2)j(pjzj)2, (27)

where pj is defined in (10).

2.2.7. Index of Disparity (IDisp)

The index of disparity (IDisp) measures the relative difference between the rates of the socioeconomic groups and a reference rate as a proportion of the reference rate. It was first introduced by Pearcy and Keppel, 2002 [14] as

IDispPK=1Jj=1J|μjμ¯|μ¯×100. (28)

A version of IDisp is replacing the population mean rate, μ¯, with the rate of a reference group, μref, which is

IDisp=1J1j=1,jrefJ|μjμref|μref×100, (29)

The corresponding estimator is

IDisp^=1J1j=1,jrefJ|YjYref|Yref×100, (30)

where Yref is the estimator of μref. To eliminate the absolute values from the formula, HD*Calc recommends the use of the group with the smallest rate as the reference group.

2.2.8. Mean Log Deviation (MLD)

MLD is defined as

MLD=j=1Jpjlogγj=log(μ¯)j=1Jpjlog(μj), (31)

where μ¯ and pj are defined in (11) and (10), respectively, and

γj=μjμ¯ (32)

is the ratio of the rate of the j-th socioeconomic group and the population mean rate. It is estimated by

MLD^=log(Y)¯j=1Jpjlog(Yj). (33)

2.2.9. Theil’s Index (T)

T is similar to MLD but it uses a different disproportionality function. It is defined as

T=j=1Jpjγjlog(γj), (34)

where pj and γj are defined in (10) and (32), respectively. It is estimated by

T^=j=1JpjYjY¯logYjY¯. (35)

2.2.10. Relative Index of Inequality (RII)

RII is obtained by dividing SII by the population mean rate [15]

RII=SIIμ¯=β1μ¯, (36)

where μ¯ and β1 are defined in (11) and (26), respectively. It is estimated by

RII^=1j(pjzj2)j(pjzj)2j(pjzjYj)Y¯j(pjzj). (37)

2.2.11. Kunst–Mackenbach Relative Index (KMI)

Mackenbach and Kunst, 1997 [16] proposed an alternative to RII by dividing the rate of a hypothetical person with zj=0 by the rate of a hypothetical person with zj=1

KMI=β0β0+β1, (38)

where β0 and β1 are defined in (26). It is estimated by

KMI^=β^0β^0+SII^, (39)

where SII^ is calculated in (27) and β^0 can be obtained as

β^0=Y¯SII^×z¯, (40)

where Y¯ is defined in (12) and

z¯=j=1Jpjzj. (41)

2.3. Confidence Intervals Based on the Classical Method

The classical method used for variance estimation for the majority of the measures of health disparities implemented in HD*Calc is the Delta method. If θ^=F(Y) is an estimator of the true measure of health disparities θ, we approximate F by using a first-order Taylor series approximation around μ and then

Var(θ^)Varj=1JFyjyj=μj(Yjμj),

where μ=(μ1,,μJ) is the mean of Y. Assuming that the J socioeconomic groups are independent, we obtain

Var(θ^)j=1JFyjyj=μj2σj2, (42)

where σ2=(σ12,,σJ2) is the main diagonal of the variance-covariance matrix of Y. We substitute the unknown parameters μj and σj2 with their estimates to obtain Var^(θ^), and then construct corresponding Wald confidence intervals for θ. Detailed derivations of the formulas for the estimated variances may be found in Ahn et al., 2018 [7] for 11 of the 15 measures of health disparities implemented in HD*Calc (all measures except eACI, eRCI, PD and PR) and on the HD*Calc website [4] for all 15 measures.

2.4. Fiducial Intervals

In this section, we describe new methods to construct fiducial intervals for measures of health disparities based on the use of approximate fiducial quantities. The fiducial inference is an approach to inference introduced by Fisher that has good frequentist properties [17,18].

2.4.1. Fiducial Quantities (FQs)

Following Krishnamoorthy and Lee, 2010 [9], for an observed value mjk of the number of events Djk, we have the equalities

Pr(Djkmjk|λjk)=Prχ2mjk22njk<λjk|mjk, (43)

and

Pr(Djkmjk|λjk)=Prχ2mjk+222njk>λjk|mjk, (44)

where χd2 is a random variable following a chi-squared distribution with d degree of freedom. Garwood 1936 [19] proposed a related exact confidence interval for a Poisson mean

12njkχ2mjk;α/22,12njkχ2mjk+2;1α/22. (45)

Cox 1953 [20] introduced an approximate FQ for λjk, χ2mjk+122njk. A related approximate fiducial interval is

12njkχ2mjk+1;α/22,12njkχ2mjk+1;1α/22. (46)

Dempster 2008 [21] proposed another approximate FQ for λjk, a 50-50 mixture of χ2mjk22njk and χ2mjk+222njk.

An approximate FQ for a function of λs may be obtained by replacing the λs with their FQs in the function [18]. In our case, each measure of health disparities can be expressed as a function h(·) of λjks, and an approximate FQ for

h(λ11,,λ1K;;λJ1,,λJK), (47)

is obtained as

h(λ^11,,λ^1K;;λ^J1,,λ^JK), (48)

where λ^jk is an approximate FQ of λjk.

2.4.2. Simulation-Based Methods to Construct Fiducial Intervals

We use the above approximate FQs to construct three different fiducial intervals (FIs):

  • FI1.

    Simulate λ^jk from χmjk+122njk;

  • FI2.

    Simulate λ^jk from either χmjk22njk or χmjk+222njk, each with a 50% probability;

  • FI3.

    Simulate λ^jk from both χmjk22njk and χmjk+222njk.

For each method, we plug in the simulated λ^jks into the function h(·) to obtain the simulated values of the measures of health disparities h(λ^11,,λ^1K;;λ^J1,,λ^JK). After performing B simulations, a 95% FI is constructed using the 2.5 and 97.5 percentiles of the set of simulated values for the measures of health disparities. For cells where no event is observed, i.e., mjk=0, we follow Zhang et al. 2014 [22] and use

λ^jk=1/njknjk+1. (49)

2.5. Monte Carlo Simulation-Based Methods (MCS)

For the Monte Carlo simulation-based methods, we simulate values for the age-adjusted rates, μj, instead of values for the cell rates λjk, as performed for the previously described fiducial methods, either from a truncated Normal distribution (MCS-N) or a Gamma distribution (MCS-G). The mean and the variance of the distribution from which we simulate values are the estimated mean and the estimated variance of the estimator of μj. The use of these two distributions ensures that all simulated values are non-negative. When using the truncated Normal distribution, we simulate from a Normal distribution and discard the negative simulated values, i.e., keep only the non-negative simulated values. The adjustment (49) for zero counts is also applied. After we simulate values for μ^j, we use them to calculate the simulated values for the measures of health disparities. After performing B simulations, the 95% CI is constructed using the 2.5 and 97.5 percentiles of the set of simulated values for the measures of health disparities.

2.6. Simulation Study

We simulated data under nine different scenarios to allow different combinations of sample sizes and true rates per cell (where the cells are all cross-classifications of age groups and socioeconomic groups). For each scenario, we simulated data for the 12 cells that correspond to the combinations of three ordered socioeconomic groups and four age groups. Fixed weights, according to the WHO World Standard were applied to each age group. Table 1 describes the characteristics of the nine scenarios, with the means and standard deviations (SDs) being calculated across the 12 cells. The combinations of sample sizes and true rates per cell resulted in five categories for the magnitude of the expected count per cell, i.e., <1, 1–9, 10–99, 100–999, and 1000–9999.

Table 1.

Characteristics of the nine scenarios.

Scenario Sample Size
Mean (SD)
True Rate
Mean (SD)
Expected Event Count
Mean (SD)
Magnitude of Expected Event Count
1 2417 (1084) 0.0003 (0.0002) 0.8 (0.784) <1
2 2417 (1084) 0.003 (0.002) 8 (7.84) 1–9
3 24,167 (10,836) 0.0003 (0.0002) 8 (7.84) 1–9
4 2417 (1084) 0.03 (0.02) 80 (78.4) 10–99
5 24,167 (10,836) 0.003 (0.002) 80 (78.4) 10–99
6 241,667 (108,363) 0.0003 (0.0002) 80 (78.4) 10–99
7 24,167 (10,836) 0.03 (0.02) 800 (784) 100–999
8 241,667 (108,363) 0.003 (0.002) 800 (784) 100–999
9 241,667 (108,363) 0.03 (0.02) 8000 (7839) 1000–9999

For each scenario, we generated 5000 datasets, and for each dataset, we used 5000 simulations to construct the 95% MCS-based CIs and the 95% FIs. The empirical coverage was defined as the frequency of the true value of the measure of health disparities being covered by the nominal 95% CIs or FIs.

3. Results

We start with the results for scenario 1, which corresponds to a situation involving extremely sparse data, i.e., where the expected count per cell is below 1. The empirical coverage results are presented in Table 2 and Figure 1. For eACI, the Classic method, FI1 and FI2 had empirical coverages considerably below the nominal 95% level; FI1 and FI2 had the same problem for eRCI. The MCS-N method had only about 91% empirical coverage for PD, while FI3 had very large empirical coverage ranging from 99% to 100% for 11 of the 15 measures. By contrast, the MCS-G method performed reasonably well for all 15 measures for this scenario.

Table 2.

Empirical coverage (%) for nominal 95% confidence intervals and fiducial intervals for scenario 1.

Measure Classical MCS-N MCS-G FI1 FI2 FI3
RD 97.46 97.00 97.74 99.30 99.38 99.26
PD 93.28 90.86 94.12 98.30 99.32 98.60
BGV 94.26 96.74 97.94 99.42 99.46 99.38
ACI 99.64 98.34 97.72 99.98 100.0 99.96
eACI 82.82 95.04 95.08 52.20 64.10 94.52
SII 99.64 98.34 97.72 99.98 100.0 99.96
RR 100.0 99.36 97.76 100.0 100.0 100.0
PR 96.62 99.04 97.66 91.12 95.56 95.64
IDisp 99.40 99.04 97.50 100.0 100.0 100.0
MLD 93.72 99.32 97.88 99.98 100.0 100.0
RCI 97.50 98.30 97.54 99.98 100.0 100.0
eRCI 100.0 97.16 97.22 71.58 80.94 95.20
T 93.14 99.60 98.54 99.78 99.98 99.98
RII 97.50 98.30 97.54 99.98 100.0 100.0
KMI 95.54 99.76 99.62 99.98 100.0 100.0

Figure 1.

Figure 1

Empirical coverage results for scenario 1. The dashed line shows 95% coverage.

The results for scenarios 2 and 3 were very similar to each other. They both correspond to situations involving sparse (but not extremely sparse) data, where the expected count per cell is between 1 and 10. The empirical coverage results are shown in Table 3 and Figure 2 for scenario 2, and in Table 4 and Figure 3 for scenario 3, respectively. For both scenarios, the Classic method still had empirical coverages considerably below the nominal 95% level for eACI. FI1 and FI2 had the same problem for eACI, but to a much lesser extent, with empirical coverages of about 92%. Overall, FI3 performed best for the 15 measures, followed closely by MCS-G and MCS-N.

Table 3.

Empirical coverage (%) for nominal 95% confidence intervals and fiducial intervals for scenario 2.

Measure Classical MCS-N MCS-G FI1 FI2 FI3
RD 95.70 95.76 95.26 96.08 96.24 96.06
PD 95.28 95.38 95.26 95.92 96.10 95.90
BGV 94.08 95.60 95.18 95.98 96.28 95.98
ACI 94.68 94.76 94.06 95.68 96.48 95.74
eACI 78.94 94.60 94.82 91.80 92.36 95.68
SII 94.68 94.76 94.06 95.68 96.48 95.74
RR 97.08 93.34 93.80 97.04 97.64 98.36
PR 95.52 94.14 94.30 94.72 95.42 96.10
IDisp 98.36 93.32 94.58 98.10 98.40 98.78
MLD 94.80 92.98 93.00 95.98 96.54 97.44
RCI 94.14 94.76 94.06 95.62 96.34 95.64
eRCI 100.0 94.50 94.84 93.22 93.86 95.58
T 93.48 93.64 93.36 95.16 95.90 97.10
RII 94.14 94.76 94.06 95.62 96.34 95.64
KMI 94.90 94.76 94.06 95.62 96.34 95.64

Figure 2.

Figure 2

Empirical coverage results for scenario 2. The dashed line shows 95% coverage.

Table 4.

Empirical coverage (%) for nominal 95% confidence intervals and fiducial intervals for scenario 3.

Measure Classical MCS-N MCS-G FI1 FI2 FI3
RD 95.70 95.76 95.26 96.08 96.24 96.06
PD 95.28 95.38 95.26 95.92 96.10 95.90
BGV 94.08 95.60 95.18 95.98 96.28 95.98
ACI 94.68 94.76 94.06 95.68 96.48 95.74
eACI 78.70 94.58 94.84 91.80 92.34 95.66
SII 94.68 94.76 94.06 95.68 96.48 95.74
RR 97.08 93.34 93.80 97.04 97.64 98.36
PR 95.52 94.14 94.30 94.72 95.42 96.10
IDisp 98.36 93.32 94.58 98.10 98.40 98.78
MLD 94.80 92.98 93.00 95.98 96.54 97.44
RCI 94.14 94.76 94.06 95.62 96.34 95.64
eRCI 100.0 94.50 94.84 93.22 93.86 95.58
T 93.48 93.64 93.36 95.16 95.90 97.10
RII 94.14 94.76 94.06 95.62 96.34 95.64
KMI 94.90 94.76 94.06 95.62 96.34 95.64

Figure 3.

Figure 3

Empirical coverage results for scenario 3. The dashed line shows 95% coverage.

For scenarios 4 to 9, where the data may not be considered sparse by having an expected count per cell of 10 or more, the empirical coverages are between 94% and 96% for all methods and all 15 measures, except for the Classical method for eACI (where they ranged from 79% to 83%) and eRCI (where they were all 100%). For these scenarios, all methods except the Classical method performed well. The complete results regarding the empirical coverage are shown in Appendix A.

4. Discussion

We compared six methods to construct confidence intervals and fiducial intervals for 15 measures of health disparities with regard to their empirical coverage under nine different scenarios. Overall, two methods performed well: the MCS-G method to construct confidence intervals and the FI3 method to construct fiducial intervals. It is important to note that the documentation for HD*Calc version 2.0.0 also recommends the use of the Monte Carlo simulation-based method with the Gamma distribution based on the results from Ahn et al., 2019 [8] regarding 11 measures of health disparities. Compared to the Normal distribution, the Gamma distribution is a better choice to use for simulating rates due to its positivity. Moreover, its flexibility in accommodating asymmetry surpasses that of a truncated Normal distribution.

The strengths of the current study include the addition of four measures (eACI, eRCI, PD and PR) to the list of 11 measures of health disparities previously investigated, and the consideration of different scenarios corresponding to different combinations of sample sizes and true rates per cell. The limitations, due to feasibility reasons, include the consideration of eACI and eRCI only when the aversion parameter ν=3, the use of only one value for the number of simulations used for the MCS-based methods and the fiducial methods, i.e., 5000 simulations, the use of an ordinal socioeconomic group variable with only three levels, and the use of an age group variable with only four levels.

Future research work should consider eACI and eRCI with other values of the aversion parameter, a larger number of socioeconomic groups and age groups, and different numbers of simulations for the MCS-based methods and the fiducial methods. Building upon the work from Talih et al., 2020 [23], related future research should also investigate if it is possible to reduce a large number of measures of health disparities to a smaller set of measures that satisfy a set of desirable properties and are easier to interpret. With a smaller number of recommended measures of health disparities, it would be easier to thoroughly compare the performance of statistical methods to construct confidence intervals and fiducial intervals for these selected measures.

5. Conclusions

Given that the MCS-G method is much simpler to understand and implement than the FI3 method, and the lack of familiarity of statisticians and (more importantly) practitioners with fiducial methods and fiducial intervals, we recommend the use of the Monte Carlo simulation-based method with the Gamma distribution.

Abbreviations

The following abbreviations are used in this manuscript:

ACI absolute concentration index
BGV between-group variance
CDC Centers for Disease Control and Prevention
CI confidence interval
eACI extended absolute concentration index
eRCI extended relative concentration index
FI fiducial interval
FQ fiducial quantity
HD*Calc Health Disparity Calculator
IDisp index of disparity
KMI Kunst-Mackenbach relative index
MCS Monte Carlo simulation
MCS-N Monte Carlo simulation-based using a truncated Normal distribution
MCS-G Monte Carlo simulation-based using the Gamma distribution
MLD mean log deviation
PD pair difference
PR pair ratio
RCI relative concentration index
RD range difference
RII relative index of inequality
RR range ratio
SEER Surveillance, Epidemiology, and End Results
SII slope index of inequality
T Theil’s index
WHO World Health Organization

Appendix A. Results of the Simulation Study

Table A1.

Empirical coverage (%) for nominal 95% confidence intervals and fiducial intervals for all scenarios.

Measure Scenario Classical MCS-N MCS-G FI1 FI2 FI3
PD 1 93.28 90.86 94.12 98.30 99.32 98.60
2 95.28 95.38 95.26 95.92 96.10 95.90
3 95.28 95.38 95.26 95.92 96.10 95.90
4 95.28 95.32 95.40 95.40 95.20 95.30
5 95.28 95.32 95.40 95.40 95.20 95.30
6 95.28 95.32 95.40 95.40 95.20 95.30
7 94.66 94.74 94.78 94.74 94.76 94.72
8 94.66 94.74 94.78 94.74 94.76 94.72
9 94.80 94.78 94.80 94.78 94.78 94.92
RD 1 97.46 97.00 97.74 99.30 99.38 99.26
2 95.70 95.76 95.26 96.08 96.24 96.06
3 95.70 95.76 95.26 96.08 96.24 96.06
4 95.28 95.36 95.40 95.40 95.18 95.28
5 95.28 95.36 95.40 95.40 95.18 95.28
6 95.28 95.36 95.40 95.40 95.18 95.28
7 94.66 94.74 94.78 94.74 94.76 94.72
8 94.66 94.74 94.78 94.74 94.76 94.72
9 94.80 94.78 94.80 94.78 94.78 94.92
BGV 1 94.26 96.74 97.94 99.42 99.46 99.38
2 94.08 95.60 95.18 95.98 96.28 95.98
3 94.08 95.60 95.18 95.98 96.28 95.98
4 95.06 95.06 95.12 95.10 95.12 95.08
5 95.06 95.06 95.12 95.10 95.12 95.08
6 95.06 95.06 95.12 95.10 95.12 95.08
7 95.24 95.40 95.18 95.36 95.30 95.34
8 95.24 95.40 95.18 95.36 95.30 95.34
9 94.80 94.76 94.88 94.86 94.84 94.82
ACI 1 99.64 98.34 97.72 99.98 100.0 99.96
2 94.68 94.76 94.06 95.68 96.48 95.74
3 94.68 94.76 94.06 95.68 96.48 95.74
4 95.40 95.48 95.12 95.40 95.56 95.40
5 95.40 95.48 95.12 95.40 95.56 95.40
6 95.40 95.48 95.12 95.40 95.56 95.40
7 94.54 94.48 94.54 94.68 94.42 94.54
8 94.54 94.48 94.54 94.68 94.42 94.54
9 94.62 94.70 94.72 94.60 94.62 94.64
eACI 1 82.82 95.04 95.08 52.20 64.10 94.52
2 78.94 94.60 94.82 91.80 92.36 95.68
3 78.70 94.58 94.84 91.80 92.34 95.66
4 81.46 94.42 94.60 94.10 94.20 94.40
5 79.38 94.50 94.62 94.06 94.10 94.40
6 79.24 94.52 94.62 94.04 94.12 94.40
7 81.78 95.28 95.20 95.24 95.36 95.26
8 80.14 95.22 95.20 95.24 95.32 95.30
9 83.10 95.54 95.54 95.56 95.52 95.46
SII 1 99.64 98.34 97.72 99.98 100.0 99.96
2 94.68 94.76 94.06 95.68 96.48 95.74
3 94.68 94.76 94.06 95.68 96.48 95.74
4 95.40 95.48 95.12 95.40 95.56 95.40
5 95.40 95.48 95.12 95.40 95.56 95.40
6 95.40 95.48 95.12 95.40 95.56 95.40
7 94.54 94.48 94.54 94.68 94.42 94.54
8 94.54 94.48 94.54 94.68 94.42 94.54
9 94.62 94.70 94.72 94.60 94.62 94.64
PR 1 96.62 99.04 97.66 91.12 95.56 95.64
2 95.52 94.14 94.30 94.72 95.42 96.10
3 95.52 94.14 94.30 94.72 95.42 96.10
4 95.66 95.64 95.44 95.36 95.62 95.62
5 95.66 95.64 95.44 95.36 95.62 95.62
6 95.66 95.64 95.44 95.36 95.62 95.62
7 94.80 94.88 94.80 94.68 94.80 94.84
8 94.80 94.88 94.80 94.68 94.80 94.84
9 94.94 94.84 94.82 94.92 94.98 94.98
RR 1 100.0 99.36 97.76 100.0 100.0 100.0
2 97.08 93.34 93.80 97.04 97.64 98.36
3 97.08 93.34 93.80 97.04 97.64 98.36
4 95.68 95.70 95.54 95.44 95.70 95.68
5 95.68 95.70 95.54 95.44 95.70 95.68
6 95.68 95.70 95.54 95.44 95.70 95.68
7 94.80 94.88 94.80 94.68 94.80 94.84
8 94.80 94.88 94.80 94.68 94.80 94.84
9 94.94 94.84 94.82 94.92 94.98 94.98
IDisp 1 99.40 99.04 97.50 100.0 100.0 100.0
2 98.36 93.32 94.58 98.10 98.40 98.78
3 98.36 93.32 94.58 98.10 98.40 98.78
4 95.14 95.58 95.42 95.32 95.48 95.56
5 95.14 95.58 95.42 95.32 95.48 95.56
6 95.14 95.58 95.42 95.32 95.48 95.56
7 94.80 94.44 94.64 94.70 94.58 94.68
8 94.80 94.44 94.64 94.70 94.58 94.68
9 94.92 94.88 94.82 94.86 94.88 94.96
MLD 1 93.72 99.32 97.88 99.98 100.0 100.0
2 94.80 92.98 93.00 95.98 96.54 97.44
3 94.80 92.98 93.00 95.98 96.54 97.44
4 95.70 95.06 95.10 95.74 95.82 96.06
5 95.70 95.06 95.10 95.74 95.82 96.06
6 95.70 95.06 95.10 95.74 95.82 96.06
7 95.02 95.14 95.02 94.98 95.00 95.06
8 95.02 95.14 95.02 94.98 95.00 95.06
9 94.84 94.78 94.92 94.70 94.84 94.80
RCI 1 97.50 98.30 97.54 99.98 100.0 100.0
2 94.14 94.76 94.06 95.62 96.34 95.64
3 94.14 94.76 94.06 95.62 96.34 95.64
4 95.28 95.46 95.04 95.22 95.56 95.34
5 95.28 95.46 95.04 95.22 95.56 95.34
6 95.28 95.46 95.04 95.22 95.56 95.34
7 94.60 94.62 94.64 94.62 94.64 94.70
8 94.60 94.62 94.64 94.62 94.64 94.70
9 94.90 94.86 94.88 94.78 94.84 94.82
eRCI 1 100.0 97.16 97.22 71.58 80.94 95.20
2 100.0 94.50 94.84 93.22 93.86 95.58
3 100.0 94.50 94.84 93.22 93.86 95.58
4 100.0 95.02 95.14 95.08 95.10 95.46
5 100.0 95.02 95.14 95.08 95.10 95.46
6 100.0 95.02 95.14 95.08 95.10 95.46
7 100.0 94.82 94.76 94.80 94.86 94.86
8 100.0 94.82 94.76 94.80 94.86 94.86
9 100.0 95.38 95.40 95.38 95.12 95.28
T 1 93.14 99.60 98.54 99.78 99.98 99.98
2 93.48 93.64 93.36 95.16 95.90 97.10
3 93.48 93.64 93.36 95.16 95.90 97.10
4 95.52 95.12 95.20 95.64 95.70 95.92
5 95.52 95.12 95.20 95.64 95.70 95.92
6 95.52 95.12 95.20 95.64 95.70 95.92
7 95.20 95.06 95.06 95.16 95.16 95.16
8 95.20 95.06 95.06 95.16 95.16 95.16
9 94.88 94.82 94.92 94.78 94.76 94.84
RII 1 97.50 98.30 97.54 99.98 100.0 100.0
2 94.14 94.76 94.06 95.62 96.34 95.64
3 94.14 94.76 94.06 95.62 96.34 95.64
4 95.28 95.46 95.04 95.22 95.56 95.34
5 95.28 95.46 95.04 95.22 95.56 95.34
6 95.28 95.46 95.04 95.22 95.56 95.34
7 94.60 94.62 94.64 94.62 94.64 94.70
8 94.60 94.62 94.64 94.62 94.64 94.70
9 94.90 94.86 94.88 94.78 94.84 94.82
KMI 1 95.54 99.76 99.62 99.98 100.0 100.0
2 94.90 94.76 94.06 95.62 96.34 95.64
3 94.90 94.76 94.06 95.62 96.34 95.64
4 95.18 95.46 95.04 95.22 95.56 95.34
5 95.18 95.46 95.04 95.22 95.56 95.34
6 95.18 95.46 95.04 95.22 95.56 95.34
7 94.66 94.62 94.64 94.62 94.64 94.70
8 94.66 94.62 94.64 94.62 94.64 94.70
9 94.84 94.86 94.88 94.78 94.84 94.82

Author Contributions

Conceptualization, T.L., A.D.D. and G.L.; methodology, T.L. and G.L.; software, T.L.; formal analysis, T.L.; investigation, T.L., A.D.D. and G.L.; writing—original draft preparation, T.L.; writing—review and editing, A.D.D. and G.L.; visualization, T.L. All authors have read and agreed to the published version of the manuscript.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The code used to simulate the data for this study is available upon request from the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

Funding Statement

This research received no external funding.

Footnotes

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

References

  • 1.U.S. Department of Health and Human Services, Office of Disease Prevention and Health Promotion Healthy People 2020. [(accessed on 1 September 2023)];2010 Available online: https://www.cdc.gov/nchs/healthy_people/hp2020.htm. [PubMed]
  • 2.Marmot M., Friel S., Bell R., Houweling T.A., Taylor S., on behalf of the Commission on Social Determinants of Health Closing the gap in a generation: Health equity through action on the social determinants of health. Lancet. 2008;372:1661–1669. doi: 10.1016/S0140-6736(08)61690-6. [DOI] [PubMed] [Google Scholar]
  • 3.Truman B.I., Centers for Disease Control and Prevention CDC health disparities and inequalities report-United States, 2011. [(accessed on 1 September 2023)];Mickey Leland Cent. Inf. Portal. 2011 4 Available online: https://digitalscholarship.tsu.edu/mlcejs_info/4/ [Google Scholar]
  • 4.Division of Cancer Control and Population Sciences, National Cancer Institute Health Disparities Calculator (HD*Calc), Version 2.0.0. [(accessed on 1 September 2023)];2019 Available online: https://seer.cancer.gov/hdcalc/
  • 5.Breen N., Scott S., Percy-Laurry A., Lewis D., Glasgow R. Health disparities calculator: A methodologically rigorous tool for analyzing inequalities in population health. Am. J. Public Health. 2014;104:1589–1591. doi: 10.2105/AJPH.2014.301982. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Harper S., Lynch J. Methods for Measuring Cancer Disparities: Using Data Relevant to Healthy People 2010 Cancer-Related Objectives. National Cancer Institute; Bethesda, MD, USA: 2005. (NCI Cancer Surveillance Monograph Series, No. 6). Technical Report. [Google Scholar]
  • 7.Ahn J., Harper S., Yu M., Feuer E.J., Liu B., Luta G. Variance Estimation and Confidence Intervals for 11 Commonly Used Health Disparity Measures. JCO Clin. Cancer Inform. 2018;2:2. doi: 10.1200/CCI.18.00031. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Ahn J., Harper S., Yu M., Feuer E.J., Liu B. Improved Monte Carlo methods for estimating confidence intervals for eleven commonly used health disparity measures. PLoS ONE. 2019;14:e0219542. doi: 10.1371/journal.pone.0219542. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Krishnamoorthy K., Lee M. Inference for functions of parameters in discrete distributions based on fiducial approach: Binomial and Poisson cases. J. Stat. Plan. Inference. 2010;140:1182–1192. doi: 10.1016/j.jspi.2009.11.004. [DOI] [Google Scholar]
  • 10.Harper S., Lynch J. Selected Comparisons of Measures of Health Disparities: A Review Using Databases Relevant to Healthy People 2010 Cancer-Related Objectives. National Cancer Institute; Bethesda, MD, USA: 2007. (NCI Cancer Surveillance Monograph Series, No. 7). Technical Report. [Google Scholar]
  • 11.Kakwani N., Wagstaff A., Van Doorslaer E. Socioeconomic inequalities in health: Measurement, computation, and statistical inference. J. Econom. 1997;77:87–103. doi: 10.1016/S0304-4076(96)01807-6. [DOI] [Google Scholar]
  • 12.Yu M., Liu B., Li Y., Zou Z., Breen N. Statistical inferences of extended concentration indices for directly standardized rates. Stat. Med. 2019;38:62–73. doi: 10.1002/sim.7952. [DOI] [PubMed] [Google Scholar]
  • 13.Preston S.H., Haines M.R., Pamuk E. Effects of Industrialization and Urbanization on Mortality in Developed Countries. Department of Economics, Wayne State University; Detroit, MI, USA: 1981. [Google Scholar]
  • 14.Pearcy J.N., Keppel K.G. A Summary Measure of Health Disparity. Public Health Rep. 2002;117:273–280. doi: 10.1016/S0033-3549(04)50161-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Pamuk E.R. Social-class inequality in infant mortality in England and Wales from 1921 to 1980. Eur. J. Popul./Revue Européenne Démographie. 1988;4:1–21. doi: 10.1007/BF01797104. [DOI] [Google Scholar]
  • 16.Mackenbach J.P., Kunst A.E. Measuring the magnitude of socio-economic inequalities in health: An overview of available measures illustrated with two examples from Europe. Soc. Sci. Med. 1997;44:757–771. doi: 10.1016/S0277-9536(96)00073-1. [DOI] [PubMed] [Google Scholar]
  • 17.Fisher R.A. Inverse probability. Math. Proc. Camb. Philos. Soc. 1930;26:528–535. doi: 10.1017/S0305004100016297. [DOI] [Google Scholar]
  • 18.Fisher R.A. The fiducial argument in statistical inference. Ann. Eugen. 1935;6:391–398. doi: 10.1111/j.1469-1809.1935.tb02120.x. [DOI] [Google Scholar]
  • 19.Garwood F. Fiducial limits for the Poisson distribution. Biometrika. 1936;28:437–442. [Google Scholar]
  • 20.Cox D. Some simple approximate tests for Poisson variates. Biometrika. 1953;40:354–360. doi: 10.1093/biomet/40.3-4.354. [DOI] [Google Scholar]
  • 21.Dempster A.P. The Dempster–Shafer calculus for statisticians. Int. J. Approx. Reason. 2008;48:365–377. doi: 10.1016/j.ijar.2007.03.004. [DOI] [Google Scholar]
  • 22.Zhang S., Luo J., Zhu L., Stinchcomb D.G., Campbell D., Carter G., Gilkeson S., Feuer E.J. Confidence intervals for ranks of age-adjusted rates across states or counties. Stat. Med. 2014;33:1853–1866. doi: 10.1002/sim.6071. [DOI] [PubMed] [Google Scholar]
  • 23.Talih M., Moonesinghe R., Huang D.T. Measuring the Magnitude of Health Inequality between 2 Population Subgroup Proportions. Am. J. Epidemiol. 2020;189:987–996. doi: 10.1093/aje/kwaa050. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

The code used to simulate the data for this study is available upon request from the corresponding author.


Articles from International Journal of Environmental Research and Public Health are provided here courtesy of Multidisciplinary Digital Publishing Institute (MDPI)

RESOURCES