On the Efficiency of Score Tests for Homogeneity in Two-Component Parametric Models for Discrete Data

David Todem; Wei-Wen Hsu; Kyung Mann Kim

doi:10.1111/j.1541-0420.2011.01737.x

. Author manuscript; available in PMC: 2014 Jan 27.

Published in final edited form as: Biometrics. 2012 Feb 20;68(3):975–982. doi: 10.1111/j.1541-0420.2011.01737.x

On the Efficiency of Score Tests for Homogeneity in Two-Component Parametric Models for Discrete Data

David Todem ^1,^*, Wei-Wen Hsu ², Kyung Mann Kim ³

PMCID: PMC3902182 NIHMSID: NIHMS518874 PMID: 22348298

Summary

In many applications of two-component mixture models for discrete data such as zero-inflated models, it is often of interest to conduct inferences for the mixing weights. Score tests derived from the marginal model that allows for negative mixing weights have been particularly useful for this purpose. But the existing testing procedures often rely on restrictive assumptions such as the constancy of the mixing weights and typically ignore the structural constraints of the marginal model. In this article, we develop a score test of homogeneity that overcomes the limitations of existing procedures. The technique is based on a decomposition of the mixing weights into terms that have an obvious statistical interpretation. We exploit this decomposition to lay the foundation of the test. Simulation results show that the proposed covariate-adjusted test statistic can greatly improve the efficiency over test statistics based on constant mixing weights. A real-life example in dental caries research is used to illustrate the methodology.

Keywords: Binary regression, Finite dimensional parameter, Goodness-of-fit test, Identifiable models, Likelihood function, Negative binomial models, Overdispersion, Score function

1. Introduction

In discrete data, when an observation exhibits an excessive frequency, it is a common practice to use two-component mixture models that combine an atom distribution with a proper distribution. Well known members of this class of models include the so-called zero-inflated models for independent count data first studied by Mullahy (1986), Farewell and Sprott (1988), and Lambert (1992) and their extensions to correlated count data by Hall (2000). Zero-inflated models have been extensively studied in statistical research and applied to data from various disciplines including agriculture, econometrics, medicine, engineering, sociology, and behavioral sciences. Ridout, Demetrio, and Hinde (1998) provide an extensive review of this literature.

Two-component mixture models provide an interesting parametric framework to accommodate heterogeneity in a population. A prevailing concern, however, is whether the inherent heterogeneity is consistent with observed data. Many authors have examined this important issue using two-sided score tests. In a Biometrics paper, van den Broek (1995, pp. 738–748) has developed a two-sidsed score test for homogeneity in zero-inflated models for count data. Others have extended this test to situations where the nondegenerate distribution is a member of the exponential family (Deng and Paul, 2000) and to clustered data (Xiang et al., 2006). The original test by van den Broek (1995) and its extensions were constructed under alternatives to homogeneity that allow for negative mixing weights. Specifically, they were derived under the marginal mixture model that ignores its hierarchical representation. Although this approach uses score functions that are well defined at the null value, it is often unclear how constraints generated from this marginal model are accommodated under the alternative. Most importantly, existing testing procedures typically do not fully exploit the general structure of the mixing weights. A common restriction is that constant mixing weights are often assumed under heterogeneity. This is an important limitation as covariate-dependent mixing weights are typical in real applications of these models. From a technical perspective, tests that rely on constant mixing weights may lack power to detect heterogeneity if both inflation and deflation are present in the population.

The problem of inflation and deflation, typically at zero, arises quite naturally in various applications. A good example is provided by a very unique survey designed to collect oral health information on low-income African American children (0–5 years) and their main caregivers (14+ years), living in the city of Detroit (see Tellez et al., 2006). This article aimed at promoting oral health and reducing its disparities within this community through the understanding of determinants of dental caries. Dental caries were measured using scores which represent the cumulative severity of the disease for each surveyed participants. These scores have well-documented shortcomings regarding their ability to describe dental caries experiences, but they continue to be instrumental in evaluating and comparing the risks of dental caries across population groups (Lewsey and Thomson, 2004). Most importantly, they remain popular in dental caries research for their ability to conduct historical comparisons in population-based studies. Homogeneous models when applied to these data typically reveal an inflation of zeros (few children with no dental caries predicted than observed) for younger children and deflation of zeros (more children with no dental caries predicted than observed) for older children. For these data, an analysis based on constant mixing weights under the alternative may fail to detect heterogeneity in this population.

Health services research is another area of applications where inflation and deflation are often encountered. For example, Hur et al. (2002) presented a study on patients undergoing a partial colectomy operation from 123 Veterans Affairs (VA) medical centers in the National Surgical Quality Improvement Program. The primary goal of this program was to improve surgical care for veteran patients by developing performance measures for surgery in the VA system. One of the primary study outcomes was the number of postoperative complications. It was reported that patients that are relatively healthier before the surgical operation, as measured by the classification from the American Society of Anesthesiologists, will likely have no complications, although their sicker counterparts will be associated with higher numbers of complications. For such data, a homogenous model will exhibit an inflation at zero for relatively healthier patients and a deflation at zero for the sickest patients.

The examples above then draw into question the efficiency of testing procedures which assume constancy of the mixing weights. In this article, we suggest an extension of the score test of homogeneity to covariates. A recent illustration of this approach for zero-inflated models is given by Jansakul and Hinde (2009). These authors related the mixing weights to covariates using an identity link function. A limitation, however, is that the identity link function is seldom used in practice for this class of models. Moreover, the structure of the marginal model is not properly integrated into the testing procedure. As a solution, we formulate and develop a score test of homogeneity based on an intuitive approach that accommodates the features of the marginal model. Specifically, we embed the structural constraints of the marginal model into the procedure. The technique is based on decomposing the mixing weights into terms that have an obvious statistical interpretation. One appealing feature of this decomposition is that it naturally incorporates covariates into the mixing weights. We exploit this decomposition to lay the foundation of the test. We show that this is a natural strategy to adopt when testing is carried out under the marginal model that ignores its hierarchical representation. The proposed test in its current formulation can be naturally used to study heterogeneity at any point of the population and for any nondegenerate parametric distributions that are not necessarily members of the exponential family.

The remainder of this article is organized as follows. In Section 2, we specify the homogeneity hypothesis using composite link functions and develop a score test that accommodates the features of the marginal model. In Section 3, the empirical performance of the proposed test is studied and its real-life applications are illustrated using dental caries counts in young children. Some remaining issues are discussed in Section 4. Additional technical details are contained in a Web supplementary materials file.

2. The Method

2.1 Hypothesis Formulation and Reparameterization

We consider a sample of n independent realizations ${y_{i}}_{i = 1}^{n}$ , of random variables ${Y_{i}}_{i = 1}^{n}$ each drawn from a mixture of an atom distribution at a known point y^* and an unknown nondegenerate discrete distribution G governed by a finite dimensional parameter vector θ,

Pr (Y_{i} = y_{i}) = {\begin{cases} ω_{i} + (1 - ω_{i}) g_{i} (y^{*}) & if y_{i} = y^{*} \\ (1 - ω_{i}) g_{i} (y_{i}) & if y_{i} \neq y^{*} . \end{cases}

(1)

Here ω_i is an unknown mixing weight and g_i (.) is a probability mass associated with G. The mixture model in equation(1) allows for two types of representations, depending on the support of ω_i. Under its hierarchical representation, the mixing weight is a probability mass, requiring the distributional constraints 0 ≤ ω_i ≤ 1, i = 1,…, n. Under the marginal representation, however, the only distributional constraints are that 0 ≤ Pr(Y_i = y_i) ≤ 1, for all i = 1, …, n, under the mixture model equation (1), which result in constraints on mixing weights,

- g_{i} (y^{*}) / (1 - g_{i} (y^{*})) \leq ω_{i} \leq 1, i = 1, \dots, n .

(2)

The marginal model can then be used to accommodate both inflation (positive ω_i) and deflation (negative ω_i) at y^*, a property not shared by its hierarchical counterpart. The marginal model, however, does not allow any hierarchical interpretation of the mixture model when the mixing weight is negative. For further discussions on the interpretation of these models when y^* = 0, see Heilbron (1994).

Suppose we are interested in evaluating the hypothesis of zero mixing weights. Under the marginal model that allows for negative mixing weights, one is typically interested in the two-sided hypotheses,

H_{0} : ω_{i} = 0, for all i vs . H_{1} : ω_{i} \neq 0, for some i,

, where ω_i satisfies constraints in equation (2). To test these hypotheses, we consider a suitable transformation of ω_i that incorporates the constraints in equation (2) into the testing scheme. A natural transformation in light of these constraints is given by,

ω_{i} = \frac{π_{i} - g_{i} (y^{*})}{1 - g_{i} (y^{*})}, 0 \leq π_{i} \leq 1.

(3)

Evidently, the lower and upper bounds of ω_i in equation (2) are attained at points 0 and 1 of π_i, respectively. The quantity π_i has an obvious statistical interpretation. To see this, consider a binary version of Y_i denoted δ(Y_i) taking value 1 if Y_i = y^* and 0 otherwise. The term π_i is simply the first moment E{δ(Y_i)} = Pr(Y_i = y^*) under the marginal mixture model equation (1), which can be estimated very well if many independent copies δ(y_i) of δ(Y_i) are available. The transformation as specified in equation (3) then arises from equating the probability mass at y^* as predicted by the mixture model in equation (1) to the first moment of δ(Y_i). That is, E{δ(Y_i)} = ω_i + (1 − ω_i)g_i (y^*), where g_i (y^*) is the probability Pr(Y_i = y^*) under the distribution G.

Based on the parameterization equation (3), we formally state the hypotheses as,

H_{0} : π_{i} = g_{i} (y^{*}), for all i vs . H_{1} : π_{i} \neq g_{i} (y^{*}), for some i .

We exploit this reformulation to lay the foundation of the test. We consider a suitable parameterization of π_i and g_i (y^*) that reduces the homogeneity hypothesis above to a problem involving a small number of parameters. For this, we assume that μ_i the mean of Y_i under the nondegenerate parametric distribution G is finite. We express the probability mass at y^* with respect to the distribution G as a function of μ_i, that is g_i (y^*) = g̃(μ_i), where g̃(.) is a bounded function between 0 and 1, which depends on θ and y^*. If the data can be arranged to form strata with a finite, possibly small, number of terms π_i and μ_i, the homogeneity hypothesis naturally reduces to a problem involving a small number of parameters. This stratified approach has its limitations when the number of strata or more generally when the number of covariates is increasing, however. Alternatively, regression techniques relating μ_i and π_i to some observed covariates can be used to address the curse of dimensionality. For this, consider two-column vector covariates x_i and z_i, of dimensions p and q, respectively, that are also observed alongside the responses y_i, i = 1, 2, …, n. We relate μ_i to covariates x_i as follows, $h (μ_{i}) = x_{i}^{'} β$ , where β is a parameter vector of dimension p × 1 and a subset of θ and h(.) is a monotone, differentiable, and invertible function. Likewise, we relate π_i to covariates z_i as follows, $π_{i} = f (z_{i}^{'} γ)$ , where γ is a parameter vector of dimension q × 1 and f (.) is a differentiable function. We find a suitable function f (.) that translates the homogeneity hypothesis into an equality involving only parameters β and γ. This is summarized in the proposition below.

Proposition

Assuming the model $h (μ_{i}) = x_{i}^{'} β$ under the nondegenerate distribution G, a suitable choice for f(.) is given by $f (z_{i}^{'} γ) = \tilde{g} \circ h^{- 1} (z_{i}^{'} γ)$ , where ∘ represents the composite function operator, and the homogeneity (null) hypothesis can be reformulated as,

H_{0} : \tilde{g} \circ h^{- 1} (z_{i}^{'} γ) = \tilde{g} \circ h^{- 1} (x_{i}^{'} β), for all i,

(4)

which further reduces to a linear contrast involving only parameters β and γ.

The proof of this proposition is provided in the Web supplementary file. The function f (.) falls in the class of composite link functions which are not always invertible (Thompson and Baker, 1981). Below, we give an explicit expression of this function for commonly used nondegenerate parametric distributions G and link functions h(.). Specific results when y^* = 0 are given in Table 1.

Table 1.

Natural parameterizations relating π_i to linear predictors $z_{i}^{'} γ$ for commonly used two-component models and link functions relating μ_i to linear predictors $x_{i}^{'} β$ when y^* = 0

Nondegenerate distribution G

Link function h(·) relating μ_i to

x_{i}^{'} β

Probability g_i(0) = g̃(μ_i)

Parameterization relating π_i to

z_{i}^{'} γ

Poisson with mean μ_i

log{μ_i}

exp {- exp {x_{i}^{'} β}}

exp {- exp {z_{i}^{'} γ}}

Binomial with success probability μ_i and m_i trials

logit{μ_i}

{1 + exp {x_{i}^{'} β}}^{- m_{i}}

{1 + exp {z_{i}^{'} γ}}^{- m_{i}}

Φ⁻¹(μ_i)

{1 + Φ (x_{i}^{'} β)}^{m_{i}}

{1 + Φ (z_{i}^{'} γ)}^{m_{i}}

log{−log{μ_i}}

{1 - exp {- exp {x_{i}^{'} β}}}^{m_{i}}

{1 - exp {- exp {z_{i}^{'} γ}}}^{m_{i}}

Negative binomial with mean μ_i and over dispersion parameter κ

log{μ_i}

{1 + κ exp {x_{i}^{'} β}}^{- 1 / κ}

{1 + κ exp {z_{i}^{'} γ}}^{- 1 / κ}

Open in a new tab

Note: θ = β for the Poisson and the binomial processes;

θ = {β, κ} for the negative binomial process.

The two-component model when G is a Poisson distribution with mean μ_i. Here θ = β and g̃(μ_i) = exp{−μ_i}{μ_i}^{y^*}Γ⁻¹(y^* + 1). Assuming the log link function $log {μ_{i}} = x_{i}^{'} β$ , a natural parameterization relating π_i to $z_{i}^{'} γ$ is, $π_{i} = exp {- exp {z_{i}^{'}}} {exp {z_{i}^{'} γ}}^{y^{*}} Γ^{- 1} (y^{*} + 1)$ .
The two-component model when G is a binomial distribution with success probability μ_i and planned number of trials m_i. Here θ = β and $\tilde{g} (μ_{i}) = m_{i} C_{y^{*}} μ_{i}^{y^{*}} {1 - μ_{i}}^{m_{i} - y^{*}}$ with $m_{i} C_{y^{*}} = \frac{Γ (m_{i} + 1)}{Γ (y^{*} + 1) Γ (m_{i} - y^{*} + 1)}$ . Assuming the logit link function $log {μ_{i} / (1 - μ_{i})} = x_{i}^{'} β$ , a natural parameterization for π_i is then, $π_{i} = m_{i} C_{y^{*}} {1 + exp {- z_{i}^{'} γ}}^{- y^{*}} {1 + exp {z_{i}^{'} γ}}^{y^{*} - m_{i}}$ .
The two-component model when G is a negative binomial distribution with mean μ_i and overdispersion parameter κ. Here θ = {β, κ} and $\tilde{g} (μ_{i}) = \frac{Γ (y^{*} + κ^{- 1})}{Γ (κ^{- 1}) Γ (y^{*} + 1)} {1 + κ μ_{i}}^{- 1 / κ} {1 + κ^{- 1} μ_{i}^{- 1}}^{- y^{*}}$ . For the log link function $log {μ_{i}} = x_{i}^{'} β$ , a natural parameterization for π_i is then, $π_{i} = \frac{Γ (y^{*} + κ^{- 1})}{Γ (κ^{- 1}) Γ (y^{*} + 1)} {1 + κ exp {z_{i}^{'} γ}}^{- 1 / κ} {1 + κ^{- 1} exp {- z_{i}^{'} γ}}^{- y^{*}}$ .

2.2 A Covariate-Adjusted Score Test for Homogeneity

Without any loss of generality, we will develop the test for the case where x_i = z_i, although our method generalizes quite naturally for any x_i and z_i Under this assumption, our homogeneity hypothesis then becomes, H₀ : β = γ. Let C be a selector matrix such that Cθ = β, the homogeneity hypothesis is then equivalent to, H₀ : Cθ = γ. Under the alternative model, the log-likelihood function, given observations ${y_{i}}_{i 1}^{n}$ , as a function of θ and γ is given by,

ℓ (θ, γ) = \sum_{i = 1}^{n} (δ [y_{i}] log [π_{i}] + [1 - δ {y_{i}}] log [\frac{{1 - π_{i}} g_{i} {y_{i}}}{1 - g_{i} {y^{*}}}]) .

We assume the following change of variables, α = Cθ − γ, with α = 0 under the null hypothesis. We denote by u(θ, α) the score function ∂ℓ(θ, Cθ − α)/∂(θ, α). To construct the score test statistic and derive its limiting null distribution, we set the following conditions for the arguments in the sequel to be valid.

C1
The support set of (θ, α) is compact.
C2
Assume a nonsingular matrix (θ, α) such that −n⁻¹∂u(θ, α)/∂(θ, α) = (θ, α) + o_p (1).
C3
For all a > 0, sup_||_ζ_||≤_a {n^−1/2[u((θ, α) + n^−1/2ζ) −u(θ, α)] + (θ, α)ζ} = o_p (1), where o_p (1) represents the convergence in probability as n → ∞.

Under Conditions C1–C3, standard asymptotic results give, n^−1/2 u(θ, α) →_d Inline graphic (0, (θ, α)), where →_d represents the convergence in distribution as n → ∞ (see Cox and Hinkley, 1974, pp. 311–343, for this large-sample size result). Assume the following decomposition, $u (θ, α) = {(u_{θ}^{'} (θ, α), u_{α}^{'} (θ, α))}^{'}$ where the respective entries are the first-order derivatives of the log-likelihood function with respect to θ and α. Let Inline graphic (θ, α), (θ, α), and (θ, α) be the corresponding blocks in (θ, α). The main building block of the test statistic is given by,

u_{α} (θ, α) = \sum_{i = 1}^{n} {\frac{δ (y_{i}) - π_{i}}{π_{i} (1 - π_{i})} \frac{\partial π_{i}}{\partial α}} .

Note that π_i in this expression depends on both θ and α because of the change of variables. The other building block Inline graphic (θ, α) is in general tedious to compute. In the Web supplementary file, we give details of these calculations when the component G of the mixture model is a Poisson or a binomial distribution (two well known exponential family members) and assuming y^* = 0.

Let b(θ) define the score function with respect to θ under the null distribution G. We assume the existence of a root-n consistent estimator θ̂ of θ^*, the true value of θ under the null distribution G, such that $n^{1 / 2} (\hat{θ} - θ^{*}) = n^{- 1 / 2} I_{θ θ}^{- 1} (θ^{*}, 0) b (θ^{*}) + o_{p} (1)$ . The core of the score statistic to evaluate the hypothesis α = 0 is given by û_α = u_α(θ̂, 0). In the Web supplementary file, we give an explicit form for this quantity when the component G is a Poisson or a binomial distribution.

The asymptotic distribution of û_α can be derived by applying a standard Taylor series expansion coupled with the law of large numbers,

n^{- 1 / 2} {\hat{u}}_{α} = n^{- 1 / 2} \sum_{i = 1}^{n} {u_{i, α} (θ^{*}, 0) - I_{θ α}^{'} (θ^{*}, 0) I_{θ θ}^{- 1} (θ^{*}, 0) b_{i} (θ^{*})} + o_{p} (1),

where u_i_, _α(θ^*, 0) and b_i(θ^*) are, respectively, the random contributions of subject i to the score functions u_α(θ^*, 0) and b(θ^*). The right-hand side of this equation consists of sums of independent random vectors $v_{i} (θ^{*}) = u_{i, α} (θ^{*}, 0) - I_{θ α}^{'} (θ^{*}, 0) I_{θ θ}^{- 1} (θ^{*}, 0) b_{i} (θ^{*})$ , i = 1, …, n, for which the central limit theorem applies. That is, n^−1/2û_α →_d Inline graphic (0, Λ(θ^*)), where $Λ (θ^{*}) = {lim}_{n \to \infty} n^{- 1} \sum_{i = 1}^{n} E (v_{i}^{\otimes 2} (θ^{*}))$ , with $v_{i}^{\otimes 2} (θ^{*}) = v_{i} (θ^{*}) v_{i}^{'} (θ^{*})$ . A straightforward calculation gives $Λ (θ^{*}) = I_{α α} (θ^{*}, 0) - I_{θ α}^{'} (θ^{*}, 0) I_{θ θ}^{- 1} (θ^{*}, 0) I_{θ α} (θ^{*}, 0)$ for which a consistent estimator is obtained by replacing θ^* by its estimator θ̂. We then construct a score test statistic as,

s_{n} = n^{- 1} {\hat{u}}_{α}^{'} Λ^{- 1} (\hat{θ}) {\hat{u}}_{α} \to_{d} χ_{r}^{2} under H_{0} as n \to \infty,

where r = p = q. The null hypothesis is rejected for large values of s_n.

3. Numerical Studies

3.1 Simulation

The two representations of the mixture model in equation (1) have some technical implications on the data generating mechanism. When ω_i is positive, the mixture model maintains its hierarchical representation and data can then be generated using the usual two-stage process. For negative ω_i, however, the mixture model loses its hierarchical representation and the data can not be generated from the two-stage process. Instead, data are generated directly from the marginal distribution in equation (1) by inverting the cumulative distribution function (CDF) of a uniform distribution on the interval (0, 1).

We conducted a simulation study to evaluate the empirical performances of the proposed covariate-adjusted score test in small to moderate sample sizes. We compared these performances to those of the tests proposed by van den Broek (1995), and Jansakul and Hinde (2009). Throughout our simulations, we generated data from a two-component model with true mixing weights $ω_{i}^{*}$ , i = 1, …, n, at y^* = 0, and a nondegenerate Poisson distribution with a simple mean $μ_{i}^{*} = exp {β_{0}^{*} - 1.45 x_{i}}$ , where the intercept $β_{0}^{*}$ takes values in the set {−0.75, 0, 0.75} and x_i is a covariate generated from a uniform distribution on the interval (0, 1). Throughout our simulations, we performed the proposed covariate-adjusted score test assuming the working mixing weight model ω_i = {π_i − exp{−μ_i}}{1 − exp{−μ_i}}⁻¹ under the alternative, where π_i = exp{−exp{γ₀ + γ₁x_i}} and μ_i = exp{β₀ + β₁x_i}. The test of van den Broek (1995) was performed assuming ω_i = γ₀, and that of Jansakul and Hinde (2009), assuming ω_i = γ₀ + γ₁x_i. With these parameterizations, the null hypotheses to be evaluated were then given by: H₀ : γ₀ = β₀, γ₁ = β₁, for our formulation; H₀ : γ₀ = 0, for van den Broek’s test; and H₀ : γ₀ = 0, γ₁ = 0, for Jansakul and Hinde’s test. The maximum likelihood estimate β̂ of the true value of β = (β₀, β₁)′ under the null was obtained from a homogeneous Poisson model with mean μ_i = exp{β₀ + β₁x_i}. Finally, all simulations were replicated 1,000 times and for sample sizes 50, 100, and 200.

To investigate the empirical type I error rates of the tests, we generated data from the homogeneous Poisson model. The empirical type I error rates at 5% nominal level are reported in Table 2. All considered score tests have well controlled type I error rates even for a sample size as small as 50 when $β_{0}^{*}$ is large. However, for smaller values of $β_{0}^{*}$ (large values of ${- β}_{0}^{*}$ ) these tests tend to be slightly conservative. In other words, when the true mean $μ_{i}^{*}$ nears zero they tend to reject the null hypothesis less often than anticipated and this conservativeness does not diminish with increasing sample sizes.

Table 2.

Empirical size of score test statistics when the null model is a nondegenerate Poisson model with mean $μ_{i}^{*} = exp {β_{0}^{*} - 1.45 x_{i}}$ , x_i ~ Uniform(0, 1), at 5% significance level

β_{0}^{*}

n = 50

n = 100

n = 200

−0.75

0.75

−0.75

0.75

−0.75

0.75

ω^* = 0

vdB test

0.031

0.053

0.051

0.048

0.063

0.059

0.042

0.050

0.066

J&H test

0.033

0.051

0.054

0.038

0.044

0.061

0.036

0.049

0.052

Prop. c-a test

0.036

0.048

0.040

0.046

0.057

0.033

0.045

0.060

Open in a new tab

Note: vdB, van den Broek, test with df = 1; J&H, Jansakul and Hinde, test with df = 2; Prop. c-a, Proposed covariate-adjusted, test with df = 2.

We investigated the empirical power of the tests to detect various forms of heterogeneity in the population. First, we generated data from a two-component mixture model with a constant mixing weight fixed at 0.25. For this constant mixing weight model, increasing sample sizes and separation of mixture components improve the power of detecting the alternatives under consideration for all tests considered (see Table 3). The test proposed by van den Broek (1995) appears to be more powerful than the two covariate-adjusted tests, especially when the mixture components are well separated. This is expected as data were generated from a positive and constant mixing weight model, for which the associated marginal model maintains its hierarchical representation. The loss of power was fairly minor for covariate-adjusted tests, however.

Table 3.

Empirical power of score test statistics to detect various forms $ω_{i}^{*}$ of heterogeneity coupled with a nondegenerate Poisson model with mean $μ_{i}^{*} = exp {β_{0}^{*} - 1.45 x_{i}}$ , x_i ~ Uniform(0, 1), at 5% significance level

β_{0}^{*}

n = 50

n = 100

n = 200

−0.75

0.75

−0.75

0.75

−0.75

0.75

ω_{i}^{*} = 0.25

vdB test

0.060

0.096

0.313

0.086

0.147

0.578

0.122

0.294

0.895

J&H test

0.069

0.087

0.249

0.094

0.118

0.481

0.123

0.238

0.851

Prop. c-a test

0.071

0.086

0.247

0.094

0.122

0.478

0.124

0.241

0.849

ω_{i}^{*} = 0.45 + 0.15 x_{i}

vdB test

0.100

0.221

0.698

0.156

0.388

0.960

0.219

0.684

1.000

J&H test

0.102

0.190

0.623

0.178

0.362

0.927

0.237

0.608

0.999

Prop. c-a test

0.105

0.191

0.617

0.179

0.363

0.933

0.235

0.603

0.999

ω_{i}^{*} = \frac{exp {- 1.2 + 1.1 x_{i}}}{1 + exp {- 1.2 + 1.1 x_{i}}}

vdB test

0.059

0.109

0.373

0.090

0.213

0.704

0.124

0.343

0.953

J&H test

0.080

0.112

0.318

0.103

0.192

0.662

0.136

0.297

0.929

Prop. c-a test

0.087

0.112

0.317

0.105

0.194

0.655

0.135

0.292

0.924

ω_{i}^{*} = \frac{π_{i}^{*} - exp {- μ_{i}^{*}}}{1 - exp {- μ_{i}^{*}}}, π_{i}^{*} = exp {- exp {1.3 - 2.4 x_{i}}}

vdB test

0.166

0.293

0.257

0.295

0.395

0.413

0.392

0.612

0.590

J&H test

0.236

0.454

0.386

0.383

0.628

0.544

0.506

0.847

0.731

Prop. c-a test

0.710

0.845

0.957

0.944

0.981

0.997

1.000

ω_{i}^{*} = \frac{exp {- 15 + 30 x_{i}}}{1 + exp {- 15 + 30 x_{i}}}

vdB test

0.026

0.067

0.095

0.039

0.079

0.131

0.051

0.090

0.189

J&H test

0.054

0.107

0.166

0.082

0.141

0.271

0.090

0.180

0.448

Prop. c-a test

0.068

0.151

0.430

0.122

0.238

0.758

0.131

0.431

0.967

ω_{i}^{*} = Φ (- 15 + 30 x_{i})

vdB test

0.042

0.070

0.099

0.045

0.069

0.153

0.057

0.083

0.221

J&H test

0.051

0.097

0.160

0.075

0.114

0.278

0.104

0.167

0.465

Prop. c-a test

0.062

0.142

0.485

0.095

0.222

0.739

0.142

0.409

0.962

Open in a new tab

Note: vdB, van den Broek, test with df = 1; J&H, Jansakul & Hinde, test with df = 2; Prop. c-a, Proposed covariate-adjusted, test with df = 2.

Second, we allowed the true mixing weight to depend on covariates using the linear, the logistic, and the proposed transformation (mixing weight models in rows 2, 3, and 4 of Table 3). Overall, the power of the three score tests improves with the sample size, regardless of the true mixing weight. When the true mixing weight is a linear or a logistic function of the covariates, all considered tests appear to have comparable powers. But the power deteriorates as the mean of the non-degenerate Poisson model nears zero. This is not surprising as the true mixing weights in these schemes are bounded between 0 and 1, a condition for the marginal mixture model to maintain its hierarchical representation. When the true mixing weights allow for negative values (mixing weight model in row 4 of Table 3), the proposed score test statistic appears to be more powerful than the other tests. The deterioration of power appears to be greater for the test proposed by van den Broek (1995). This is expected as the constant mixing weight test averages the mixing weights over the space of covariates, which may greatly affect power if both deflation and inflation are present in the population.

From the previous simulation results, one referee was concerned that the proposed covariate adjusted test outperforms the other competitors only when the simulated data are generated from a model that allows for negative mixing weights. Following this referee’s suggestion, we conducted further simulation studies in which data were generated from a two-component mixture model with positive covariate dependent weights $ω_{i}^{*} = {1 + exp {15 - 30 x_{i}}}^{- 1}$ and $ω_{i}^{*} = Φ (- 15 + 30 x_{i})$ , i = 1, …, n, where Φ(·) is the CDF of a standard normal variable. For these simulation schemes, the proposed test outperforms the other two competitors, especially when the two mixture components are well separated (see the last two simulation schemes in Table 3). Here the test proposed by Jansakul and Hinde (2009) adjusts for the covariate x_i in modeling the mixing weight ω_i, but fails to provide an adequate functional relationship between ω_i and x_i. In general, we expect this test to perform well when linearity holds and poorly otherwise. Following another referee’s suggestion, another simulation study was conducted to compare the three tests when covariates x_i and z_i are not equal. Findings from this article (results where x_i is a subset of z_i are shown in Table 5, Web supplementary file) were similar to those that assume x_i = z_i.

In sum, incorporating covariates in the score tests for homogeneity can greatly improve efficiency. But this efficiency gain highly depends on the prespecified working model of the mixing weight and the behavior of the tests under model mis-specification. When the true mixing weight does not depend on covariates, our test may not be as efficient as the constant mixing weight test. However, given that the true model is usually unknown to the analyst, a general approach that assumes covariate dependent mixing weights appears to be the most conservative data analytic strategy.

3.2 Dental Caries Data

To illustrate our methodology, we considered children dental caries data from the Detroit study. We focused on scores representing the number of tooth surfaces that show signs of clinically detectable enamel lesions comprising both noncavitated and cavitated lesions. Using this stringent dental caries definition, three different outcomes were derived. These outcomes represent the number of decayed surfaces (DS), the number of decayed and filled surfaces (DFS), and the number of decayed, missing, and filled surfaces (DMFS). Although this survey is longitudinal in nature, our numerical computations are based on cross-sectional data of 897 children surveyed in the first wave of examinations conducted between 2002 and 2003. Covariates considered include Age (the child’s age in years), SI (the child’s sugar intake), and their multiplicative interaction Age * SI. A more detailed description of the study can be found elsewhere (see, for example, Tellez et al., 2006).

It is a common practice to use zero-inflated models to characterize dental caries scores (see, for example, Böhning et al., 1999). In addition, it is well known that dental caries data often exhibit overdispersion in addition to zero inflation. For low-income children, however, it is possible that both zero inflation and deflation may be present, even when overdispersion is accounted for. To investigate this, as a simple analysis we discretized the age variable (Age < 3, 3 ≤ Age ≤ 4, and Age > 4) and fit a homogeneous negative binomial model with age group specific means and a common overdispersion parameter to DS. Figure 1 presents an informal assessment of this model by comparing the observed proportions to the predicted probabilities for each age group. In this population, there is an inflation of zeros for children under the age of 3 and a deflation for children above the age of 4 (Bottom panel of Figure 1). A constant mixing weight test by averaging out the mixing weights over the age groups, may fail to detect heterogeneity in this population.

Observed and predicted proportions (Top panel) and observed–predicted proportions (Bottom panel) of the number of decayed surfaces (DS) for each age group. Fitted model is a negative binomial distribution with age group specific means.

We evaluated the homogeneity hypothesis using the proposed covariate-adjusted score test. Specifically, we considered a two-component mixture model for which the nondegenerate distribution is a negative binomial model with mean μ_i = exp{β₀ + β₁Age_i + β₂SI_i + β₃Age_i * SI_i} and overdispersion parameter κ and the mixing weight ω_i is given by equation (3) with π_i = (1 + κ exp{γ₀ + γ₁Age_i + γ₂SI_i + γ₃Age_i * SI_i})^−1/^κ. For comparison, the score test with constant mixing weight ω_i = γ₀ of van den Broek (1995) and that proposed by Jansakul and Hinde (2009) with mixing weight ω_i = γ₀ + γ₁Age_i + γ₂SI_i + γ₃Age_i * SI_i were also performed. With these parameterizations, the null hypotheses to be evaluated were then giving by: H₀ : γ_j = β_j, j = 0, 1, 2, 3, for our formulation; H₀ : γ₀ = 0, for van den Broek’s test; and H₀ : γ_j = 0, j = 0, 1, 2, 3, for Jansakul and Hinde’s test. All these tests were conducted by replacing the nuisance parameter β by its maximum likelihood estimate under the null distribution. Results of this analysis are presented in Table 4.

Table 4.

Score test statistics, degrees of freedom, and associated p-values for heterogeneity in dental caries data when the null model is a nondegenerate negative binomial distribution

Response	van den Broek test			Jansakul & Hinde test			Proposed cov.-adj. test
Response	df	statistic	p-value	df	statistic	p-value	df	statistic	p-value
DS	1	0.0164	0.8980	4	128.1512	<0.001	4	151.8566	<0.001
DFS	1	0.0083	0.9274	4	123.2241	<0.001	4	148.8391	<0.001
DMFS	1	0.0009	0.9759	4	126.7810	<0.001	4	151.8784	<0.001

Open in a new tab

Our test statistic and that of Jansakul and Hinde (2009) reject the homogeneity hypothesis for all outcomes at 5% significance level, supporting the hypothesis of heterogeneity. But our test provides a stronger evidence for heterogeneity in this population in view of the p-values. The constant mixing weight test, however, fails to reject the null for all outcomes. The observed value of the constant mixing weight test statistic is almost zero, suggesting that inflation and deflation at zero appear to be of the same magnitude, taking into account the random variation. This is a good example where the score test based on constant mixing weights is not powerful enough to capture heterogeneity in the data. But when adjusted by covariates, the score test greatly improves efficiency in detecting heterogeneity.

It is worth noting that failure to reject homogeneity does not give evidence that the zero-inflated negative binomial model provides a best fit for the data. Instead, such rejection only gives grounds for the zero-inflated negative binomial model to be further evaluated. As a final investigation, we then compared the zero-inflated negative binomial model to the zero-inflated Poisson model. A one-sided test (Silvapulle and Silvapulle, 1995; Ridout, Hinde, and Demetrio, 2001) of the overdispersion parameter of the negative binomial model was highly significant at 1% significance level, revealing a strong evidence for overdispersion in addition to zero inflation. This confounding of mixtures with overdispersion is not uncommon in practice as recognized by Lindsay and Roeder (1992) and Böhning et al. (1999).

4. Discussion

The goal of this article was not to argue for or against the marginal model, but rather to show how to conduct inferences for the mixing weights in this class of models. Although score tests for evaluating homogeneity in two-component models for discrete data have been well discussed in the literature, existing methodologies have relied primarily on restrictive assumptions. One common restriction is that constant mixing weights are often assumed. From a practical perspective, this is an important limitation as covariates dependent mixing weights are typical in real applications of two-component regression models. More generally, limitations of existing methodologies result from the structure of the marginal model being simply ignored or at best not properly integrated into the testing procedures. In this article, we formulated and developed a score test of homogeneity based on an intuitive approach that accommodates the features of the marginal model. Specifically, our proposed test adopted a novel parameterization that allows the structural constraints of the mixing weight to be embedded in the testing scheme. Most importantly, this parameterization naturally incorporates covariates in the mixing weights. We showed that this is a natural strategy to adopt when testing is carried out under the marginal model that ignores its hierarchical representation.

The proposed test can be extended to refine the model specification. We generally conduct the test of homogeneity because it is easy to control its type I error. However, if the null hypothesis is rejected, one is typically interested in evaluating composite hypotheses. For example, one may be interested in testing for equality of some coefficients of the binary regression model and corresponding coefficients of the nondegenerate regression model. Although the details to evaluate such composite hypotheses warrant a separate investigation, the results presented here can be used to evaluate these specific hypotheses with a proper adjustment of the type I error. It is noteworthy that rejecting the homogeneity hypothesis against a specific two-component model does not necessarily imply that the latter model is appropriate. Instead, we advocate that such rejection only gives grounds for the two-component model to be further investigated. It is therefore crucial to study how plausible are inferences with respect to a candidate model under consideration that are of major substantive interest from the observed data standpoint.

Supplementary Material

Supplementary File

NIHMS518874-supplement-Supplementary_File.pdf^{(58KB, pdf)}

Acknowledgments

The authors are grateful to Amid Ismail and Woosung Sohn for their permission to use the Detroit dental caries data. This work was supported by the first author’s NCI/NIH K-award, 5K01CA131259-04 and its supplement 3K01CA131259-02S1 from the 2009 American Recovery and Reinvestment Act funding mechanism.

Footnotes

5. Supplementary Materials

Web Appendices referenced in Sections 2 and 3 are available with this article at the Biometrics website on Wiley Online Library.

References

Böhning D, Dietz E, Schlattmann P, Mendonca L, Kirchner U. The zero-inflated Poisson model and the decayed, missing and filled teeth index in dental epidemiology. Journal of the Royal Statistical Society: Series A (Statistics in Society) 1999;162:195–209. [Google Scholar]
Cox DR, Hinkley DV. Theoretical Statistics. London: Chapman & Hall Ltd; 1974. [Google Scholar]
Deng D, Paul SR. Score tests for zero inflation in generalized linear models. The Canadian Journal of Statistics/La Revue Canadienne de Statistique. 2000;28:563–570. [Google Scholar]
Farewell V, Sprott D. The use of a mixture model in the analysis of count data. Biometrics. 1988;44:1191–1194. [PubMed] [Google Scholar]
Hall DB. Zero-inflated Poisson and binomial regression with random effects: A case study. Biometrics. 2000;56:1030–1039. doi: 10.1111/j.0006-341x.2000.01030.x. [DOI] [PubMed] [Google Scholar]
Heilbron D. Zero-altered and other regression models for count data with added zeros. Biometrical Journal. 1994;36:531–547. [Google Scholar]
Hur K, Hedeker D, Henderson W, Khuri S, Daley J. Modeling clustered count data with excess zeros in health care outcomes research. Health Services and Outcomes Research Methodology. 2002;3:5–20. [Google Scholar]
Jansakul N, Hinde J. Score tests for extra-zero models in zero-inflated negative binomial models. Communications in Statistics-Simulation and Computation. 2009;38:92–108. [Google Scholar]
Lambert D. Zero-inflated Poisson regression, with an application to defects in manufacturing. Technometrics. 1992;34:1–14. [Google Scholar]
Lewsey JD, Thomson WM. The utility of the zero-inflated poisson and zero-inflated negative binomial models: A case study of cross-sectional and longitudinal dmf data examining the effect of socio-economic status. Community Dentistry and Oral Epidemiology. 2004;32:183–189. doi: 10.1111/j.1600-0528.2004.00155.x. [DOI] [PubMed] [Google Scholar]
Lindsay B, Roeder K. Residual diagnostics for mixture models. Journal of the American Statistical Association. 1992;87:785–794. [Google Scholar]
Mullahy J. Specification and testing of some modified count data models. Journal of Econometrics. 1986;33:341–365. [Google Scholar]
Ridout M, Demetrio CGB, Hinde J. Proceedings of International Biometric Conference. Cape Town, South Africa: International Biometric Society; 1998. Models for count data with many zeros; pp. 179–192. [Google Scholar]
Ridout M, Hinde J, Demetrio CGB. A score test for testing a zero-inflated Poisson regression model against zero-inflated negative binomial alternatives. Biometrics. 2001;57:219–223. doi: 10.1111/j.0006-341x.2001.00219.x. [DOI] [PubMed] [Google Scholar]
Silvapulle MJ, Silvapulle P. A score test against one-sided alternatives. Journal of the American Statistical Association. 1995;90:342–349. [Google Scholar]
Tellez M, Sohn W, Burt B, Ismail A. Assessment of the relationship between neighborhood characteristics and dental caries severity among low-income African-Americans: A multilevel approach. Journal of Public Health Dentistry. 2006;66:30–36. doi: 10.1111/j.1752-7325.2006.tb02548.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
Thompson R, Baker RJ. Composite link functions in generalized linear models. Applied Statistics. 1981;30:125–131. [Google Scholar]
van den Broek J. A score test for zero inflation in a Poisson distribution. Biometrics. 1995;51:738–743. [PubMed] [Google Scholar]
Xiang L, Lee A, Yau K, McLachlan G. A score test for zero-inflation in correlated count data. Statistics in Medicine. 2006;25:1660–1671. doi: 10.1002/sim.2308. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary File

NIHMS518874-supplement-Supplementary_File.pdf^{(58KB, pdf)}

[R1] Böhning D, Dietz E, Schlattmann P, Mendonca L, Kirchner U. The zero-inflated Poisson model and the decayed, missing and filled teeth index in dental epidemiology. Journal of the Royal Statistical Society: Series A (Statistics in Society) 1999;162:195–209. [Google Scholar]

[R2] Cox DR, Hinkley DV. Theoretical Statistics. London: Chapman & Hall Ltd; 1974. [Google Scholar]

[R3] Deng D, Paul SR. Score tests for zero inflation in generalized linear models. The Canadian Journal of Statistics/La Revue Canadienne de Statistique. 2000;28:563–570. [Google Scholar]

[R4] Farewell V, Sprott D. The use of a mixture model in the analysis of count data. Biometrics. 1988;44:1191–1194. [PubMed] [Google Scholar]

[R5] Hall DB. Zero-inflated Poisson and binomial regression with random effects: A case study. Biometrics. 2000;56:1030–1039. doi: 10.1111/j.0006-341x.2000.01030.x. [DOI] [PubMed] [Google Scholar]

[R6] Heilbron D. Zero-altered and other regression models for count data with added zeros. Biometrical Journal. 1994;36:531–547. [Google Scholar]

[R7] Hur K, Hedeker D, Henderson W, Khuri S, Daley J. Modeling clustered count data with excess zeros in health care outcomes research. Health Services and Outcomes Research Methodology. 2002;3:5–20. [Google Scholar]

[R8] Jansakul N, Hinde J. Score tests for extra-zero models in zero-inflated negative binomial models. Communications in Statistics-Simulation and Computation. 2009;38:92–108. [Google Scholar]

[R9] Lambert D. Zero-inflated Poisson regression, with an application to defects in manufacturing. Technometrics. 1992;34:1–14. [Google Scholar]

[R10] Lewsey JD, Thomson WM. The utility of the zero-inflated poisson and zero-inflated negative binomial models: A case study of cross-sectional and longitudinal dmf data examining the effect of socio-economic status. Community Dentistry and Oral Epidemiology. 2004;32:183–189. doi: 10.1111/j.1600-0528.2004.00155.x. [DOI] [PubMed] [Google Scholar]

[R11] Lindsay B, Roeder K. Residual diagnostics for mixture models. Journal of the American Statistical Association. 1992;87:785–794. [Google Scholar]

[R12] Mullahy J. Specification and testing of some modified count data models. Journal of Econometrics. 1986;33:341–365. [Google Scholar]

[R13] Ridout M, Demetrio CGB, Hinde J. Proceedings of International Biometric Conference. Cape Town, South Africa: International Biometric Society; 1998. Models for count data with many zeros; pp. 179–192. [Google Scholar]

[R14] Ridout M, Hinde J, Demetrio CGB. A score test for testing a zero-inflated Poisson regression model against zero-inflated negative binomial alternatives. Biometrics. 2001;57:219–223. doi: 10.1111/j.0006-341x.2001.00219.x. [DOI] [PubMed] [Google Scholar]

[R15] Silvapulle MJ, Silvapulle P. A score test against one-sided alternatives. Journal of the American Statistical Association. 1995;90:342–349. [Google Scholar]

[R16] Tellez M, Sohn W, Burt B, Ismail A. Assessment of the relationship between neighborhood characteristics and dental caries severity among low-income African-Americans: A multilevel approach. Journal of Public Health Dentistry. 2006;66:30–36. doi: 10.1111/j.1752-7325.2006.tb02548.x. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R17] Thompson R, Baker RJ. Composite link functions in generalized linear models. Applied Statistics. 1981;30:125–131. [Google Scholar]

[R18] van den Broek J. A score test for zero inflation in a Poisson distribution. Biometrics. 1995;51:738–743. [PubMed] [Google Scholar]

[R19] Xiang L, Lee A, Yau K, McLachlan G. A score test for zero-inflation in correlated count data. Statistics in Medicine. 2006;25:1660–1671. doi: 10.1002/sim.2308. [DOI] [PubMed] [Google Scholar]

PERMALINK

On the Efficiency of Score Tests for Homogeneity in Two-Component Parametric Models for Discrete Data

David Todem

Wei-Wen Hsu

Kyung Mann Kim

Summary

1. Introduction

2. The Method

2.1 Hypothesis Formulation and Reparameterization

Proposition

Table 1.

2.2 A Covariate-Adjusted Score Test for Homogeneity

3. Numerical Studies

3.1 Simulation

Table 2.

Table 3.

3.2 Dental Caries Data

Figure 1.

Table 4.

4. Discussion

Supplementary Material

Acknowledgments

Footnotes

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

On the Efficiency of Score Tests for Homogeneity in Two-Component Parametric Models for Discrete Data

David Todem

Wei-Wen Hsu

Kyung Mann Kim

Summary

1. Introduction

2. The Method

2.1 Hypothesis Formulation and Reparameterization

Proposition

Table 1.

2.2 A Covariate-Adjusted Score Test for Homogeneity

3. Numerical Studies

3.1 Simulation

Table 2.

Table 3.

3.2 Dental Caries Data

Figure 1.

Table 4.

4. Discussion

Supplementary Material

Acknowledgments

Footnotes

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases