Summary
Group testing, where subjects are tested in pools rather than individually, has a long history of successful application in infectious disease screening. In this paper, we develop group testing regression models to include covariate effects which are best regarded as random. We present approaches to fit mixed effects models using maximum likelihood, investigate likelihood ratio and score tests for variance components, and evaluate small sample performance using simulation. We illustrate our methods using chlamydia and gonorrhea data collected by the state of Nebraska as part of the Infertility Prevention Project.
Keywords: Generalized linear mixed model, Latent binary response, Likelihood ratio test, Monte Carlo EM algorithm, Pooled testing, Score test
1. Introduction
The Infertility Prevention Project (IPP) is a national program, funded by the Centers for Disease Control and Prevention, aimed at providing routine screening, prevention strategies, and treatment for individuals with chlamydia and/or gonorrhea infection. Chlamydia and gonorrhea are the two most common sexually transmitted diseases (STDs) in the United States, and each year the US spends approximately $4 billion on assessment, testing, and treatment for these infections (Screening and Treatment Guidelines, IPP, Region VII, 2003). Unlike most viral STDs, individuals with chlamydia and gonorrhea, both bacterial infections, are usually asymptomatic, and left untreated, both infections can result in a variety of long-term sequelae, such as pelvic inflammatory disease, ectopic pregnancy, and infertility (Kacena et al., 1998a, 1998b).
This research arises from our interaction with medical collaborators in Nebraska (Region VII of the IPP), where more than 30,000 individuals are tested for chlamydia and gonorrhea each year. With the current testing cost of about $15 per individual expected to rise as public health costs increase nationally, our colleagues have expressed an interest in implementing group testing (also known as pooled testing) as a means of surveillance in the state. Group testing, where subjects are tested in pools rather than individually, has a long history of successful application in infectious disease screening, dating back to the work of Dorfman (1943). Today, pooling individual samples (e.g., blood, urine, etc.) is a common strategy to reduce testing costs and has been implemented in a variety of infectious disease applications involving HIV, hepatitis B/C (Cardoso, Koerner, and Kubanek, 1998; Pilcher et al., 2005), and elsewhere for chlamydia and gonorrhea screening (Kacena et al., 1998a, 1998b; Lindan et al., 2005; Rours et al., 2005).
Traditionally, statistical research in group testing has focused on homogeneous populations; that is, the positive/negative statuses of individuals tested are assumed to be independent and identically distributed (iid) random variables. However, in most infectious disease studies, there is covariate information available for each individual, and a major thrust of the analysis is to determine which covariates are associated with individual positivity. To address this issue, Vansteelandt, Goetghebeur, and Verstraeten (2000) and Xie (2001) have each proposed fixed effects regression approaches to account for individual covariate information with pooled binary responses. Vansteelandt et al. (2000) use a method by which maximum likelihood estimates are computed directly using the pooled responses, whereas Xie (2001) treats the individual responses as unobserved and uses the EM algorithm.
Unifying the approaches of Vansteelandt et al. (2000) and Xie (2001) is the assumption that individual (latent) statuses are independent. However, one aspect of the Nebraska study conspicuously calls this assumption into question. Because testing takes place at different clinic site locations throughout the state, individuals are inherently clustered by design and may share common characteristics. For example, the level of gonorrhea and chlamydia infection in Douglas County (which includes Omaha) has recently been described as at “epidemic” levels (Zagurski, 2006), whereas infection levels in other parts of the state are relatively lower. If the prevalence of chlamydia and gonorrhea does vary geographically, it would be more natural to regard the individual latent infection statuses as correlated and to conceptualize clinic locations as random effects. Taking this perspective, the analysis of group testing samples from the Nebraska IPP requires the development of new regression methods that allow for within-cluster correlation among the individual latent binary responses.
In this paper, we generalize the group testing regression modeling approach of Vansteelandt et al. (2000) to include random effects. In Section 2, we obtain maximum likelihood estimates and discuss large sample inference for fixed effects parameters. In Section 3, we present likelihood ratio and score tests for variance components which allow one to assess heterogeneity among clusters. In Section 4, we provide simulation evidence to show that our model fitting approaches behave well and that tests for variance components have very good size and power properties in finite samples. In Section 5, we illustrate the use of our new analytical methods with the Nebraska IPP data. In Section 6, we summarize and discuss extensions of this work.
2. Estimation and Inference
2.1. Notation and Assumptions
We are interested in developing mixed effects regression techniques to model the probability of a single infection (e.g., chlamydia) using pooled testing results. In doing so, we view individuals from a given clinic site as a cluster. Within the ith site, suppose that each individual is randomly assigned to one of ni pools and let Yijk = 1 if the kth individual in the jth pool at site i is positive, Yijk = 0 otherwise, for i = 1, 2, ..., l, j = 1, 2, ..., ni, and k = 1, 2, ..., cij. We henceforth refer to cij as the pool size. We let , where ui denotes a q × 1 random effect vector for site i, and relate the latent status Yijk to the covariate information through the generalized linear mixed model
(1) |
where β is a p × 1 vector of fixed effects parameters, xijk is a p × 1 covariate vector associated with the fixed effects, zijk is a q × 1 covariate vector associated with the site-specific random effects, and g(·) is a known, monotonic, differentiable link function. It is worth emphasizing that Yijk is not observed, unless cij = 1, which corresponds to testing subjects individually.
Because identifying positive individuals is not our goal, similarly to Vansteelandt et al. (2000), we consider the case where only the initial pooled responses are observed; that is, subsets of positive pools are not retested further. Define Tij = 1 if the jth pool at site i tests positive, Tij = 0 otherwise, for i = 1, 2, ..., l and j = 1, 2, ..., ni. We assume that the statuses of individuals and pools from different sites are independent and that, conditional on ui, the statuses of individuals and pools within site i are independent so that
where γ1 and γ2 denote assay sensitivity and specificity, respectively. It is assumed that γ1 and γ2 are constants close to 1 and do not depend on cij, a reasonable assumption with modern diagnostic assays based on nucleic acid technology (NAT). For a number of chlamydia studies, NATs have been shown to have high sensitivity and specificity for pools of up to size 10 when pooling urine or cervical swabs (see, e.g., Kacena et al., 1998b). Similar results have been observed in gonorrhea studies (Kacena et al., 1998a).
We treat the site-specific u1, u2, ..., ul as iid multivariate normal random vectors with mean 0 and covariance matrix D ≡ D(φ), where φ is an m × 1 vector of variance components. Denote by θ = (β’, φ’)’ the (p + m) × 1 vector of parameters. Under these assumptions, the log-likelihood of θ based on the observed pool responses T = (Tij) is
(2) |
where ,
(3) |
and is the probability density function.
2.2. Maximum Likelihood Estimation
We consider two approaches to find the MLE θ˄ = (β˄’,φ’)’ and evaluate the relative merits of each. First, we use adaptive Gauss-Hermite quadrature (Pinheiro and Bates, 1995) to approximate the integral in (2) and then maximize the approximated log-likelihood with respect to θ using a Newton-Raphson procedure. This approach is straightforward to implement when q is small; however, its use may be ill-advised for large q as the number of quadrature points needed to confer reasonable precision increases exponentially (Booth, Hobert, and Jank, 2001). As a second approach, we formulate a general Monte Carlo expected maximization (MCEM) algorithm (McCulloch, 1997). While this approach is computationally more expensive, it could be modified to allow for other random effects distributions (Chen, Zhang, and Davidian, 2002), and it has the potential to be applicable with other pooling strategies not considered here; see Section 6.
The MCEM approach is now described in our hierarchical setting. Treating the random effect u as missing, the complete data log-likelihood can be written as
(4) |
where P (T, u|β, φ) is the density function of (T, u), P (T|u; β) is the conditional density of T given u, fij(cij, xij, ui, β), is defined as in (3), and c0 = -q log(2π)/2 is free of θ. We define
(5) |
noting that I1 depends only on β and that I2 depends only on φ through D = D(φ). With the observed data T, the E-step involves calculating E(I1|T) and E(I2|T) for a given estimate θ(b) ≡ (β(b), φ(b)). These expectations can not be calculated in closed form, so we approximate them using Monte Carlo simulation; that is, we generate a large number (M) of random draws from the conditional distribution f(u|T;θ(b)) and use the sample means of I1 and I2 to estimate E(I1|T) and E(I2|T), respectively. Because f(u|T; θ(b)) is not available in closed form, we use the Metropolis-Hastings (MH) algorithm to sample from it. Since pools from different sites are independent, it suffices to generate samples from f(ui|T; θ(b)), for each i = 1, 2, ..., l. We choose the distribution as the proposal function for the MH algorithm. Summarizing, the MCEM algorithm is implemented as follows:
Choose a starting value for θ(0) = (β(0), φ(0)).
- (E-step). For a given b = 0, 1, 2, ..., approximate E(I1|T) and E(I2|T) by
respectively, where , h = 1, 2, ..., M, are M draws from the conditional distribution f(ui|T; θ(b)), i = 1, 2, ..., l, using the MH algorithm (see Web Appendix A). (M-step). Maximize with respect to, β obtaining a new estimate β(b+1). Maximize with respect to φ, obtaining a new estimate φ(b+1). Set b = b + 1.
Repeat Steps 2 and 3 until ∥β(b+1) - β(b)∥ and ∥(b+1) - β(b)∥ are sufficiently small.
2.3. Covariance Matrix Estimation
When one uses quadratures to find θ˄, the negative inverse Hessian at the last iteration of the Newton-Raphson maximization procedure provides an estimate of the covariance matrix of θ˄. With MCEM, an appeal to the missing information principle and the method of Louis (1982) gives the observed information matrix
where l(θ|T) is the observed data log-likelihood defined in (2), lc(θ|T, u) is the complete data log-likelihood in (4), and the functions I1 and I2 are as given in (5). We estimate I(θ) using
where the matrix expression is evaluated at θ = θ˄ using random samples , i = 1, 2, ..., l, h = 1, 2, ..., M, generated from the conditional density P(u|T; θ˄) via the MH algorithm described in Web Appendix A. Regardless of which model fitting technique is used, standard errors are obtained from I(θ˄)-1, making the construction of large sample Wald confidence intervals possible. Based on the asymptotic results from Nie (2007), one would expect such confidence intervals to be adequate for l large.
3. Tests for Variance Components
We now consider the problem of testing whether individual random effects are present using the pooled responses. This is of practical interest because if there is no additional variation among sites, the analyst may wish to collapse over the sites and use the simpler regression model of Vansteelandt et al. (2000), which regards all effects as fixed. Herein, we concentrate on the q = 1 case and take zijk = 1, for all i, j, and k, so that the test of interest is
where σ2 = var(ui). This is sufficient for our purposes as the Nebraska IPP data to be analyzed in Section 5 involves only potential site effects and no site-specific covariates. Generalizations for q > 1 are discussed in Section 6. In the literature, the test of H0 versus H1 has been described as “nonstandard” (Self and Liang, 1987), because the value of σ2 under H0 corresponds to a boundary point of the parameter space. The implementation of one-sided tests in a constrained parameter space has recently garnered attention from Molenberghs and Verbeke (2007). Based on their recommendations, we investigate likelihood ratio and score tests.
3.1. Likelihood ratio test
The likelihood ratio test statistic for H0 versus H1 is given by
where l(β, σ2) is the log-likelihood defined in (2) with φ = σ2 and θ = (β’, σ2)’. Computing TLR requires one to fit both the mixed model in Section 2 and the fixed effects model of Vansteelandt et al. (2000). For the scalar variance case, the results of Self and Liang (1987) apply so that the asymptotic distribution of TLR is the two-component mixture , where is a point mass distribution at 0 and denotes the χ2 distribution with 1 degree of freedom. Large values of TLR are evidence against H0.
3.2. Score test
We also propose a score test analogous to Liang (1987) who considers individual response data but adjust the test to acknowledge the one sided nature of H1. Specifically, we reparameterize the random intercept for the ith site in (1) as ui ≡ αi = τ1/2vi and rewrite (1) as
where v1, v2, ..., vl are assumed to be iid random variables with unspecified distribution F(·), E(vi) = 0, and var(vi) = 1. The parameter τ is interpreted as the variance of ui in (1) so that under this parametrization, the test for homogeneity among sites becomes H0 : τ = σ2 = 0. The log-likelihood contribution from the ith site is
where ,
and fij(cij, xij, τ1/2vij, β) is defined as in (3) with ui = ui = τ1/2vi and zijk = 1. Define
Expressions for (∂/∂αi) log fi(xi, 0, β) and log fi(xi, 0, β) are provided in Web Appendix B. The asymptotic variance of S(β) is equal to , where
(6) |
A score statistic for H0 : σ2 = 0 versus H1 : σ2 > 0 is thus given by
where β˄ is the MLE computed under H0 and is computed by evaluating ∂li/∂τ, ∂li/∂β, and the expectations in (6) at τ = 0 and β = β˄. Closed form expressions for ∂li/∂τ and ∂li/∂β are provided in Web Appendix B. Closed form expressions for Iττ and Iτβ do not exist, so we approximate them using Monte Carlo simulation to sample pooled responses under H0 while preserving the original covariate values among all pools. Applying the results in Silvapulle and Silvapulle (1995), the asymptotic distribution of TS is the same two-component mixture , with large values of TS being evidence against H0.
One small technical detail warrants a remark. In our formulation of the score statistic, we calculate the estimated variance using expected information, whereas Molenberghs and Verbeke (2007) use observed information. We use expected information because calculating the second derivative of the log-likelihood l(θ) numerically can produce unstable results; in addition, the observed information matrix can be non-positive definite.
4. Simulation Evidence
4.1. Maximum Likelihood Estimation
We have performed simulations to assess the characteristics of the maximum likelihood estimates in a variety of situations encountered in large scale infectious disease studies. We first generated individual binary responses Yijk according to the simple logit model
(7) |
where , θ = (β0, β1, σ)’ = (-5, 1, 1)’, and . These configurations provide a mean prevalence of about 1.4 percent (range: 0.1 to 5.0 percent). We generated N = 2000 (10000) individuals with l = 10 (40) sites and randomly assigned individual responses to pools within site to create simulated group responses Tij. For simplicity, we take cij = c and ni = n, for all i and j, so that the overall number of individuals is N = lnc. We assume throughout this section that γ1 = γ2 = 1 only because the results obtained from imperfect testing (γ1 < 1, γ2 < 1) were similar. Imperfect tests are considered in Section 5. To fit the model in (7), both adaptive quadratures and MCEM were used. Differences between the fits were negligible so we present the results from quadratures. For comparative purposes, we also fit the model in (7) to the individual simulated data (i.e., c = 1) using the glmm package in R. For each (N, l, n, c) combination, 500 data sets were simulated.
The results from Table 1 demonstrate that, in terms of bias, the fixed effects estimates β˄0 and β˄1 from group testing do about as well as those from individual testing. The variance component estimate σ˄ from group testing tends to slightly underestimate the true σ, but this occurs for individual testing as well. We compare group testing and individual testing on the basis of N fixed, so it is not surprising that the relative mean squared error (RMSE); that is, the ratio of the MSE from individual testing to the MSE from group testing, is often less than unity. However, it is interesting to note that the RMSE for σ˄ is not largely affected by the pool size c. This is likely explained by the fact that the variance component σ is a site-level parameter, and the loss of information (if any) is small when pooling individuals within site. The estimated 95 percent Wald coverage probabilities are also given based on the 500 Monte Carlo data sets. These are mostly within the margin of error, suggesting that the variance estimates from Section 2.3 are adequate.
Table 1.
β˄0 | β˄1 | σ˄ | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
N | l | n | c | IND | POOL | IND | POOL | IND | POOL | |
Mean | -5.04 | -5.06 | 1.02 | 1.02 | 0.91 | 0.92 | ||||
100 | 2 | Std dev | 0.48 | 0.49 | 0.26 | 0.32 | 0.37 | 0.36 | ||
RMSE/Cov | 0.95/0.94 | 0.64/0.96 | 1.07/0.97 | |||||||
Mean | -5.04 | -5.07 | 1.02 | 0.97 | 0.91 | 0.93 | ||||
2000 | 10 | 40 | 5 | Std dev | 0.48 | 0.54 | 0.26 | 0.52 | 0.37 | 0.36 |
RMSE/Cov | 0.76/0.94 | 0.26/0.95 | 1.07/0.97 | |||||||
Mean | -5.04 | -5.11 | 1.02 | 0.89 | 0.91 | 0.94 | ||||
20 | 10 | Std dev | 0.48 | 0.59 | 0.26 | 0.74 | 0.37 | 0.37 | ||
RMSE/Cov | 0.65/0.96 | 0.12/0.91 | 1.01/0.98 |
β˄0 | β˄1 | σ˄ | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
N | l | n | c | IND | POOL | IND | POOL | IND | POOL | |
Mean | -4.99 | 5.00 | 1.00 | 1.00 | 0.96 | 0.96 | ||||
125 | 2 | Std dev | 0.22 | 0.23 | 0.11 | 0.14 | 0.17 | 0.18 | ||
RMSE/Cov | 0.94/0.94 | 0.62/0.93 | 0.99/0.92 | |||||||
Mean | -4.99 | -4.99 | 1.00 | 0.99 | 0.96 | 0.96 | ||||
10000 | 40 | 50 | 5 | Std dev | 0.22 | 0.25 | 0.11 | 0.20 | 0.17 | 0.18 |
RMSE/Cov | 0.82/0.94 | 0.30/0.96 | 0.98/0.93 | |||||||
Mean | -4.99 | -4.99 | 1.00 | 0.97 | 0.96 | 0.96 | ||||
25 | 10 | Std dev | 0.22 | 0.26 | 0.11 | 0.30 | 0.18 | 0.18 | ||
RMSE/Cov | 0.71/0.94 | 0.14/0.95 | 1.00/0.93 |
We now present the results from a second simulation illustrating the performance of the MCEM algorithm with the model
(8) |
where ui = (ui1, ui2)’ follows a bivariate normal distribution with mean 0 = (0, 0)’, , , and corr(ui1, ui2) = ρ. Additionally, we take , zijk ∼ , β = (β0, β1)’ = (-4, 1)’, φ = (σ1, σ2, ρ)’ = (0.7, 0.5, 0)’, and θ = (β’, φ’)’. These configurations provide a mean prevalence of about 4.3 percent (range: 0.1 to 17.4 percent). We generated individual responses according to the model in (8) and created pools by random assignment as before; 200 simulated data sets were used for each (N, l, n, c) combination. For each data set, we initially took the number of Monte Carlo draws to be M = 1500 and increased M incrementally based on the recommendations in Booth and Holbert (1999). To prevent premature stopping, we used the convergence criteria
for 5 consecutive iterations with ε = 0.0001. For each iteration, 10 percent of the simulated random effects were discarded for “burn in” purposes. When convergence was reached, 100,000 random draws from P (u|T; θ˄) were used to estimate I(θ); see Web Appendix A. For a given simulated data set, we terminated the algorithm if our convergence criteria was not met after 80 iterations. Non-convergence mainly occurs when the pool size c is large and the number of sites l is small. For the worst case, presented in Table 2 with (l, n, c) = (20, 10, 10), we observed non-convergence and/or a non-positive definite information matrix for about 8 percent of the data sets and removed these from consideration. This behavior is unfortunate, but not unexpected, given the stochastic nature of the MCEM approach. Furthermore, additional investigation reveals that non-convergence is fueled by the relatively low prevalence settings used when simulating individual data. Of course, group testing is most applicable in low prevalence settings.
Table 2.
N | l | n | c | β˄0 | β˄1 | σ˄1 | σ˄2 | ρ˄ | |
---|---|---|---|---|---|---|---|---|---|
Mean | -3.99 | 1.01 | 0.65 | 0.48 | 0.03 | ||||
50 | 2 | SD/SE | 0.34/0.27 | 0.17/0.16 | 0.24/0.23 | 0.13/0.13 | 0.42/0.41 | ||
Cov | 0.92 | 0.93 | 0.98 | 0.97 | 0.91 | ||||
Mean | -4.04 | 1.02 | 0.64 | 0.48 | -0.05 | ||||
2000 | 20 | 20 | 5 | SD/SE | 0.31/0.33 | 0.24/0.23 | 0.22/0.25 | 0.17/0.18 | 0.49/0.50 |
Cov | 0.95 | 0.95 | 0.98 | 0.98 | 0.91 | ||||
Mean | -4.14 | 1.04 | 0.63 | 0.50 | 0.09 | ||||
10 | 10 | SD/SE | 0.43/0.45 | 0.40/0.34 | 0.23/0.28 | 0.21/0.27 | 0.54/0.69 | ||
Cov | 0.97 | 0.92 | 0.96 | 0.96 | 0.93 |
N | l | n | c | β˄0 | β˄1 | σ˄1 | σ˄2 | ρ˄ | |
---|---|---|---|---|---|---|---|---|---|
Mean | -4.01 | 1.01 | 0.69 | 0.49 | 0.00 | ||||
50 | 2 | SD/SE | 0.19/0.19 | 0.12/0.11 | 0.15/0.16 | 0.09/0.09 | 0.33/0.27 | ||
Cov | 0.95 | 0.93 | 0.96 | 0.96 | 0.88 | ||||
Mean | -3.99 | 0.99 | 0.67 | 0.48 | 0.03 | ||||
4000 | 40 | 20 | 5 | SD/SE | 0.23/0.23 | 0.17/0.16 | 0.18/0.17 | 0.12/0.13 | 0.32/0.34 |
Cov | 0.92 | 0.93 | 0.95 | 0.94 | 0.97 | ||||
Mean | -3.99 | 0.99 | 0.66 | 0.47 | -0.04 | ||||
10 | 10 | SD/SE | 0.30/0.30 | 0.25/0.23 | 0.19/0.19 | 0.16/0.20 | 0.48/0.54 | ||
Cov | 0.93 | 0.94 | 0.97 | 0.94 | 0.90 |
The results from Table 2 show that the fixed effects estimates β˄0 and β˄1 from MCEM generally have small bias. The random effects standard deviations σ˄1 and σ˄ do tend to underestimate the true standard deviations, but not by much. The correlation estimate ρ˄ is generally on target but is somewhat more variable than the fixed effect estimates. We also report in Table 2 numerical evidence that the information matrix estimate I(θ˄) is adequate. This can be seen by the relative closeness of “SD” and “SE,” the sample standard deviation of the 200 maximum likelihood estimates θ˄ and the average estimated standard error from I(θ˄)-1, respectively. With few exceptions, the estimated 95 percent Wald interval confidence coefficients for all parameters are within the margin of Monte Carlo error.
4.2. Tests for Homogeneity
We next characterize the performance of the likelihood ratio and score tests for homogeneity discussed in Section 3. Using the model in (7) with β = (β0, β1)’ = (-4, 1)’, , and , Table 3 shows the estimated size and power of the α = 0.05 tests for σ = 0, 0.2, 0.3, ..., 0.6. For each (N, l, n, c, σ) combination, we generated 500 data sets and created pools using random assignment as before; the estimated power in Table 3 is the proportion of times H0 : σ2 = 0 is rejected out of 500. For the score test, the expectations Iττ and Iτβ in (6) were approximated using 2000 replicates as described in Section 3.2. The results suggest that both tests confer the correct size; estimated nominal α = 0.05 sizes are all within the margin of Monte Carlo error. Powers for both tests increase in nearly the same manner, although the score test has a slight advantage for σ close to 0. This finding is congruous with the locally most powerful result described in Lin (1997) for individual testing when q = 1.
Table 3.
N | l | n | c | σ = 0 | 0.2 | 0.3 | 0.4 | 0.5 | 0.6 | |
---|---|---|---|---|---|---|---|---|---|---|
50 | 2 | TLR | 0.03 | 0.10 | 0.30 | 0.55 | 0.78 | 0.89 | ||
TS | 0.04 | 0.13 | 0.36 | 0.63 | 0.83 | 0.92 | ||||
2000 | 20 | 20 | 5 | TLR | 0.04 | 0.08 | 0.24 | 0.45 | 0.75 | 0.88 |
TS | 0.05 | 0.12 | 0.28 | 0.51 | 0.80 | 0.88 | ||||
10 | 10 | TLR | 0.03 | 0.08 | 0.19 | 0.38 | 0.61 | 0.74 | ||
TS | 0.05 | 0.11 | 0.26 | 0.45 | 0.69 | 0.80 |
N | l | n | c | σ = 0 | 0.2 | 0.3 | 0.4 | 0.5 | 0.6 | |
---|---|---|---|---|---|---|---|---|---|---|
50 | 2 | TLR | 0.04 | 0.22 | 0.53 | 0.83 | 0.94 | 0.99 | ||
TS | 0.06 | 0.27 | 0.59 | 0.88 | 0.97 | 1.00 | ||||
4000 | 40 | 20 | 5 | TLR | 0.05 | 0.14 | 0.38 | 0.75 | 0.93 | 0.98 |
TS | 0.05 | 0.18 | 0.44 | 0.80 | 0.95 | 0.99 | ||||
10 | 10 | TLR | 0.03 | 0.11 | 0.32 | 0.57 | 0.80 | 0.92 | ||
TS | 0.05 | 0.16 | 0.38 | 0.67 | 0.88 | 0.97 |
5. Nebraska IPP Data
The state of Nebraska takes part in the nationwide IPP through its Sexually Transmitted Diseases and Infertility Control Program. At l = 78 clinic sites throughout the state, urine or swab (cervical or male urethra) specimens are collected on each individual and are transported to the Nebraska Public Health Laboratory (NPHL) in Omaha for chlamydia and gonorrhea testing. More than 30,000 individual tests are performed annually by the NPHL; we use the individual testing results from the first quarter of 2006 to test our group testing methods. The data set consists of chlamydia and gonorrhea infection statuses (infected/not) for 6,138 subjects, as well as several risk covariates. The number of subjects within site varies from 1 to 540. The sample prevalence for chlamydia and gonorrhea is 7.8 percent and 1.7 percent, respectively, making group testing potentially attractive as a means of surveillance.
We consider chlamydia and gonorrhea infections separately. While group testing is not used currently in Nebraska, it is being used elsewhere for chlamydia and gonorrhea screening, as noted in Section 1. In fact, Lindan et al. (2005) estimates that approximately 12 percent of laboratories in the US use group testing for detecting chlamydia. When compared to individual testing, Kacena et al. (1998b) and Kacena et al. (1998a) report that pooling can reduce testing costs by 39 and 60 percent for chlamydia and gonorrhea, respectively.
We use pool sizes c = 2, 5, 8 and construct pools in the following way. If site i has ni subjects, we first create [ni/c] pools by assigning subjects to pools at random. Any remaining subjects are assigned to a smaller-sized pool. For example, if there are 23 subjects available for one site and c = 5, 4 pools of size 5 and 1 pool of size 3 are created for that site. To incorporate the effects of imperfect testing, pooled responses are recorded assuming that assay sensitivity and specificity are γ1 = 0.95 and 2 = 0.98, respectively. These values are reasonable for both infections based on the empirical findings of Kacena et al. (1998a) and Kacena et al. (1998b). To cover a large number of the possible arrangements of individuals within site, we simulate 100 sets of pools. Among the available covariates from the Nebraska data set, we chose age, gender, urethritis status, and infection symptoms status, based on a variable selection process with the individual data. For each data set and for each infection, we fit the model
(9) |
where is the random clinic site effect. We do not consider variable selection issues in this paper, although this is a good topic for future research with group testing models.
Table 4 displays the results from fitting the model in (9). For each infection, we provide the parameter estimates based on the raw individual data. Probability values for the score and likelihood ratio tests of H0 : σ2 = 0 were highly significant for both infections (p < 0.01) based on the individual data. Estimates and standard errors from pooled testing in Table 4 are obtained by averaging over the 100 simulated sets of pools. There is a general agreement between the estimates from individual testing and pooled testing; those from pooled testing are more variable, but this is expected. For each set of pools, we performed the α = 0.05 level likelihood ratio and score tests as described in Section 3. For each c > 1, the proportion of times H0 is rejected is also given in Table 4 for the score test (likelihood ratio test results were similar). Above all, geographic heterogeneity is apparent for both infections using the pooled responses.
Table 4.
Chlamydia | |||||||
---|---|---|---|---|---|---|---|
c | β˄0 | β˄1 | β˄2 | β˄3 | β˄4 | σ˄ | Reject |
1 | -2.45(0.26) | -0.05(0.01) | 0.68(0.14) | 0.87(0.25) | 0.46(0.12) | 0.46(0.09) | — |
2 | -2.46(0.35) | -0.05(0.01) | 0.65(0.18) | 0.86(0.31) | 0.50(0.16) | 0.46(0.09) | 1.00 |
5 | -2.65(0.44) | -0.03(0.02) | 0.47(0.24) | 0.93(0.46) | 0.57(0.24) | 0.39(0.10) | 0.88 |
8 | -2.89(0.45) | -0.01(0.02) | 0.31(0.26) | 0.96(0.60) | 0.58(0.29) | 0.32(0.11) | 0.77 |
Gonorrhea | |||||||
---|---|---|---|---|---|---|---|
c | β˄0 | β˄1 | β˄2 | β˄3 | β˄4 | σ˄ | Reject |
1 | -4.88(0.50) | -0.02(0.01) | 0.25(0.26) | 1.57(0.34) | 0.95(0.23) | 0.99(0.24) | — |
2 | -5.42(0.62) | -0.01(0.02) | 0.48(0.34) | 1.39(0.39) | 1.04(0.30) | 0.90(0.27) | 1.00 |
5 | -6.21(0.73) | 0.00(0.02) | 0.79(0.45) | 1.20(0.51) | 1.19(0.42) | 0.83(0.27) | 1.00 |
8 | -6.62(0.81) | -0.01(-0.03) | 0.93(0.49) | 1.01(0.65) | 1.42(0.50) | 0.83(0.27) | 0.96 |
6. Discussion
We have generalized the work of Vansteelandt et al. (2000) to incorporate random effects into group testing regression models and have illustrated the usefulness of this extension using chlamydia and gonorrhea data collected by the state of Nebraska. We have presented methods for maximum likelihood estimation, for large sample inference, and for assessing homogeneity in the latent binary responses that arise with group testing. For future research, it may be of interest to investigate different pool composition strategies, as was done by Vansteelandt et al. (2000) and Bilder and Tebbs (2008) in fixed effects models. Vansteelandt et al. (2000) showed that forming x-homogeneous pools (i.e., pools whose covariate values are as similar as possible) minimizes the amount of lost information due to pooling and provides the best fixed effects parameter estimates. In models with random effects, it is not clear how x-homogeneous and/or z-homogeneous compositions would a ect the parameter estimates.
Tests for variance components in Section 3 have been examined in the scalar variance case, but more general tests of variance components could be formulated using the results from Self and Liang (1987) and Silvapulle and Silvapulle (1995). For example, in (8), one might wish to test or possibly , while leaving unspecified. Such examples fall into a larger class of tests involving the unique components of D. In higher dimensional settings, likelihood ratio and score statistics continue to follow mixture χ2 distributions, but with possibly a larger number of components.
Our hierarchical formulation requires that individuals are pooled within site, but this assumption may not be practical in some applications. A useful extension of this work would be to allow for subjects from different sites to be pooled together. In this setting, fitting the model using quadratures would most likely be computationally infeasible, but the MCEM approach may prove fruitful. Even so, when compared to pooling subjects within site, we would expect both the loss in precision of the random effects estimates and the loss in power of the homogeneity tests to be significant. Although we have introduced random effects models for group testing using only the initial pool results, our work can be extended to handle the multistage pooling designs presented in Brookmeyer (1999). When γ1 = γ2 = 1 (perfect testing), this extension is almost automatic, but is perhaps not practical in infectious disease contexts. When imperfect tests are used, conceptually our methodology is applicable, although the likelihood function is far more complicated. We would expect this to be also true for other pooling strategies such as array-based pooling (see, e.g., Kim et al., 2007).
Finally, we are currently pursuing the development of techniques to simultaneously model the prevalence of multiple infections within a group testing regression framework, thereby extending the work of Hughes-Oliver and Rosenberger (2000) to incorporate covariate information. Such regression models have the potential to find widespread use, because many epidemiological investigations using group testing do involve multiple infections and disease statuses are often correlated.
Supplementary Material
Acknowledgements
The authors are grateful to the Editor, the Associate Editor, and the two anonymous referees for their helpful comments. We especially thank Dr. Peter Iwen, Dr. Steven Hinrichs, and Philip Medina for their consultation on the IPP. We also thank Drs. Xianzheng Huang, Kerrie Nelson, and Geert Verbeke for their comments on an earlier draft of this manuscript. This research was supported by Grant R01 AI067373 from the National Institutes of Health.
References
- Bilder C, Tebbs J. Bias, efficiency, and agreement for group-testing regression models. Journal of Statistical Computation and Simulation. 2008 doi: 10.1080/00949650701608990. in press. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Booth J, Hobert J. Maximizing generalized linear mixed model likelihoods with an automated Monte Carlo EM algorithm. (Series B).Journal of the Royal Statistical Society. 1999;61:265–285. [Google Scholar]
- Booth J, Hobert J, Jank W. A survey of Monte Carlo algorithms for maximizing the likelihood of a two-stage hierarchical model. Statistical Modelling. 2001;1:333–349. [Google Scholar]
- Brookmeyer R. Analysis of multistage pooling studies of biological specimens for estimating disease incidence and prevalence. Biometrics. 1999;55:608–612. doi: 10.1111/j.0006-341x.1999.00608.x. [DOI] [PubMed] [Google Scholar]
- Cardoso M, Koerner K, Kubanek B. Mini-pool screening by nucleic acid testing for hepatitis B virus, hepatitis C virus, and HIV: Preliminary results. Transfusion. 1998;38:905–907. doi: 10.1046/j.1537-2995.1998.381098440853.x. [DOI] [PubMed] [Google Scholar]
- Chen J, Zhang D, Davidian M. A Monte Carlo EM algorithm for generalized linear mixed models with flexible random e ects distribution. Biostatistics. 2002;3:347–360. doi: 10.1093/biostatistics/3.3.347. [DOI] [PubMed] [Google Scholar]
- Dorfman R. The detection of defective members of large populations. Annals of Mathematical Statistics. 1943;14:436–440. [Google Scholar]
- Hughes-Oliver J, Rosenberger W. Efficient estimation of the prevalence of multiple rare traits. Biometrika. 2000;87:315–327. [Google Scholar]
- Kacena K, Quinn S, Hartman S, Quinn T, Gaydos C. Pooling of urine samples for screening for Neisseria gonorrhoeae by ligase chain reaction: Accuracy and application. Journal of Clinical Microbiology. 1998a;36:3624–3628. doi: 10.1128/jcm.36.12.3624-3628.1998. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kacena K, Quinn S, Howell M, Madico G, Quinn T, Gaydos C. Pooling urine samples for ligase chain reaction screening for genital Chlamydia trachomatis infection in asymptomatic women. Journal of Clinical Microbiology. 1998b;36:481–485. doi: 10.1128/jcm.36.2.481-485.1998. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kim H, Hudgens M, Dreyfuss J, Westreich D, Pilcher C. Comparison of group testing algorithms for case identification in the presence of testing error. Biometrics. 2007;63:1152–1163. doi: 10.1111/j.1541-0420.2007.00817.x. [DOI] [PubMed] [Google Scholar]
- Liang K. A locally most powerful test for homogeneity with many strata. Biometrika. 1987;74:259–264. [Google Scholar]
- Lin X. Variance component testing in generalised linear models with random e ects. Biometrika. 1997;84:309–326. [Google Scholar]
- Lindan C, Mathur M, Kumta S, Jerajani H, Gogate A, Schachter J, Moncada J. Utility of pooled urine specimens for detection of Chlamydia trachomatis and Neisseria gonorrhoeae in men attending public sexually transmitted infection clinics in Mumbai, India, by PCR. Journal of Clinical Microbiology. 2005;43:1674–1677. doi: 10.1128/JCM.43.4.1674-1677.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Louis T. Finding observed information using the EM algorithm. (Series B).Journal of the Royal Statistical Society. 1982;44:98–130. [Google Scholar]
- McCulloch E. Maximum likelihood algorithms for generalized linear mixed models. Journal of the American Statistical Association. 1997;92:162–170. [Google Scholar]
- Molenberghs G, Verbeke G. Likelihood ratio, score, and Wald tests in a constrained parameter space. The American Statistician. 2007;61:22–27. [Google Scholar]
- Nie L. Convergence rate of MLE in generalized linear and nonlinear mixed-e ects models: Theory and applications. Journal of Statistical Planning and Inference. 2007;137:1787–1804. [Google Scholar]
- Pilcher C, Fiscus S, Nguyen T, Foust E, Wolf L, Williams D, Ashby R, O’Dowd J, McPherson J, Stalzer B, Hightow L, Miller W, Eron J, Cohen M, Leone P. Detection of acute infections during HIV testing in North Carolina. New England Journal of Medicine. 2005;352:1873–1883. doi: 10.1056/NEJMoa042291. [DOI] [PubMed] [Google Scholar]
- Pinheiro J, Bates D. Approximations to the log-likelihood function in the nonlinear mixed-e ects model. Journal of Computational and Graphical Statistics. 1995;4:12–35. [Google Scholar]
- Rours G, Verkooyen R, Willemse H, van der Zwaan E, van Belkum A, de Groot R, Verbrugh H, Ossewaarde J. Use of pooled urine samples and automated DNA isolation to achieve improved sensitivity and cost-e ectiveness of large-scale testing for Chlamydia trachomatis in pregnant women. Journal of Clinical Microbiology. 2005;43:4684–4690. doi: 10.1128/JCM.43.9.4684-4690.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Screening and Treatment Guidelines, Infertility Prevention Project, Region VII. 2003 Available at http://www.devsys.org/html/ipp/index.html.
- Self S, Liang K. Asymptotic properties of maximum likelihood estimators and likelihood ratio tests under nonstandard conditions. Journal of the American Statistical Association. 1987;82:605–610. [Google Scholar]
- Silvapulle M, Silvapulle P. A score test against one-sided alternatives. Journal of the American Statistical Association. 1995;90:342–349. [Google Scholar]
- Vansteelandt S, Goetghebeur E, Verstraeten T. Regression models for disease prevalence with diagnostic tests on pools of serum samples. Biometrics. 2000;56:1126–1133. doi: 10.1111/j.0006-341x.2000.01126.x. [DOI] [PubMed] [Google Scholar]
- Xie M. Regression analysis of group testing samples. Statistics in Medicine. 2001;20:1957–1969. doi: 10.1002/sim.817. [DOI] [PubMed] [Google Scholar]
- Zagurski K. Omaha World Herald. 2006 Feb 2;:08B. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.