Abstract
Group testing, introduced by Dorfman (1943), has been used to reduce costs when estimating the prevalence of a binary characteristic based on a screening test of
groups that include
independent individuals in total. If the unknown prevalence is low and the screening test suffers from misclassification, it is also possible to obtain more precise prevalence estimates than those obtained from testing all
samples separately (Tu et al., 1994). In some applications, the individual binary response corresponds to whether an underlying time-to-event variable
is less than an observed screening time
, a data structure known as current status data. Given sufficient variation in the observed
values, it is possible to estimate the distribution function
of
nonparametrically, at least at some points in its support, using the pool-adjacent-violators algorithm (Ayer et al., 1955). Here, we consider nonparametric estimation of
based on group-tested current status data for groups of size
where the group tests positive if and only if any individual’s unobserved
is less than the corresponding observed
. We investigate the performance of the group-based estimator as compared to the individual test nonparametric maximum likelihood estimator, and show that the former can be more precise in the presence of misclassification for low values of
. Potential applications include testing for the presence of various diseases in pooled samples where interest focuses on the age-at-incidence distribution rather than overall prevalence. We apply this estimator to the age-at-incidence curve for hepatitis C infection in a sample of U.S. women who gave birth to a child in 2014, where group assignment is done at random and based on maternal age. We discuss connections to other work in the literature, as well as potential extensions.
Keywords: Current status data, Expectation-maximization algorithm, Group testing, Pool-adjacent-violators algorithm
1. Introduction
In the past decade, group testing of a binary response has once again become a topic of great interest (Remlinger et al., 2006; Wahed et al., 2006; Dhand et al., 2010). The idea was first introduced in 1943 as a potential cost-saving measure for the detection of syphilis in U.S. army recruits (Dorfman, 1943). Group testing reduces the number of tests by allocating, randomly or otherwise,
individuals into
groups of equal size
and testing each pooled group only once, in order to provide an estimate of the prevalence of a binary characteristic in a population.
More recent work has considered potential issues with group testing, such as dilution effects, non-random group assignment, and misclassification (Hwang, 1976; Wein & Zenios, 1996; Delaigle & Hall, 2012; Liu et al., 2012). Tu et al. (1995) suggested that if the unknown prevalence of a binary characteristic is sufficiently low and the screening test suffers from misclassification, more precise estimates of the prevalence can be obtained from
group tests than by testing all
individuals separately. The intuition behind this finding is complex. When a test has a rate of misclassification independent of the number of individuals in the pooled sample, performing fewer tests could increase the precision of the prevalence estimate due to fewer tests being performed, thereby leading to less noise in the observations. This is particularly the case when the prevalence is sufficiently small, making it uncommon that two positives will occur in the same group.
The data structure where an individual’s binary response corresponds to an underlying time-to-event variable
occurring before an observed screening time
is known as current status data, or interval censoring type I (Jewell & van der Laan, 2003; Jewell & Emerson, 2013). The nonparametric maximum likelihood estimator of the distribution function,
, of
for current status data is the pool-adjacent-violators algorithm, although it is only possible to use this estimator if there is sufficient variation in the observed screening times
(Ayer et al., 1955).
In this paper, we develop a simple algorithm to compute a nonparametric maximum likelihood estimator of
for group-tested current status data, and extend it to settings where the test is subject to misclassification. When misclassification is present, we hypothesize that there will sometimes be substantial gains in precision for values of
at which the prevalence is sufficiently small, as described by Tu et al. (1995) in the case of estimating a single fixed prevalence.
2. Notation and likelihood function
We assume that the underlying data, prior to grouping, arise from
independent realizations of a bivariate random variable,
, where the survival random variable
and screening random variable
follow distribution functions
and
, respectively. Throughout, we assume that
and
are independent. The observed data are based on grouping these realizations at random into blocks of size
, where for convenience we assume that
is an integer. It is trivial to extend all the results below to situations where the block sizes may vary. Thus each original unit corresponds to the
th individual in the
th group, where
and
. The group-tested result from the
th group,
, is the only test result available, whereas individual screening times,
, are observed for all participants. Specifically,
if and only if
for all
and
otherwise. The group test detects the presence of one or more positives in the group, but cannot distinguish between a single, or several, positive
. The immediate goal is to estimate the distribution function
.
Owing to the assumed independence of
and
, we can focus on the conditional likelihood of the data given the observed screening times
. Since
, this conditional likelihood is
| (1) |
where
is the survival function of
. This conditional likelihood applies to various methods of selecting the screening times
and assigning the observations to groups for testing. At one extreme, the
values in each group are selected completely at random; at the other end of the spectrum, individuals with a common value of
are assigned to the same group. The latter sampling scheme is only fully feasible if the distribution function
is discrete. While the estimation strategy pursued here applies generally, estimation is much simpler with a common
value in each group, and asymptotic properties of the estimator are more easily derived in that case. For example, with a common value of
in each grouping of fixed group size
, the likelihood (1) simplifies to that for the standard current status data problem with underlying survival function
. Estimates and inference regarding
can then be immediately translated to corresponding statements regarding
itself. In practice, with a continuous
, it may be advantageous to group together individuals with approximately the same value of
.
This development assumes a perfect screening test of whether or not the true group test result was positive,
. We can extend these ideas to permit misclassification of the test results, and we now use the notation
to distinguish the potentially misclassified test result from the true result
. Assume that the test has known sensitivity and specificity, independent of both the screening time
and the group size, given by
and
with the assumption that
. Then the conditional likelihood of the potentially misclassified data, given the observed screening times
, can be written as
where
.
3. An expectation-maximization pool-adjacent-violators algorithm
3.1. Development of the algorithm
Group-tested current status data can be formulated as a missing data problem. First, consider the setting without misclassification of test results. While the full set of screening times
is observed, only group-tested results
are available, whereas a complete dataset would include all individual test results,
. This missing information setting naturally allows use of the expectation-maximization algorithm (Dempster et al., 1977).
To implement the expectation-maximization algorithm, we calculate the expected value of the true individual test result,
, given the observed value of the group-tested result,
, based on a current estimate of
. These calculations are straightforward when there is no misclassification:
| (2) |
| (3) |
For misclassified data with sensitivity
and specificity
, computing the expected value of an individual true disease status
given the potentially misclassified observed group-test result
becomes slightly more complicated; see the Supplementary Material. Letting
, this step becomes
For the maximization step, we simply use a weighted version of the pool-adjacent-violators algorithm on the full dataset
, where
with weight 1 if
, per (2). On the other hand, according to (3), if
, then
with weight given by the right-hand side of (3), and additional observations
have weight given by 1 minus the right-hand side of (3). The complete algorithm is thus described as follows.
Step 1
Initialize values of
for each individual and set a threshold
for convergence.
Step 1
(Expectation). For each individual
in group
, calculate the probability
that the individual tested positive, given their group’s test result. For perfectly classified results,
, use
(4) For group-tested results subject to misclassification,
, with sensitivity
and specificity
such that
, use
(5)
Step 3
(Maximization). Use the group-tested results,
or
, as the observations for each individual, and the probabilities from Step 2 as the weights in the weighted pool-adjacent-violators algorithm to calculate updated estimates of
.
Step 4
Repeat Steps 2 and 3, using the estimate of
from Step 3 as the initial value for Step 2, until convergence, for example until
It is important to run the algorithm with several choices of starting values, not only to reduce the possibility of converging to a local extrema, but also to discover possible different nonunique versions of the nonparametric maximum likelihood estimator. We recommend choosing a large set of random starting values of
at the observed set of
by generating random Un
values ordered so that the starting values are monotonically increasing with
.
3.2. Comments regarding asymptotics
Asymptotic results for standard current status data are nonstandard. The nonparametric maximum likelihood estimator is known to be consistent, although converging only at the rate
, but has a non-Gaussian limiting distribution known as Chernoff’s distribution (Groeneboom & Wellner, 1992) in situations where the monitoring time distribution,
, is continuous; Banerjee (2012) provides a concise discussion of this result. Rather than using Wald-type pointwise confidence intervals derived from this limit, Banerjee & Wellner (2001, 2005) suggest using a likelihood ratio approach to construct confidence bands.
On the other hand, when
has finite support, the likelihood is parametric, since
can then be estimated only at this finite number of support points, namely the observed censoring times. As expected from this observation, the nonparametric maximum likelihood estimator now converges to a Gaussian limit at rate
, with the asymptotic variance at a specific monitoring time
given simply by
, which is straightforward to estimate using the obvious plug-in estimators (Yu et al., 1998; Maathuis & Hudgens, 2011). The hybrid problem where the number of support points grows with the sample size is discussed beautifully in Tang et al. (2012). Sal y Rosas & Hughes (2010) proposed the inversion of a likelihood ratio test to obtain pointwise confidence intervals for
when the data are subject to misclassification.
These results can be applied directly to the group-testing scenario only in the simplest situations. For the extreme situation of only one monitoring time, estimation of
reduces to the simple estimation of prevalence. This scenario has been studied extensively in the literature on group testing with misclassification; for example, Tu et al. (1994) provided asymptotically normal confidence intervals with convergence rate
. Generalizing slightly, the situation with finite support for
, and with no misclassification, simplifies to the case considered by Yu et al. (1998) if individuals within a group all share a common value of
. In this case,
, so that asymptotic results for the nonparametric maximum likelihood estimator applied to the group-tested data immediately follow through for the plug-in estimator of
, or
, at the finite number of screening times
by using the delta method. We anticipate that this will extend straightforwardly in the presence of misclassification, and we also suggest that use of the bootstrap will be effective here.
Even with a finite number of monitoring times, the situation becomes more complex when screening times are randomly assigned to the groups. This is clear even in the case of only two monitoring times and with pair groupings done at random. Further, there are as yet no known asymptotic results for the nonparametric maximum likelihood estimator of
3.1 with a continuous screening time distribution, although we anticipate that convergence will remain at a
rate.
4. Elementary example
4.1. An analytic solution
For illustration, consider a simple example in a setting without misclassified test results, where there are two groups each containing two individuals; that is,
and
. There are twelve possible combinations of group assignments and test results, corresponding to three different possible pair assignments with each pair having two possible test outcomes. Consideration of the conditional likelihood (1) reveals a simple solution in all but one of these cases; we focus on the remaining case, which has the grouping shown in Fig. 1, with
and
.
Fig. 1.

Elementary example of data configuration with two groups, each of size 2, where the first group has tested positive and the second group has tested negative.
The conditional likelihood (1) in this setting is
It is immediate that the nonparametric maximum likelihood estimator must have
and
. Hence, the nonparametric maximum likelihood estimator is not unique but is achieved by any set of
with
,
and
. We show how the expectation-maximization pool-adjacent-violators algorithm converges to one such solution, with the specific value depending directly on the starting values for
.
Given an initial set of probabilities
,
,
and
such that
, the first step of the algorithm calculates the expectation of each of the initial conditional probabilities,
(
), as given in (4) and (5), i.e., the probability that each individual was positive given the known group-tested result. For two of these probabilities, in a setting without misclassification, this calculation is trivial: the pair tested negative, so neither of the individuals was positive. Hence we can set
. For the pair that tested positive, this calculation follows directly from (4):
The next step of the algorithm is to make these
monotonic, recalling that
, by using the pool-adjacent-violators algorithm. This yields the following updated estimates of
:
| (6) |
| (7) |
These steps are then iterated until a determination of convergence based on comparing, say, the sum of the squared differences between
and
at each observed
to a prespecified threshold
.
4.2. Multiple convergence values
As we demonstrated in
4.1, the initial values for the pair that tested negative,
and
, are not relevant to the update step in our expectation-maximization pool-adjacent-violators algorithm. Therefore, when discussing convergence of the algorithm, we will only consider initial values for
and
.
In all settings where
, the update step given by (6) and (7) becomes
Therefore, at convergence,
so that the algorithm converges to
, the only solution in
. This can, of course, also be expressed as
for
.
For any other set of starting values, the ratio
remains unchanged by the iterations. We can therefore write
, where
and
stays fixed, as determined by the starting values for
and
. At convergence, (6) then simplifies to
Thus, convergence occurs when
. After an application of the quadratic formula, this simplifies to
the only feasible solution. It immediately follows that at convergence,
so that the condition
holds, as noted in
4.1.
This simple example demonstrates the nonuniqueness of the nonparametric maximum likelihood estimator, with the algorithm converging to a specific solution for
determined by the ratio of the starting values of
at
and
. When using this algorithm in an applied setting, we suggest repeating it many times, using a different set of randomly drawn starting values each time, and then computing the likelihood function to identify as many different unique solutions to the optimization as possible.
5. Simulations
5.1. Design of simulations
We carry out two series of simulations to examine the behaviour of the expectation-maximization pool-adjacent-violators algorithm for group-tested data, as compared to the pool-adjacent-violators algorithm, which is the nonparametric maximum likelihood estimator for individual-level current status data (Barlow et al., 1972). We consider two scenarios, one where the tests are subject to no misclassification, and another where the test is subject to misclassification with known, constant error rates. In the latter case, the comparative estimator for misclassified individual-level current status data was derived by McKeown & Jewell (2010). We consider both continuous and discrete independent screening times. The former are described and discussed below, and the latter in the Supplementary Material.
Each simulation is characterized by a set of fixed parameters:
, the number of individuals;
, the group size; and
and
, the sensitivity and specificity of the screening test, respectively. We set
in scenarios without misclassification. We first simulate traditional current status data for each individual from the distribution of the true event times,
, and the censoring distribution,
. Each run of the simulations begins with simulating data of sample size
at the individual level, followed by assigning individuals to groups randomly.
The distribution
of the event times
is Weibull with shape and scale parameters 4 and 25, respectively; here
has mean 22
7 and variance 40
4. For the perfectly classified test simulations, the screening distribution
for
is Un
, allowing almost all of the distribution
to be identified. The necessary binary datum
is then determined from the generated individual values of
and
. The values of
, the group-tested results, follow immediately from the values of
from each individual in the group, as described in
2. Each simulation is performed 1000 times in six different settings, given by
and groupings of sizes
.
For misclassified test results, we are most interested in examining performance of the expectation-maximization pool-adjacent-violators estimator in the left tail of
, where false positive test results could have the largest effect on the estimate of
(Tu et al., 1994). Hence, while
remains the same Weibull distribution, we now take
to be Un
to ensure that
. Here we select a single sample size
in 12 different settings with group sizes
and misclassification rates of
. In these simulations, the observed misclassified data are obtained by, first, subjecting each individual test result
to misclassification under the specified test characteristics and, second, generating the group-tested outcome
separately by misclassifying the corresponding group-test result
. Here we have used the same test classification probabilities, assuming independence between the group size and the error rates of the testing procedure.
In each run of the two sets of simulations, for perfectly classified data and misclassified data, we compute both the appropriate expectation-maximization pool-adjacent-violators algorithm for the group-tested data and the appropriate pool-adjacent-violators algorithm for individual data. To select initial values for the expectation-maximization pool-adjacent-violators algorithm, we first draw
values uniformly from the range
and sort them from smallest to largest; we then order the observations so that the
values are monotonically increasing, and match the ordered initial probabilities to the ordered data. Although, as noted earlier, for a specific application we recommend choosing multiple starting values, here we opt to randomly select only one set of initial values for each simulated dataset, thereby achieving only one of potentially many possible nonparametric maximum likelihood estimates.
The averages of the estimates of
obtained from each algorithm over the 1000 runs are calculated for each
in the support of
. To calculate the estimate of
at a value of
not observed in a specific simulation, we assume left-continuity of both estimators in situations where this is not imposed by monotonicity. To provide a sense of the variability of each estimator, we also calculate the 2
5th and 97
5{th} quantiles of the estimates over the 1000 simulations. For the second set of simulations, we use these quantities to compute a measure of pseudo-relative efficiency, the ratio of the widths of these 95% Monte Carlo quantile intervals:
. The variances of the simulated estimates are less relevant, since we hypothesize that this estimator does not converge to a Gaussian distribution, nor at a
rate.
The Supplementary Material contains results from two simulations in samples of size
, with 10 fixed, equal-frequency screening times
, and with true event probabilities at each screening time fixed at
. In the first simulation, we randomly group individuals by values of
to allow for the presentation of asymptotically normal confidence intervals, as described in
3.2; in the second, we group across screening times and again present the widths of the 95% Monte Carlo quantile intervals.
5.2. Results: perfectly classified data
Figure 2 displays the results from applying the expectation-maximization pool-adjacent-violators algorithm and the pool-adjacent-violators algorithm to data generated in the six simulations where there is no misclassification of the test results. These simulations show that the finite-sample bias is small, except perhaps when the group size is large, e.g.,
, and
is small. Even then, this bias declines systematically as the sample size increases. As anticipated, in all situations, the bias is also smaller for the estimator based on individual test results. Similarly, and also to be expected, the latter is more precise, although the gain in precision decreases for larger sample sizes and smaller
. This being said, the group-tested estimator stands up remarkably well given that the screening costs are reduced by 50%, 80% and 90% when
and
, respectively, assuming that costs are proportional to the number of tests.
Fig. 2.
Results from six simulations of the estimation of
, with 1000 runs each, for different sample sizes
and group sizes
. In each panel, the black lines are the average estimates of
over the 1000 simulations, with the solid line representing the true cumulative distribution function Wei
and the dashed and dotted lines representing, respectively, the estimates from the pool-adjacent-violators algorithm and the expectation-maximization pool-adjacent-violators algorithm; the grey lines are the 2
5th and 97
5{th} quantiles from the simulation runs for each estimator, using the same line types.
Because the asymptotic properties of the expectation-maximization pool-adjacent-violators algorithm are currently unknown, to demonstrate variability in the estimates we delineate the 95% Monte Carlo quantile interval by dashed and dotted lines in Fig. 2. The width of this interval for the pool-adjacent-violators algorithm from individual data is always smaller than that for the expectation-maximization pool-adjacent-violators algorithm applied to group-tested data. This is to be expected, as there is no misclassification in these simulations. Smaller group sizes
in the expectation-maximization pool-adjacent-violators algorithm provide 95% quantile intervals more similar to those estimated from individual data, and as
increases for fixed
, the width of the 95% quantile interval decreases. Overall, Fig. 2 demonstrates that the expectation-maximization pool-adjacent-violators algorithm provides an unbiased estimate of the true underlying distribution,
.
5.3. Results: misclassified data
Figures 3 and 4 present results from the twelve simulations in settings with
individuals and varying group sizes and misclassification rates. Figure 3 shows that the percentage relative bias of both estimators in these finite samples is large, e.g., greater than 100%, for estimates of
that are very small, e.g., less than 0
002, and is very close to zero for estimates of
that are greater than 0
02, even at large group sizes with high misclassification rates. Although the individual-based estimator is less biased at small group sizes and low misclassification rates, we do see similar or lower amounts of bias from the group-testing estimator at higher misclassification rates, e.g.,
or
, particularly with the larger grouping sizes
and at lower values of
. Ultimately, the shapes of the finite-sample relative bias curves for these two estimators are very similar, so, at the very least, grouping does not introduce substantial amounts of additional bias.
Fig. 3.
Graphical representation of the finite-sample percentage relative bias from 12 simulations repeated 1000 times with 5000 individuals each, based on different group sizes
and misclassification rates
, with values of the latter noted along the right-hand side. In each plot, the solid black line represents results obtained from the expectation-maximization pool-adjacent-violators algorithm for group tests, and the short-dashed black line represents results from the pool-adjacent-violators algorithm for misclassified individual test data; the long-dashed black line represents the reference level of 0% bias.
Fig. 4.
Logarithm of the pseudo-relative efficiency of the expectation-maximization pool-adjacent-violators algorithm and the adjusted pool-adjacent-violators algorithm from 12 simulations of 1000 runs with 5000 individuals each, based on different group sizes
and misclassification rates
, with values of the latter noted along the right-hand side. In each plot, the solid black line is a lowess curve showing the overall trend in pseudo-relative efficiency as
increases; the dashed black line represents equal-width 95% Monte Carlo quantile intervals for reference: if the solid black line is below zero, the width of the expectation-maximization pool-adjacent-violators 95% Monte Carlo quantile interval is smaller than that obtained from the individual test pool-adjacent-violators algorithm.
With regard to variability, a comparison of the widths of the 95% Monte Carlo quantile intervals associated with both estimators, as shown in Fig. 4, demonstrates a considerable advantage of our estimator from group-tested data at low
and high levels of misclassification. For example,
corresponds to a true prevalence of 2
5%. If a test is subject to 10% misclassification, i.e.,
, then test results from data grouped into pools of size 10 will provide a more or equally precise estimate of
for
than data from individual tests. This implies that if the cumulative failure rate in question is less than 2
5%, a testing procedure that involves groups of size 10 will cost 90% less than testing everyone individually, and will result in a less biased and more precise estimate of
in this range. In general, the specific threshold
below which such precision gains can be expected depends on both the group size and the misclassification rate, as suggested by Tu et al. (1994) for estimation of a single fixed prevalence.
The Supplementary Material includes results from simulations of group-tested current status data on a grid, with grouping done solely according to common observation times, which more easily ensures a sufficiently small maximum value of
. As seen in Tu et al. (1994), we observe a reduction in the size of 95% confidence intervals as the group size increases, and separately a reduction in the size of the 95% confidence intervals as the misclassification rates decrease. Additionally, there appears to be no substantial increase in bias as group size increases.
6. Application to hepatitis C data
To investigate the performance of our estimator in a practical setting, we use publicly available data from the 2014 U.S. Birth Data File, created by the National Center for Health Statistics, to investigate the age-at-incidence distribution for hepatitis C in non-Hispanic white women of child-bearing age. The dataset includes all such women of ages 13–40 who gave birth in 2014. We are therefore making the tacit assumption that women who gave birth are a representative sample of women of the same ages that could have given birth in terms of their risk of infection with hepatitis C. This is not exactly correct but seems to be a reasonable approximation, at least for sexually active women. Of the 1 981 521 eligible women, we randomly sampled 10%, creating a sample of
observations, for greater ease of illustration and computation. The data include the mother’s age in years and her hepatitis C status at the birth of her child. Of the
women in our investigation, only 901 tested positive for hepatitis C, a cumulative incidence of 0
46%. When accounting for potential misclassification of these test results, we used the sensitivity,
, and specificity,
, associated with the most commonly used test for hepatitis C: an enzyme immunoassay test. Although hepatitis C can be spread via sexual contact, it is primarily transmitted through blood, and an increase in the incidence of hepatitis C after age 25 would imply that people are beginning or continuing to engage in risky drug behaviour.
These data are based on individual blood testing for each mother separately. To illustrate our proposed methods, we consider group testing of pooled blood samples, representing potentially enormous savings in test costs depending on the size of the grouping used. These savings persist even if specific infected individuals need to be identified. As discussed above, given the low misclassification rates, we anticipate some loss of accuracy in estimating the prevalence, but this may nonetheless be worth the considerable cost reduction. We created artificial group-test results in two ways: (i) by assigning the data to groups of sizes 2, 5 and 10 according to age (gridded group assignment), and (ii) by randomly assigning the data to groups of sizes 2, 5 and 10. Then, each group test was assigned a positive result if at least one individual test was positive. For gridded group assignments, we computed point estimates and 95% confidence intervals adjusted for misclassification using the method described in
3.2. For random group assignments, we applied the adjusted pool-adjacent-violators algorithm to the individual test results and, for comparison, the expectation-maximization pool-adjacent-violators algorithm to the group-tested results.
Figure 5 displays estimates obtained from individual and group-tested results with groups of sizes 2, 5 and 10 in a setting where group assignment is done by common age. The results are satisfying, as they lead to the same public health implications. Although the estimates are slightly different, they increase with group size, and the major jumps in the estimates occur at ages 19 and 21 for each of the group sizes considered. From these results, we can be fairly certain that any intervention to potentially reduce the public health burden due to hepatitis C infection would best occur during adolescence, ideally before risky behaviours such as drug use and unprotected sexual activity begin. In this example, major cost reductions could be achieved by decreasing the number of tests performed, assuming costs are proportional to the number of tests, without changing the conclusions of the analysis.
Fig. 5.
Four estimates of the cumulative incidence of hepatitis C in non-Hispanic white child-bearing U.S. women of ages 13–40 in 2014 when grouping is assigned according to common values of age. Group sizes considered were
and 10. In each panel, the solid line is the estimate from the individual or group-tested results, and the dashed lines represent the upper and lower bounds of 95% confidence intervals.
Figure 6 displays estimates obtained from individual and group-tested results with groups of sizes 2, 5 and 10 in a setting where group assignment is done completely at random. Unlike the estimates in Fig. 5 obtained from data grouped according to the women’s age, here the estimates from data in groups of different sizes yield different implications. The results from the individual tests suggest an essentially flat cumulative incidence of hepatitis C after age 21, having reached a cumulative incidence of approximately 0
38%. This has significant implications for a public health intervention: it potentially indicates, for example, that any future hepatitis C vaccination would be most effective if implemented during late adolescence. No vaccine currently exists, although several candidates are under development. The group-tested results from groups of size 2 support the same conclusion, although they suggest that the cumulative incidence does not increase after age 19. However, the results from groups of sizes 5 and 10 tell a slightly different story: while these estimates increase to a cumulative incidence of roughly 0
4% before age 20, they then both continue to increase with age to somewhere in the range of 0
45–0
55% by age 40, suggesting that a substantial fraction of hepatitis C infections occur post-adolescence.
Fig. 6.
Four estimates of the cumulative incidence of hepatitis C in non-Hispanic white child-bearing U.S. women of ages 13–40 in 2014 when group testing with random group assignments. The solid line is the pool-adjacent-violators estimate from the individual test results, and the dotted, short-dashed and long-dashed lines are the estimates obtained from the expectation-maximization pool-adjacent-violators algorithm with the individual test results artificially assigned to groups of sizes 2, 5 and 10, respectively.
Because these estimates seem to imply public health interventions at different times in life, it is important to consider which estimate is most reliable in this particular setting. As noted earlier, there is very little misclassification in the testing procedure, so we would expect that the results from the adjusted pool-adjacent-violators algorithm based on individual data would be more accurate, albeit obtained at significantly higher cost. However, the pool-adjacent-violators algorithm adjusted for misclassification has a limitation: it automatically estimates cumulative incidences that are less than
as 0; because the cumulative incidences at the early ages are less than 0
5%, if we had set
in this application, our estimate from the individual data adjusted for misclassification would have been zero at all ages. This suggests a potential issue with individual test results that may not be as much of a problem with group-tested results.
7. Discussion
In this paper we have proposed a modified expectation-maximization algorithm to estimate a distribution function from data obtained by group-tested current status screening with test misclassification. Simulations show that the estimator based on group-tested data adds relatively little extra small-sample bias compared to an estimator based on individual data, but has a far lower cost, although this conclusion necessarily requires a larger
as the grouping size
increases. Additionally, when substantial misclassification is present, and
is low, estimates obtained from the expectation-maximization pool-adjacent-violators algorithm with groups of size 5 or larger may be less biased and have improved precision, although inferential properties for this procedure need further development. This offers the possibility that a significantly less expensive testing procedure might yield a less biased and more precise estimate for the left tail of
.
In the presence of misclassification, these observations suggest possible hybrid grouping strategies that may improve precision at low values of
and maintain performance at higher levels, all in comparison to individual tests whose costs are far greater. That is, where possible, if the screening times are known in advance of pooling, it will likely be advantageous to first group individuals according to the observed
values, and then use larger group sizes at the smaller values of
and decrease the group size as
increases, even down to individual tests. Simulations to examine variations of these possibilities are currently under way. As noted earlier, when individuals in a group have similar
values, it is possible to also use an approximate individual group-tested current status estimator by treating all
values in the group as being the same.
There are a number of important extensions to these results. As noted, the pool-adjacent-violators estimator for classic current status data converges at a rate of
with a nonstandard asymptotic limit, see a 1987 technical report by P. Groeneboom from the University of Amsterdam. We conjecture that the same asymptotics will hold for the group-tested estimator, although this remains to be established. In practice, in a setting with misclassified individual current status data, the
-out-of-
bootstrap (McKeown & Jewell, 2010) has been shown to provide one method of obtaining valid inference procedures. We look forward to further theoretical progress in this area.
It is natural to anticipate that misclassification rates may depend on group size. This may occur, for example, if the screening test is more sensitive to detecting a positive group when there are more individual positives in the pool, related to the so-called dilution effect (Hwang, 1976; McMahan et al., 2013). Second, covariate-adjusted regression analysis has been a primary focus of the statistical literature on group testing (Vansteelandt et al., 2000; Xie, 2001; Chen et al., 2009; Delaigle & Meister, 2011). In addition, in many applications, interest is focused on regression effects or group comparisons of time-to-event properties rather than on estimation of the underlying distribution function itself, often through use of standard multiplicative or additive regression models. Such regression models have been widely studied for individual current status data (Jewell & Emerson, 2013). Future work will investigate the use of additive hazard regression models for group-tested current status data.
Supplementary Material
Acknowledgement
The authors thank the editor, associate editor and reviewers for their insightful feedback. This work was supported by the National Heart, Lung, and Blood Institute, U.S. National Institutes of Health.
Supplementary material
Supplementary material available at Biometrika online contains a derivation of the expectation step of our expectation-maximization pool-adjacent-violators algorithm in the presence of misclassification, results from both sets of simulations with fixed censoring times, and code needed to replicate the simulations outlined in
5.1.
References
- Ayer M. Brunk H. D. Ewing G. M. Reid W. T. & Silverman E.. (1955). An empirical distribution function for sampling with incomplete information. Ann. Math. Statist. 26, 641–7. [Google Scholar]
- Banerjee M. (2012). Current status data in the 21st century: Some interesting developments. In Interval-Censored Time-to-Event Data: Methods and Applications. Chen D. G. Sun J. and Peace K. E. eds. Boca Raton, Florida: Chapman & Hall/CRC, pp. 45–90. [Google Scholar]
- Banerjee M. & Wellner J. A.. (2001). Likelihood ratio test for monotone functions. Ann. Statist. 29, 1699–731. [Google Scholar]
- Banerjee M. & Wellner J. A.. (2005). Confidence intervals for current status data. Scand. J. Statist. 32, 405–24. [Google Scholar]
- Barlow R. E. Bartholomew D. J. Bremner J. M. & Brunk H. D.. (1972). Statistical Inference Under Order Restrictions. New York: Wiley. [Google Scholar]
- Chen P. Tebbs J. M. & Bilder C. R.. (2009). Group testing regression models with fixed and random effects. Biometrics 65, 1270–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Delaigle A. & Meister A.. (2011). Nonparametric regression analysis for group testing data. J. Am. Statist. Assoc. 106, 640–50. [Google Scholar]
- Delaigle A. & Hall P.. (2012). Nonparametric regression with homogeneous group testing data. Ann. Statist. 40, 131–58. [Google Scholar]
- Dempster A. P. Laird N. M. & Rubin D. B.. (1977). Maximum likelihood from incomplete data via the EM algorithm. J. R. Statist. Soc. B 39, 1–38. [Google Scholar]
- Dhand N. K. Johnson W. O. & Toribio J. A. L.. (2010). A Bayesian approach to estimate OJD prevalence from pooled fecal samples of variable pool size. J. Agric. Biol. Envir. Statist. 15, 452–73. [Google Scholar]
- Dorfman R. (1943). The detection of defective members of large populations. Ann. Math. Statist. 14, 436–40. [Google Scholar]
- Groeneboom P. & Wellner J. A.. (1992). Nonparametric Maximum Likelihood Estimators for Interval Censoring and Deconvolution. Boston: Birkhäuser. [Google Scholar]
- Hwang F. K. (1976). Group testing with a dilution effect. Biometrika 63, 671–80. [Google Scholar]
- Jewell N. P. & Emerson R.. (2013). Current status data: An illustration with data on avalanche victims. In Handbook of Survival Analysis. Boca Raton, Florida: Chapman & Hall/CRC, pp. 391–412. [Google Scholar]
- Jewell N. P. & van der Laan M.. (2003). Current status data: Review, recent developments and open problems. In Handbook in Statistics, vol. 23 Amsterdam: Elsevier, pp. 625–42. [Google Scholar]
- Liu A. Liu C. Zhang Z. & Albert P. S.. (2012). Optimality of group testing in the presence of misclassification. Biometrika 99, 245–51. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Maathuis M. & Hudgens M. G.. (2011). Nonparametric inference for competing risks current status data with continuous, discrete or grouped observation times. Biometrika 98, 325–40. [DOI] [PMC free article] [PubMed] [Google Scholar]
- McKeown K. & Jewell N. P.. (2010). Misclassification of current status data. Lifetime Data Anal. 16, 215–30. [DOI] [PMC free article] [PubMed] [Google Scholar]
- McMahan C. S. Tebbs J. M. & Bilder C. R.. (2013). Regression models for group testing data with pool dilution effects. Biostatistics 14, 284–98. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Remlinger K. S. Hughes-Oliver J. M. Young S. S. & Lam R. L.. (2006). Statistical design of pools using optimal coverage and minimal collision. Technometrics 48, 133–43. [Google Scholar]
- Sal y Rosas V. G. & Hughes J. P.. (2010). Nonparametric and semiparametric analysis of current status data subject to outcome misclassification. Statist. Commun. Inf. Dis. 2010, article no 364. [PMC free article] [PubMed] [Google Scholar]
- Tang R. Banerjee M. & Kosorok M. R.. (2012). Likelihood based inference for current status data on a grid: A boundary phenomenon and an adaptive inference procedure. Ann. Statist. 40, 45–72. [Google Scholar]
- Tu X. M. Litvak E. & Pagano M.. (1994). Screening tests: Can we get more by doing less? Statist. Med. 13, 1905–19. [DOI] [PubMed] [Google Scholar]
- Tu X. M. Litvak E. & Pagano M.. (1995). On the informativeness and accuracy of pooled testing in estimating prevalence of a rare disease: Application to HIV screening. Biometrika 82, 287–97. [Google Scholar]
- Vansteelandt S. Goetghebeur E. & Verstraeten T.. (2000). Regression models for disease prevalence with diagnostic tests on pools of serum samples. Biometrics 56, 1126–33. [DOI] [PubMed] [Google Scholar]
- Wahed M. A. Chowdhury D. Nermell B. Khan S. I. Ilias M. Rahman M. Persson L. A. & Vahter M.. (2006). A modified routine analysis of arsenic content in drinking-water in Bangladesh by hydride generation-atomic absorption spectrophotometry. J. Health Pop. Nutr. 24, 36–41. [PubMed] [Google Scholar]
- Wein L. M. & Zenios S. A.. (1996). Pooled testing for HIV screening: Capturing the dilution effect. Oper. Res. 44, 543–69. [Google Scholar]
- Xie M. (2001). Regression analysis of group testing samples. Statist. Med. 20, 1957–69. [DOI] [PubMed] [Google Scholar]
- Yu G. Schick A. Li L. & Wong G. Y. C.. (1998). Asymptotic properties of the GMLE in the case 1 interval-censorship model with discrete inspection times. Can. J. Statist. 26, 619–27. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.



















