Skip to main content
Springer Nature - PMC COVID-19 Collection logoLink to Springer Nature - PMC COVID-19 Collection
. 2021 Sep 20;83(4):35. doi: 10.1007/s00285-021-01663-6

A binary search scheme for determining all contaminated specimens

Vassilis G Papanicolaou 1,
PMCID: PMC8450752  PMID: 34542723

Abstract

Specimens are collected from N different sources. Each specimen has probability p of being contaminated (in the case of a disease, e.g., p is the prevalence rate), independently of the other specimens. Suppose we can apply group testing, namely take small portions from several specimens, mix them together, and test the mixture for contamination, so that if the test turns positive, then at least one of the samples in the mixture is contaminated. In this paper we give a detailed probabilistic analysis of a binary search scheme, we propose, for determining all contaminated specimens. More precisely, we study the number T(N) of tests required in order to find all the contaminated specimens, if this search scheme is applied. We derive recursive and, in some cases, explicit formulas for the expectation, the variance, and the characteristic function of T(N). Also, we determine the asymptotic behavior of the moments of T(N) as N and from that we obtain the limiting distribution of T(N) (appropriately normalized), which turns out to be normal.

Keywords: Prevalence (rate), Adaptive group testing, Binary search scheme, Linear regime, Probabilistic testing, Average-case aspect ratio, Characteristic function, Moments, Limiting distribution, Normal distribution

Introduction

Consider N containers containing samples from N different sources (e.g., clinical specimens from N different patients, water samples from N different lakes, etc.). For each sample we assume that the probability of being contaminated (by, say, a virus, a toxic substance, etc.) is p and the probability that it is not contaminated is q:=1-p, independently of the other samples. All N samples must undergo a screening procedure (say a molecular or antibody test in the case of a viral contamination, a radiation measurement in the case of a radioactive contamination, etc.) in order to determine exactly which are the contaminated ones.

One obvious way to identify all contaminated specimens is the individual testing, namely to test the specimens one by one individually (of course, this approach requires N tests). In this work we analyze a group testing (or pool testing) approach, which can be called “binary search scheme” and requires a random number of tests. In particular, we will see that if p is not too big, typically if p<0.224, then by following this scheme, the expected number of tests required in order to determine all contaminated samples can be made strictly less than N. There is one requirement, though, for implementing this scheme, namely that each sample can undergo many tests (or that the sample quantity contained in each container suffices for many tests).

Group testing procedures have broad applications in areas ranging from medicine and engineering to airport security control, and they have attracted the attention of researchers for over seventy-five years (see, e.g., Dorfman’s seminal paper (Dorfman 1943), as well as Hwang 1972; Sobel and Groll 1959; Ungar 1960, etc.). In fact, recently, group testing has received some special attention, partly due to the COVID-19 crisis (see, e.g., Aldridge 2019, 2020; Armendáriz et al. 2020; Golliery and Gossnerz 2020; Malinovsky 2019; Mallapaty 2020, and the references therein).

The contribution of the present work is a detailed quantitative analysis of a precise group testing procedure which is quite natural and easy to implement. The scheme goes as follows: First we take samples from each of the N containers, mix them together and test the mixture. If the mixture is not contaminated, we know that none of the samples is contaminated. If the mixture is contaminated, we split the containers in two groups, where the first group consists of the first N/2 containers and the second group consists of the remaining N/2 containers. Then we take samples from the containers of the first group, mix them together and test the mixture. If the mixture is not contaminated, then we know that none of the samples of the first group is contaminated. If the mixture is contaminated, we split the containers of the first group into two subgroups (in the same way we did with the N containers) and continue the procedure. We also apply the same procedure to the second group (consisting of N/2 containers).

The main quantity of interest in the present paper is the number T(N) of tests required in order to determine all contaminated samples, if the above binary search scheme is applied.

Following the terminology of Aldridge (2019) we can characterize the aforementioned procedure as an adaptive probabilistic group testing in the linear regime (“linear” since the average number of contaminated specimens is pN, hence grows linearly with N). “Adaptive” refers to the fact that the way we perform a stage of the testing procedure depends on the outcome of the previous stages. “Probabilistic" refers to the fact that each specimen has probability p of being contaminated, independently of the other specimens and, furthermore, that the required number of tests, T(N), is a random variable. Let us be a bit more specific about the latter. Suppose ξ1,ξ2, is a sequence of independent Bernoulli random variables of a probability space (Ω,F,P), with P{ξk=1}=p and P{ξk=0}=q, k=1,2,, so that {ξk=1} is the event that the k-th specimen is contaminated. Then, for each N, the quantity T(N) of our binary search scheme is a specific function of ξ1,ξ2,,ξN, say,

T(N)=TN(ξ1,ξ2,,ξN). 1.1

In particular, is not hard to see that

1=TN(0,0,,0)TN(ξ1,ξ2,,ξN)TN(1,1,,1)=2N-1, 1.2

i.e. if none of the contents of the N containers is contaminated, then T(N)=1, whereas if all N containers contain contaminated samples, then by easy induction on N we can see that T(N)=2N-1 (observe that if ξ1==ξN=1, then T(1)=1, while for N2 we have T(N)=T(N/2)+T(N/2)+1). Thus, T(N) can become bigger than N, while the deterministic way of checking of the samples one by one requires N tests (i.e. in the case of individual testing we would have TNN).

Let us mention that, for a given p, the optimal testing procedure, namely the procedure which minimizes the expectation of the number of tests required in order to determine (with zero-error) all contaminated specimens in an adaptive probabilistic group testing, remains unknown (Malinovsky 2019). However, Ungar (1960) has shown that if

pp:=3-520.382, 1.3

then the optimal procedure is individual testing. In practice, though, p is usually quite small, much smaller than p.

We do not claim that our proposed procedure is optimal. However, for small values of p the numerical evidence we have suggests (see Tables 1 and 2 in Sects. 2.1 and 3.1 respectively) that, in comparison with the optimal results obtained by Aldridge (2019), Aldridge (2020) via simulation, our binary scheme is near-optimal.

Table 1.

Values of E[Wn]/N for certain choices of N=2n and p

p=.005 p=.01 p=.05 p=.1 p=.15 p=.2
N=1 1 1 1 1 1 1
N=2 .505 .515 .575 .645 .715 .780
N=4 .265 .280 .395 .528 .653 .768
N=8 .148 .169 .335 .518 .679 .820
N=16 .092 .121 .328 .542 .719 .871
N=32 .068 .103 .340 .566 .748 .901
N=64 .059 .099 .352 .581 .763 .917
N=128 .057 .101 .359 .589 .771 .924
N=256 .058 .103 .363 .593 .775 .928
N=512 .059 .105 .365 .595 .777 .930
N=1024 .060 .106 .366 .596 .778 .931

Table 2.

Values of E[T(N)]/N for certain choices of N2n and p

p=.005 p=.01 p=.05 p=.1 p=.15 p=.2
N=3 .345 .357 .447 .554 .654 .749
N=5 .217 .234 .362 .510 .645 .768
N=6 .186 .205 .348 .510 .656 .787
N=7 .163 .184 .337 .510 .663 .800
N=12 .110 .136 .325 .526 .696 .843
N=48 .061 .099 .345 .571 .751 .902
N=96 .057 .099 .354 .582 .762 .912
N=200 .057 .101 .358 .585 .765 .917
N=389 .058 .104 .362 .589 .769 .920
N=768 .059 .105 .363 .591 .771 .922
N=1000 .060 .106 .365 .594 .776 .929

Let us also mention that our “binary search scheme” is different from the “binary splitting” procedures of Sobel and Groll (1959) (see, also, Aldridge (2019) and Hwang (1972)) where at each stage a group is splitted in two subgroups whose sizes may not be equal or even nearly equal. This feature makes these binary splitting procedures very hard to analyze theoretically, while the advantage of our proposed procedure is that, due to its simplicity it allows us to calculate certain relevant quantities (e.g., the expectation E[T(N)] and the variance V[T(N)]) explicitly. We even managed to determine the limiting distribution of T(N) (appropriately normalized), which turns out to be normal.

Finally, let us mention that in Armendáriz et al. (2020) the authors consider optimality in the case where the pool sizes are powers of 3 (except possibly for an extra factor of 4 for the first pool). In the case where N=100 and p=0.02 their average number of tests in order to determine the contaminated specimens is 20, thus their average-case aspect ratio is 0.2. Tables 1 and 2 below, indicate that our proposed strategy gives comparable, if not better, results.

In Sect. 2 we study the special case where N=2n. Here, the random variable T(2n) is denoted by Wn. We present explicit formulas for the expectation (Theorem 1) and the variance (Corollary 1) of Wn. In addition, in formulas (2.12) and (2.42) we give the asymptotic behavior of E[Wn] and V[Wn] respectively, as n. Then, we determine the leading asymptotics of all the moments of Wn (Theorem 3) and from that we conclude (Corollary 2) that, under an appropriate normalization Wn converges in distribution to a normal random variable. Finally, at the end of the section, in Corollary 3 and in formula (2.75) we give a couple of Law-of-Large-Numbers-type results for Wn.

Section 3 studies the case of an arbitrary N. In Theorem 4, Theorem 5, and Corollary 5 respectively we derive recursive formulas for μ(N):=E[T(N)], g(z;N):=E[zT(N)], and σ2(N):=V[T(N)], while Corollary 4 gives an explicit, albeit rather messy, formula for E[T(N)]. We also demonstrate (see Remark 3) the nonexistence of the limit of E[T(N)]/N, as N, which is in contrast to the special case N=2n, where the limit exists. In Sect. 3.2 we show that the moments of Y(N):=[T(N)-μ(N)]/σ(N) converge to the moments of the standard normal variable Z as N. An immediate consequence is (Corollary 7) that Y(N) converges to Z in distribution.

Here, let us point out that the determination of the limiting behavior of a random quantity is a fundamental issue in many probabilistic models. Thus, our result that [T(N)-μ(N)]/σ(N) converges in distribution to the standard normal variable Z, and that the convergence is good to the point that we also have convergence of all moments to the corresponding moments of Z, gives much more information about the testing procedure than the information we obtain from the behavior of E[T(N)]. And since the number N can be in the order of 102 or, even 103, and the explicit results can get very messy for such big values of N, the asymptotic behavior of T(N) enables us to obtain valuable information without much effort.

At the end of the paper we have included a brief appendix (Sect. 4) containing a lemma and two corollaries, which are used in the proofs of some of the results of Sects. 2 and 3.

The case N=2n

We first consider the case where the number of containers is a power of 2, namely

N=2n,wheren=0,1,. 2.1

As we have said in the introduction, the first step is to test a pool containing samples from all N=2n containers. If this pool is not contaminated, then none of the contents of the N containers is contaminated and we are done. If the pool is contaminated, we make two subpools, one containing samples of the first 2n-1 containers and the other containing samples of the remaining 2n-1 containers. We continue by testing the first of those subpools. If it is contaminated we split it again into two subpools of 2n-2 samples each and keep going. We also repeat the same procedure for the second subpool of the 2n-1 samples.

One important detail here is that if the first subpool of the 2n-1 samples turns out not contaminated, then we are sure that the second subpool of the remaining 2n-1 samples must be contaminated, hence we can save one test and immediately proceed with the splitting of the second subpool into two subpools of 2n-2 samples each. Likewise, suppose that at some step of the procedure a subpool of 2n-k samples is found contaminated. Then this subpool is splitted further into two other subpools, each containing 2n-k-1 samples. If the first of these subpools is found not contaminated, then we automatically know that the other is contaminated and, consequently, we can save one test.

Let Wn be the number of tests required to find all contaminated samples by following the above procedure. Thus by (1.2) we have

1Wn2n+1-1=2N-1 2.2

(in the trivial case n=0, i.e N=1, we, of course, have W0=1).

Example 1

Suppose that N=4 (hence n=2). In this case W2 can take any value between 1 and 7, except for 2. For instance in the case where the samples of the first three containers are not contaminated, while the content of the fourth container is contaminated, we have W2=T4(0,0,0,1)=3. On the other hand, in the case where the samples of the first and the third container are contaminated, while the samples of the second and fourth container are not contaminated, we have W2=T4(1,0,1,0)=7.

The expectation of Wn

Theorem 1

Let N=2n and Wn as above. Then

E[Wn]=2n+1-1-2nk=1nq2k+q2k-12k,n=0,1, 2.3

(in the trivial case n=0 the sum is empty, i.e. 0), where, as it is mentioned in the introduction, q is the probability that a sample is not contaminated.

Proof

Assume n1 and let Dn be the event that none of the 2n samples is contaminated. Then

E[Wn]=E[Wn|Dn]P(Dn)+E[Wn|Dnc]P(Dnc),

and hence

E[Wn]=q2n+un1-q2n, 2.4

where for typographical convenience we have set

un:=E[Wn|Dnc]. 2.5

In order to find a recursive formula for un let us first consider the event An that in the group of the 2n containers none of the first 2n-1 contain contaminated samples. Clearly, DnAn and

P(An|Dnc)=P(AnDnc)P(Dnc)=P(An)-P(Dn)P(Dnc)=q2n-1-q2n1-q2n=q2n-11+q2n-1. 2.6

Likewise, if Bn is the event that in the group of the 2n containers none of the last 2n-1 contains contaminated samples, then

P(Bn|Dnc)=q2n-11+q2n-1. 2.7

Let us also notice that AnBn=Dn, hence P(AnBn|Dnc)=0.

Now, (i) given An and Dnc we have that Wn=d1+Wn-1, where the notation X=dY signifies that the random variables X and Y have the same distribution. To better justify this distributional equality we observe that, in the case N=2n, formula (1.1) takes the form Wn=T(N)=TN(ξ1,,ξN) and Wn-1=T(N/2)=TN/2(ξ1,,ξN/2). Thus, Wn-1=dTN/2(ξ(N/2)+1,,ξN). Now, Dn={ξ1=ξ2==ξN=0} and An={ξ1=ξ2==ξN/2=0}. Hence, the given event AnDnc means that only some of the variables ξ(N/2)+1,,ξN may differ from 0. Consequently, given AnDnc we have that Wn=d1+Wn-1.

(ii) Similarly, given Bn and Dnc we have that Wn=d2+Wn-1.

Finally, (iii) given (AnBn)c and Dnc we have that Wn=1+Wn-1+W~n-1, where, W~n-1 is an independent copy of Wn-1. Thus (given Dnc ) by conditioning on the events An, Bn and (AnBn)c we get, in view of (2.5), (2.6), and (2.7),

un=(1+un-1)q2n-11+q2n-1+(2+un-1)q2n-11+q2n-1+(1+2un-1)1-q2n-11+q2n-1

or

un=21+q2n-1un-1+1+2q2n-11+q2n-1. 2.8

By replacing n by n-1 in (2.4) we get

E[Wn-1]=q2n-1+un-11-q2n-1. 2.9

Therefore, by using (2.8) in (2.4) and (2.9) we can eliminate un, un-1 and obtain

E[Wn]=2E[Wn-1]-q2n-q2n-1+1. 2.10

Finally, (2.3) follows easily from (2.10) and the fact that E[W0]=1.

For instance, in the cases n=1, n=2, and n=10 formula (2.3) becomes

E[W1]=3-q-q2,E[W2]=7-2q-3q2-q4,andE[W10]=2047-512q-768q2-384q4-192q8-96q16-48q32-24q64-12q128-6q256-3q512-q1024 2.11

respectively. In the extreme case q=0 we know that E[Wn]=2n+1-1, while in the other extreme case q=1 we know that E[Wn]=1, and these values agree with the ones given by (2.3).

Let us also notice that formula (2.3) implies that for any given q[0,1) we have

E[Wn]=2nα1(q)-1+Oq2n,n, 2.12

where

α1(q):=2-k=1q2k+q2k-12k=2-q2-32k=1q2k2k. 2.13

It is clear that α1(q) is a power series about q=0 with radius of convergence equal to 1 (actually, it is well known that the unit circle is the natural boundary of this power series), which is strictly decreasing on [0, 1] with α1(0)=2 and α1(1)=0. Furthermore, it is not hard to see that it satisfies the functional equation

α1q2=2α1(q)+q2+q-2. 2.14

Also, by using the inequality

1q2xdx<k=1q2k<1+1q2xdx,q(0,1),

and by estimating the integral as q1- one can show that

α1(q)=32ln2ln(1-q)+O(1),q1-. 2.15

Remark 1

By dividing both sides of (2.3) by 2n and comparing with (2.13) we can see that as n

E[Wn]2nα1(q)uniformly inq[0,1]. 2.16

Actually, the convergence is much stronger, since for every m=1,2, the m-th derivative of E[Wn]/2n with respect to q converges to the m-th derivative of α1(q) uniformly on compact subets of [0, 1).

The expectation E[Wn] and, consequently, the average-case aspect ratio E[Wn]/2n are getting smaller as q approaches 1 (equivalently, as p approaches 0). As an illustration, by employing (2.3) we have created the following table:

(by comparing the above values with the graphs, given in Aldridge (2019) and Aldridge (2020), of the optimal average-case aspect ratio as a function of p, we can see that, if p0.15, our procedure is near-optimal).

If q is such that

E[Wn]<2n=N, 2.17

then, by applying the aforementioned testing strategy, the number of tests required to determine all containers with contaminated content is, on the average, less than N, namely less than the number of tests required to check the containers one by one.

Let us set

μn=μn(q):=E[Wn]. 2.18

It is clear from (2.3) that for n1 the quantity μn(q) is a polynomial in q of degree 2n, strictly decreasing on [0, 1], with μn(0)=2n+1-1=2N-1 and μn(1)=1. Thus, there is a unique qn(0,1) such that

μn(qn)=2n 2.19

and (2.17) holds if q(qn,1] or, equivalently, if p[0,1-qn). Therefore, it only makes sense to consider the procedure if q>qn. Alternatively, if q is given, one may try to find the optimal value, say m, of n which minimizes μn(q)/2n and, then, split N into subgroups having size m.

As an example, let us give the (approximate) values of qn for n=0,1,,10:

q0=0,q1=5-12.618,q2.685,q3.727,q4.751,q5.763,q6.769,q7.772,q8.774,q9.775,q10.775,

where q0=0 is a convention. Notice that q1=1-p, where p is as in (1.3). Thus, q1 is the threshold found by Ungar (1960) below which group testing does not improve on individual testing.

In view of (2.3) and (2.18), formula (2.19) becomes

k=1nqn2k+qn2k-12k=1-12n, 2.20

which by letting n yields

qnq,whereα1(q)=1, 2.21

α1(q) being the function defined in (2.13). In fact, .775<q<.776. Furthermore, by formula (2.20) we have

k=1nqn2k+qn2k-12k=k=1nqn+12k+qn+12k-12k-1-qn+12n-qn+12n+12n+1,

which implies that the sequence qn is strictly increasing. Therefore, if q>.776, equivalently if p<.224, then E[Wn]<2n=N for any n1.

The behavior of Wn as n

We start with a recursive formula for the generating function of Wn.

Theorem 2

Let

gn(z):=EzWn. 2.22

Then

gn(z)=zgn-1(z)2+(z-z2)q2n-1gn-1(z)+(z-z2)q2n,n=1,2, 2.23

(clearly, g0(z)=z).

Proof

With Dn as in the proof of Theorem 1 we have

gn(z)=EzWn=EzWn|DnP(Dn)+EzWn|DncP(Dnc),

thus

gn(z)=zq2n+hn(z)1-q2n, 2.24

where

hn(z):=EzWn|Dn. 2.25

Let An and Bn be the events introduced in the proof of Theorem 1. Then, given An and Dnc we have that zWn=dzzWn-1; given Bn and Dnc we have that zWn=dz2zWn-1; finally given (AnBn)c and Dnc we have that zWn=dzzWn-1zW~n-1, where W~n-1 is an independent copy of Wn-1. Thus (given Dnc ) by conditioning on the events An, Bn and (AnBn)c we get, in view of (2.25), (2.6), and (2.7),

hn(z)=(z+z2)hn-1(z)q2n-11+q2n-1+zhn-1(z)21-q2n-11+q2n-1. 2.26

By replacing n by n-1 in (2.24) we get

gn-1(z)=zq2n-1+hn-1(z)1-q2n-1 2.27

and (2.23) follows by eliminating hn(z) and hn-1(z) from (2.24), (2.26), and (2.27).

By setting z=eit in (2.23) it follows that the characteristic function

ϕn(t):=EeitWn 2.28

of Wn satisfies the recursion

ϕn(t)=eitϕn-1(t)2+eit-e2itq2n-1ϕn-1(t)+eit-e2itq2n,n=1,2,, 2.29

with, of course, ϕ0(t)=eit.

For instance, for n=1 formula (2.29) yields

ϕ1(t)=e3it+qeit-e2iteit+q2eit-e2iteit. 2.30

Corollary 1

Let

σn2=σn2(q):=VWn. 2.31

Then

σn2(q)=2n+1k=1n2q2k+q2k-1+2nk=1nq2k+1+q3·2k-1-5q2k-3q2k-12k-2nk=1n2q2k+q2k-1j=1kq2j+q2j-12j,n=0,1, 2.32

(in the trivial case n=0 all the above sums are empty, i.e. 0).

Proof

Differentiating (2.23) twice with respect to z and then setting z=1 yields

gn(1)=2gn-1(1)+2gn-1(1)2-2q2n-1gn-1(1)+4gn-1(1)-2q2n-2q2n-1, 2.33

where, in view of (2.22),

gn(1)=EWn(Wn-1)andgn(1)=EWn. 2.34

Thus,

σn2=gn(1)+gn(1)-gn(1)2 2.35

and by using (2.10) and (2.35) in (2.33) we obtain

σn2=2σn-12+2q2n+q2n-1EWn+q2n+1+q3·2n-1-3q2n-2q2n-1. 2.36

Finally, formula (2.36) together with the fact that σ02=VW0=0 imply

σn2=2nk=1n2q2k+q2k-1E[Wk]+q2k+1+q3·2k-1-3q2k-2q2k-12k 2.37

from which (2.32) follows by invoking (2.3).

As we have mentioned, in the extreme cases where q=0 or q=1 the variable Wn becomes deterministic and, consequently, σn2(0)=σn2(1)=0, which is in agreement with (2.32). E.g., setting n=1 and n=2 in formula (2.32) yields

σ12(q)=V[W1]=-q4-2q3+2q2+q=q(1-q)(q2+3q+1) 2.38

and

σ22(q)=V[W2]=q(1-q)(q6+q5+7q4+11q3+5q2+11q+2). 2.39

Notice that the maximum of σ12(q) on [0, 1] is attained at q.6462. Actually, both dσ12(q)/dq and d2σ12(q)/dq2 have one (simple) zero in [0, 1]. The maximum of σ22(q) on [0, 1] is attained at q.744, while both dσ22(q)/dq and d2σ22(q)/dq2 have one simple zero in [0, 1].

Let us also notice that, since

j=1kq2j+q2j-12j=q2+q2k2k32j=1k-1q2j2j, 2.40

formula (2.32) can be also written as

σn2(q)=2n2-q2k=1n2q2k+q2k-1-2nk=1nq2k+1+5q2k+3q2k-12k-2n·32k=1n2q2k+q2k-1j=1k-1q2j2j,n=0,1. 2.41

From formula (2.41) it follows that for any given q[0,1) we have

σn2=V[Wn]=2nβ(q)+O2nq2n,n, 2.42

where

β(q):=2-q2k=12q2k+q2k-1-k=1q2k+1+5q2k+3q2k-12k-32k=12q2k+q2k-1j=1k-1q2j2j. 2.43

An equivalent way to write β(q) is

β(q)=q(q+1)2+6-3q2k=1q2k-172k=1q2k2k-32k=12q2k+q2k-1j=1k-1q2j2j 2.44

(notice that β(0)=0, 0<β(q)< for q(0,1), and β(1)=; furthermore, β(q) is a power series about q=0 with radius of convergence equal to 1).

We would like to understand the limiting behavior of Wn, as n, for any fixed q(0,1). Formulas (2.12) and (2.42) suggest that we should work with the normalized variables

Xn:=Wn-2nα1(q)2n/2,n=0,1,. 2.45

Let us consider the characteristic functions

ψn(t):=EeitXn,n=0,1,. 2.46

In view of (2.28) we have

ψn(t)=ϕn2-n/2te-i2n/2α1(q)t 2.47

and using (2.47) in (2.29) yields

ψn(t)=e2-n/2itψn-1t/22+e2-n/2it-e2·2-n/2ite-2n/2α1(q)it/2q2n-1ψn-1t/2+e2-n/2it-e2·2-n/2ite-2n/2α1(q)itq2n,n=1,2,, 2.48

with ψ0(t)=e[1-α1(q)]it.

From formula (2.48) one can guess the limiting distribution of Xn: Assuming that for each tR the sequence ψn(t) has a limit, say ψ(t), we can let n in (2.48) and conclude that ψ(t)=ψt/22. Actually, if the limit ψ(t) exists, then formulas (2.49) and (2.50) below imply that ψ(0)=0 and ψ(0) exists (in C). From the existence of these derivatives and the functional equation of ψ(t) we get that the function h(t):=t-1-lnψ(t) is continuous at 0 and satisfies the equation h(t)=ht/2. Therefore, h(t)h(0) for all tR and, hence ψ(t)=e-at2 for some a0. Consequently, Xn converges in distribution to a normal random variable, say X, with zero mean. As for the variance of X, formulas (2.42) and (2.45) suggest that V[X]=β(q), where β(q) is given by (2.43)–(2.44).

The above argument is not rigorous since it assumes the existence of the (pointwise) limit limnψn(t) for every tR. It is worth, however, since it points out what we should look for.

From (2.12), (2.42), and (2.45) we get that for any fixed q[0,1)

E[Xn]=-12n/2+Oq2nandV[Xn]=β(q)+Oq2n,n, 2.49

thus

EXn2=β(q)+12n+Oq2n,n. 2.50

One important consequence of (2.50) is Durrett (2005) that the sequence Fn, n=0,1,, where Fn is the distribution function of Xn, is tight.

Theorem 3

For m=1,2, and q[0,1) we have

limnEXnm=β(q)m/2EZm, 2.51

where β(q) is given by (2.43)–(2.44) and Z is a standard normal random variable. In other words

limnEXn2-1=0andlimnEXn2=(2)!2!β(q),1. 2.52

Proof

By differentiating both sides of (2.48) with respect to t we get

ψn(t)=2e2-n/2itψn-1t2ψn-1t2+2-n/2ie2-n/2itψn-1t22+q2n-1ddte2-n/2it-e2·2-n/2ite-2n/2α1(q)it/2ψn-1t2+q2nddte2-n/2it-e2·2-n/2ite-2n/2α1(q)it. 2.53

We continue by differentiating k-1 times with respect to t both sides of (2.53), where k2. This yields

ψn(k)(0)=Q1n(0)+Q2n(0)+Q3n(0)+Q4n(0), 2.54

where

Q1n(t):=2dk-1dtk-1e2-n/2itψn-1t2ψn-1t2, 2.55
Q2n(t):=2-n/2idk-1dtk-1e2-n/2itψn-1t22, 2.56
Q3n(t):=q2n-1dkdtke2-n/2it-e2·2-n/2ite-2n/2α1(q)it/2ψn-1t2, 2.57
andQ4n(t):=q2ndkdtke2-n/2it-e2·2-n/2ite-2n/2α1(q)it. 2.58

We will prove the theorem by induction. For m=1 and m=2 the truth of (2.51) follows immediately from (2.49) and (2.50). The inductive hypothesis is that the limit limnEXnk=i-klimnψn(k)(0) satisfies (2.51) or, equivalently, (2.52) for k=1,,m-1. Then, for k=m (where m3):

  • (i)
    Formula (2.55) implies
    Q1n(0)=1(2)m-2j=0m-1m-1jψn-1(m-j)(0)ψn-1(j)(0)+O12n/2,n; 2.59
  • (ii)
    formula (2.56) implies
    Q2n(0)=O12n/2,n; 2.60
  • (iii)
    since [e2-n/2it-e2·2-n/2it]t=0=0, formula (2.57) implies
    Q3n(0)=Oq2n-12mn/2,n; 2.61
  • (iv)
    finally, formula (2.58) implies
    Q4n(0)=Oq2n2mn/2,n. 2.62
    Thus, for k=m3, by using (2.59), (2.60), (2.61), and (2.62) in (2.54) we can conclude that for every n1 (recall that ψn(0)=1)
    ψn(m)(0)=ψn-1(m)(0)(2)m-2+1(2)m-2j=1m-1m-1jψn-1(m-j)(0)ψn-1(j)(0)+δn, 2.63
    where ψ0(m)(0)=im[1-α1(q)]m and
    δn=O12n/2,n. 2.64

Let us set

bn:=1(2)m-2j=1m-1m-1jψn-1(m-j)(0)ψn-1(j)(0)+δn. 2.65

Then, the inductive hypothesis together with (2.64) imply that the limit limnbn exists in C. We can, therefore, apply Corollary A2 of the Appendix to (2.63) with zn=ψn(m)(0), ρ=2)-(m-2) and bn as in (2.65) to conclude that the limit limnψn(m)(0) exists in C. This allows us to take limits in (2.63) and obtain

λ(m)=λ(m)(2)m-2+1(2)m-2j=1m-1m-1jλ(m-j)λ(j)

or

(2)m-2-1λ(m)=j=1m-1m-1jλ(m-j)λ(j) 2.66

where for typographical convenience we have set

λ(m):=limnψn(m)(0)=imlimnEXnm,m1. 2.67

We have, thus, shown that formula (2.66) holds for all m1. If m=1, the sum in the right-hand side of (2.66) is empty, i.e. zero, and, hence we get what we already knew, namely that λ(1)=0. If m=2 then (2.66) becomes vacuous (0 = 0), but we know that λ(2)=-β(q). Thus, formula (2.66) gives recursively the values of λ(m) for every m3, and by using the induction hypothesis again, namely that λ(k)=ikβ(q)k/2EZk for k=1,,m-1, it is straightforward to show that λ(m), as given by (2.66), equals imβ(q)m/2EZm for every m.

A remarkable consequence of Theorem 3 is the following corollary whose proof uses standard arguments, but we decided to include it for the sake of completeness.

Corollary 2

Let β(q) be the quantity given in (2.43)–(2.44). Then, for any fixed q[0,1) we have

Xndβ(q)Zasn, 2.68

where Z is a standard normal random variable and the symbol d denotes convergence in distribution.

Proof

As we have seen, the sequence of the distribution functions Fn(x)=PXnx, n=0,1,, is tight.

Suppose that XnkdX, where Xnk, k=1,2,, is a subsequence of Xn. Then (Durrett 2005) there is a sequence of random variables Yk, k=1,2, converging a.s. to a random variable Y, such that for each k1 the variables Yk and Xnk have the same distribution (hence the limits Y and X also have the same distribution, since a.s. convergence implies convergence in distribution). Thus, by Theorem 3 we have

limkEYkm=limkEXnkm=β(q)m/2EZmfor all integersm1. 2.69

In particular, for m=2 we get from (2.69) that supkEYk2< for all =1,2, and, consequently (Chung 2001), that the sequence Ykm, k=1,2,, is uniformly integrable for every m1. Therefore YkY in Lr(Ω,F,P), for all r0 (i.e. E[|Yk-Y|r)0 as k, since by Lr(Ω,F,P) we denote the space of all real-valued random variables of our probability space (Ω,F,P), mentioned in the introduction, with a finite ·r-norm), and (2.69) yields

EXm=EYm=limkEYkm=β(q)m/2EZmfor all integersm1. 2.70

It is well known (and not hard to check) that the moments of the normal distribution satisfy the Carleman’s condition and, consequently, a normal distribution is uniquely determined from its moments (Chung 2001; Durrett 2005) (alternatively, since the characteristic function of a normal variable is entire, it is uniquely determine by the moments). Therefore, it follows from (2.70) that X and β(q)Z have the same distribution. Hence, every subsequence of Xn which converges in distribution, converges to β(q)Z and (2.68) is established.

From formula (2.45) we have

Xn2n/2=Wn2n-α1(q), 2.71

hence Theorem 3 has the following immediate corollary:

Corollary 3

For any r>0 and q[0,1] we have

Wn2nα1(q)inLr(Ω,F,P),n. 2.72

Finally, let us observe that for any given ϵ>0 we have (in view of (2.71) and Chebyshev’s inequality)

PWn2n-α1(q)ϵ=PXn2n/2ϵ12nϵ2EXn2, 2.73

hence (2.50) yields

PWn2n-α1(q)ϵ12nϵ2[β(q)+o(1)],n, 2.74

from which it follows by a standard application of the 1st Borel-Cantelli Lemma that if Wn, n=1,2,, are considered random variables of the space (Ω,F,P), then, for any q[0,1]

Wn2nα1(q)a.s.,n. 2.75

The case of a general N

Let us now discuss the case of a general N, namely the case where N is not necessarily a power of 2. As we have described in the introduction, the first step of the binary search scheme is to test a pool containing samples from all N containers. If this pool is not contaminated, then none of the contents of the N containers is contaminated and we are done. If the pool is contaminated, we form two subpools, one containing samples of the first N/2 containers and the other containing samples of the remaining N/2 containers (recall that N/2+N/2=N). We continue by testing the first of those subpools. If it is contaminated we split it again into two subpools of N/2/2 and N/2/2 samples respectively and keep going. We also apply the same procedure to the second subpool of the N/2 samples.

Suppose T(N)=T(N;q) is the number of tests required to find all contaminated samples by following the above procedure (thus T(2n)=Wn, where Wn is the random variable studied in the previous section). Then, as in formula (1.2),

1T(N)2N-1. 3.1

In the extreme cases q=1 and q=0 the quantity T(N) becomes deterministic and we have respectively

T(N;1)=1andT(N;0)=2N-1. 3.2

Evidently, T(N)stT(N+1), where st denotes the usual stochastic ordering (recall that XstY means that P{X>x}P{Y>x} for all xR). In other words, T(N) is stochastically increasing in N. In particular

WνstT(N)stWν,ν:=log2N, 3.3

where log2 is the logarithm to the base 2. Also, it follows easily from a coupling argument that if q1>q2, then T(N;q1)stT(N;q2).

The expectation, the generating function, and the variance of T(N)

Theorem 4

Let us set

μ(N)=μ(N;q):=E[T(N;q)]. 3.4

Then μ(N) satisfies the recursion

μ(N)=μN/2+μN/2-qN-qN/2+1,N2 3.5

(of course, μ(1)=1).

Proof

We adapt the proof of Theorem 1. Assume N2 and let DN be the event that none of the N samples is contaminated. Then

μ(N)=E[T(N)]=E[T(N)|DN]P(DN)+E[T(N)|DNc]P(DNc),

and hence

μ(N)=qN+u(N)1-qN, 3.6

where for typographical convenience we have set

u(N):=E[T(N)|DNc]. 3.7

In order to find a recursive formula for u(N) let us first consider the event AN that in the group of the N containers none of the first N/2 contain contaminated samples. Clearly DNAN and

P(AN|DNc)=P(AN)-P(DN)P(DNc)=qN/2-qN1-qN=qN/21-qN/21-qN. 3.8

Likewise, if BN is the event that in the group of the N containers none of the last N/2 contains contaminated samples, then

P(BN|DNc)=qN/2-qN1-qN=qN/21-qN/21-qN. 3.9

Let us also notice that ANBN=DN, hence P(ANBN|DNc)=0.

Now, (i) given AN and DNc we have that T(N)=1+T~(N/2), where T~(N/2) has the same distribution as T(N/2); (ii) given BN and DNc we have that T(N)=2+T(N/2); finally (iii) given (ANBN)c and DNc we have that T(N)=d1+T(N/2)+T~(N/2), where, T~(N/2) is independent of T(N/2) and has the same distribution as T(N/2). Thus (given DNc ) by conditioning on the events AN, BN and (ANBN)c we get, in view of (3.7), (3.8), and (3.9),

u(N)=[1+uN/2]qN/2-qN1-qN+[2+uN/2]qN/2-qN1-qN+[1+uN/2+uN/2]1-qN/2-qN1-qN-qN/2-qN1-qN

or

1-qNu(N)=1-qN/2uN/2+1-qN/2uN/2-2qN+qN/2+1 3.10

from which, in view of (3.6), formula (3.5) follows immediately.

In the case where N=2n formula (3.5) reduces to (2.10).

Remark 2

By easy induction formula (3.5) implies that, for N2 the quantity μ(N;q)=E[T(N;q)] is a polynomial in q of degree N whose coefficients are 0, except for the constant term which is equal to 2N-1. The leading term of μ(N;q) is -qN. In the special case where N=2n these properties of μ(N;q) follow trivially from (2.3).

For instance,

μ(3)=5-2q-q2-q3,μ(5)=9-3q-3q2-q3-q5,andμ(1000)=1999-512q-720q2-48q3-336q4-48q7-144q8-48q15-48q16-40q31-8q32-16q62-8q63-12q125-6q250-3q500-q1000, 3.11

while μ(2), μ(4), and μ(1024) are given by (2.11), since W1=T(2), W2=T(4), and W10=T(1024).

With the help of (3.5) one can obtain an extension of formula (2.3) valid for any N. We first need to introduce some convenient notation:

ι(N):=N/2,ιk(N)=(ιιkiterates)(N), 3.12

so that ι1(N)=ι(N), while we also have the standard convention ι0(N)=N (we furthermore have ι-1(N)={2N,2N+1}). For example, if N2 and, as in (3.3), ν=log2N, then

ιν(N)=1,whileιν-1(N)=2or3andιν+1(N)=0. 3.13

Corollary 4

Let μ(N) be as in (3.4) and

εμ(N)=εμ(N;q):=qN/2+qN-q(N+1)/2-qN+1=qN/2-qN/2+qN-qN+1 3.14

(the last equality follows from the fact that (N+1)/2=N/2). Then

μ(N)=2N-1-(N-1)q-(N-1)q2+n=2N-1k=1log2nεμιk-1(n),N1 3.15

(if N=1 or N=2, then the double sum in the right-hand side is 0).

Proof

By setting

Δμ(N)=Δμ(N;q):=μ(N+1;q)-μ(N;q) 3.16

and by recalling that (N+1)/2=N/2 and (N+1)/2=N/2+1, formula (3.5) implies (in view of (3.12) and (3.14))

Δμ(N)=ΔμN/2+εμ(N)=Δμ(ι(N))+εμ(N),N2. 3.17

From (3.17) we obtain

k=1log2NΔμιk-1(N)-Δμιk(N)=k=1log2Nεμιk-1(N),N2,

or, in view of (3.13),

Δμ(N)-Δμ(1)=k=1log2Nεμιk-1(N),N1, 3.18

where Δμ(1)=μ(2)-μ(1)=E[T(2)]-E[T(1)]=E[W1]-E[W0]=2-q-q2 (in the case N=1, formula (3.18) is trivially true, since the sum in the right-hand side is empty). Consequently, (3.18) becomes

Δμ(N)=μ(N+1)-μ(N)=2-q-q2+k=1log2Nεμιk-1(N),N1. 3.19

from which (3.15) follows immediately.

With the help of (3.15) we have created the following table for the average-case aspect ratio:

(as in Table 1, if we compare the above values with the graphs, given in Aldridge (2019) and Aldridge (2020), of the optimal average-case aspect ratio as a function of p, we can see that, if p0.15, our procedure is near-optimal).

Remark 3

Let us look at the case N=2nM, where M is a fixed odd integer. We set

τn:=ET(N)=ET(2nM), 3.20

so that

τ0=ET(M)=μ(M). 3.21

Then (3.5) becomes

τn=2τn-1-q2nM-q2n-1M+1,n1. 3.22

From (3.21) and (3.22) it follows that

τn2n=μ(M)+k=1n1-q2kM-q2k-1M2k=μ(M)+1-12n-k=1nq2kM+q2k-1M2kn0, 3.23

thus

ET(2nM)2nM=τn2nM=μ(M)+1M-12nM-k=1nq2kM+q2k-1M2kM 3.24

for all n0. It follows that, as n,

ET(2nM)2nMαM(q)uniformly inq[0,1], 3.25

where

αM(q):=μ(M)+1M-k=1q2kM+q2k-1M2kM=2μ(M)+2-qM2M-32k=1q2kM2kM 3.26

(as in the case of Remark 1, the convergence in (3.25) is much stronger, since for every m=1,2, the m-th derivative of E[T(2nM)]/(2nM) with respect to q converges to the m-th derivative of αM(q) uniformly on [0,1-ϵ] for any ϵ>0).

From (3.26) it is obvious that if M1M2, then αM1(q)αM2(q), except for at most countably many values of q (notice, e.g., that αM(0)=2 and αM(1)=0 for all M). Therefore, it follows from (3.25) that for q(0,1)

the limitlimNET(N;q)Ndoes not exist. 3.27

Comment

The reason we consider the case N=2nM, where M is odd, is mainly theoretical: We use these values of N and the corresponding subsequences in order to demonstrate the non-convergence of T(N)/N. We do not claim any practical value of this choice of N.

Open Question

Is it true that α1(q)=lim supNE[T(N;q)]/N?

Next, we consider the generating function of T(N). Using the approach of the proofs of Theorems 2 and 4 we can derive the following result:

Theorem 5

Let

g(z;N)=g(z;N;q):=EzT(N;q). 3.28

Then g(zN) satisfies the recursion

g(z;N)=zgz;N/2gz;N/2+(z-z2)qN/2gz;N/2+(z-z2)qN 3.29

for N2 (of course, g(z;1)=z).

Notice that in the case where N=2n, formula (3.29) reduces to (2.23).

By setting z=eit in (3.29) it follows that the characteristic function

ϕ(t;N)=ϕ(t;N;q):=EeitT(N;q) 3.30

of T(N)=T(N;q) satisfies the recursion

ϕ(t;N)=eitϕt;N/2ϕt;N/2+eit-e2itqN/2ϕt;N/2+eit-e2itqN,N2, 3.31

with, of course, ϕ(t;1)=eit. For example, for N=2 formula (3.31) confirms the value of ϕ(t;2) given in (2.30).

By differentiating (3.29) twice with respect to z and then setting z=1 and using (3.5) we get the following corollary:

Corollary 5

Let

σ2(N)=σ2(N;q):=VT(N;q). 3.32

Then σ2(N) satisfies the recursion

σ2(N)=σ2N/2+σ2N/2+2μ(N)qN+2μN/2qN/2+q2N-q2N/2-3qN-qN/2,N2 3.33

(of course, σ2(1)=0).

In the case N=2n formula (3.33) reduces to (2.36) (e.g., if N=2, then (3.33) agrees with (2.38)).

As we have mentioned, in the extreme cases where q=0 or q=1 the variable T(N) becomes deterministic and, consequently, σ2(N;0)=σ2(N;1)=0, which is in agreement with (3.33).

Corollary 6

For q(0,1) there are constants 0<c1<c2, depending on q, such that σ2(N)=VT(N) satisfies

c1Nσ2(N)c2Nfor allN2. 3.34

Proof

For N2 we have P{T(N)=1}=qN, which implies that μ(N)qN+2(1-qN)=2-qN. Hence

2μ(N)qN+2μN/2qN/2+q2N-q2N/2-3qN-qN/222-qNqN+22-qN/2qN/2+q2N-q2N/2-3qN-qN/2=qN-q2N+3qN/2-3q2N/2>0. 3.35

Using (3.35) in (3.33) implies

σ2(N)>σ2N/2+σ2N/2,N2. 3.36

Now, as we have already seen, σ2(1;q)0, while by (2.38) we have σ2(2;q)=q(1-q)(q2+3q+1). We can, thus, choose

c1=σ2(2;q)3=q(1-q)(q2+3q+1)3>0. 3.37

Then, (3.37) implies that σ2(2)=3c1>2c2 and from (3.36) we have σ2(3)>σ2(2)=3c1, i.e. the first inequality in (3.34) is valid for N=2 and N=3. Finally, the inequality c1Nσ2(N) for every N2 follows easily from (3.36) by induction.

To establish the second inequality of (3.34) we set

Δσ(N):=σ2(N+1)-σ2(N). 3.38

Then, formula (3.33) implies (in view of (3.12))

Δσ(N)=ΔσN/2+εσ(N)=Δσ(ι(N))+εσ(N),N2, 3.39

where

εσ(N)=εσ(N;q):=2μ(N+1)q-μ(N)qN+2μN/2qN/2-2μN/2qN/2+qN/2-qN/2+q2N/2-q2N/2+3-qN-qN+1qN(1-q). 3.40

Observe that by (3.40) and (3.1) we have that

εσ(N)=ONqN/2,N. 3.41

Now, as in the case of (3.18) we have, in view of (3.39),

Δσ(N)=Δσ(1)+k=1log2Nεσιk-1(N),N2, 3.42

where Δσ(1)=σ2(2)-σ2(1)=q(1-q)(q2+3q+1). From (3.42) and (3.41) we get that Δσ(N) is bounded. Therefore, the second inequality of (3.34) follows immediately from (3.38).

Of course, in the case where N=2n we have formula (2.42), which is much more precise than (3.34).

Let us also notice that with the help of (3.38), (3.42), and (3.15) we can get a messy, yet explicit formula for σ2(N), extending (2.32) to the case of a general N.

The behavior of T(N) as N

We start with a Lemma.

Lemma 1

Let σ2(N)=σ2(N;q)=VT(N;q) as in the previous subsection. Then for any fixed q(0,1) we have

σ2N/2σ2(N)=12+O1Nandσ2N/2σ2(N)=12+O1N,N. 3.43

Proof

As we have seen in the proof of Corollary 6, Δσ(N)=σ2(N+1)-σ2(N) is bounded, hence

σ2N/2=σ2N/2+O(1),N. 3.44

Now, if we divide (3.33) by σ2(N) and invoke (3.34) we get

1=σ2N/2σ2(N)+σ2N/2σ2(N)+OqN/2N,N. 3.45

Therefore, by using (3.44) in (3.45) and invoking again (3.34) we obtain

1=2σ2N/2σ2(N)+O1N,N, 3.46

which is equivalent to the first equality of (3.43). The second equality follows immediately.

In order to determine the limiting behavior of T(N), as N, for any fixed q(0,1), it is natural to work with the normalized variables

Y(N):=T(N)-μ(N)σ(N),N=2,3, 3.47

(Y(1) is not defined). Obviously,

E[Y(N)]=0andV[Y(N)]=E[Y(N)2]=1for allN2, 3.48

thus (Durrett 2005) the sequence of the distribution functions of Y(N) is tight.

Let us, also, introduce the characteristic functions

ψ(t;N):=EeitY(N),N=2,3,. 3.49

In view of (3.47) and (3.30) we have

ψ(t;N)=ϕtσ(N);Ne-itμ(N)/σ(N)t 3.50

and by using (3.50) in (3.31) and then invoking (3.5) we obtain the recursion

ψ(t;N)=eqN+qN/2σ(N)itψσ(N/2)σ(N)t;N/2ψσ(N/2)σ(N)t;N/2+qN/2e1σ(N)it-e2σ(N)iteμ(N/2)-μ(N)σ(N)itψσ(N/2)σ(N)t;N/2+qNe1σ(N)it-e2σ(N)ite-μ(N)σ(N)it,N4, 3.51

with ψ(t;2) and ψ(t;3) taken from (3.49). Actually, if we define ψ(0;1):=1, then (3.51) is valid for N2.

We are now ready to present a general result, which can be viewed as an extension of Theorem 3 to the case of an arbitrary N.

Theorem 6

For m=1,2, and q(0,1) we have

limNEY(N)m=EZm, 3.52

where Y(N) is given by (3.47) and Z is a standard normal random variable. In other words

limNEY(N)2-1=0andlimNEY(N)2=(2)!2!,1. 3.53

Proof

We will follow the approach of the proof of Theorem 3.

We apply induction. By (3.48) we know that (3.53) is valid in the special cases m=1 and m=2. The inductive hypothesis is that the limit limNEY(N)k=i-klimNψ(k)(0;N) (where ψ(k)(t;N) denotes the k-th derivative of ψ(t;N) with respect to t) satisfies (3.53) for k=1,,m-1. Then, for k=m (where m3) formula (3.51) implies

ψ(m)(0;N)=σmN/2σm(N)ψ(m)(0;N/2)+σmN/2σm(N)ψ(m)(0;N/2)+j=1m-1mjσjN/2σm-jN/2σm(N)ψ(j)(0;N/2)ψ(m-j)(0;N/2)+δ(N), 3.54

where

δ(N)=OqN/2,N. 3.55

Let us set

b(N):=j=1m-1mjσjN/2σm-jN/2σm(N)ψ(j)(0;N/2)ψ(m-j)(0;N/2)+δ(N). 3.56

Then, the inductive hypothesis together with Lemma 1 and (3.55) imply that the limit limNb(N) exists in C. We can, therefore, apply Corollary A1 of the Appendix to (3.54) with z(N)=ψ(m)(0;N), ρ1(N)=σmN/2/σm(N), ρ2(N)=σmN/2/σm(N) (so that, in view of Lemma 1 and (4.6), ρ1=ρ2=1/2m/2), and b(N) as in (3.56), to conclude that

the limitΛ(m):=limNψ(m)(0;N)exists inC. 3.57

Thus by taking limits in (3.54) we obtain (in view of (3.57) and Lemma 1)

Λ(m)=Λ(m)2m/2+Λ(m)2m/2+12m/2j=1m-1mjΛ(m-j)Λ(j)

or

2m/2-2Λ(m)=j=1m-1mjΛ(m-j)Λ(j). 3.58

We have, thus, shown that formula (3.58) holds for all m1. If m=1, the sum in the right-hand side of (3.58) is empty, i.e. zero, and, hence we get what we already knew, namely that Λ(1)=0. If m=2 then (3.58) becomes vacuous (0 = 0), but we know that Λ(2)=-1. Thus, formula (3.58) gives recursively the values of Λ(m) for every m3, and by using the induction hypothesis again, namely that Λ(k)=ikEZk for k=1,,m-1, it is straightforward to show that Λ(m), as given by (3.58), equals imEZm for every m.

Theorem 6 together with the fact that the sequence Y(N) is tight imply the following corollary which can be considered an extension of Corollary 2. Its proof is omitted since it is just a repetition of the proof of Corollary 2.

Corollary 7

For any fixed q(0,1) we have

Y(N)=T(N)-E[T(N)]V[T(N)]dZasN, 3.59

where Z is a standard normal random variable.

Finally, let us mention that, since σ(N)=V[T(N)]=O(N), a rather trivial consequence of Theorem 6 is that for any given ϵ>0 we have that

T(N)-E[T(N)]N12+ϵ0inLr(Ω,F,P),N, 3.60

for any r>0 and any q[0,1].

It follows that if Nn, n=1,2,, is a sequence of integers (with limnNn=) such that the limit limnE[T(Nn)]/Nn exists (e.g., Nn=2n or Nn=3·2n), then

T(Nn)NnlimnE[T(Nn)]Nn,n, 3.61

in the Lr(Ω,F,P)-sense, for every r>0 (the case Nn=2n is treated by Corollary 3).

Also, by Chebyshev’s inequality (applied to a sufficiently high power of (T(N)-E[T(N)])/N12+ϵ) and the 1st Borel-Cantelli Lemma it follows that

T(N)-E[T(N)]N12+ϵ0a.s.,N, 3.62

for any q[0,1] and any given ϵ>0. In this case, again, if limnE[T(Nn)]/Nn exists for some sequence Nn, then

T(Nn)NnlimnE[T(Nn)]Nna.s.,n. 3.63

Acknowledgements

The author wishes to thank the two anonymous referrees, as well as Professor Tom Britton, associate editor of JOMB, for their constructive comments and suggestions.

Appendix

Lemma A1

Let ε(N), N=1,2,, be a sequence of nonnegative real numbers such that there exists an integer N02 for which

ε(N)r1εN/2+r2εN/2+δ(N)for allNN0, 4.1

where r1, r2 are nonnegative constants satisfying r:=r1+r2<1 and limNδ(N)=0. Then,

limNε(N)=0. 4.2

Proof

We have that lim supε(N)=lim supεN/2=lim supεN/2. Thus, by taking lim sup in both sides of (4.1) we get

0lim supε(N)r1lim supε(N)+r2lim supε(N)+lim supδ(N),

i.e.

0lim supε(N)rlim supε(N). 4.3

Since r<1, to finish the proof it suffices to show that ε(N) is bounded , and we can, e.g., see that as follows. Let δ:=maxN|δ(N)| and

M:=maxδ1-r,max1NN0|ε(N)|. 4.4

Then (noticing that (4.4) implies that rM+δM), easy induction on N implies that ε(N)M for all N1.

If r1+r2=1, then (4.2) does not hold. For example, if ε(N)=r1εN/2+(1-r1)εN/2 for N2, then ε(N)=ε(1) for all N.

Corollary A1

Let z(N), N=1,2,, be a sequence of complex numbers satisfying the recursion

z(N)=ρ1(N)zN/2+ρ2(N)zN/2+b(N),N2, 4.5

where ρ1(N), ρ2(N), and b(N) are complex sequences such that

limNρ1(N)=ρ1C,limNρ2(N)=ρ2C,andlimNb(N)=bC, 4.6

with |ρ1|+|ρ2|<1. Then

limNz(N)=b1-ρ1-ρ2. 4.7

Proof

Pick ϵ>0 so that |ρ1|+|ρ2|+2ϵ<1 and then choose N02 so that |ρ1(N)-ρ1|<ϵ and |ρ2(N)-ρ2|<ϵ for all NN0. Then (4.5) implies

|z(N)|(|ρ1|+ϵ)|zN/2|+(|ρ2|+ϵ)|zN/2|+b,NN0, 4.8

where b:=supN|b(N)|<, and the argument at the end of the proof of Lemma A1 applies to (4.8) and implies that the sequence z(N) is bounded.

Next, we write (4.5) as

z(N)-b1-ρ1-ρ2=ρ1zN/2-b1-ρ1-ρ2+ρ2zN/2-b1-ρ1-ρ2+δ(N), 4.9

where

δ(N):=[ρ1(N)-ρ1]zN/2+[ρ2(N)-ρ2]zN/2+[b(N)-b]. 4.10

Notice that our assumptions for ρ1(N), ρ2(N), and b(N), together with the fact that z(N) is bounded, imply that limNδ(N)=0. Thus, if we take absolute values in (4.9) and set

ε(N):=z(N)-b1-ρ1-ρ2, 4.11

we obtain

ε(N)|ρ1|εN/2+|ρ2|εN/2+|δ(N)|,N2, 4.12

hence (since |ρ1|+|ρ2|<1) Lemma A1 implies that limNε(N)=0.

Finally, by using the above arguments we can also show the following simpler result:

Corollary A2

Let zn, n=0,1,, be a sequence of complex numbers satisfying the recursion

zn=ρzn-1+bn,n=1,2,, 4.13

where |ρ|<1 and limnbnC. Then,

limnzn=limnbn1-ρ. 4.14

Footnotes

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  1. Aldridge M (2019) Rates of adaptive group testing in the linear regime. In: IEEE International Symposium on Information Theory (ISIT), Paris, France, pp 236–240. 10.1109/ISIT.2019.8849712
  2. Aldridge M (2020) Conservative two-stage group testing, arXiv:2005.06617v1 [Stat.AP] (6)
  3. Armendáriz I, Ferrari PA, Fraiman D, Ponce Dawson S (2020) Group testing with nested pools, arXiv:2005.13650v2 [math.ST] (6)
  4. Chung KL. A course in probability theory. 3. San Diego: Academic Press; 2001. [Google Scholar]
  5. Dorfman R. The detection of defective members of large populations. Ann Math Stat. 1943;14(4):436–440. doi: 10.1214/aoms/1177731363. [DOI] [Google Scholar]
  6. Durrett R (2005) Probability: theory and examples, third edition. Duxbury Advanced Series, Brooks/Cole–Thomson Learning, Belmont
  7. Golliery C, Gossnerz O. Group Testing against Covid-19. Covid Econ. 2020;1(2):32–42. doi: 10.18231/j.ijirm.2020.023. [DOI] [Google Scholar]
  8. Hwang FK. A method for detecting all defective members in a population by group testing. J Am Stat Assoc. 1972;67(339):605–608. doi: 10.1080/01621459.1972.10481257. [DOI] [Google Scholar]
  9. Malinovsky Y. Sterrett procedure for the generalized group testing problem. Methodol Comput Appl Probab. 2019;21:829–840. doi: 10.1007/s11009-017-9601-4. [DOI] [Google Scholar]
  10. Mallapaty S. The mathematical strategy that could transform coronavirus testing. Nature. 2020;583:504–505. doi: 10.1038/d41586-020-02053-6. [DOI] [PubMed] [Google Scholar]
  11. Sobel M, Groll PA. Group testing to eliminate efficiently all defectives in a binomial sample. Bell Labs Tech J. 1959;38(5):1179–1252. doi: 10.1002/j.1538-7305.1959.tb03914.x. [DOI] [Google Scholar]
  12. Ungar P. The cut off point for group testing. Commun Pure Appl Math. 1960;13(1):49–54. doi: 10.1002/cpa.3160130105. [DOI] [Google Scholar]

Articles from Journal of Mathematical Biology are provided here courtesy of Nature Publishing Group

RESOURCES