A binary search scheme for determining all contaminated specimens

Vassilis G Papanicolaou

doi:10.1007/s00285-021-01663-6

. 2021 Sep 20;83(4):35. doi: 10.1007/s00285-021-01663-6

A binary search scheme for determining all contaminated specimens

Vassilis G Papanicolaou ^1,^✉

PMCID: PMC8450752 PMID: 34542723

Abstract

Specimens are collected from N different sources. Each specimen has probability p of being contaminated (in the case of a disease, e.g., p is the prevalence rate), independently of the other specimens. Suppose we can apply group testing, namely take small portions from several specimens, mix them together, and test the mixture for contamination, so that if the test turns positive, then at least one of the samples in the mixture is contaminated. In this paper we give a detailed probabilistic analysis of a binary search scheme, we propose, for determining all contaminated specimens. More precisely, we study the number T(N) of tests required in order to find all the contaminated specimens, if this search scheme is applied. We derive recursive and, in some cases, explicit formulas for the expectation, the variance, and the characteristic function of T(N). Also, we determine the asymptotic behavior of the moments of T(N) as $N \to \infty$ and from that we obtain the limiting distribution of T(N) (appropriately normalized), which turns out to be normal.

Keywords: Prevalence (rate), Adaptive group testing, Binary search scheme, Linear regime, Probabilistic testing, Average-case aspect ratio, Characteristic function, Moments, Limiting distribution, Normal distribution

Introduction

Consider N containers containing samples from N different sources (e.g., clinical specimens from N different patients, water samples from N different lakes, etc.). For each sample we assume that the probability of being contaminated (by, say, a virus, a toxic substance, etc.) is p and the probability that it is not contaminated is $q : = 1 - p$ , independently of the other samples. All N samples must undergo a screening procedure (say a molecular or antibody test in the case of a viral contamination, a radiation measurement in the case of a radioactive contamination, etc.) in order to determine exactly which are the contaminated ones.

One obvious way to identify all contaminated specimens is the individual testing, namely to test the specimens one by one individually (of course, this approach requires N tests). In this work we analyze a group testing (or pool testing) approach, which can be called “binary search scheme” and requires a random number of tests. In particular, we will see that if p is not too big, typically if $p < 0.224$ , then by following this scheme, the expected number of tests required in order to determine all contaminated samples can be made strictly less than N. There is one requirement, though, for implementing this scheme, namely that each sample can undergo many tests (or that the sample quantity contained in each container suffices for many tests).

Group testing procedures have broad applications in areas ranging from medicine and engineering to airport security control, and they have attracted the attention of researchers for over seventy-five years (see, e.g., Dorfman’s seminal paper (Dorfman 1943), as well as Hwang 1972; Sobel and Groll 1959; Ungar 1960, etc.). In fact, recently, group testing has received some special attention, partly due to the COVID-19 crisis (see, e.g., Aldridge 2019, 2020; Armendáriz et al. 2020; Golliery and Gossnerz 2020; Malinovsky 2019; Mallapaty 2020, and the references therein).

The contribution of the present work is a detailed quantitative analysis of a precise group testing procedure which is quite natural and easy to implement. The scheme goes as follows: First we take samples from each of the N containers, mix them together and test the mixture. If the mixture is not contaminated, we know that none of the samples is contaminated. If the mixture is contaminated, we split the containers in two groups, where the first group consists of the first $⌊ N / 2 ⌋$ containers and the second group consists of the remaining $⌈ N / 2 ⌉$ containers. Then we take samples from the containers of the first group, mix them together and test the mixture. If the mixture is not contaminated, then we know that none of the samples of the first group is contaminated. If the mixture is contaminated, we split the containers of the first group into two subgroups (in the same way we did with the N containers) and continue the procedure. We also apply the same procedure to the second group (consisting of $⌈ N / 2 ⌉$ containers).

The main quantity of interest in the present paper is the number T(N) of tests required in order to determine all contaminated samples, if the above binary search scheme is applied.

Following the terminology of Aldridge (2019) we can characterize the aforementioned procedure as an adaptive probabilistic group testing in the linear regime (“linear” since the average number of contaminated specimens is pN, hence grows linearly with N). “Adaptive” refers to the fact that the way we perform a stage of the testing procedure depends on the outcome of the previous stages. “Probabilistic" refers to the fact that each specimen has probability p of being contaminated, independently of the other specimens and, furthermore, that the required number of tests, T(N), is a random variable. Let us be a bit more specific about the latter. Suppose $ξ_{1}, ξ_{2}, \dots$ is a sequence of independent Bernoulli random variables of a probability space $(Ω, F, P)$ , with $P {ξ_{k} = 1} = p$ and $P {ξ_{k} = 0} = q$ , $k = 1, 2, \dots$ , so that ${ξ_{k} = 1}$ is the event that the k-th specimen is contaminated. Then, for each N, the quantity T(N) of our binary search scheme is a specific function of $ξ_{1}, ξ_{2}, \dots, ξ_{N}$ , say,

\begin{matrix} T (N) = T_{N} (ξ_{1}, ξ_{2}, \dots, ξ_{N}) . \end{matrix}

1.1

In particular, is not hard to see that

\begin{matrix} 1 = T_{N} (0, 0, \dots, 0) \leq T_{N} (ξ_{1}, ξ_{2}, \dots, ξ_{N}) \leq T_{N} (1, 1, \dots, 1) = 2 N - 1, \end{matrix}

1.2

i.e. if none of the contents of the N containers is contaminated, then $T (N) = 1$ , whereas if all N containers contain contaminated samples, then by easy induction on N we can see that $T (N) = 2 N - 1$ (observe that if $ξ_{1} = \dots = ξ_{N} = 1$ , then $T (1) = 1$ , while for $N \geq 2$ we have $T (N) = T (⌊ N / 2 ⌋) + T (⌈ N / 2 ⌉) + 1$ ). Thus, T(N) can become bigger than N, while the deterministic way of checking of the samples one by one requires N tests (i.e. in the case of individual testing we would have $T_{N} \equiv N$ ).

Let us mention that, for a given p, the optimal testing procedure, namely the procedure which minimizes the expectation of the number of tests required in order to determine (with zero-error) all contaminated specimens in an adaptive probabilistic group testing, remains unknown (Malinovsky 2019). However, Ungar (1960) has shown that if

\begin{matrix} p \geq p^{*} : = \frac{3 - \sqrt{5}}{2} \approx 0.382, \end{matrix}

1.3

then the optimal procedure is individual testing. In practice, though, p is usually quite small, much smaller than $p^{*}$ .

We do not claim that our proposed procedure is optimal. However, for small values of p the numerical evidence we have suggests (see Tables 1 and 2 in Sects. 2.1 and 3.1 respectively) that, in comparison with the optimal results obtained by Aldridge (2019), Aldridge (2020) via simulation, our binary scheme is near-optimal.

Table 1.

Values of $E [W_{n}] / N$ for certain choices of $N = 2^{n}$ and p

	$p = . 005$	$p = . 01$	$p = . 05$	$p = . 1$	$p = . 15$	$p = . 2$
$N = 1$	1	1	1	1	1	1
$N = 2$	.505	.515	.575	.645	.715	.780
$N = 4$	.265	.280	.395	.528	.653	.768
$N = 8$	.148	.169	.335	.518	.679	.820
$N = 16$	.092	.121	.328	.542	.719	.871
$N = 32$	.068	.103	.340	.566	.748	.901
$N = 64$	.059	.099	.352	.581	.763	.917
$N = 128$	.057	.101	.359	.589	.771	.924
$N = 256$	.058	.103	.363	.593	.775	.928
$N = 512$	.059	.105	.365	.595	.777	.930
$N = 1024$	.060	.106	.366	.596	.778	.931

Open in a new tab

Table 2.

Values of $E [T (N)] / N$ for certain choices of $N \neq 2^{n}$ and p

	$p = . 005$	$p = . 01$	$p = . 05$	$p = . 1$	$p = . 15$	$p = . 2$
$N = 3$	.345	.357	.447	.554	.654	.749
$N = 5$	.217	.234	.362	.510	.645	.768
$N = 6$	.186	.205	.348	.510	.656	.787
$N = 7$	.163	.184	.337	.510	.663	.800
$N = 12$	.110	.136	.325	.526	.696	.843
$N = 48$	.061	.099	.345	.571	.751	.902
$N = 96$	.057	.099	.354	.582	.762	.912
$N = 200$	.057	.101	.358	.585	.765	.917
$N = 389$	.058	.104	.362	.589	.769	.920
$N = 768$	.059	.105	.363	.591	.771	.922
$N = 1000$	.060	.106	.365	.594	.776	.929

Open in a new tab

Let us also mention that our “binary search scheme” is different from the “binary splitting” procedures of Sobel and Groll (1959) (see, also, Aldridge (2019) and Hwang (1972)) where at each stage a group is splitted in two subgroups whose sizes may not be equal or even nearly equal. This feature makes these binary splitting procedures very hard to analyze theoretically, while the advantage of our proposed procedure is that, due to its simplicity it allows us to calculate certain relevant quantities (e.g., the expectation $E [T (N)]$ and the variance $V [T (N)]$ ) explicitly. We even managed to determine the limiting distribution of T(N) (appropriately normalized), which turns out to be normal.

Finally, let us mention that in Armendáriz et al. (2020) the authors consider optimality in the case where the pool sizes are powers of 3 (except possibly for an extra factor of 4 for the first pool). In the case where $N = 100$ and $p = 0.02$ their average number of tests in order to determine the contaminated specimens is 20, thus their average-case aspect ratio is 0.2. Tables 1 and 2 below, indicate that our proposed strategy gives comparable, if not better, results.

In Sect. 2 we study the special case where $N = 2^{n}$ . Here, the random variable $T (2^{n})$ is denoted by $W_{n}$ . We present explicit formulas for the expectation (Theorem 1) and the variance (Corollary 1) of $W_{n}$ . In addition, in formulas (2.12) and (2.42) we give the asymptotic behavior of $E [W_{n}]$ and $V [W_{n}]$ respectively, as $n \to \infty$ . Then, we determine the leading asymptotics of all the moments of $W_{n}$ (Theorem 3) and from that we conclude (Corollary 2) that, under an appropriate normalization $W_{n}$ converges in distribution to a normal random variable. Finally, at the end of the section, in Corollary 3 and in formula (2.75) we give a couple of Law-of-Large-Numbers-type results for $W_{n}$ .

Section 3 studies the case of an arbitrary N. In Theorem 4, Theorem 5, and Corollary 5 respectively we derive recursive formulas for $μ (N) : = E [T (N)]$ , $g (z ; N) : = E [z^{T (N)}]$ , and $σ^{2} (N) : = V [T (N)]$ , while Corollary 4 gives an explicit, albeit rather messy, formula for $E [T (N)]$ . We also demonstrate (see Remark 3) the nonexistence of the limit of $E [T (N)] / N$ , as $N \to \infty$ , which is in contrast to the special case $N = 2^{n}$ , where the limit exists. In Sect. 3.2 we show that the moments of $Y (N) : = [T (N) - μ (N)] / σ (N)$ converge to the moments of the standard normal variable Z as $N \to \infty$ . An immediate consequence is (Corollary 7) that Y(N) converges to Z in distribution.

Here, let us point out that the determination of the limiting behavior of a random quantity is a fundamental issue in many probabilistic models. Thus, our result that $[T (N) - μ (N)] / σ (N)$ converges in distribution to the standard normal variable Z, and that the convergence is good to the point that we also have convergence of all moments to the corresponding moments of Z, gives much more information about the testing procedure than the information we obtain from the behavior of $E [T (N)]$ . And since the number N can be in the order of $10^{2}$ or, even $10^{3}$ , and the explicit results can get very messy for such big values of N, the asymptotic behavior of T(N) enables us to obtain valuable information without much effort.

At the end of the paper we have included a brief appendix (Sect. 4) containing a lemma and two corollaries, which are used in the proofs of some of the results of Sects. 2 and 3.

The case $N = 2^{n}$

We first consider the case where the number of containers is a power of 2, namely

\begin{matrix} N = 2^{n}, where n = 0, 1, \dots . \end{matrix}

2.1

As we have said in the introduction, the first step is to test a pool containing samples from all $N = 2^{n}$ containers. If this pool is not contaminated, then none of the contents of the N containers is contaminated and we are done. If the pool is contaminated, we make two subpools, one containing samples of the first $2^{n - 1}$ containers and the other containing samples of the remaining $2^{n - 1}$ containers. We continue by testing the first of those subpools. If it is contaminated we split it again into two subpools of $2^{n - 2}$ samples each and keep going. We also repeat the same procedure for the second subpool of the $2^{n - 1}$ samples.

One important detail here is that if the first subpool of the $2^{n - 1}$ samples turns out not contaminated, then we are sure that the second subpool of the remaining $2^{n - 1}$ samples must be contaminated, hence we can save one test and immediately proceed with the splitting of the second subpool into two subpools of $2^{n - 2}$ samples each. Likewise, suppose that at some step of the procedure a subpool of $2^{n - k}$ samples is found contaminated. Then this subpool is splitted further into two other subpools, each containing $2^{n - k - 1}$ samples. If the first of these subpools is found not contaminated, then we automatically know that the other is contaminated and, consequently, we can save one test.

Let $W_{n}$ be the number of tests required to find all contaminated samples by following the above procedure. Thus by (1.2) we have

\begin{matrix} 1 \leq W_{n} \leq 2^{n + 1} - 1 = 2 N - 1 \end{matrix}

2.2

(in the trivial case $n = 0$ , i.e $N = 1$ , we, of course, have $W_{0} = 1$ ).

Example 1

Suppose that $N = 4$ (hence $n = 2$ ). In this case $W_{2}$ can take any value between 1 and 7, except for 2. For instance in the case where the samples of the first three containers are not contaminated, while the content of the fourth container is contaminated, we have $W_{2} = T_{4} (0, 0, 0, 1) = 3$ . On the other hand, in the case where the samples of the first and the third container are contaminated, while the samples of the second and fourth container are not contaminated, we have $W_{2} = T_{4} (1, 0, 1, 0) = 7$ .

The expectation of $W_{n}$

Theorem 1

Let $N = 2^{n}$ and $W_{n}$ as above. Then

\begin{matrix} E [W_{n}] = 2^{n + 1} - 1 - 2^{n} \sum_{k = 1}^{n} \frac{q^{2^{k}} + q^{2^{k - 1}}}{2^{k}}, n = 0, 1, \dots \end{matrix}

2.3

(in the trivial case $n = 0$ the sum is empty, i.e. 0), where, as it is mentioned in the introduction, q is the probability that a sample is not contaminated.

Proof

Assume $n \geq 1$ and let $D_{n}$ be the event that none of the $2^{n}$ samples is contaminated. Then

\begin{matrix} E [W_{n}] = E [W_{n} | D_{n}] P (D_{n}) + E [W_{n} | D_{n}^{c}] P (D_{n}^{c}), \end{matrix}

and hence

\begin{matrix} E [W_{n}] = q^{2^{n}} + u_{n} (1 - q^{2^{n}}), \end{matrix}

2.4

where for typographical convenience we have set

\begin{matrix} u_{n} : = E [W_{n} | D_{n}^{c}] . \end{matrix}

2.5

In order to find a recursive formula for $u_{n}$ let us first consider the event $A_{n}$ that in the group of the $2^{n}$ containers none of the first $2^{n - 1}$ contain contaminated samples. Clearly, $D_{n} \subset A_{n}$ and

\begin{matrix} P (A_{n} | D_{n}^{c}) = \frac{P (A_{n} D_{n}^{c})}{P (D_{n}^{c})} = \frac{P (A_{n}) - P (D_{n})}{P (D_{n}^{c})} = \frac{q^{2^{n - 1}} - q^{2^{n}}}{1 - q^{2^{n}}} = \frac{q^{2^{n - 1}}}{1 + q^{2^{n - 1}}} . \end{matrix}

2.6

Likewise, if $B_{n}$ is the event that in the group of the $2^{n}$ containers none of the last $2^{n - 1}$ contains contaminated samples, then

\begin{matrix} P (B_{n} | D_{n}^{c}) = \frac{q^{2^{n - 1}}}{1 + q^{2^{n - 1}}} . \end{matrix}

2.7

Let us also notice that $A_{n} B_{n} = D_{n}$ , hence $P (A_{n} B_{n} | D_{n}^{c}) = 0$ .

Now, (i) given $A_{n}$ and $D_{n}^{c}$ we have that $W_{n} \overset{d}{=} 1 + W_{n - 1}$ , where the notation $X \overset{d}{=} Y$ signifies that the random variables X and Y have the same distribution. To better justify this distributional equality we observe that, in the case $N = 2^{n}$ , formula (1.1) takes the form $W_{n} = T (N) = T_{N} (ξ_{1}, \dots, ξ_{N})$ and $W_{n - 1} = T (N / 2) = T_{N / 2} (ξ_{1}, \dots, ξ_{N / 2})$ . Thus, $W_{n - 1} \overset{d}{=} T_{N / 2} (ξ_{(N / 2) + 1}, \dots, ξ_{N})$ . Now, $D_{n} = {ξ_{1} = ξ_{2} = \dots = ξ_{N} = 0}$ and $A_{n} = {ξ_{1} = ξ_{2} = \dots = ξ_{N / 2} = 0}$ . Hence, the given event $A_{n} D_{n}^{c}$ means that only some of the variables $ξ_{(N / 2) + 1}, \dots, ξ_{N}$ may differ from 0. Consequently, given $A_{n} D_{n}^{c}$ we have that $W_{n} \overset{d}{=} 1 + W_{n - 1}$ .

(ii) Similarly, given $B_{n}$ and $D_{n}^{c}$ we have that $W_{n} \overset{d}{=} 2 + W_{n - 1}$ .

Finally, (iii) given ${(A_{n} \cup B_{n})}^{c}$ and $D_{n}^{c}$ we have that $W_{n} = 1 + W_{n - 1} + {\tilde{W}}_{n - 1}$ , where, ${\tilde{W}}_{n - 1}$ is an independent copy of $W_{n - 1}$ . Thus (given $D_{n}^{c}$ ) by conditioning on the events $A_{n}$ , $B_{n}$ and ${(A_{n} \cup B_{n})}^{c}$ we get, in view of (2.5), (2.6), and (2.7),

\begin{matrix} u_{n} = (1 + u_{n - 1}) \frac{q^{2^{n - 1}}}{1 + q^{2^{n - 1}}} + (2 + u_{n - 1}) \frac{q^{2^{n - 1}}}{1 + q^{2^{n - 1}}} + (1 + 2 u_{n - 1}) \frac{1 - q^{2^{n - 1}}}{1 + q^{2^{n - 1}}} \end{matrix}

\begin{matrix} u_{n} = \frac{2}{1 + q^{2^{n - 1}}} u_{n - 1} + \frac{1 + 2 q^{2^{n - 1}}}{1 + q^{2^{n - 1}}} . \end{matrix}

2.8

By replacing n by $n - 1$ in (2.4) we get

\begin{matrix} E [W_{n - 1}] = q^{2^{n - 1}} + u_{n - 1} (1 - q^{2^{n - 1}}) . \end{matrix}

2.9

Therefore, by using (2.8) in (2.4) and (2.9) we can eliminate $u_{n}$ , $u_{n - 1}$ and obtain

\begin{matrix} E [W_{n}] = 2 E [W_{n - 1}] - q^{2^{n}} - q^{2^{n - 1}} + 1 . \end{matrix}

2.10

Finally, (2.3) follows easily from (2.10) and the fact that $E [W_{0}] = 1$ . $□$

For instance, in the cases $n = 1$ , $n = 2$ , and $n = 10$ formula (2.3) becomes

\begin{matrix} E [W_{1}] & = 3 - q - q^{2}, E [W_{2}] = 7 - 2 q - 3 q^{2} - q^{4}, \\ and \\ E [W_{10}] & = 2047 - 512 q - 768 q^{2} - 384 q^{4} - 192 q^{8} - 96 q^{16} \\ - 48 q^{32} - 24 q^{64} - 12 q^{128} - 6 q^{256} - 3 q^{512} - q^{1024} \end{matrix}

2.11

respectively. In the extreme case $q = 0$ we know that $E [W_{n}] = 2^{n + 1} - 1$ , while in the other extreme case $q = 1$ we know that $E [W_{n}] = 1$ , and these values agree with the ones given by (2.3).

Let us also notice that formula (2.3) implies that for any given $q \in [0, 1)$ we have

\begin{matrix} E [W_{n}] = 2^{n} α_{1} (q) - 1 + O (q^{2^{n}}), n \to \infty, \end{matrix}

2.12

where

\begin{matrix} α_{1} (q) : = 2 - \sum_{k = 1}^{\infty} \frac{q^{2^{k}} + q^{2^{k - 1}}}{2^{k}} = 2 - \frac{q}{2} - \frac{3}{2} \sum_{k = 1}^{\infty} \frac{q^{2^{k}}}{2^{k}} . \end{matrix}

2.13

It is clear that $α_{1} (q)$ is a power series about $q = 0$ with radius of convergence equal to 1 (actually, it is well known that the unit circle is the natural boundary of this power series), which is strictly decreasing on [0, 1] with $α_{1} (0) = 2$ and $α_{1} (1) = 0$ . Furthermore, it is not hard to see that it satisfies the functional equation

\begin{matrix} α_{1} (q^{2}) = 2 α_{1} (q) + q^{2} + q - 2 . \end{matrix}

2.14

Also, by using the inequality

\begin{matrix} \int_{1}^{\infty} q^{2^{x}} d x < \sum_{k = 1}^{\infty} q^{2^{k}} < 1 + \int_{1}^{\infty} q^{2^{x}} d x, q \in (0, 1), \end{matrix}

and by estimating the integral as $q \to 1^{-}$ one can show that

\begin{matrix} α_{1}^{'} (q) = \frac{3}{2 ln 2} ln (1 - q) + O (1), q \to 1^{-} . \end{matrix}

2.15

Remark 1

By dividing both sides of (2.3) by $2^{n}$ and comparing with (2.13) we can see that as $n \to \infty$

\begin{matrix} \frac{E [W_{n}]}{2^{n}} \to α_{1} (q) uniformly in q \in [0, 1] . \end{matrix}

2.16

Actually, the convergence is much stronger, since for every $m = 1, 2, \dots$ the m-th derivative of $E [W_{n}] / 2^{n}$ with respect to q converges to the m-th derivative of $α_{1} (q)$ uniformly on compact subets of [0, 1).

The expectation $E [W_{n}]$ and, consequently, the average-case aspect ratio $E [W_{n}] / 2^{n}$ are getting smaller as q approaches 1 (equivalently, as p approaches 0). As an illustration, by employing (2.3) we have created the following table:

(by comparing the above values with the graphs, given in Aldridge (2019) and Aldridge (2020), of the optimal average-case aspect ratio as a function of p, we can see that, if $p \leq 0.15$ , our procedure is near-optimal).

If q is such that

\begin{matrix} E [W_{n}] < 2^{n} = N, \end{matrix}

2.17

then, by applying the aforementioned testing strategy, the number of tests required to determine all containers with contaminated content is, on the average, less than N, namely less than the number of tests required to check the containers one by one.

Let us set

\begin{matrix} μ_{n} = μ_{n} (q) : = E [W_{n}] . \end{matrix}

2.18

It is clear from (2.3) that for $n \geq 1$ the quantity $μ_{n} (q)$ is a polynomial in q of degree $2^{n}$ , strictly decreasing on [0, 1], with $μ_{n} (0) = 2^{n + 1} - 1 = 2 N - 1$ and $μ_{n} (1) = 1$ . Thus, there is a unique $q_{n} \in (0, 1)$ such that

\begin{matrix} μ_{n} (q_{n}) = 2^{n} \end{matrix}

2.19

and (2.17) holds if $q \in (q_{n}, 1]$ or, equivalently, if $p \in [0, 1 - q_{n})$ . Therefore, it only makes sense to consider the procedure if $q > q_{n}$ . Alternatively, if q is given, one may try to find the optimal value, say m, of n which minimizes $μ_{n} (q) / 2^{n}$ and, then, split N into subgroups having size m.

As an example, let us give the (approximate) values of $q_{n}$ for $n = 0, 1, \dots, 10$ :

\begin{matrix} \begin{matrix} q_{0} = 0, q_{1} = \frac{\sqrt{5} - 1}{2} \approx . 618, & q_{2} \approx . 685, & q_{3} \approx . 727, & q_{4} \approx . 751, \\ q_{5} \approx . 763, q_{6} \approx . 769, q_{7} \approx . 772, & q_{8} \approx . 774, & q_{9} \approx . 775, & q_{10} \approx . 775, \end{matrix} \end{matrix}

where $q_{0} = 0$ is a convention. Notice that $q_{1} = 1 - p^{*}$ , where $p^{*}$ is as in (1.3). Thus, $q_{1}$ is the threshold found by Ungar (1960) below which group testing does not improve on individual testing.

In view of (2.3) and (2.18), formula (2.19) becomes

\begin{matrix} \sum_{k = 1}^{n} \frac{q_{n}^{2^{k}} + q_{n}^{2^{k - 1}}}{2^{k}} = 1 - \frac{1}{2^{n}}, \end{matrix}

2.20

which by letting $n \to \infty$ yields

\begin{matrix} q_{n} \to q_{\infty}, where α_{1} (q_{\infty}) = 1, \end{matrix}

2.21

$α_{1} (q)$ being the function defined in (2.13). In fact, $. 775 < q_{\infty} < . 776$ . Furthermore, by formula (2.20) we have

\begin{matrix} \sum_{k = 1}^{n} \frac{q_{n}^{2^{k}} + q_{n}^{2^{k - 1}}}{2^{k}} = \sum_{k = 1}^{n} \frac{q_{n + 1}^{2^{k}} + q_{n + 1}^{2^{k - 1}}}{2^{k}} - \frac{1 - q_{n + 1}^{2^{n}} - q_{n + 1}^{2^{n + 1}}}{2^{n + 1}}, \end{matrix}

which implies that the sequence $q_{n}$ is strictly increasing. Therefore, if $q > . 776$ , equivalently if $p < . 224$ , then $E [W_{n}] < 2^{n} = N$ for any $n \geq 1$ .

The behavior of $W_{n}$ as $n \to \infty$

We start with a recursive formula for the generating function of $W_{n}$ .

Theorem 2

Let

\begin{matrix} g_{n} (z) : = E [z^{W_{n}}] . \end{matrix}

2.22

Then

\begin{matrix} g_{n} (z) = z g_{n - 1} {(z)}^{2} + (z - z^{2}) q^{2^{n - 1}} g_{n - 1} (z) + (z - z^{2}) q^{2^{n}}, n = 1, 2, \dots \end{matrix}

2.23

(clearly, $g_{0} (z) = z$ ).

Proof

With $D_{n}$ as in the proof of Theorem 1 we have

\begin{matrix} g_{n} (z) = E [z^{W_{n}}] = E [z^{W_{n}}, |, D_{n}] P (D_{n}) + E [z^{W_{n}}, |, D_{n}^{c}] P (D_{n}^{c}), \end{matrix}

thus

\begin{matrix} g_{n} (z) = z q^{2^{n}} + h_{n} (z) (1 - q^{2^{n}}), \end{matrix}

2.24

where

\begin{matrix} h_{n} (z) : = E [z^{W_{n}}, |, D_{n}] . \end{matrix}

2.25

Let $A_{n}$ and $B_{n}$ be the events introduced in the proof of Theorem 1. Then, given $A_{n}$ and $D_{n}^{c}$ we have that $z^{W_{n}} \overset{d}{=} z z^{W_{n - 1}}$ ; given $B_{n}$ and $D_{n}^{c}$ we have that $z^{W_{n}} \overset{d}{=} z^{2} z^{W_{n - 1}}$ ; finally given ${(A_{n} \cup B_{n})}^{c}$ and $D_{n}^{c}$ we have that $z^{W_{n}} \overset{d}{=} z z^{W_{n - 1}} z^{{\tilde{W}}_{n - 1}}$ , where ${\tilde{W}}_{n - 1}$ is an independent copy of $W_{n - 1}$ . Thus (given $D_{n}^{c}$ ) by conditioning on the events $A_{n}$ , $B_{n}$ and ${(A_{n} \cup B_{n})}^{c}$ we get, in view of (2.25), (2.6), and (2.7),

\begin{matrix} h_{n} (z) = (z + z^{2}) h_{n - 1} (z) \frac{q^{2^{n - 1}}}{1 + q^{2^{n - 1}}} + z h_{n - 1} {(z)}^{2} \frac{1 - q^{2^{n - 1}}}{1 + q^{2^{n - 1}}} . \end{matrix}

2.26

By replacing n by $n - 1$ in (2.24) we get

\begin{matrix} g_{n - 1} (z) = z q^{2^{n - 1}} + h_{n - 1} (z) (1 - q^{2^{n - 1}}) \end{matrix}

2.27

and (2.23) follows by eliminating $h_{n} (z)$ and $h_{n - 1} (z)$ from (2.24), (2.26), and (2.27). $□$

By setting $z = e^{it}$ in (2.23) it follows that the characteristic function

\begin{matrix} ϕ_{n} (t) : = E [e^{i t W_{n}}] \end{matrix}

2.28

of $W_{n}$ satisfies the recursion

\begin{matrix} ϕ_{n} (t) = e^{it} ϕ_{n - 1} {(t)}^{2} + (e^{it} - e^{2 i t}) q^{2^{n - 1}} ϕ_{n - 1} (t) + (e^{it} - e^{2 i t}) q^{2^{n}}, n = 1, 2, \dots, \end{matrix}

2.29

with, of course, $ϕ_{0} (t) = e^{it}$ .

For instance, for $n = 1$ formula (2.29) yields

\begin{matrix} ϕ_{1} (t) = e^{3 i t} + q (e^{it} - e^{2 i t}) e^{it} + q^{2} (e^{it} - e^{2 i t}) e^{it} . \end{matrix}

2.30

Corollary 1

Let

\begin{matrix} σ_{n}^{2} = σ_{n}^{2} (q) : = V [W_{n}] . \end{matrix}

2.31

Then

\begin{matrix} σ_{n}^{2} (q) & = 2^{n + 1} \sum_{k = 1}^{n} (2 q^{2^{k}} + q^{2^{k - 1}}) + 2^{n} \sum_{k = 1}^{n} \frac{q^{2^{k + 1}} + q^{3 \cdot 2^{k - 1}} - 5 q^{2^{k}} - 3 q^{2^{k - 1}}}{2^{k}} \\ - 2^{n} \sum_{k = 1}^{n} (2 q^{2^{k}} + q^{2^{k - 1}}) \sum_{j = 1}^{k} \frac{q^{2^{j}} + q^{2^{j - 1}}}{2^{j}}, n = 0, 1, \dots \end{matrix}

2.32

(in the trivial case $n = 0$ all the above sums are empty, i.e. 0).

Proof

Differentiating (2.23) twice with respect to z and then setting $z = 1$ yields

\begin{matrix} g_{n}^{''} (1) = 2 g_{n - 1}^{''} (1) + 2 g_{n - 1}^{'} {(1)}^{2} - 2 q^{2^{n - 1}} g_{n - 1}^{'} (1) + 4 g_{n - 1}^{'} (1) - 2 q^{2^{n}} - 2 q^{2^{n - 1}}, \end{matrix}

2.33

where, in view of (2.22),

\begin{matrix} g_{n}^{''} (1) = E [W_{n}, (W_{n} - 1)] and g_{n}^{'} (1) = E [W_{n}] . \end{matrix}

2.34

Thus,

\begin{matrix} σ_{n}^{2} = g_{n}^{''} (1) + g_{n}^{'} (1) - g_{n}^{'} {(1)}^{2} \end{matrix}

2.35

and by using (2.10) and (2.35) in (2.33) we obtain

\begin{matrix} σ_{n}^{2} = 2 σ_{n - 1}^{2} + (2 q^{2^{n}} + q^{2^{n - 1}}) E [W_{n}] + q^{2^{n + 1}} + q^{3 \cdot 2^{n - 1}} - 3 q^{2^{n}} - 2 q^{2^{n - 1}} . \end{matrix}

2.36

Finally, formula (2.36) together with the fact that $σ_{0}^{2} = V [W_{0}] = 0$ imply

\begin{matrix} σ_{n}^{2} = 2^{n} \sum_{k = 1}^{n} \frac{(2 q^{2^{k}} + q^{2^{k - 1}}) E [W_{k}] + q^{2^{k + 1}} + q^{3 \cdot 2^{k - 1}} - 3 q^{2^{k}} - 2 q^{2^{k - 1}}}{2^{k}} \end{matrix}

2.37

from which (2.32) follows by invoking (2.3). $□$

As we have mentioned, in the extreme cases where $q = 0$ or $q = 1$ the variable $W_{n}$ becomes deterministic and, consequently, $σ_{n}^{2} (0) = σ_{n}^{2} (1) = 0$ , which is in agreement with (2.32). E.g., setting $n = 1$ and $n = 2$ in formula (2.32) yields

\begin{matrix} σ_{1}^{2} (q) = V [W_{1}] = - q^{4} - 2 q^{3} + 2 q^{2} + q = q (1 - q) (q^{2} + 3 q + 1) \end{matrix}

2.38

and

\begin{matrix} σ_{2}^{2} (q) = V [W_{2}] = q (1 - q) (q^{6} + q^{5} + 7 q^{4} + 11 q^{3} + 5 q^{2} + 11 q + 2) . \end{matrix}

2.39

Notice that the maximum of $σ_{1}^{2} (q)$ on [0, 1] is attained at $q \approx . 6462$ . Actually, both $d σ_{1}^{2} (q) / d q$ and $d^{2} σ_{1}^{2} (q) / d q^{2}$ have one (simple) zero in [0, 1]. The maximum of $σ_{2}^{2} (q)$ on [0, 1] is attained at $q \approx . 744$ , while both $d σ_{2}^{2} (q) / d q$ and $d^{2} σ_{2}^{2} (q) / d q^{2}$ have one simple zero in [0, 1].

Let us also notice that, since

\begin{matrix} \sum_{j = 1}^{k} \frac{q^{2^{j}} + q^{2^{j - 1}}}{2^{j}} = \frac{q}{2} + \frac{q^{2^{k}}}{2^{k}} \frac{3}{2} \sum_{j = 1}^{k - 1} \frac{q^{2^{j}}}{2^{j}}, \end{matrix}

2.40

formula (2.32) can be also written as

\begin{matrix} σ_{n}^{2} (q) & = 2^{n} (2 - \frac{q}{2}) \sum_{k = 1}^{n} (2 q^{2^{k}} + q^{2^{k - 1}}) - 2^{n} \sum_{k = 1}^{n} \frac{q^{2^{k + 1}} + 5 q^{2^{k}} + 3 q^{2^{k - 1}}}{2^{k}} \\ - 2^{n} \cdot \frac{3}{2} \sum_{k = 1}^{n} (2 q^{2^{k}} + q^{2^{k - 1}}) \sum_{j = 1}^{k - 1} \frac{q^{2^{j}}}{2^{j}}, n = 0, 1 \dots . \end{matrix}

2.41

From formula (2.41) it follows that for any given $q \in [0, 1)$ we have

\begin{matrix} σ_{n}^{2} = V [W_{n}] = 2^{n} β (q) + O (2^{n}, q^{2^{n}}), n \to \infty, \end{matrix}

2.42

where

\begin{matrix} β (q) & : = (2 - \frac{q}{2}) \sum_{k = 1}^{\infty} (2 q^{2^{k}} + q^{2^{k - 1}}) - \sum_{k = 1}^{\infty} \frac{q^{2^{k + 1}} + 5 q^{2^{k}} + 3 q^{2^{k - 1}}}{2^{k}} \\ - \frac{3}{2} \sum_{k = 1}^{\infty} (2 q^{2^{k}} + q^{2^{k - 1}}) \sum_{j = 1}^{k - 1} \frac{q^{2^{j}}}{2^{j}} . \end{matrix}

2.43

An equivalent way to write $β (q)$ is

\begin{matrix} β (q) & = \frac{q (q + 1)}{2} + (6 - \frac{3 q}{2}) \sum_{k = 1}^{\infty} q^{2^{k}} - \frac{17}{2} \sum_{k = 1}^{\infty} \frac{q^{2^{k}}}{2^{k}} \\ - \frac{3}{2} \sum_{k = 1}^{\infty} (2 q^{2^{k}} + q^{2^{k - 1}}) \sum_{j = 1}^{k - 1} \frac{q^{2^{j}}}{2^{j}} \end{matrix}

2.44

(notice that $β (0) = 0$ , $0 < β (q) < \infty$ for $q \in (0, 1)$ , and $β (1) = \infty$ ; furthermore, $β (q)$ is a power series about $q = 0$ with radius of convergence equal to 1).

We would like to understand the limiting behavior of $W_{n}$ , as $n \to \infty$ , for any fixed $q \in (0, 1)$ . Formulas (2.12) and (2.42) suggest that we should work with the normalized variables

\begin{matrix} X_{n} : = \frac{W_{n} - 2^{n} α_{1} (q)}{2^{n / 2}}, n = 0, 1, \dots . \end{matrix}

2.45

Let us consider the characteristic functions

\begin{matrix} ψ_{n} (t) : = E [e^{i t X_{n}}], n = 0, 1, \dots . \end{matrix}

2.46

In view of (2.28) we have

\begin{matrix} ψ_{n} (t) = ϕ_{n} (2^{- n / 2}, t) e^{- i 2^{n / 2} α_{1} (q) t} \end{matrix}

2.47

and using (2.47) in (2.29) yields

\begin{matrix} ψ_{n} (t) & = e^{2^{- n / 2} i t} ψ_{n - 1} {(t / \sqrt{2})}^{2} \\ + (e^{2^{- n / 2} i t} - e^{2 \cdot 2^{- n / 2} i t}) e^{- 2^{n / 2} α_{1} (q) i t / 2} q^{2^{n - 1}} ψ_{n - 1} (t / \sqrt{2}) \\ + (e^{2^{- n / 2} i t} - e^{2 \cdot 2^{- n / 2} i t}) e^{- 2^{n / 2} α_{1} (q) i t} q^{2^{n}}, n = 1, 2, \dots, \end{matrix}

2.48

with $ψ_{0} (t) = e^{[1 - α_{1} (q)] i t}$ .

From formula (2.48) one can guess the limiting distribution of $X_{n}$ : Assuming that for each $t \in R$ the sequence $ψ_{n} (t)$ has a limit, say $ψ (t)$ , we can let $n \to \infty$ in (2.48) and conclude that $ψ (t) = ψ {(t / \sqrt{2})}^{2}$ . Actually, if the limit $ψ (t)$ exists, then formulas (2.49) and (2.50) below imply that $ψ^{'} (0) = 0$ and $ψ^{''} (0)$ exists (in $C$ ). From the existence of these derivatives and the functional equation of $ψ (t)$ we get that the function $h (t) : = t^{- 1} \sqrt{- ln ψ (t)}$ is continuous at 0 and satisfies the equation $h (t) = h (t / \sqrt{2})$ . Therefore, $h (t) \equiv h (0)$ for all $t \in R$ and, hence $ψ (t) = e^{- a t^{2}}$ for some $a \geq 0$ . Consequently, $X_{n}$ converges in distribution to a normal random variable, say X, with zero mean. As for the variance of X, formulas (2.42) and (2.45) suggest that $V [X] = β (q)$ , where $β (q)$ is given by (2.43)–(2.44).

The above argument is not rigorous since it assumes the existence of the (pointwise) limit ${lim}_{n} ψ_{n} (t)$ for every $t \in R$ . It is worth, however, since it points out what we should look for.

From (2.12), (2.42), and (2.45) we get that for any fixed $q \in [0, 1)$

\begin{matrix} E [X_{n}] = - \frac{1}{2^{n / 2}} + O (q^{2^{n}}) and V [X_{n}] = β (q) + O (q^{2^{n}}), n \to \infty, \end{matrix}

2.49

thus

\begin{matrix} E [X_{n}^{2}] = β (q) + \frac{1}{2^{n}} + O (q^{2^{n}}), n \to \infty . \end{matrix}

2.50

One important consequence of (2.50) is Durrett (2005) that the sequence $F_{n}$ , $n = 0, 1, \dots$ , where $F_{n}$ is the distribution function of $X_{n}$ , is tight.

Theorem 3

For $m = 1, 2, \dots$ and $q \in [0, 1)$ we have

\begin{matrix} lim_{n} E [X_{n}^{m}] = β {(q)}^{m / 2} E [Z^{m}], \end{matrix}

2.51

where $β (q)$ is given by (2.43)–(2.44) and Z is a standard normal random variable. In other words

\begin{matrix} lim_{n} E [X_{n}^{2 ℓ - 1}] = 0 and lim_{n} E [X_{n}^{2 ℓ}] = \frac{(2 ℓ)!}{2^{ℓ} ℓ!} β {(q)}^{ℓ}, ℓ \geq 1 . \end{matrix}

2.52

Proof

By differentiating both sides of (2.48) with respect to t we get

\begin{matrix} ψ_{n}^{'} (t) & = \sqrt{2} e^{2^{- n / 2} i t} ψ_{n - 1} (\frac{t}{\sqrt{2}}) ψ_{n - 1}^{'} (\frac{t}{\sqrt{2}}) + 2^{- n / 2} i e^{2^{- n / 2} i t} ψ_{n - 1} {(\frac{t}{\sqrt{2}})}^{2} \\ + q^{2^{n - 1}} \frac{d}{dt} [(e^{2^{- n / 2} i t} - e^{2 \cdot 2^{- n / 2} i t}), e^{- 2^{n / 2} α_{1} (q) i t / 2}, ψ_{n - 1}, (\frac{t}{\sqrt{2}})] \\ + q^{2^{n}} \frac{d}{dt} [(e^{2^{- n / 2} i t} - e^{2 \cdot 2^{- n / 2} i t}), e^{- 2^{n / 2} α_{1} (q) i t}] . \end{matrix}

2.53

We continue by differentiating $k - 1$ times with respect to t both sides of (2.53), where $k \geq 2$ . This yields

\begin{matrix} ψ_{n}^{(k)} (0) = Q_{1 n} (0) + Q_{2 n} (0) + Q_{3 n} (0) + Q_{4 n} (0), \end{matrix}

2.54

where

\begin{matrix} Q_{1 n} (t) & : = \sqrt{2} \frac{d^{k - 1}}{d t^{k - 1}} [e^{2^{- n / 2} i t}, ψ_{n - 1}, (\frac{t}{\sqrt{2}}), ψ_{n - 1}^{'}, (\frac{t}{\sqrt{2}})], \end{matrix}

2.55

\begin{matrix} Q_{2 n} (t) & : = 2^{- n / 2} i \frac{d^{k - 1}}{d t^{k - 1}} [e^{2^{- n / 2} i t}, ψ_{n - 1}, {(\frac{t}{\sqrt{2}})}^{2}], \end{matrix}

2.56

\begin{matrix} Q_{3 n} (t) & : = q^{2^{n - 1}} \frac{d^{k}}{d t^{k}} [(e^{2^{- n / 2} i t} - e^{2 \cdot 2^{- n / 2} i t}), e^{- 2^{n / 2} α_{1} (q) i t / 2}, ψ_{n - 1}, (\frac{t}{\sqrt{2}})], \end{matrix}

2.57

\begin{matrix} and \\ Q_{4 n} (t) & : = q^{2^{n}} \frac{d^{k}}{d t^{k}} [(e^{2^{- n / 2} i t} - e^{2 \cdot 2^{- n / 2} i t}), e^{- 2^{n / 2} α_{1} (q) i t}] . \end{matrix}

2.58

We will prove the theorem by induction. For $m = 1$ and $m = 2$ the truth of (2.51) follows immediately from (2.49) and (2.50). The inductive hypothesis is that the limit ${lim}_{n} E [X_{n}^{k}] = i^{- k} {lim}_{n} ψ_{n}^{(k)} (0)$ satisfies (2.51) or, equivalently, (2.52) for $k = 1, \dots, m - 1$ . Then, for $k = m$ (where $m \geq 3$ ):

(i)
Formula (2.55) implies
$\begin{matrix} Q_{1 n} (0) = \frac{1}{{(\sqrt{2})}^{m - 2}} \sum_{j = 0}^{m - 1} (\begin{matrix} m - 1 \\ j \end{matrix}) ψ_{n - 1}^{(m - j)} (0) ψ_{n - 1}^{(j)} (0) + O (\frac{1}{2^{n / 2}}), n \to \infty ; \end{matrix}$ 2.59
(ii)
formula (2.56) implies
$\begin{matrix} Q_{2 n} (0) = O (\frac{1}{2^{n / 2}}), n \to \infty ; \end{matrix}$ 2.60
(iii)
since ${[e^{2^{- n / 2} i t} - e^{2 \cdot 2^{- n / 2} i t}]}_{t = 0} = 0$ , formula (2.57) implies
$\begin{matrix} Q_{3 n} (0) = O (q^{2^{n - 1}}, 2^{m n / 2}), n \to \infty ; \end{matrix}$ 2.61
(iv)
finally, formula (2.58) implies
$\begin{matrix} Q_{4 n} (0) = O (q^{2^{n}}, 2^{m n / 2}), n \to \infty . \end{matrix}$ 2.62
Thus, for $k = m \geq 3$ , by using (2.59), (2.60), (2.61), and (2.62) in (2.54) we can conclude that for every $n \geq 1$ (recall that $ψ_{n} (0) = 1$ )
$\begin{matrix} ψ_{n}^{(m)} (0) = \frac{ψ_{n - 1}^{(m)} (0)}{{(\sqrt{2})}^{m - 2}} + \frac{1}{{(\sqrt{2})}^{m - 2}} \sum_{j = 1}^{m - 1} (\begin{matrix} m - 1 \\ j \end{matrix}) ψ_{n - 1}^{(m - j)} (0) ψ_{n - 1}^{(j)} (0) + δ_{n}, \end{matrix}$ 2.63
where $ψ_{0}^{(m)} (0) = i^{m} {[1 - α_{1} (q)]}^{m}$ and
$\begin{matrix} δ_{n} = O (\frac{1}{2^{n / 2}}), n \to \infty . \end{matrix}$ 2.64

Let us set

\begin{matrix} b_{n} : = \frac{1}{{(\sqrt{2})}^{m - 2}} \sum_{j = 1}^{m - 1} (\begin{matrix} m - 1 \\ j \end{matrix}) ψ_{n - 1}^{(m - j)} (0) ψ_{n - 1}^{(j)} (0) + δ_{n} . \end{matrix}

2.65

Then, the inductive hypothesis together with (2.64) imply that the limit ${lim}_{n} b_{n}$ exists in $C$ . We can, therefore, apply Corollary A2 of the Appendix to (2.63) with $z_{n} = ψ_{n}^{(m)} (0)$ , $ρ = \sqrt{2})^{- (m - 2)}$ and $b_{n}$ as in (2.65) to conclude that the limit ${lim}_{n} ψ_{n}^{(m)} (0)$ exists in $C$ . This allows us to take limits in (2.63) and obtain

\begin{matrix} λ (m) = \frac{λ (m)}{{(\sqrt{2})}^{m - 2}} + \frac{1}{{(\sqrt{2})}^{m - 2}} \sum_{j = 1}^{m - 1} (\begin{matrix} m - 1 \\ j \end{matrix}) λ (m - j) λ (j) \end{matrix}

\begin{matrix} [{(\sqrt{2})}^{m - 2} - 1] λ (m) = \sum_{j = 1}^{m - 1} (\begin{matrix} m - 1 \\ j \end{matrix}) λ (m - j) λ (j) \end{matrix}

2.66

where for typographical convenience we have set

\begin{matrix} λ (m) : = lim_{n} ψ_{n}^{(m)} (0) = i^{m} lim_{n} E [X_{n}^{m}], m \geq 1 . \end{matrix}

2.67

We have, thus, shown that formula (2.66) holds for all $m \geq 1$ . If $m = 1$ , the sum in the right-hand side of (2.66) is empty, i.e. zero, and, hence we get what we already knew, namely that $λ (1) = 0$ . If $m = 2$ then (2.66) becomes vacuous (0 = 0), but we know that $λ (2) = - β (q)$ . Thus, formula (2.66) gives recursively the values of $λ (m)$ for every $m \geq 3$ , and by using the induction hypothesis again, namely that $λ (k) = i^{k} β {(q)}^{k / 2} E [Z^{k}]$ for $k = 1, \dots, m - 1$ , it is straightforward to show that $λ (m)$ , as given by (2.66), equals $i^{m} β {(q)}^{m / 2} E [Z^{m}]$ for every m. $□$

A remarkable consequence of Theorem 3 is the following corollary whose proof uses standard arguments, but we decided to include it for the sake of completeness.

Corollary 2

Let $β (q)$ be the quantity given in (2.43)–(2.44). Then, for any fixed $q \in [0, 1)$ we have

\begin{matrix} X_{n} \overset{d}{⟶} \sqrt{β (q)} Z as n \to \infty, \end{matrix}

2.68

where Z is a standard normal random variable and the symbol $\overset{d}{⟶}$ denotes convergence in distribution.

Proof

As we have seen, the sequence of the distribution functions $F_{n} (x) = P \{X_{n} \leq x\}$ , $n = 0, 1, \dots$ , is tight.

Suppose that $X_{n_{k}} \overset{d}{⟶} X$ , where $X_{n_{k}}$ , $k = 1, 2, \dots$ , is a subsequence of $X_{n}$ . Then (Durrett 2005) there is a sequence of random variables $Y_{k}$ , $k = 1, 2, \dots$ converging a.s. to a random variable Y, such that for each $k \geq 1$ the variables $Y_{k}$ and $X_{n_{k}}$ have the same distribution (hence the limits Y and X also have the same distribution, since a.s. convergence implies convergence in distribution). Thus, by Theorem 3 we have

\begin{matrix} lim_{k} E [Y_{k}^{m}] = lim_{k} E [X_{n_{k}}^{m}] = β {(q)}^{m / 2} E [Z^{m}] for all integers m \geq 1 . \end{matrix}

2.69

In particular, for $m = 2 ℓ$ we get from (2.69) that ${sup}_{k} E [Y_{k}^{2 ℓ}] < \infty$ for all $ℓ = 1, 2, \dots$ and, consequently (Chung 2001), that the sequence $Y_{k}^{m}$ , $k = 1, 2, \dots$ , is uniformly integrable for every $m \geq 1$ . Therefore $Y_{k} \to Y$ in $L_{r} (Ω, F, P)$ , for all $r \geq 0$ (i.e. $E [| Y_{k} - Y |^{r}) \to 0$ as $k \to \infty$ , since by $L_{r} (Ω, F, P)$ we denote the space of all real-valued random variables of our probability space $(Ω, F, P)$ , mentioned in the introduction, with a finite ${‖ \cdot ‖}_{r}$ -norm), and (2.69) yields

\begin{matrix} E [X^{m}] = E [Y^{m}] = lim_{k} E [Y_{k}^{m}] = β {(q)}^{m / 2} E [Z^{m}] for all integers m \geq 1 . \end{matrix}

2.70

It is well known (and not hard to check) that the moments of the normal distribution satisfy the Carleman’s condition and, consequently, a normal distribution is uniquely determined from its moments (Chung 2001; Durrett 2005) (alternatively, since the characteristic function of a normal variable is entire, it is uniquely determine by the moments). Therefore, it follows from (2.70) that X and $\sqrt{β (q)} Z$ have the same distribution. Hence, every subsequence of $X_{n}$ which converges in distribution, converges to $\sqrt{β (q)} Z$ and (2.68) is established. $□$

From formula (2.45) we have

\begin{matrix} \frac{X_{n}}{2^{n / 2}} = \frac{W_{n}}{2^{n}} - α_{1} (q), \end{matrix}

2.71

hence Theorem 3 has the following immediate corollary:

Corollary 3

For any $r > 0$ and $q \in [0, 1]$ we have

\begin{matrix} \frac{W_{n}}{2^{n}} \to α_{1} (q) in L_{r} (Ω, F, P), n \to \infty . \end{matrix}

2.72

Finally, let us observe that for any given $ϵ > 0$ we have (in view of (2.71) and Chebyshev’s inequality)

\begin{matrix} P \{|\frac{W_{n}}{2^{n}} - α_{1} (q)| \geq ϵ\} = P \{|\frac{X_{n}}{2^{n / 2}}| \geq ϵ\} \leq \frac{1}{2^{n} ϵ^{2}} E [X_{n}^{2}], \end{matrix}

2.73

hence (2.50) yields

\begin{matrix} P \{|\frac{W_{n}}{2^{n}} - α_{1} (q)| \geq ϵ\} \leq \frac{1}{2^{n} ϵ^{2}} [β (q) + o (1)], n \to \infty, \end{matrix}

2.74

from which it follows by a standard application of the 1st Borel-Cantelli Lemma that if $W_{n}$ , $n = 1, 2, \dots$ , are considered random variables of the space $(Ω, F, P)$ , then, for any $q \in [0, 1]$

\begin{matrix} \frac{W_{n}}{2^{n}} \to α_{1} (q) a.s., n \to \infty . \end{matrix}

2.75

The case of a general N

Let us now discuss the case of a general N, namely the case where N is not necessarily a power of 2. As we have described in the introduction, the first step of the binary search scheme is to test a pool containing samples from all N containers. If this pool is not contaminated, then none of the contents of the N containers is contaminated and we are done. If the pool is contaminated, we form two subpools, one containing samples of the first $⌊ N / 2 ⌋$ containers and the other containing samples of the remaining $⌈ N / 2 ⌉$ containers (recall that $⌊ N / 2 ⌋ + ⌈ N / 2 ⌉ = N$ ). We continue by testing the first of those subpools. If it is contaminated we split it again into two subpools of $⌊ ⌊ N / 2 ⌋ / 2 ⌋$ and $⌈ ⌊ N / 2 ⌋ / 2 ⌉$ samples respectively and keep going. We also apply the same procedure to the second subpool of the $⌈ N / 2 ⌉$ samples.

Suppose $T (N) = T (N ; q)$ is the number of tests required to find all contaminated samples by following the above procedure (thus $T (2^{n}) = W_{n}$ , where $W_{n}$ is the random variable studied in the previous section). Then, as in formula (1.2),

\begin{matrix} 1 \leq T (N) \leq 2 N - 1 . \end{matrix}

3.1

In the extreme cases $q = 1$ and $q = 0$ the quantity T(N) becomes deterministic and we have respectively

\begin{matrix} T (N ; 1) = 1 and T (N ; 0) = 2 N - 1 . \end{matrix}

3.2

Evidently, $T (N) \leq_{st} T (N + 1)$ , where $\leq_{st}$ denotes the usual stochastic ordering (recall that $X \leq_{st} Y$ means that $P {X > x} \leq P {Y > x}$ for all $x \in R$ ). In other words, T(N) is stochastically increasing in N. In particular

\begin{matrix} W_{⌊ ν ⌋} \leq_{st} T (N) \leq_{st} W_{⌈ ν ⌉}, ν : = {log}_{2} N, \end{matrix}

3.3

where ${log}_{2}$ is the logarithm to the base 2. Also, it follows easily from a coupling argument that if $q_{1} > q_{2}$ , then $T (N ; q_{1}) \leq_{st} T (N ; q_{2})$ .

The expectation, the generating function, and the variance of T(N)

Theorem 4

Let us set

\begin{matrix} μ (N) = μ (N ; q) : = E [T (N ; q)] . \end{matrix}

3.4

Then $μ (N)$ satisfies the recursion

\begin{matrix} μ (N) = μ (⌊ N / 2 ⌋) + μ (⌈ N / 2 ⌉) - q^{N} - q^{⌊ N / 2 ⌋} + 1, N \geq 2 \end{matrix}

3.5

(of course, $μ (1) = 1$ ).

Proof

We adapt the proof of Theorem 1. Assume $N \geq 2$ and let $D_{N}$ be the event that none of the N samples is contaminated. Then

\begin{matrix} μ (N) = E [T (N)] = E [T (N) | D_{N}] P (D_{N}) + E [T (N) | D_{N}^{c}] P (D_{N}^{c}), \end{matrix}

and hence

\begin{matrix} μ (N) = q^{N} + u (N) (1 - q^{N}), \end{matrix}

3.6

where for typographical convenience we have set

\begin{matrix} u (N) : = E [T (N) | D_{N}^{c}] . \end{matrix}

3.7

In order to find a recursive formula for u(N) let us first consider the event $A_{N}$ that in the group of the N containers none of the first $⌊ N / 2 ⌋$ contain contaminated samples. Clearly $D_{N} \subset A_{N}$ and

\begin{matrix} P (A_{N} | D_{N}^{c}) = \frac{P (A_{N}) - P (D_{N})}{P (D_{N}^{c})} = \frac{q^{⌊ N / 2 ⌋} - q^{N}}{1 - q^{N}} = \frac{q^{⌊ N / 2 ⌋} (1 - q^{⌈ N / 2 ⌉})}{1 - q^{N}} . \end{matrix}

3.8

Likewise, if $B_{N}$ is the event that in the group of the N containers none of the last $⌈ N / 2 ⌉$ contains contaminated samples, then

\begin{matrix} P (B_{N} | D_{N}^{c}) = \frac{q^{⌈ N / 2 ⌉} - q^{N}}{1 - q^{N}} = \frac{q^{⌈ N / 2 ⌉} (1 - q^{⌊ N / 2 ⌋})}{1 - q^{N}} . \end{matrix}

3.9

Let us also notice that $A_{N} B_{N} = D_{N}$ , hence $P (A_{N} B_{N} | D_{N}^{c}) = 0$ .

Now, (i) given $A_{N}$ and $D_{N}^{c}$ we have that $T (N) = 1 + \tilde{T} (⌈ N / 2 ⌉)$ , where $\tilde{T} (⌈ N / 2 ⌉)$ has the same distribution as $T (⌈ N / 2 ⌉)$ ; (ii) given $B_{N}$ and $D_{N}^{c}$ we have that $T (N) = 2 + T (⌊ N / 2 ⌋)$ ; finally (iii) given ${(A_{N} \cup B_{N})}^{c}$ and $D_{N}^{c}$ we have that $T (N) \overset{d}{=} 1 + T (⌊ N / 2 ⌋) + \tilde{T} (⌈ N / 2 ⌉)$ , where, $\tilde{T} (⌈ N / 2 ⌉)$ is independent of $T (⌊ N / 2 ⌋)$ and has the same distribution as $T (⌈ N / 2 ⌉)$ . Thus (given $D_{N}^{c}$ ) by conditioning on the events $A_{N}$ , $B_{N}$ and ${(A_{N} \cup B_{N})}^{c}$ we get, in view of (3.7), (3.8), and (3.9),

\begin{matrix} u (N) & = [1 + u (⌈ N / 2 ⌉)] \frac{q^{⌊ N / 2 ⌋} - q^{N}}{1 - q^{N}} + [2 + u (⌊ N / 2 ⌋)] \frac{q^{⌈ N / 2 ⌉} - q^{N}}{1 - q^{N}} \\ + [1 + u (⌈ N / 2 ⌉) + u (⌊ N / 2 ⌋)] (1 - \frac{q^{⌊ N / 2 ⌋} - q^{N}}{1 - q^{N}} - \frac{q^{⌈ N / 2 ⌉} - q^{N}}{1 - q^{N}}) \end{matrix}

\begin{matrix} (1 - q^{N}) u (N) & = (1 - q^{⌊ N / 2 ⌋}) u (⌊ N / 2 ⌋) + (1 - q^{⌈ N / 2 ⌉}) u (⌈ N / 2 ⌉) \\ - 2 q^{N} + q^{⌈ N / 2 ⌉} + 1 \end{matrix}

3.10

from which, in view of (3.6), formula (3.5) follows immediately. $□$

In the case where $N = 2^{n}$ formula (3.5) reduces to (2.10).

Remark 2

By easy induction formula (3.5) implies that, for $N \geq 2$ the quantity $μ (N ; q) = E [T (N ; q)]$ is a polynomial in q of degree N whose coefficients are $\leq 0$ , except for the constant term which is equal to $2 N - 1$ . The leading term of $μ (N ; q)$ is $- q^{N}$ . In the special case where $N = 2^{n}$ these properties of $μ (N ; q)$ follow trivially from (2.3).

For instance,

\begin{matrix} μ (3) & = 5 - 2 q - q^{2} - q^{3}, μ (5) = 9 - 3 q - 3 q^{2} - q^{3} - q^{5}, \\ and \\ μ (1000) & = 1999 - 512 q - 720 q^{2} - 48 q^{3} - 336 q^{4} - 48 q^{7} - 144 q^{8} - 48 q^{15} \\ - 48 q^{16} - 40 q^{31} - 8 q^{32} - 16 q^{62} - 8 q^{63} - 12 q^{125} - 6 q^{250} - 3 q^{500} - q^{1000}, \end{matrix}

3.11

while $μ (2)$ , $μ (4)$ , and $μ (1024)$ are given by (2.11), since $W_{1} = T (2)$ , $W_{2} = T (4)$ , and $W_{10} = T (1024)$ .

With the help of (3.5) one can obtain an extension of formula (2.3) valid for any N. We first need to introduce some convenient notation:

\begin{matrix} ι (N) : = ⌊ N / 2 ⌋, ι^{k} (N) = (\underset{k iterates}{\underset{⏟}{ι \circ \dots \circ ι}}) (N), \end{matrix}

3.12

so that $ι^{1} (N) = ι (N)$ , while we also have the standard convention $ι^{0} (N) = N$ (we furthermore have $ι^{- 1} (N) = {2 N, 2 N + 1}$ ). For example, if $N \geq 2$ and, as in (3.3), $ν = {log}_{2} N$ , then

\begin{matrix} ι^{⌊ ν ⌋} (N) = 1, while ι^{⌊ ν ⌋ - 1} (N) = 2 or 3 and ι^{⌊ ν ⌋ + 1} (N) = 0 . \end{matrix}

3.13

Corollary 4

Let $μ (N)$ be as in (3.4) and

\begin{matrix} ε_{μ} (N) & = ε_{μ} (N ; q) : = q^{⌊ N / 2 ⌋} + q^{N} - q^{⌊ (N + 1) / 2 ⌋} - q^{N + 1} \\ = q^{⌊ N / 2 ⌋} - q^{⌈ N / 2 ⌉} + q^{N} - q^{N + 1} \end{matrix}

3.14

(the last equality follows from the fact that $⌊ (N + 1) / 2 ⌋ = ⌈ N / 2 ⌉$ ). Then

\begin{matrix} μ (N) = 2 N - 1 - (N - 1) q - (N - 1) q^{2} + \sum_{n = 2}^{N - 1} \sum_{k = 1}^{⌊ {log}_{2} n ⌋} ε_{μ} (ι^{k - 1}, (n)), N \geq 1 \end{matrix}

3.15

(if $N = 1$ or $N = 2$ , then the double sum in the right-hand side is 0).

Proof

By setting

\begin{matrix} Δ_{μ} (N) = Δ_{μ} (N ; q) : = μ (N + 1 ; q) - μ (N ; q) \end{matrix}

3.16

and by recalling that $⌊ (N + 1) / 2 ⌋ = ⌈ N / 2 ⌉$ and $⌈ (N + 1) / 2 ⌉ = ⌊ N / 2 ⌋ + 1$ , formula (3.5) implies (in view of (3.12) and (3.14))

\begin{matrix} Δ_{μ} (N) = Δ_{μ} (⌊ N / 2 ⌋) + ε_{μ} (N) = Δ_{μ} (ι (N)) + ε_{μ} (N), N \geq 2 . \end{matrix}

3.17

From (3.17) we obtain

\begin{matrix} \sum_{k = 1}^{⌊ {log}_{2} N ⌋} [Δ_{μ} (ι^{k - 1}, (N)) - Δ_{μ} (ι^{k}, (N))] = \sum_{k = 1}^{⌊ {log}_{2} N ⌋} ε_{μ} (ι^{k - 1}, (N)), N \geq 2, \end{matrix}

or, in view of (3.13),

\begin{matrix} Δ_{μ} (N) - Δ_{μ} (1) = \sum_{k = 1}^{⌊ {log}_{2} N ⌋} ε_{μ} (ι^{k - 1}, (N)), N \geq 1, \end{matrix}

3.18

where $Δ_{μ} (1) = μ (2) - μ (1) = E [T (2)] - E [T (1)] = E [W_{1}] - E [W_{0}] = 2 - q - q^{2}$ (in the case $N = 1$ , formula (3.18) is trivially true, since the sum in the right-hand side is empty). Consequently, (3.18) becomes

\begin{matrix} Δ_{μ} (N) = μ (N + 1) - μ (N) = 2 - q - q^{2} + \sum_{k = 1}^{⌊ {log}_{2} N ⌋} ε_{μ} (ι^{k - 1}, (N)), N \geq 1 . \end{matrix}

3.19

from which (3.15) follows immediately. $□$

With the help of (3.15) we have created the following table for the average-case aspect ratio:

(as in Table 1, if we compare the above values with the graphs, given in Aldridge (2019) and Aldridge (2020), of the optimal average-case aspect ratio as a function of p, we can see that, if $p \leq 0.15$ , our procedure is near-optimal).

Remark 3

Let us look at the case $N = 2^{n} M$ , where M is a fixed odd integer. We set

\begin{matrix} τ_{n} : = E [T (N)] = E [T (2^{n} M)], \end{matrix}

3.20

so that

\begin{matrix} τ_{0} = E [T (M)] = μ (M) . \end{matrix}

3.21

Then (3.5) becomes

\begin{matrix} τ_{n} = 2 τ_{n - 1} - q^{2^{n} M} - q^{2^{n - 1} M} + 1, n \geq 1 . \end{matrix}

3.22

From (3.21) and (3.22) it follows that

\begin{matrix} \frac{τ_{n}}{2^{n}} & = μ (M) + \sum_{k = 1}^{n} \frac{1 - q^{2^{k} M} - q^{2^{k - 1} M}}{2^{k}} \\ = μ (M) + 1 - \frac{1}{2^{n}} - \sum_{k = 1}^{n} \frac{q^{2^{k} M} + q^{2^{k - 1} M}}{2^{k}} n \geq 0, \end{matrix}

3.23

thus

\begin{matrix} \frac{E [T (2^{n} M)]}{2^{n} M} = \frac{τ_{n}}{2^{n} M} = \frac{μ (M) + 1}{M} - \frac{1}{2^{n} M} - \sum_{k = 1}^{n} \frac{q^{2^{k} M} + q^{2^{k - 1} M}}{2^{k} M} \end{matrix}

3.24

for all $n \geq 0$ . It follows that, as $n \to \infty$ ,

\begin{matrix} \frac{E [T (2^{n} M)]}{2^{n} M} \to α_{M} (q) uniformly in q \in [0, 1], \end{matrix}

3.25

where

\begin{matrix} α_{M} (q) & : = \frac{μ (M) + 1}{M} - \sum_{k = 1}^{\infty} \frac{q^{2^{k} M} + q^{2^{k - 1} M}}{2^{k} M} \\ = \frac{2 μ (M) + 2 - q^{M}}{2 M} - \frac{3}{2} \sum_{k = 1}^{\infty} \frac{q^{2^{k} M}}{2^{k} M} \end{matrix}

3.26

(as in the case of Remark 1, the convergence in (3.25) is much stronger, since for every $m = 1, 2, \dots$ the m-th derivative of $E [T (2^{n} M)] / (2^{n} M)$ with respect to q converges to the m-th derivative of $α_{M} (q)$ uniformly on $[0, 1 - ϵ]$ for any $ϵ > 0$ ).

From (3.26) it is obvious that if $M_{1} \neq M_{2}$ , then $α_{M_{1}} (q) \neq α_{M_{2}} (q)$ , except for at most countably many values of q (notice, e.g., that $α_{M} (0) = 2$ and $α_{M} (1) = 0$ for all M). Therefore, it follows from (3.25) that for $q \in (0, 1)$

\begin{matrix} the limit lim_{N} \frac{E [T (N ; q)]}{N} does not exist . \end{matrix}

3.27

Comment

The reason we consider the case $N = 2^{n} M$ , where M is odd, is mainly theoretical: We use these values of N and the corresponding subsequences in order to demonstrate the non-convergence of T(N)/N. We do not claim any practical value of this choice of N.

Open Question

Is it true that $α_{1} (q) = {lim sup}_{N} E [T (N ; q)] / N$ ?

Next, we consider the generating function of T(N). Using the approach of the proofs of Theorems 2 and 4 we can derive the following result:

Theorem 5

Let

\begin{matrix} g (z ; N) = g (z ; N ; q) : = E [z^{T (N ; q)}] . \end{matrix}

3.28

Then g(z; N) satisfies the recursion

\begin{matrix} g (z ; N) = z g (z ; ⌊ N / 2 ⌋) g (z ; ⌈ N / 2 ⌉) + (z - z^{2}) q^{⌊ N / 2 ⌋} g (z ; ⌈ N / 2 ⌉) + (z - z^{2}) q^{N} \end{matrix}

3.29

for $N \geq 2$ (of course, $g (z ; 1) = z$ ).

Notice that in the case where $N = 2^{n}$ , formula (3.29) reduces to (2.23).

By setting $z = e^{it}$ in (3.29) it follows that the characteristic function

\begin{matrix} ϕ (t ; N) = ϕ (t ; N ; q) : = E [e^{i t T (N ; q)}] \end{matrix}

3.30

of $T (N) = T (N ; q)$ satisfies the recursion

\begin{matrix} ϕ (t ; N) & = e^{it} ϕ (t ; ⌊ N / 2 ⌋) ϕ (t ; ⌈ N / 2 ⌉) + (e^{it} - e^{2 i t}) q^{⌊ N / 2 ⌋} ϕ (t ; ⌈ N / 2 ⌉) \\ + (e^{it} - e^{2 i t}) q^{N}, N \geq 2, \end{matrix}

3.31

with, of course, $ϕ (t ; 1) = e^{it}$ . For example, for $N = 2$ formula (3.31) confirms the value of $ϕ (t ; 2)$ given in (2.30).

By differentiating (3.29) twice with respect to z and then setting $z = 1$ and using (3.5) we get the following corollary:

Corollary 5

Let

\begin{matrix} σ^{2} (N) = σ^{2} (N ; q) : = V [T (N ; q)] . \end{matrix}

3.32

Then $σ^{2} (N)$ satisfies the recursion

\begin{matrix} σ^{2} (N) & = σ^{2} (⌊ N / 2 ⌋) + σ^{2} (⌈ N / 2 ⌉) + 2 μ (N) q^{N} + 2 μ (⌊ N / 2 ⌋) q^{⌊ N / 2 ⌋} \\ + q^{2 N} - q^{2 ⌊ N / 2 ⌋} - 3 q^{N} - q^{⌊ N / 2 ⌋}, N \geq 2 \end{matrix}

3.33

(of course, $σ^{2} (1) = 0$ ).

In the case $N = 2^{n}$ formula (3.33) reduces to (2.36) (e.g., if $N = 2$ , then (3.33) agrees with (2.38)).

As we have mentioned, in the extreme cases where $q = 0$ or $q = 1$ the variable T(N) becomes deterministic and, consequently, $σ^{2} (N ; 0) = σ^{2} (N ; 1) = 0$ , which is in agreement with (3.33).

Corollary 6

For $q \in (0, 1)$ there are constants $0 < c_{1} < c_{2}$ , depending on q, such that $σ^{2} (N) = V [T (N)]$ satisfies

\begin{matrix} c_{1} N \leq σ^{2} (N) \leq c_{2} N for all N \geq 2 . \end{matrix}

3.34

Proof

For $N \geq 2$ we have $P {T (N) = 1} = q^{N}$ , which implies that $μ (N) \geq q^{N} + 2 (1 - q^{N}) = 2 - q^{N}$ . Hence

\begin{matrix} 2 μ (N) q^{N} + 2 μ (⌊ N / 2 ⌋) q^{⌊ N / 2 ⌋} + q^{2 N} - q^{2 ⌊ N / 2 ⌋} - 3 q^{N} - q^{⌊ N / 2 ⌋} \\ \geq 2 (2 - q^{N}) q^{N} + 2 (2 - q^{⌊ N / 2 ⌋}) q^{⌊ N / 2 ⌋} + q^{2 N} - q^{2 ⌊ N / 2 ⌋} - 3 q^{N} - q^{⌊ N / 2 ⌋} \\ = q^{N} - q^{2 N} + 3 q^{⌊ N / 2 ⌋} - 3 q^{2 ⌊ N / 2 ⌋} > 0 . \end{matrix}

3.35

Using (3.35) in (3.33) implies

\begin{matrix} σ^{2} (N) > σ^{2} (⌊ N / 2 ⌋) + σ^{2} (⌈ N / 2 ⌉), N \geq 2 . \end{matrix}

3.36

Now, as we have already seen, $σ^{2} (1 ; q) \equiv 0$ , while by (2.38) we have $σ^{2} (2 ; q) = q (1 - q) (q^{2} + 3 q + 1)$ . We can, thus, choose

\begin{matrix} c_{1} = \frac{σ^{2} (2 ; q)}{3} = \frac{q (1 - q) (q^{2} + 3 q + 1)}{3} > 0 . \end{matrix}

3.37

Then, (3.37) implies that $σ^{2} (2) = 3 c_{1} > 2 c_{2}$ and from (3.36) we have $σ^{2} (3) > σ^{2} (2) = 3 c_{1}$ , i.e. the first inequality in (3.34) is valid for $N = 2$ and $N = 3$ . Finally, the inequality $c_{1} N \leq σ^{2} (N)$ for every $N \geq 2$ follows easily from (3.36) by induction.

To establish the second inequality of (3.34) we set

\begin{matrix} Δ_{σ} (N) : = σ^{2} (N + 1) - σ^{2} (N) . \end{matrix}

3.38

Then, formula (3.33) implies (in view of (3.12))

\begin{matrix} Δ_{σ} (N) = Δ_{σ} (⌊ N / 2 ⌋) + ε_{σ} (N) = Δ_{σ} (ι (N)) + ε_{σ} (N), N \geq 2, \end{matrix}

3.39

where

\begin{matrix} ε_{σ} (N) & = ε_{σ} (N ; q) : = 2 [μ (N + 1) q - μ (N)] q^{N} + 2 μ (⌈N / 2⌉) q^{⌈N / 2⌉} \\ - 2 μ (⌊N / 2⌋) q^{⌊N / 2⌋} + q^{⌊N / 2⌋} - q^{⌈N / 2⌉} + q^{2 ⌊N / 2⌋} \\ - q^{2 ⌈N / 2⌉} + (3 - q^{N} - q^{N + 1}) q^{N} (1 - q) . \end{matrix}

3.40

Observe that by (3.40) and (3.1) we have that

\begin{matrix} ε_{σ} (N) = O (N, q^{⌊N / 2⌋}), N \to \infty . \end{matrix}

3.41

Now, as in the case of (3.18) we have, in view of (3.39),

\begin{matrix} Δ_{σ} (N) = Δ_{σ} (1) + \sum_{k = 1}^{⌊ {log}_{2} N ⌋} ε_{σ} (ι^{k - 1}, (N)), N \geq 2, \end{matrix}

3.42

where $Δ_{σ} (1) = σ^{2} (2) - σ^{2} (1) = q (1 - q) (q^{2} + 3 q + 1)$ . From (3.42) and (3.41) we get that $Δ_{σ} (N)$ is bounded. Therefore, the second inequality of (3.34) follows immediately from (3.38). $□$

Of course, in the case where $N = 2^{n}$ we have formula (2.42), which is much more precise than (3.34).

Let us also notice that with the help of (3.38), (3.42), and (3.15) we can get a messy, yet explicit formula for $σ^{2} (N)$ , extending (2.32) to the case of a general N.

The behavior of T(N) as $N \to \infty$

We start with a Lemma.

Lemma 1

Let $σ^{2} (N) = σ^{2} (N ; q) = V [T (N ; q)]$ as in the previous subsection. Then for any fixed $q \in (0, 1)$ we have

\begin{matrix} \frac{σ^{2} (⌊ N / 2 ⌋)}{σ^{2} (N)} = \frac{1}{2} + O (\frac{1}{N}) and \frac{σ^{2} (⌈ N / 2 ⌉)}{σ^{2} (N)} = \frac{1}{2} + O (\frac{1}{N}), N \to \infty . \end{matrix}

3.43

Proof

As we have seen in the proof of Corollary 6, $Δ_{σ} (N) = σ^{2} (N + 1) - σ^{2} (N)$ is bounded, hence

\begin{matrix} σ^{2} (⌈ N / 2 ⌉) = σ^{2} (⌊ N / 2 ⌋) + O (1), N \to \infty . \end{matrix}

3.44

Now, if we divide (3.33) by $σ^{2} (N)$ and invoke (3.34) we get

\begin{matrix} 1 = \frac{σ^{2} (⌊ N / 2 ⌋)}{σ^{2} (N)} + \frac{σ^{2} (⌈ N / 2 ⌉)}{σ^{2} (N)} + O (\frac{q^{N / 2}}{N}), N \to \infty . \end{matrix}

3.45

Therefore, by using (3.44) in (3.45) and invoking again (3.34) we obtain

\begin{matrix} 1 = \frac{2 σ^{2} (⌊ N / 2 ⌋)}{σ^{2} (N)} + O (\frac{1}{N}), N \to \infty, \end{matrix}

3.46

which is equivalent to the first equality of (3.43). The second equality follows immediately. $□$

In order to determine the limiting behavior of T(N), as $N \to \infty$ , for any fixed $q \in (0, 1)$ , it is natural to work with the normalized variables

\begin{matrix} Y (N) : = \frac{T (N) - μ (N)}{σ (N)}, N = 2, 3, \dots \end{matrix}

3.47

(Y(1) is not defined). Obviously,

\begin{matrix} E [Y (N)] = 0 and V [Y (N)] = E [Y {(N)}^{2}] = 1 for all N \geq 2, \end{matrix}

3.48

thus (Durrett 2005) the sequence of the distribution functions of Y(N) is tight.

Let us, also, introduce the characteristic functions

\begin{matrix} ψ (t ; N) : = E [e^{i t Y (N)}], N = 2, 3, \dots . \end{matrix}

3.49

In view of (3.47) and (3.30) we have

\begin{matrix} ψ (t ; N) = ϕ (\frac{t}{σ (N)} ; N) e^{- i t μ (N) / σ (N) t} \end{matrix}

3.50

and by using (3.50) in (3.31) and then invoking (3.5) we obtain the recursion

\begin{matrix} ψ (t ; N) & = e^{\frac{q^{N} + q^{⌊ N / 2 ⌋}}{σ (N)} i t} ψ (\frac{σ (⌊ N / 2 ⌋)}{σ (N)} t ; ⌊ N / 2 ⌋) ψ (\frac{σ (⌈ N / 2 ⌉)}{σ (N)} t ; ⌈ N / 2 ⌉) \\ + q^{⌊ N / 2 ⌋} (e^{\frac{1}{σ (N)} i t} - e^{\frac{2}{σ (N)} i t}) e^{\frac{μ (⌈ N / 2 ⌉) - μ (N)}{σ (N)} i t} ψ (\frac{σ (⌈ N / 2 ⌉)}{σ (N)} t ; ⌈ N / 2 ⌉) \\ + q^{N} (e^{\frac{1}{σ (N)} i t} - e^{\frac{2}{σ (N)} i t}) e^{- \frac{μ (N)}{σ (N)} i t}, N \geq 4, \end{matrix}

3.51

with $ψ (t ; 2)$ and $ψ (t ; 3)$ taken from (3.49). Actually, if we define $ψ (0 ; 1) : = 1$ , then (3.51) is valid for $N \geq 2$ .

We are now ready to present a general result, which can be viewed as an extension of Theorem 3 to the case of an arbitrary N.

Theorem 6

For $m = 1, 2, \dots$ and $q \in (0, 1)$ we have

\begin{matrix} lim_{N} E [Y, {(N)}^{m}] = E [Z^{m}], \end{matrix}

3.52

where Y(N) is given by (3.47) and Z is a standard normal random variable. In other words

\begin{matrix} lim_{N} E [Y, {(N)}^{2 ℓ - 1}] = 0 and lim_{N} E [Y, {(N)}^{2 ℓ}] = \frac{(2 ℓ)!}{2^{ℓ} ℓ!}, ℓ \geq 1 . \end{matrix}

3.53

Proof

We will follow the approach of the proof of Theorem 3.

We apply induction. By (3.48) we know that (3.53) is valid in the special cases $m = 1$ and $m = 2$ . The inductive hypothesis is that the limit ${lim}_{N} E [Y, {(N)}^{k}] = i^{- k} {lim}_{N} ψ^{(k)} (0 ; N)$ (where $ψ^{(k)} (t ; N)$ denotes the k-th derivative of $ψ (t ; N)$ with respect to t) satisfies (3.53) for $k = 1, \dots, m - 1$ . Then, for $k = m$ (where $m \geq 3$ ) formula (3.51) implies

\begin{matrix} ψ^{(m)} (0 ; N) \\ = \frac{σ^{m} (⌊ N / 2 ⌋)}{σ^{m} (N)} ψ^{(m)} (0 ; ⌊ N / 2 ⌋) + \frac{σ^{m} (⌈ N / 2 ⌉)}{σ^{m} (N)} ψ^{(m)} (0 ; ⌈ N / 2 ⌉) \\ + \sum_{j = 1}^{m - 1} (\begin{matrix} m \\ j \end{matrix}) \frac{σ^{j} (⌊ N / 2 ⌋) σ^{m - j} (⌈ N / 2 ⌉)}{σ^{m} (N)} ψ^{(j)} (0 ; ⌊ N / 2 ⌋) ψ^{(m - j)} (0 ; ⌈ N / 2 ⌉) \\ + δ (N), \end{matrix}

3.54

where

\begin{matrix} δ (N) = O (q^{N / 2}), N \to \infty . \end{matrix}

3.55

Let us set

\begin{matrix} b (N) & : = \sum_{j = 1}^{m - 1} (\begin{matrix} m \\ j \end{matrix}) \frac{σ^{j} (⌊ N / 2 ⌋) σ^{m - j} (⌈ N / 2 ⌉)}{σ^{m} (N)} ψ^{(j)} (0 ; ⌊ N / 2 ⌋) ψ^{(m - j)} (0 ; ⌈ N / 2 ⌉) \\ + δ (N) . \end{matrix}

3.56

Then, the inductive hypothesis together with Lemma 1 and (3.55) imply that the limit ${lim}_{N} b (N)$ exists in $C$ . We can, therefore, apply Corollary A1 of the Appendix to (3.54) with $z (N) = ψ^{(m)} (0 ; N)$ , $ρ_{1} (N) = σ^{m} (⌊ N / 2 ⌋) / σ^{m} (N)$ , $ρ_{2} (N) = σ^{m} (⌈ N / 2 ⌉) / σ^{m} (N)$ (so that, in view of Lemma 1 and (4.6), $ρ_{1} = ρ_{2} = 1 / 2^{m / 2}$ ), and b(N) as in (3.56), to conclude that

\begin{matrix} the limit Λ (m) : = lim_{N} ψ^{(m)} (0 ; N) exists in C . \end{matrix}

3.57

Thus by taking limits in (3.54) we obtain (in view of (3.57) and Lemma 1)

\begin{matrix} Λ (m) = \frac{Λ (m)}{2^{m / 2}} + \frac{Λ (m)}{2^{m / 2}} + \frac{1}{2^{m / 2}} \sum_{j = 1}^{m - 1} (\begin{matrix} m \\ j \end{matrix}) Λ (m - j) Λ (j) \end{matrix}

\begin{matrix} (2^{m / 2} - 2) Λ (m) = \sum_{j = 1}^{m - 1} (\begin{matrix} m \\ j \end{matrix}) Λ (m - j) Λ (j) . \end{matrix}

3.58

We have, thus, shown that formula (3.58) holds for all $m \geq 1$ . If $m = 1$ , the sum in the right-hand side of (3.58) is empty, i.e. zero, and, hence we get what we already knew, namely that $Λ (1) = 0$ . If $m = 2$ then (3.58) becomes vacuous (0 = 0), but we know that $Λ (2) = - 1$ . Thus, formula (3.58) gives recursively the values of $Λ (m)$ for every $m \geq 3$ , and by using the induction hypothesis again, namely that $Λ (k) = i^{k} E [Z^{k}]$ for $k = 1, \dots, m - 1$ , it is straightforward to show that $Λ (m)$ , as given by (3.58), equals $i^{m} E [Z^{m}]$ for every m. $□$

Theorem 6 together with the fact that the sequence Y(N) is tight imply the following corollary which can be considered an extension of Corollary 2. Its proof is omitted since it is just a repetition of the proof of Corollary 2.

Corollary 7

For any fixed $q \in (0, 1)$ we have

\begin{matrix} Y (N) = \frac{T (N) - E [T (N)]}{\sqrt{V [T (N)]}} \overset{d}{⟶} Z as N \to \infty, \end{matrix}

3.59

where Z is a standard normal random variable.

Finally, let us mention that, since $σ (N) = \sqrt{V [T (N)]} = O (\sqrt{N})$ , a rather trivial consequence of Theorem 6 is that for any given $ϵ > 0$ we have that

\begin{matrix} \frac{T (N) - E [T (N)]}{N^{\frac{1}{2} + ϵ}} \to 0 in L_{r} (Ω, F, P), N \to \infty, \end{matrix}

3.60

for any $r > 0$ and any $q \in [0, 1]$ .

It follows that if $N_{n}$ , $n = 1, 2, \dots$ , is a sequence of integers (with ${lim}_{n} N_{n} = \infty$ ) such that the limit ${lim}_{n} E [T (N_{n})] / N_{n}$ exists (e.g., $N_{n} = 2^{n}$ or $N_{n} = 3 \cdot 2^{n}$ ), then

\begin{matrix} \frac{T (N_{n})}{N_{n}} \to lim_{n} \frac{E [T (N_{n})]}{N_{n}}, n \to \infty, \end{matrix}

3.61

in the $L_{r} (Ω, F, P)$ -sense, for every $r > 0$ (the case $N_{n} = 2^{n}$ is treated by Corollary 3).

Also, by Chebyshev’s inequality (applied to a sufficiently high power of $(T (N) - E [T (N)]) / N^{\frac{1}{2} + ϵ}$ ) and the 1st Borel-Cantelli Lemma it follows that

\begin{matrix} \frac{T (N) - E [T (N)]}{N^{\frac{1}{2} + ϵ}} \to 0 a.s., N \to \infty, \end{matrix}

3.62

for any $q \in [0, 1]$ and any given $ϵ > 0$ . In this case, again, if ${lim}_{n} E [T (N_{n})] / N_{n}$ exists for some sequence $N_{n} \to \infty$ , then

\begin{matrix} \frac{T (N_{n})}{N_{n}} \to lim_{n} \frac{E [T (N_{n})]}{N_{n}} a.s., n \to \infty . \end{matrix}

3.63

Acknowledgements

The author wishes to thank the two anonymous referrees, as well as Professor Tom Britton, associate editor of JOMB, for their constructive comments and suggestions.

Appendix

Lemma A1

Let $ε (N)$ , $N = 1, 2, \dots$ , be a sequence of nonnegative real numbers such that there exists an integer $N_{0} \geq 2$ for which

\begin{matrix} ε (N) \leq r_{1} ε (⌊ N / 2 ⌋) + r_{2} ε (⌈ N / 2 ⌉) + δ (N) for all N \geq N_{0}, \end{matrix}

4.1

where $r_{1}$ , $r_{2}$ are nonnegative constants satisfying $r : = r_{1} + r_{2} < 1$ and ${lim}_{N} δ (N) = 0$ . Then,

\begin{matrix} lim_{N} ε (N) = 0 . \end{matrix}

4.2

Proof

We have that $lim sup ε (N) = lim sup ε (⌊ N / 2 ⌋) = lim sup ε (⌈ N / 2 ⌉)$ . Thus, by taking $lim sup$ in both sides of (4.1) we get

\begin{matrix} 0 \leq lim sup ε (N) \leq r_{1} lim sup ε (N) + r_{2} lim sup ε (N) + lim sup δ (N), \end{matrix}

i.e.

\begin{matrix} 0 \leq lim sup ε (N) \leq r lim sup ε (N) . \end{matrix}

4.3

Since $r < 1$ , to finish the proof it suffices to show that $ε (N)$ is bounded , and we can, e.g., see that as follows. Let $δ^{*} : = {max}_{N} | δ (N) |$ and

\begin{matrix} M : = max \{\frac{δ^{*}}{1 - r}, max_{1 \leq N \leq N_{0}} | ε (N) |\} . \end{matrix}

4.4

Then (noticing that (4.4) implies that $r M + δ^{*} \leq M$ ), easy induction on N implies that $ε (N) \leq M$ for all $N \geq 1$ . $□$

If $r_{1} + r_{2} = 1$ , then (4.2) does not hold. For example, if $ε (N) = r_{1} ε (⌊ N / 2 ⌋) + (1 - r_{1}) ε (⌈ N / 2 ⌉)$ for $N \geq 2$ , then $ε (N) = ε (1)$ for all N.

Corollary A1

Let z(N), $N = 1, 2, \dots$ , be a sequence of complex numbers satisfying the recursion

\begin{matrix} z (N) = ρ_{1} (N) z (⌊ N / 2 ⌋) + ρ_{2} (N) z (⌈ N / 2 ⌉) + b (N), N \geq 2, \end{matrix}

4.5

where $ρ_{1} (N)$ , $ρ_{2} (N)$ , and b(N) are complex sequences such that

\begin{matrix} lim_{N} ρ_{1} (N) = ρ_{1} \in C, lim_{N} ρ_{2} (N) = ρ_{2} \in C, and lim_{N} b (N) = b \in C, \end{matrix}

4.6

with $| ρ_{1} | + | ρ_{2} | < 1$ . Then

\begin{matrix} lim_{N} z (N) = \frac{b}{1 - ρ_{1} - ρ_{2}} . \end{matrix}

4.7

Proof

Pick $ϵ > 0$ so that $| ρ_{1} | + | ρ_{2} | + 2 ϵ < 1$ and then choose $N_{0} \geq 2$ so that $| ρ_{1} (N) - ρ_{1} | < ϵ$ and $| ρ_{2} (N) - ρ_{2} | < ϵ$ for all $N \geq N_{0}$ . Then (4.5) implies

\begin{matrix} | z (N) | \leq (| ρ_{1} | + ϵ) | z (⌊ N / 2 ⌋) | + (| ρ_{2} | + ϵ) | z (⌈ N / 2 ⌉) | + b^{*}, N \geq N_{0}, \end{matrix}

4.8

where $b^{*} : = {sup}_{N} | b (N) | < \infty$ , and the argument at the end of the proof of Lemma A1 applies to (4.8) and implies that the sequence z(N) is bounded.

Next, we write (4.5) as

\begin{matrix} z (N) - \frac{b}{1 - ρ_{1} - ρ_{2}} & = ρ_{1} [z (⌊ N / 2 ⌋) - \frac{b}{1 - ρ_{1} - ρ_{2}}] \\ + ρ_{2} [z (⌈ N / 2 ⌉) - \frac{b}{1 - ρ_{1} - ρ_{2}}] + δ (N), \end{matrix}

4.9

where

\begin{matrix} δ (N) : = & [ρ_{1} (N) - ρ_{1}] z (⌊ N / 2 ⌋) + [ρ_{2} (N) - ρ_{2}] z (⌈ N / 2 ⌉) \\ + [b (N) - b] . \end{matrix}

4.10

Notice that our assumptions for $ρ_{1} (N)$ , $ρ_{2} (N)$ , and b(N), together with the fact that z(N) is bounded, imply that ${lim}_{N} δ (N) = 0$ . Thus, if we take absolute values in (4.9) and set

\begin{matrix} ε (N) : = |z (N) - \frac{b}{1 - ρ_{1} - ρ_{2}}|, \end{matrix}

4.11

we obtain

\begin{matrix} ε (N) \leq | ρ_{1} | ε (⌊ N / 2 ⌋) + | ρ_{2} | ε (⌈ N / 2 ⌉) + | δ (N) |, N \geq 2, \end{matrix}

4.12

hence (since $| ρ_{1} | + | ρ_{2} | < 1$ ) Lemma A1 implies that ${lim}_{N} ε (N) = 0$ . $□$

Finally, by using the above arguments we can also show the following simpler result:

Corollary A2

Let $z_{n}$ , $n = 0, 1, \dots$ , be a sequence of complex numbers satisfying the recursion

\begin{matrix} z_{n} = ρ z_{n - 1} + b_{n}, n = 1, 2, \dots, \end{matrix}

4.13

where $| ρ | < 1$ and ${lim}_{n} b_{n} \in C$ . Then,

\begin{matrix} lim_{n} z_{n} = \frac{{lim}_{n} b_{n}}{1 - ρ} . \end{matrix}

4.14

Footnotes

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

Aldridge M (2019) Rates of adaptive group testing in the linear regime. In: IEEE International Symposium on Information Theory (ISIT), Paris, France, pp 236–240. 10.1109/ISIT.2019.8849712
Aldridge M (2020) Conservative two-stage group testing, arXiv:2005.06617v1 [Stat.AP] (6)
Armendáriz I, Ferrari PA, Fraiman D, Ponce Dawson S (2020) Group testing with nested pools, arXiv:2005.13650v2 [math.ST] (6)
Chung KL. A course in probability theory. 3. San Diego: Academic Press; 2001. [Google Scholar]
Dorfman R. The detection of defective members of large populations. Ann Math Stat. 1943;14(4):436–440. doi: 10.1214/aoms/1177731363. [DOI] [Google Scholar]
Durrett R (2005) Probability: theory and examples, third edition. Duxbury Advanced Series, Brooks/Cole–Thomson Learning, Belmont
Golliery C, Gossnerz O. Group Testing against Covid-19. Covid Econ. 2020;1(2):32–42. doi: 10.18231/j.ijirm.2020.023. [DOI] [Google Scholar]
Hwang FK. A method for detecting all defective members in a population by group testing. J Am Stat Assoc. 1972;67(339):605–608. doi: 10.1080/01621459.1972.10481257. [DOI] [Google Scholar]
Malinovsky Y. Sterrett procedure for the generalized group testing problem. Methodol Comput Appl Probab. 2019;21:829–840. doi: 10.1007/s11009-017-9601-4. [DOI] [Google Scholar]
Mallapaty S. The mathematical strategy that could transform coronavirus testing. Nature. 2020;583:504–505. doi: 10.1038/d41586-020-02053-6. [DOI] [PubMed] [Google Scholar]
Sobel M, Groll PA. Group testing to eliminate efficiently all defectives in a binomial sample. Bell Labs Tech J. 1959;38(5):1179–1252. doi: 10.1002/j.1538-7305.1959.tb03914.x. [DOI] [Google Scholar]
Ungar P. The cut off point for group testing. Commun Pure Appl Math. 1960;13(1):49–54. doi: 10.1002/cpa.3160130105. [DOI] [Google Scholar]

[CR1] Aldridge M (2019) Rates of adaptive group testing in the linear regime. In: IEEE International Symposium on Information Theory (ISIT), Paris, France, pp 236–240. 10.1109/ISIT.2019.8849712

[CR2] Aldridge M (2020) Conservative two-stage group testing, arXiv:2005.06617v1 [Stat.AP] (6)

[CR3] Armendáriz I, Ferrari PA, Fraiman D, Ponce Dawson S (2020) Group testing with nested pools, arXiv:2005.13650v2 [math.ST] (6)

[CR4] Chung KL. A course in probability theory. 3. San Diego: Academic Press; 2001. [Google Scholar]

[CR5] Dorfman R. The detection of defective members of large populations. Ann Math Stat. 1943;14(4):436–440. doi: 10.1214/aoms/1177731363. [DOI] [Google Scholar]

[CR6] Durrett R (2005) Probability: theory and examples, third edition. Duxbury Advanced Series, Brooks/Cole–Thomson Learning, Belmont

[CR7] Golliery C, Gossnerz O. Group Testing against Covid-19. Covid Econ. 2020;1(2):32–42. doi: 10.18231/j.ijirm.2020.023. [DOI] [Google Scholar]

[CR8] Hwang FK. A method for detecting all defective members in a population by group testing. J Am Stat Assoc. 1972;67(339):605–608. doi: 10.1080/01621459.1972.10481257. [DOI] [Google Scholar]

[CR9] Malinovsky Y. Sterrett procedure for the generalized group testing problem. Methodol Comput Appl Probab. 2019;21:829–840. doi: 10.1007/s11009-017-9601-4. [DOI] [Google Scholar]

[CR10] Mallapaty S. The mathematical strategy that could transform coronavirus testing. Nature. 2020;583:504–505. doi: 10.1038/d41586-020-02053-6. [DOI] [PubMed] [Google Scholar]

[CR11] Sobel M, Groll PA. Group testing to eliminate efficiently all defectives in a binomial sample. Bell Labs Tech J. 1959;38(5):1179–1252. doi: 10.1002/j.1538-7305.1959.tb03914.x. [DOI] [Google Scholar]

[CR12] Ungar P. The cut off point for group testing. Commun Pure Appl Math. 1960;13(1):49–54. doi: 10.1002/cpa.3160130105. [DOI] [Google Scholar]

PERMALINK

A binary search scheme for determining all contaminated specimens

Vassilis G Papanicolaou

Abstract

Introduction

Table 1.

Table 2.

The case N=2n

Example 1

The expectation of Wn

Theorem 1

Proof

Remark 1

The behavior of Wn as n→∞

Theorem 2

Proof

Corollary 1

Proof

Theorem 3

Proof

Corollary 2

Proof

Corollary 3

The case of a general N

The expectation, the generating function, and the variance of T(N)

Theorem 4

Proof

Remark 2

Corollary 4

Proof

Remark 3

Comment

Open Question

Theorem 5

Corollary 5

Corollary 6

Proof

The behavior of T(N) as N→∞

Lemma 1

Proof

Theorem 6

Proof

Corollary 7

Acknowledgements

Appendix

Lemma A1

Proof

Corollary A1

Proof

Corollary A2

Footnotes

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

The case $N = 2^{n}$

The expectation of $W_{n}$

The behavior of $W_{n}$ as $n \to \infty$

The behavior of T(N) as $N \to \infty$