Constructions and Comparisons of Pooling Matrices for Pooled Testing of COVID-19

Yi-Jheng Lin; Che-Hao Yu; Tzu-Hsuan Liu; Cheng-Shang Chang; Wen-Tsuen Chen

doi:10.1109/TNSE.2021.3121709

. 2021 Oct 26;9(2):467–480. doi: 10.1109/TNSE.2021.3121709

Constructions and Comparisons of Pooling Matrices for Pooled Testing of COVID-19

Yi-Jheng Lin ^1,^✉, Che-Hao Yu ¹, Tzu-Hsuan Liu ¹, Cheng-Shang Chang ¹, Wen-Tsuen Chen ¹

PMCID: PMC9014483 PMID: 35582549

Abstract

In comparison with individual testing, group testing is more efficient in reducing the number of tests and potentially leading to tremendous cost reduction. There are two key elements in a group testing technique: (i) the pooling matrix that directs samples to be pooled into groups, and (ii) the decoding algorithm that uses the group test results to reconstruct the status of each sample. In this paper, we propose a new family of pooling matrices from packing the pencil of lines (PPoL) in a finite projective plane. We compare their performance with various pooling matrices proposed in the literature, including 2D-pooling, P-BEST, and Tapestry, using the two-stage definite defectives (DD) decoding algorithm. By conducting extensive simulations for a range of prevalence rates up to 5%, our numerical results show that there is no pooling matrix with the lowest relative cost in the whole range of the prevalence rates. To optimize the performance, one should choose the right pooling matrix, depending on the prevalence rate. The family of PPoL matrices can dynamically adjust their construction parameters according to the prevalence rates and could be a better alternative than using a fixed pooling matrix.

Keywords: group testing, perfect difference sets, finite projective planes

I. Introduction

COVID-19 pandemic has deeply affected the daily life of many people in the world. The current strategy for dealing with COVID-19 is to reduce the transmission rate of COVID-19 by preventive measures, such as contact tracing, wearing masks, and social distancing. One problematic characteristic of COVID-19 is that there are asymptomatic infections [1]. As those asymptomatic infections are unaware of their contagious ability, they can infect more people if they are not yet been detected [2]. As shown in the recent paper [3], massive COVID-19 testing in South Korea on Feb. 24, 2020, can greatly reduce the proportion of undetectable infected persons and effectively reduce the transmission rate of COVID-19.

Massive testing for a large population is very costly if it is done one at a time. For a population with a low prevalence rate, group testing (or pool testing, pooled testing, batch testing) that tests a group by mixing several samples together can achieve a great extent of saving testing resources. As indicated in the recent article posted on the US FDA website [4], the group testing approach has received a lot of interest lately. Also, in the US CDC's guidance for the use of pooling procedures in SARS-CoV-2 [5], it defines three types of tests: (i) diagnostic testing that is intended to identify occurrence at the individual level and is performed when there is a reason to suspect that an individual may be infected, (ii) screening testing that is intended to identify occurrence at the individual level even if there is no reason to suspect an infection, and (iii) surveillance testing includes ongoing systematic activities, including collection, analysis, and interpretation of health-related data. The general guidance for diagnostic or screening testing using a pooling strategy in [5] (quoted below) basically follows the two-stage group testing procedure invented by Dorfman in 1943 [6]:

“If a pooled test result is negative, then all specimens can be presumed negative with the single test. If the test result is positive or indeterminate, then all the specimens in the pool need to be retested individually.”

The Dorfman two-stage algorithm is a very simple group testing strategy. Recently, there are more sophisticated group testing algorithms proposed in the literature, see, e.g., [7]–[10]. Instead of pooling a sample into a single group, these algorithms require diluting a sample and then splitting it into multiple groups (pooled samples). Such a procedure is specified by a pooling matrix that directs each diluted sample to be pooled into a specific group. The test results of pooled samples are then used for decoding (reconstructing) the status of each sample. In short, there are two key elements in a group testing strategy: (i) the pooling matrix, and (ii) the decoding algorithm.

As COVID-19 is a severe contagious disease, one should be very careful about the decoding algorithm used for reconstructing the testing results of persons. Though decoding algorithms that use soft information for group testing, including various compressed sensing algorithms in [8]–[12], might be more efficient in reducing the number of tests, they are more prone to have false positives and false negatives. A false positive might cause a person to be quarantined for 14 days and thus losing 14 days of work. On the other hand, a false negative might have an infected person wandering around the neighborhood and cause more people to be infected. In view of this, it is important to have group testing results that are as “definite” as individual testing results (in a noiseless setting).

Following the CDC guidance [5], we use the decoding algorithm, called the definite defectives (DD) algorithm in the literature (see Algorithm 2.3 of the monograph [13]), that can have definite testing results. The DD algorithm first identifies negative samples from a negative testing result of a group (as advised by the CDC guidance [5]). Such a step is known as the combinatorial orthogonal matching pursuit (COMP) step in the literature [13]. Then the DD algorithm identifies positive samples if they are in a group with only one positive sample. Not every sample can be decoded by the DD algorithm. As the Dorfman two-stage algorithm, samples that are not decoded by the DD algorithm go through the second stage, and they are tested individually. We call such an algorithm the two-stage DD algorithm.

One of the main objectives of this paper is to compare the performance of various pooling matrices proposed in the literature, including 2D-pooling [7], P-BEST [8], and Tapestry [9], [10], using the two-stage DD decoding algorithm. In addition to these pooling matrices, we also propose a new construction of a family of pooling matrices from packing the pencil of lines (PPoL) in a finite projective plane. The family of PPoL pooling matrices has very nice properties: (i) both the column correlation and the row correlation are bounded by 1, and (ii) there is a freedom to choose the construction parameters to optimize performance. To measure the amount of saving of a group testing method, we adopt the performance measure, called the expected relative cost in [6]. The expected relative cost is defined as the ratio of the expected number of tests required by the group testing technique to the number of tests required by the individual testing. We then measure the expected relative costs of these pooling matrices for a range of prevalence rates up to 5%. Some of the main findings of our numerical results are as follows:

(i)
There is no pooling matrix that has the lowest relative cost in the whole range of the prevalence rates considered in our experiments. To optimize the performance, one should choose the right pooling matrix, depending on the prevalence rate.
(ii)
The expected relative costs of the two pooling matrices used in Tapestry [9], [10] are high compared to the other pooling matrices considered in our experiments. Its performance, in terms of the expected relative cost, is even worse than the (optimized) Dorfman two-stage algorithm. However, Tapestry is capable of decoding most of the samples in the first stage. In other words, the percentages of samples that need to go through the second stage are the smallest among all the pooling matrices considered in our experiments.
(iii)
P-BEST [8] has a very low expected relative cost when the prevalence rate is below 1%. However, its expected relative cost increases dramatically when the prevalence rate is above 1.3%.
(iv)
2D-pooling [7] has a low expected relative cost when the prevalence rate is near 5%. Unlike Tapestry, P-BEST, and PPoL that rely on robots for pipetting, the implementation of 2D-pooling is relatively easy by humans.
(v)
There is a PPoL pooling matrix with column weight 3 that outperforms the P-BEST pooling matrix for the whole range of the prevalence rates considered in our experiments (up to 5%). We suggest using that PPoL pooling matrix up to the prevalence rate of 2% and then switch to other PPoL pooling matrices with respect to the increase of the prevalence rate. The detailed suggestions are shown in Table IV of Section VI.

TABLE IV. Suboptimal PPoL Pooling Matrices. : Prevalence Rates; and : Parameters of PPoL Pooling Matrices; Cost (14): Costs Computed From the Theoretical Approximations in (14); Cost (sim): Costs Measured From Simulations; Dorfman [6]: Costs by the Dorfman Two-Stage Algorithm.

			cost (14)	cost (sim)	Dorfman [6]
1%	3	31	0.1218	0.12	0.20
2%	4	23	0.1973	0.20	0.27
3%	4	23	0.2552	0.25	0.33
4%	3	13	0.3170	0.32	0.38
5%	3	13	0.3685	0.37	0.43
6%	3	13	0.4243	0.42	0.47
7%	2	7	0.4651	0.47	0.50
8%	2	7	0.5035	0.50	0.53
9%	2	7	0.5422	0.54	0.56
10%	2	7	0.5809	0.58	0.59

Open in a new tab

The paper is organized as follows: in Section II, we briefly review the group testing problem, including the mathematical formulation and the DD decoding algorithm. In Section III, we introduce the related works that are used in our comparison study. We then propose the new family of PPoL pooling matrices in Section IV. In Section V, we consider the noisy pooled testing. In Section VI, we conduct extensive simulations to compare the performance of various pooling matrices using the two-stage DD algorithm. The paper is concluded in Section VII, where we discuss possible extensions for future works.

II. Review of Group Testing

A. The problem statement

Consider the group testing problem with Inline graphic samples (indexed from ), and groups (indexed from ). The samples are pooled into the groups (pooled samples) through an binary matrix so that the sample is pooled into the group if (see Fig. 1). Such a matrix is called the pooling matrix in this paper. Note that a pooling matrix corresponds to the biadjacency matrix of an Inline graphic bipartite graph. Let be the binary state vector of the samples and be the binary state vector of the groups. Then

where the matrix operation is under the Boolean algebra (that replaces the usual addition by the OR operation and the usual multiplication by the AND operation). The main objective of group testing is to decode the vector Inline graphic given the observation vector under certain assumptions. In this paper, we adopt the following basic assumptions for binary samples:

Fig. 1. — Pooled testing represented by a bipartite graph.

(i)
Every sample is binary, i.e., it is either positive (1) or negative (0).
(ii)
Every group is binary, and a group is positive (1) if there is at least one sample in that group is positive. On the other hand, a group is negative (0) if all the samples pooled into that group are negative.

If we test each sample one at a time, then the number of tests for Inline graphic samples is , and the average number of tests per sample is 1. The key advantage of using group testing is that the number of tests per sample can be greatly reduced. One important performance measure of group testing, called the expected relative cost in [6], is the ratio of the expected number of tests required by the group testing technique to the number of tests required by the individual testing. The main objective of this paper is to compare the expected relative costs of various group testing methods.

B. The definite defectives (DD) decoding algorithm

In this section, we briefly review the definite defectives (DD) algorithm (see Algorithm 2.3 of [13]). The DD algorithm first identifies negative samples from a negative testing result of a group. Such a step is known as the combinatorial orthogonal matching pursuit (COMP) step. Then the DD algorithm identifies positive samples if they are in a group with only one positive sample. The detailed steps of the DD algorithm are outlined in Algorithm 1.

Algorithm 1. The definite defectives (DD) algorithm for binary samples

Input An pooling matrix and a binary -vector of the group test result.
Output an -vector for the test results of the samples.
0: Initially, every sample is marked “un-decoded.”
1: If there is a negative group, then all the samples pooled into that group are decoded to be negative.
2: The edges of samples decoded to be negative in the bipartite graph are removed from the graph.
3: Repeat from Step 1 until there is no negative group.
4: If there is a positive group with exactly one (remaining) sample in that group, then that sample is decoded to positive.
5: Repeat from Step 4 until no more samples can be decoded.

In Fig. 2, we provide an illustrating example for Algorithm 1. In Fig. 2(a), the test result of Inline graphic is negative, and thus the three samples , and , are decoded to be negative. In Fig. 2(b), the edges that are connected to the samples , and , are removed from the bipartite graph. In Fig. 2(c), the test results of the two groups and are positive. As is the only sample in , is decoded to be positive.

Note that one might not be able to decode all the samples by the above decoding algorithm. For instance, if a particular sample is pooled into groups that all have at least one positive sample, then there is no way to know whether that sample is positive or negative. As shown in Fig. 3, the sample Inline graphic cannot be decoded by the DD algorithm as the test results of the three groups are the same no matter if is positive or not.

As shown in Lemma 2.2 of [13], one important guarantee of the DD algorithm is that there is no false positive.

Proposition 1: —

([13], Lemma 2.2) Assume that all the testing results are correct. Then (i) all the samples that are decoded to be negative in Step 1 of Algorithm 1 are definite negatives, and (ii) all the samples that are decoded to be positive in Step 4 of Algorithm 1 are definite positives. As such, there are no false positives in Algorithm 1.

In order to resolve all the “un-decoded” samples, we add another stage by individually testing each “un-decoded” sample. This leads to the following two-stage DD algorithm in Algorithm 2.

Algorithm 2. The two-stage definite defectives (DD2) algorithm for binary samples

Input An pooling matrix and a binary -vector of the group test result.
Output an -vector for the test results of the samples.
1: Run the DD algorithm in Algorithm 1.
2: For those “un-decoded” samples, test them one at a time.

III. Related Works

In [14]–[16], it was shown that a single positive sample can still be detected even in pools of 5-32 samples for the standard RT-qPCR test of COVID-19. Such an experimental result provides supporting evidence for group testing of COVID-19. In the following, we review four group testing strategies proposed in the literature for COVID-19.

The Dorfman two-stage algorithm [17]: For the case that Inline graphic , i.e., every sample is pooled into a single group, the DD2 algorithm is simply the original Dorfman two-stage algorithm [6], i.e., if the group of samples is tested negative, then all the samples are ruled out. Otherwise, all the samples are tested individually. Suppose that the prevalence rate is Inline graphic . Then the expected number of tests to decode the samples by the Dorfman two-stage algorithm is . As such, the expected relative cost (defined as the ratio of the expected number of tests required by the group testing technique to the number of tests required by the individual testing in [6]) is Inline graphic . As shown in Table I of [6], the optimal group size is 11 with the expected relative cost of 20% when the prevalence rate is 1%.

TABLE I. Arrangement of the 9 Samples in a Rectangular Grid.

Open in a new tab

2D-pooling [7]: On a 96-well plate, there are 8 rows and 12 columns. Pool the samples in the same row (column) into a group. This results in 20 groups for 96 samples. One advantage of this simple 2D-pooling strategy is to minimize pipetting errors.

P-BEST [8]: P-BEST [8] uses a Inline graphic pooling matrix constructed from the Reed-Solomon code [18] for pooled testing of COVID-19. For the pooling matrix, each sample is pooled into 6 groups, and each group contains 48 samples. In [8], the authors proposed using a compressed sensing algorithm called the Gradient Projection for Sparse Reconstruction (GPSR) algorithm for decoding. Though it is claimed in [8] that the GPSR algorithm can detect up to 1% of positive carriers, there is no guarantee that every decoded sample (by the GPSR algorithm) is correct.

Tapestry [9], [10]: The Tapestry scheme [9], [10] uses the Kirkman triples to construct their pooling matrices. For the pooling matrix in [9], [10], each sample is pooled into 3 groups (in their experiments, some samples are only pooled into 2 groups). As such, it is sparser than that used by P-BEST. However, one of the restrictions for the pooling matrices constructed from the Kirkman triples is that the column weights must be 3. Such a restriction limits its applicability to optimize its performance according to the prevalence rate. We note that a compressed-sensing-based decoding algorithm was proposed in [9], [10]. Such a decoding algorithm further exploits the viral load (Ct value) of each pool and reconstructs the Ct value of each positive sample. It is claimed to be viable not just with low ( Inline graphic %) prevalence rates but even with moderate prevalence rates (5%-10%).

IV. PPOL Constructions of Pooling Matrices

In this section, we propose a new family of pooling matrices from packing the pencil of lines (PPoL) in a finite projective plane. Our idea of constructing PPoL pooling matrices was inspired by the constructions of channel hopping sequences in the rendezvous search problem in cognitive radio networks and the constructions of grant-free uplink transmission schedules in 5 G networks (see, e.g., [19]–[22]), in particular, the channel hopping sequences constructed by the PPoL algorithm in [19].

A pooling matrix is said to be Inline graphic -regular if there are exactly (resp. ) nonzero elements in each column (resp. row). In other words, the degree of every left-hand (resp. right-hand) node in the corresponding bipartite graph is (resp. ). The total number of edges in the bipartite graph is for a -regular pooling matrix Inline graphic . Define the (compressing) gain

A. Perfect difference sets and finite projective planes

As our construction of the pooling matrix is from packing the pencil of lines in a finite projective plane, we first briefly review the notions of difference sets and finite projective planes.

Definition 2: —

(Difference sets) Let . A set is called a -difference set if for every , there exist at least ordered pairs such that , where . A -difference set is said to be perfect if there exists exactly one ordered pair such that for every .

Definition 3: —

(Finite projective planes) A finite projective plane of order , denoted by , is a collection of lines and points such that

(P1)
every line contains points,

(P2)
every point is on lines,

(P3)
any two distinct lines intersect at exactly one point, and

(P4)
any two distinct points lie on exactly one line.

When Inline graphic is a prime power, Singer [23] established the connection between an -perfect difference set and a finite projective plane of order through a collineation that maps points (resp. lines) to points (resp. lines) in a finite projective plane. Specifically, suppose that is an -perfect difference set with

(i)
Let be the points.
(ii)
Let and , be the lines.

Then these Inline graphic points and lines form a finite projective plane of order .

B. The construction algorithm

In this section, we propose the PPoL algorithm for constructing pooling matrices. For this, one first constructs an Inline graphic -perfect difference set, with

Let Inline graphic and

Inline graphic be the lines in the corresponding finite projective plane.

It is easy to see that the Inline graphic lines in the corresponding finite projective plane that contain point 0 are . These lines are called the pencil of lines that contain point 0 (as the pencil point). As the only intersection of the lines is point 0, these lines, excluding point 0, are disjoint, and thus can be packed into Inline graphic . This is formally proved in the following lemma.

Lemma 4: —

Let , . Then is a partition of .

Proof. First, note that Inline graphic are the lines that contain point 0. As any two distinct lines intersect at exactly one point, we know that for ,

and that for Inline graphic ,

Thus, they are disjoint.

As there are Inline graphic points in and points in , contains points. These points are exactly the set of points in the finite projective plane of order .

In Algorithm 3, we show how one can construct a pooling matrix from a finite projective plane. The idea is to first construct a bipartite graph with the line nodes on the left and the point nodes on the right. There is an edge between a point node and a line node if that point is in that line. Then we start trimming this line-point bipartite graph to achieve the needed compression ratio. Specifically, we select the subgraph with the Inline graphic line nodes that do not contain point 0 (on the left) and the point nodes in the union of pencil of lines (on the right).

Note that in Algorithm 3, the number of samples has to be Inline graphic . However, this restriction may not be met in practice. A simple way to tackle this problem is by adding additional dummy samples to ensure that the total number of samples is . In the literature, there are some sophisticated methods (see, e.g., the recent work [24]) that further consider the “balance” issue, i.e., samples should be pooled into groups as evenly as possible.

Algorithm 3. The PPoL algorithm

Input The number of samples with being a prime power, and the degree of each sample .
Output An binary pooling matrix with and .
1: Let and construct a perfect difference set in (with and ).
2: For , let

be the lines.
3: Construct a bipartite graph with the lines on the left and the points on the right. Add an edge between a point node and a line node if that point is in that line.
4: Remove point 0 and line 0 from the bipartite graph (and the edges attached to these two nodes). Let be the biadjacency matrix of the trimmed bipartite graph with if point is in .
5: Let , , be the pencil of lines that contain point 0.
6: Remove the column, , in to form an biadjacency matrix . Note that these columns correspond to the lines containing point 0.
7: Let (select the first pencil of lines that contain point 0). Remove rows of that are not in to form a biadjacency matrix .

Example 5: —

(A worked example of the PPoL algorithm in Algorithm 3) Let , be the inputs of Algorithm 3. In Step 1, let and construct the perfect difference set in . In Step 2, let be the 7 lines, where , , , , , , and . In Step 3, construct the bipartite graph with the 7 lines on the left and the 7 points on the right, and add an edge between a point node and a line node if that point is in that line. This bipartite graph is shown in Fig. 4 (a). In Step 4, first remove point 0 and line 0 along with the edges attached to these two nodes from the bipartite graph. The nodes and the edges that need to be removed are marked in red in Fig. 4(b), and the trimmed bipartite graph is shown in Fig. 4(c). Then, let be the biadjacency matrix of the trimmed bipartite graph with if point is in , i.e.,

In Step 5, let , and be the 3 pencil of lines that contain point 0. In Step 6, remove the ^th = 6^th and the ^th = 4^th columns in to form a biadjacency matrix , i.e.,

The two lines that need to be removed are marked in red in Fig. 4(d), and the bipartite graph after removing the two lines are shown in Fig. 4(e). In Step 7, let . Then, remove rows of that are not in to form a biadjacency matrix , i.e.,

The points in set along with the edges attached to these nodes are marked in red in Fig. 4(f). The output of Algorithm 3 in this example is the binary pooling matrix .

Fig. 4. — An example to demonstrate how the PPoL algorithm in Algorithm 3 works.

Proposition 6: —

The degree of a line node is and the degree of a point node is .

Proof: As the remaining lines are the lines not containing point 0, each line then intersects with Inline graphic at exactly one point. Since there are pencil of lines that contain point 0, each line then intersects with at exactly points. On the other hand, each of the points in is in a line that contains point 0. As the lines that contain point 0 are removed, each point in is in lines of the remaining Inline graphic lines.

Proposition 7: —

There is at most one common nonzero element in two rows (resp. columns) in the pooling matrix from Algorithm 3, i.e., the inner product of two row vectors (resp. column vectors) is at most 1.

Proof: This is because the bipartite graph with the biadjacency matrix Inline graphic is a subgraph of the line-point bipartite graph corresponding to a finite projective plane. From (P3) and (P4) of Definition 3, any two distinct lines intersect at exactly one point, and any two distinct points lie on exactly one line. Thus, there is at most one common nonzero element in two rows (resp. columns) in Inline graphic from Algorithm 3.

Corollary 8: —

The girth (the minimum length of a cycle) of the bipartite graph with biadjacency matrix is at least 6.

Proof: As the length of a cycle in a bipartite graph must be an even number, it suffices to show that there does not exist a cycle of length 4. We prove this by contradiction. Suppose that there is a cycle of length 4. Suppose that this cycle contains two line nodes Inline graphic and and two point nodes and . Then the intersection of the two lines and contains two points and . This contradicts (P3) in Definition 3.

Theorem 9: —

Consider using the pooling matrix from Algorithm 3 for a binary state vector in a noiseless setting. If the number of positive samples in is not larger than , then every sample can be correctly decoded by the DD algorithm in Algorithm 1.

Proof: Suppose that there are at most Inline graphic positive samples. We first show that every negative sample can be correctly decoded by the DD algorithm in Algorithm 1. Consider a negative sample. Since there are at most positive samples that can be pooled into the groups of this negative sample, and two different samples can be in a common group at most once (Proposition 7), there must be at least one group without positive samples (among the Inline graphic groups of this negative sample). Thus, this negative sample can be correctly decoded. Now consider a positive sample. Since there are at most positive samples that can be pooled into the groups of this positive sample, and two different samples can be in a common group at most once (Proposition 7), there must be at least one group in which this positive sample is the only positive sample. Thus, every positive sample can be correctly decoded.

C. Connection between the PPoL algorithm and the shifted transversal design

We note that there are other methods that can also generate bipartite graphs that satisfy the property in Proposition 7. For instance, in the recent paper [25], Täufer used the shifted transversal design to generate “mutlipools” (in Definition 1 of [25]) that satisfy the property in Proposition 7 when Inline graphic is a prime (in Theorem 3 of [25]). In this section, we establish the connection between the PPoL design and the shift transversal design when is restricted to a prime. We do this by identifying a mapping between these two designs in the following example.

Example 10: —

Consider in the PPoL algorithm. Then let , and be a perfect difference set in . By using the PPoL algorithm in Algorithm 3, we obtain a bipartite graph with 9 samples (lines) and 12 groups (points) in Fig. 5. In the following, we discuss the four cases with , respectively.

Fig. 5. — The bipartite graph obtained by using Algorithm 3 for Example10.

(i)
If , . Then are in group 1, are in group 4, and are in group 6. Thus, every sample is contained in group. (See the black points and lines in Fig. 5.)
(ii)
If , and . Then, in addition to the pooling results in (i), are in group 5, are in group 3, and are in group 12. Thus, every sample is contained in groups. (See the black and green ones in Fig. 5.)
(iii)
If , , , and . Then, in addition to the pooling results in (i) and (ii), are in group 9, are in group 10, and are in group 2. Thus, every sample is contained in groups. (See the black, green, and red ones in Fig. 5.)
(iv)
If , , , , and . Then, in addition to the pooling results in (i), (ii) and (iii), are in group 7, are in group 8, and are in group 11. Thus, every sample is contained in groups. (See the black, green, red, and orange ones in Fig. 5.)

The above PPoL pooling strategy is the same as Inline graphic -multipool in the shifted transversal design [25] if we arrange the 9 samples in the -square in Table I. Specifically, pooling along rows yields the three groups , , and . This corresponds to the case with in the PPoL design. On the other hand, pooling along columns yields the three groups Inline graphic , , and . This corresponds to the case with in the PPoL design. Moreover, pooling with slope 1 (resp. 2) corresponds to the case with (resp. ).

In fact, these two constructions are closely related to orthogonal Latin squares [26]. For Inline graphic (which is a prime power), there are exactly mutually orthogonal Latin squares: , where is in GF(3). With the “vertical” and “horizontal” cases, the maximum number of multiplicity in the shifted transversal design is . Similarly, the maximum number of in the PPoL algorithm is Inline graphic . Moreover, pooling matrices that satisfy the decoding property in Theorem 9 are known as the superimposed codes in [27].

D. Probabilistic analysis of the PPoL pooling matrices

In this section, we conduct a probabilistic analysis of the PPoL pooling matrices. We make the following assumption:

(A1)
All the samples are i.i.d. Bernoulli random variables. A sample is positive (resp. negative) with probability (resp. ). The probability is known as the prevalence rate in the literature.

Note that Inline graphic . Also, let (resp. ) be the probability that the group end of a randomly selected edge is positive (resp. negative). Excluding the randomly selected edge, there are remaining edges in that group, and thus

Let Inline graphic be the conditional probability that a sample cannot be decoded, given that the sample is a negative sample. Note that a negative sample can be decoded if at least one of its edges is in a negative group, excluding its edge (see Fig. 6). Consider a negative sample, called the tagged sample. Since the girth of the bipartite graph of the pooling matrix is 6 (as shown in Corollary 8), the samples in the Inline graphic groups of the subtree of the tagged sample are distinct (see the tree expansion in Fig. 6). Thus,

Fig. 6. — Computing the conditional probability by the tree evaluation method.

Let Inline graphic be the conditional probability that the sample end of a randomly selected edge cannot be decoded, given that the sample end is a negative sample. Note that the excess degree of a sample (excluding the randomly selected edge) is . Analogous to the argument for (11) (see the bottom subtree of the tree expansion in Fig. 7), we have

Fig. 7. — Computing the conditional probability by the tree evaluation method.

Let Inline graphic be the conditional probability that a sample cannot be decoded given that the sample is a positive sample. Note that a positive sample can be decoded if at least one of its edges is in a group in which all the edges are removed except the edge of the positive sample. Since an edge is removed if its sample end is a negative sample and that sample end is decoded to be negative, the probability that an edge is removed is Inline graphic . If the tree expansion in Fig. 7 is actually a tree, then

We note that the tree expansion in Fig. 7 may not be a tree for a PPoL pooling matrix generated from Algorithm 3, the identity in (13) is only an approximation. A sufficient condition for the tree expansion in Fig. 7 to be a tree of depth 4 is that the girth of the bipartite graph is larger than 8. (If the graph in Fig. 7 is not a tree, i.e., there is a loop in that graph, then the girth of the bipartite graph is less than or equal to 8.) Unfortunately, the girth of a PPoL pooling matrix can only be proved to be at least 6. Since a sample cannot be decoded with probability Inline graphic , the average number of tests needed for the DD2 algorithm in Algorithm 2 to decode the samples is . The expected relative cost for the DD2 algorithm with an pooling matrix is

where Inline graphic is the (compressing) gain of the pooling matrix in (2). Note that for a -regular pooling matrix, we have from (2) that . Thus, we can use (11), (13) and (14) to find the -regular pooling matrix that has the lowest expected relative cost (though (13) is only an approximation for the pooling matrices constructed from the PPoL algorithm). In Table II, we use grid search to find the Inline graphic -regular pooling matrix with the lowest expected relative cost for various prevalence rates up to 10%. The search regions for the grid search are and . In the last column of this table, we also show the expected relative cost of the Dorfman two-stage algorithm (Table I of [6]). As shown in this table, using the DD2 algorithm (with the optimal pooling matrices) has significant gains over the Dorfman two-stage algorithm. Unfortunately, not every optimal Inline graphic -regular pooling matrix in Table II can be constructed by using the PPoL algorithm in Algorithm 3. In Section VI, we will look for suboptimal pooling matrices that have small performance degradation.

TABLE II. The -Regular Pooling Matrix With the Lowest Expected Relative Cost From (14).

			cost (14)	Dorfman [6]
1%	3	31	0.1218	0.20
2%	4	29	0.1881	0.27
3%	4	22	0.2545	0.33
4%	4	17	0.3147	0.38
5%	3	12	0.3678	0.43
6%	3	11	0.4166	0.47
7%	3	10	0.4627	0.50
8%	2	7	0.5035	0.53
9%	2	6	0.5416	0.56
10%	2	6	0.5760	0.59

Open in a new tab

V. Noisy Decoding

In this section, we consider decoding for noisy binary samples. For this, we introduce the noisy model in [13].

Definition 11: —

Define the probability transition function (resp. ) such that a group containing samples, of which are positive, the test result for the group is positive (resp. negative).

For the noiseless model discussed in the previous section, we have

There are several noisy models proposed in the literature (see, e.g., the monograph [13]). Among them, the dilution noise model might be a suitable one for the rt-PCR test. In the dilution noise model, the test result of a group containing Inline graphic positive samples follows a binomial distribution with parameters and . Intuitively, a positive sample included in a group can be “diluted” with probability . The parameter is called the dilution probability. The transition probability functions for the dilution noise model are

for all Inline graphic . Another way to view the dilution model is to view the bipartite graph (of the pooling matrix) as a random weighted graph, where the edge weights of the edges are independent Bernoulli random variables with parameter . In the following analysis, we say an edge is diluted (resp. not diluted) if its edge weight is 0 (resp. 1). When an edge is diluted, the sample end of that edge does not affect the testing result of the group end of that edge. On the other hand, when an edge is not diluted and its sample end is positive, then the group end of that edge is positive.

For the dilution model, there might be false negatives and false positives if we use the DD algorithm for decoding. This is because a positive sample might be diluted during the pooling process and thus mistakenly decoded as a negative sample. On the other hand, a negative sample might be pooled into a group with a false negative and thus be mistakenly decoded as a positive sample by the DD algorithm (that assumes the only remaining sample in a positive group is positive). In order to ensure that there are no false positives, we could only run the COMP step in the DD algorithm and have the un-decoded samples tested one at a time at the second stage. However, there are still false negatives due to dilution. To reduce the false negatives, we propose using the Inline graphic -combinatorial orthogonal matching pursuit (-COMP) algorithm (see Algorithm 4) that only decodes negative samples if there are in at least negative groups. When , this reduces to the original COMP step in the DD algorithm.

Algorithm 4. The -combinatorial orthogonal matching pursuit (-COMP) algorithm for diluted binary samples

Input An pooling matrix and a binary -vector of the group test result.
Output an -vector for the test results of the samples.
0: Initially, every sample is marked “un-decoded.”
1: If a sample is pooled in at least negative groups, then that sample is decoded to be negative.
2: For those “un-decoded” samples, test them one at a time.

Now we provide a probabilistic analysis of the Inline graphic -COMP algorithm. As in Section IV-D, we let be the probability that the group end of a randomly selected edge is negative. Excluding the randomly selected edge, there are remaining edges in that group. Conditioning on the event that edges of these remaining edges are not diluted, the probability that the group end of a randomly selected edge is negative is Inline graphic (as in (9)). Thus, we have

Following the argument in Section IV-D, let Inline graphic be the conditional probability that a sample cannot be decoded, given that the sample is a negative sample. Note that a negative sample can be decoded if at least of its edges are in negative groups, excluding its edges. Thus,

where Inline graphic is in (19). We note that (20) reduced to (11) when .

Now, we compute the false negative rate, Inline graphic , which is defined as the conditional probability that a sample is decoded to be negative, given that the sample is a positive sample. Consider a positive sample. Conditioning on the event that edges of these edges of this positive sample are diluted, the probability that this positive sample is decoded to be negative is

as shown in (20). Thus,

In particular, for Inline graphic , we have

Now, we compute the true positive rate, Inline graphic , or sensitivity, which is defined as the conditional probability that a sample is decoded to be positive, given that the sample is a positive sample.

The expected number of un-decoded samples after Step 1 of the Inline graphic -COMP algorithm is . Thus, the expected relative cost for the -COMP algorithm is

VI. Numerical Results

A. Noiseless decoding

In this section, we compare the performance of various pooling matrices by using the DD2 algorithm in Algorithm 2. The first four pooling matrices are constructed by using the PPoL algorithm in Algorithm 3 with the parameters Inline graphic , (23,4), (13,3), and (7,2), respectively. The fifth pooling matrix is the pooling matrix used in P-BEST [8]. The sixth matrix is the pooling matrix constructed by the Kirkman triples. The next two pooling matrices are used in Tapestry [9], [10]. The last pooling matrix is the 2D-pooling matrix in [7]. In Table III, we show the basic information of these pooling matrices. The size of an Inline graphic pooling matrix indicates that the number of groups is , and the number of samples is . The parameter is the number of groups in which a sample is pooled. On the other hand, is the number of samples in a group. Note that there are some pooling matrices that are not -regular. For instance, in the 2D-pooling matrix, there are 8 groups with 12 samples and 12 groups with 8 samples. Also, both the Inline graphic matrix and the matrix used in Tapestry are not -regular. The column marked with row cor. (resp. col. cor.) is the maximum of the inner product of two rows (resp. columns) in a pooling matrix. For a pooling matrix, the column marked with girth is the minimum length of a cycle in the bipartite graph corresponding to that pooling matrix. The column marked with (comp.) gain is the compressing gain Inline graphic of a pooling matrix, which is the ratio of the number of columns (samples) to the number of rows (groups), i.e., . As shown in Table III, both the row correlation and the column correlation of the pooling matrices constructed from the PPoL algorithm in Algorithm 3 are 1. So are the pooling matrix constructed by the Kirkman triples. Such a correlation result is expected from Proposition 7. On the other hand, the row correlation and the column correlation of the pooling matrix in P-BEST [8] are 6 and 2, respectively. Also, the girth of the pooling matrix in P-BEST is only 4, which is smaller than the other four matrices. The girth of the Inline graphic pooling matrix in Tapestry is also 4. This shows that the pooling matrices from the PPoL algorithm are more “spread-out” than the pooling matrix in P-BEST and the pooling matrix in Tapestry.

TABLE III. Basic Information of Some Pooling Matrices.

			row cor.	col. cor.	girth	(comp.) gain
PPoL-(31,3)	3	31	1	1	6	10.33
PPoL-(23,4)	4	23	1	1	6	5.75
PPoL-(13,3)	3	13	1	1	6	4.33
PPoL-(7,2)	2	7	1	1	8	3.5
P-BEST Matrix [8]	6	48	6	2	4	8
Kirkman Matrix	3	7	1	1	6	2.33
Tapestry Matrix [9]	2-3	6-9	3	2	4	2.5
Tapestry Matrix [9]	2-3	6-7	1	1	6	2.5
2D-pooling Matrix [7]	2	12(8)	1	1	8	4.8

Open in a new tab

In practical situations, the prevalence rates of COVID-19 are basically in the range of 0% to 5%. As such, we conduct 10,000 independent experiments for each value of the prevalence rate Inline graphic in this range to compare the performance of pooling matrices in Table III. Each numerical result is obtained by averaging over these 10,000 independent experiments. Thus, we believe the simulation results should be applicable to practical situations.

In Fig. 8, we show the (measured) conditional probability Inline graphic (that a sample cannot be decoded given it is a negative sample) for these pooling matrices. For the PPoL pooling matrices, the measured 's match extremely well with the theoretical results from (11). As shown in this figure, the Kirkman matrix and the two matrices in Tapestry have the best performance. This is because their Inline graphic 's (the number of samples in a group) are small (below 9 for these three matrices). As such, the probability that a group is tested negative is higher than the other pooling matrices. Note that these three matrices also have low (compressing) gains, 2.33-2.5. On the other hand, P-BEST has the worst performance for Inline graphic as the number of samples in a group for that matrix is 48, which is the largest among all these pooling matrices.

In Fig. 9, we show the (measured) conditional probability Inline graphic (that a sample cannot be decoded given it is a positive sample) for these pooling matrices. Once again, the Kirkman matrix and the two matrices in Tapestry have the best performance. This is mainly due to the low (compressing) gains of these three matrices. Though not shown in Fig. 9, we note that the measured Inline graphic 's are very close to those from (13), and thus the tree expansion in Fig. 7 is actually tree-like.

As discussed in Section IV-D, the probability that a sample cannot be decoded is Inline graphic . Such a probability is also the probability that a sample needs to go through the second stage for individual testing. In Fig. 10, we show the probability as a function of the prevalence rate for various pooling matrices. As shown in this figure, the Kirkman matrix and the two matrices in Tapestry have the best performance. Once again, this is mainly due to the low (compressing) gains of these three matrices. We note that it takes time to do the second test. The numerical results in Fig. 10 imply that using the Kirkman matrix (or the two matrices in Tapestry) has the shortest expected time to obtain a testing result.

Fig. 10. — The probability (that a sample cannot be decoded at the first stage and should be tested individually at the second stage) as a function of the prevalence rate for various pooling matrices.

A fair comparison of these pooling matrices is to measure their expected relative costs (defined in [6]). Recall that the expected relative cost is the ratio of the expected number of tests required by the group testing technique to the number of tests required by the individual testing. In Fig. 11, we show the (measured) expected relative costs for these pooling matrices. In this figure, we also plot the curve for the Dorfman two-stage algorithm (the black curve) with the optimal group size Inline graphic chosen from Table 1 of [6] for the prevalence rates, . To our surprise, the curves for the Kirkman matrix and the two matrices in Tapestry are above the black curve. This means that the expected relative costs of these three matrices are higher than the (optimized) Dorfman two-stage algorithm. Thus, if the additional amount of time to go through the second stage is not critical, using other pooling matrices could lead to more cost reduction than using these three matrices. There are several pooling matrices that have very low relative costs when the prevalence rates are below 1%. The P-BEST pooling matrix is one of them. However, the relative cost of the P-BEST pooling matrix increases dramatically when the prevalence rates are above 1.3%. Moreover, the P-BEST pooling matrix has a higher relative cost than the (optimized) Dorfman two-stage algorithm when the prevalence rate is above 2.5%. On the other hand, 2D-pooling has a very low relative cost when the prevalence rates are above 2.5%. To summarize, there does not exist a pooling matrix that has the lowest relative cost in the whole range of the prevalence rates considered in our experiments.

Fig. 11. — The expected relative cost as a function of the prevalence rate for various pooling matrices.

To optimize the performance, one should choose the right pooling matrix, depending on the prevalence rate. However, this might be difficult as the exact prevalence rate of a new outbreak of COVID-19 in a region might not be known in advance. Our suggestion is to use suboptimal PPoL matrices for a range of prevalence rates, as shown in Table IV. As shown in this table, the costs computed from the theoretical approximations in (14) and the costs measured from simulations are very close, and they are within 2% of the minimum costs for Inline graphic -regular pooling matrices in Table II. From our numerical results in Fig. 11, we suggest using the PPoL matrix with and when the prevalence rate is below 2%. In this range of prevalence rates, its expected relative cost is even smaller than that of P-BEST. Moreover, it can achieve an 8-fold reduction in test costs when the prevalence rate is near 1% (as shown in Table IV), and most samples can be decoded in the first stage (as shown in Fig. 10). When the prevalence rate Inline graphic is between 2%-4%, we suggest using the PPoL matrix with and . In this range of prevalence rates, using such a pooling matrix can still achieve (at least) a 3-fold reduction in test costs. Roughly, 17% of samples need to go through the second stage when the prevalence rate is near 4% (as shown in Fig. 10). When the prevalence rate Inline graphic is between 4%-7%, we suggest using the PPoL matrix with and , and it can still achieve (at least) a 2-fold reduction in test costs. When the prevalence rate is between 7%-10%, we suggest using the PPoL matrix with and . Though its expected relative cost is still lower than that of the Dorfman two-stage algorithm, the difference is small.

B. Noisy decoding

In this section, we compare the performance of various pooling matrices in the noisy case. The pooling matrices are the same as those in Section VI-A. We consider two dilution probabilities Inline graphic and in the dilution noise model. For the noisy decoding, we use the -COMP algorithm with and in Algorithm 4.

To compare the performance of these pooling matrices in the noisy case, we conduct 10,000 independent experiments for each value of the prevalence rate Inline graphic , ranging from 0% to 5%. Each numerical result is obtained by averaging over these 10,000 independent experiments. In Fig. 12, we show the sensitivity for these pooling matrices using the 1-COMP decoding algorithm. For , we observe that the performances of all pooling matrices are comparable. For Inline graphic , the PPoL(7,2) matrix and the 2D-pooling matrix have the best performance, while the P-BEST pooling matrix has the worst result when the prevalence rate is less than 1.5%. The results can be explained from the value of , the degree of each sample. Specifically, if one of the edges of a positive sample is diluted, then such a sample may be decoded as a negative one. Consequently, the larger Inline graphic results in the worse performance. In this figure, we also observe that the sensitivity increases in . The reason is that by using the 1-COMP decoding algorithm, there are more un-decoded samples as increases. Such un-decoded samples are tested individually at the second stage. This contributes to more true positive samples.

Fig. 12. — The sensitivity as a function of the prevalence rate for various pooling matrices under the dilution noise by using the 1-COMP decoding algorithm.

In Fig. 13, we show the expected relative costs for these pooling matrices using the 1-COMP decoding algorithm. When Inline graphic is below 1.5%, the value of has little effect on the expected relative costs for all pooling matrices. This is because the number of positive samples is small under low prevalence rates, and hence most of the samples can be decoded at the first stage. The same argument also explains that the higher (compressing) gain of the pooling matrix leads to a lower expected relative cost when Inline graphic is below 2%. Moreover, as increases, we observe that the expected relative costs of the P-BEST matrix and the PPoL(31,3) rise dramatically. The reason is that they have larger 's. Specifically, if a single group contains more samples, this group is more likely to be positive and thus cannot be decoded at the first stage.

In Fig. 14 and Fig. 15, we show the sensitivity and the expected relative costs, respectively, for the pooling matrices using the 2-COMP decoding algorithm. Compare with the results of Inline graphic , the sensitivity of shows a considerable improvement when . The reason is that for , a sample can be decoded as negative only when this sample is pooled in at least 2 negative groups. This greatly enhances the sensitivity, but the expected relative costs increase because more samples need to be tested at the second stage.

In Fig. 13 and Fig. 15, we also plot the curve for the Dorfman two-stage algorithm (the black curve) with its optimal group size for the prevalence rates, Inline graphic . We can see that when , the PPoL(31,3), the P-BEST matrix, the 2D-pooling matrix, and the PPoL(23,4) have lower expected relative costs than that of the Dorfman two-stage algorithm because of their higher (compressing) gains. When , none of these matrices outperforms the Dorfman two-stage algorithm in terms of the expected relative costs.

To sum up, in the dilution noise model, the sensitivity of the 1-COMP decoding algorithm in Algorithm 4 decrease significantly with respect to the increase of the dilution noise. Though using the 2-COMP decoding algorithm in Algorithm 4 results in a considerable improvement, the expected relative costs may be higher than those by the Dorfman two-stage algorithm. Thus, the simple Dorfman method might be a better strategy for pooled testing in a noisy setting.

VII. Conclusion

In this paper, we proposed a new family of PPoL polling matrices that have maximum column correlation and row correlation of 1 for a wide range of column weights. Using the two-stage definite defectives (DD2) decoding algorithm, we compare their performance with various pooling matrices proposed in the literature, including 2D-pooling [7], P-BEST [8], and Tapestry [9], [10]. Our numerical results showed no pooling matrix with the lowest expected relative cost in the whole range of the prevalence rates. To optimize the performance, one should choose the right pooling matrix, depending on the prevalence rate. As the family of PPoL matrices can dynamically adjust their construction parameters according to the prevalence rates, it seems that using such a family of pooling matrices might lead to better cost reduction than using a fixed pooling matrix. We also consider a noisy setting in this paper. Our numerical results show a trade-off between the high sensitivity and the low expected relative costs. As such, when the dilution noise is not negligible, the simple Dorfman method might be a better strategy for pooled testing.

In this paper, we only considered binary samples. For ternary samples, there are three test outcomes: negative (0), weakly positive (1), and strongly positive (2). It seems possible to extend the DD2 algorithm for binary samples to the setting with ternary samples by using successive cancellations.

Biographies

graphic file with name lin-3121709.gif

Yi-Jheng Lin received the B.S. degree in electrical engineering from National Tsing Hua University, Hsinchu, Taiwan, in 2018. He is currently working toward the Ph.D. degree with the Institute of Communications Engineering, National Tsing Hua University, Hsinchu, Taiwan. His research interests include wireless communication and cognitive radio networks.

graphic file with name yu-3121709.gif

Che-Hao Yu received the B.S. degree in mathematics and the M.S. degree in communications engineering from National Tsing-Hua University, Hsinchu, Taiwan, in 2018 and 2020, respectively. Since 2020, he has been with Phison Electronics Corp., Taiwan. His research interest focuses on 5G wireless communication.

graphic file with name liu-3121709.gif

Tzu-Hsuan Liu received the B.S. degree in communication engineering from National Central University, Taoyuan, Taiwan, in 2018 and the M.S. degree from the Institute of Communications Engineering, National Tsing Hua University, Hsinchu, Taiwan, in 2020. Since January 2021, she has been with MediaTek Inc., Hsinchu, Taiwan. Her research interest focuses on 5G wireless communication.

graphic file with name chang-3121709.gif

Cheng-Shang Chang (Fellow, IEEE) received the B.S. degree from National Taiwan University, Taipei, Taiwan, in 1983, and the M.S. and Ph.D. degrees from Columbia University, New York, NY, USA, in 1986 and 1989, respectively, all in electrical engineering. From 1989 to 1993, he was employed as a Research Staff Member with IBM Thomas J. Watson Research Center, Yorktown Heights, NY, USA. Since 1993, he has been with the Department of Electrical Engineering, National Tsing Hua University, Taiwan, where he is the Tsing Hua Distinguished Chair Professor. He is the author of the book Performance Guarantees in Communication Networks (Springer, 2000) and the coauthored of the book Principles, Architectures and Mathematical Theory of High Performance Packet Switches (Ministry of Education, R.O.C., 2006). His current research interests include concerned with network science, Big Data analytics, mathematical modeling of the Internet, and high-speed switching. Dr. Chang was the Editor of Operations Research from 1992 to 1999, the Editor of the IEEE/ACM Transactions on Networking from 2007 to 2009, and the Editor of the IEEE Transactions on Network Science and Engineering from 2014 to 2017. He is currently the Editor-at-Large for the IEEE/ACM Transactions on Networking. He is a Member of IFIP Working Group 7.3. He was the recipient of the IBM Outstanding Innovation Award in 1992, an IBM Faculty Partnership Award in 2001, and Outstanding Research Awards from the National Science Council, Taiwan, in 1998, 2000, and 2002, respectively. He was also the recipient of the Outstanding Teaching awards from both the College of EECS and the university itself in 2003. He was appointed as the first Y. Z. Hsu Scientific Chair Professor in 2002. He was the recipient of the Merit NSC Research Fellow Award from the National Science Council, R.O.C. in 2011. He was also the recipient of the the Academic Award in 2011 and the National Chair Professorship in 2017 from the Ministry of Education, R.O.C. He is the recipient of the 2017 IEEE INFOCOM Achievement Award.

graphic file with name chen-3121709.gif

Wen-Tsuen Chen (Life Fellow, IEEE) received the B.S. degree in nuclear engineering from National Tsing Hua University, Taiwan, and the M.S. and Ph.D. degrees in electrical engineering and computer sciences both from the University of California, Berkeley, in 1970, 1973, and 1976, respectively. He has been with the Department of Computer Science of National Tsing Hua University since 1976 and was the Chairman of the Department, the Dean of College of Electrical Engineering and Computer Science, and the President of National Tsing Hua University. In March 2012, he joined the Academia Sinica, Taiwan as a Distinguished Research Fellow of the Institute of Information Science until June 2018. He is currently the Sun Yun-suan Chair Professor of National Tsing Hua University. His research interests include computer networks, wireless sensor networks, mobile computing, and parallel computing. Dr. Chen was the recipient of the numerous awards for his academic accomplishments in computer networking and parallel processing, including Outstanding Research Award of the National Science Council, Academic Award in Engineering from the Ministry of Education, Technical Achievement Award and Taylor L. Booth Education Award of the IEEE Computer Society, and is currently the lifelong National Chair of the Ministry of Education, Taiwan. Dr. Chen is the Founding General Chair of the IEEE International Conference on Parallel and Distributed Systems and the General Chair of the IEEE International Conference on Distributed Computing Systems. He is a Fellow of the Chinese Technology Management Association.

Funding Statement

This work was supported in part by the Ministry of Science and Technology, Taiwan, under Grants 109-2221-E-007-091-MY2 and 108-2221-E-007-016-MY3, and in part by Qualcomm Technologies under Grant SOW NAT-435533.

Contributor Information

Yi-Jheng Lin, Email: s107064901@m107.nthu.edu.tw.

Che-Hao Yu, Email: chehaoyu@gapp.nthu.edu.tw.

Tzu-Hsuan Liu, Email: carina000314@gmail.com.

Cheng-Shang Chang, Email: cschang@ee.nthu.edu.tw.

Wen-Tsuen Chen, Email: wtchen@cs.nthu.edu.tw.

References

[1].“Coronavirus disease (COVID-19) outbreak,” 2020. [Online]. Available: https://www.who.int/emergencies/diseases/novel-coronavirus-2019
[2].Nishiura H., et al. , “Estimation of the asymptomatic ratio of novel coronavirus infections (COVID-19),” Int. J. Infect. Dis., vol. 94, 2020, Art. no. 154. [DOI] [PMC free article] [PubMed] [Google Scholar]
[3].Y.-C. Chen, P.-E. Lu, C.-S. Chang, and T.-H. Liu, “A time-dependent SIR model for COVID-19 with undetectable infected persons,” IEEE Trans. Netw. Sci. Eng., vol. 7, no. 4, pp. 3279–3294, Oct.–Dec. 2020, doi: 10.1109/TNSE.2020.3024723. [DOI] [PMC free article] [PubMed] [Google Scholar]
[4].“Pooled sample testing and screening testing for COVID-19,” 2020. [Online]. Available: https://www.fda.gov/medical-devices/coronavirus-covid-19-and-medical-devices/pooled-sample-testing-and-screening-testing-covid-19
[5].“Interim guidance for use of pooling procedures in SARS-CoV-2 diagnostic, screening, and surveillance testing,” 2020. [Online]. Available: https://www.cdc.gov/coronavirus/2019-ncov/lab/pooling-procedures.html
[6].Dorfman R., “The detection of defective members of large populations,” Ann. Math. Statist., vol. 14, no. 4, pp. 436–440, 1943. [Google Scholar]
[7].Sinnott-Armstrong N., Klein D., and Hickey B., “Evaluation of group testing for SARS-CoV-2 RNA,” Medrxiv, doi: 10.1101/2020.03.27.20043968, 2020. [DOI] [Google Scholar]
[8].Shental N., et al. , “Efficient high-throughput SARS-CoV-2 testing to detect asymptomatic carriers,” Sci. Adv., vol. 6, no. 37, 2020, Art. no. eabc5961. [DOI] [PMC free article] [PubMed] [Google Scholar]
[9].Ghosh S., et al. , “Tapestry: A single-round smart pooling technique for COVID-19 testing,” medRxiv, 2020. [Google Scholar]
[10].Ghosh S., et al. , “A compressed sensing approach to group-testing for COVID-19 detection,” 2020, arXiv:2005.07895.
[11].Yi J., Mudumbai R., and Xu W., “Low-cost and high-throughput testing of COVID-19 viruses and antibodies via compressed sensing: System concepts and computational experiments,” 2020, arXiv:2004.05759.
[12].Heidarzadeh A. and Narayanan K., “Two-stage adaptive pooling with RT-qPCR for COVID-19 screening,” in Proc. IEEE Int. Conf. Acoust., Speech Signal Process., 2021.
[13].Aldridge M., Johnson O., and Scarlett J., “Group testing: An information theory perspective,” Found. Trends Commun. Inf. Theory, vol. 15, no. 3-4, pp. 196–392, 2019. [Google Scholar]
[14].Lohse S., et al. , “Pooling of samples for testing for SARS-CoV-2 in asymptomatic people,” Lancet Infect. Dis., vol. 20, no. 11, pp. 1231–1232, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
[15].Abdalhamid B., Bilder C. R., McCutchen E. L., Hinrichs S. H., Koepsell S. A., and Iwen P. C., “Assessment of specimen pooling to conserve SARS CoV-2 testing resources,” Amer. J. Clin. Pathol., vol. 153, no. 6, pp. 715–718, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
[16].Yelin I., et al. , “Evaluation of COVID-19 RT-qPCR test in multi-sample pools,” Clin. Infect. Dis., vol. 71, no. 16, pp. 2073–2078, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
[17].Gollier C. and Gossner O., “Group testing against COVID-19,” Covid Econ., vol. 2, 2020. [Google Scholar]
[18].Reed I. S. and Solomon G., “Polynomial codes over certain finite fields,” J. Soc. Ind. Appl. Math., vol. 8, no. 2, pp. 300–304, 1960. [Google Scholar]
[19].Y.-J. Lin and C.-S. Chang, “PPoL: A periodic channel hopping sequence with nearly full rendezvous diversity,” in Proc. 30th Wireless Opt. Commun. Conf., 2021.
[20].C.-S. Chang, Liao W., and C.-M. Lien, “On the multichannel rendezvous problem: Fundamental limits, optimal hopping sequences, and bounded time-to-rendezvous,” Math. Oper. Res., vol. 40, no. 1, pp. 1–23, 2015. [Google Scholar]
[21].C.-S. Chang, D.-S. Lee, and Wang C., “Asynchronous grant-free uplink transmissions in multichannel wireless networks with heterogeneous qos guarantees,” IEEE/ACM Trans. Netw., vol. 27, no. 4, pp. 1584–1597, Aug. 2019. [Google Scholar]
[22].C.-S. Chang, J.-P. Sheu, and Y.-J. Lin, “On the theoretical gap of channel hopping sequences with maximum rendezvous diversity in the multichannel rendezvous problem,” IEEE/ACM Trans. Netw., vol. 29, no. 4, pp. 1620–1633, Aug. 2021. [Google Scholar]
[23].Singer J., “A theorem in finite projective geometry and some applications to number theory,” Trans. Amer. Math. Soc., vol. 43, no. 3, pp. 377–385, 1938. [Google Scholar]
[24].Hong D., Dey R., Lin X., Cleary B., and Dobriban E., “HYPER: Group testing via hypergraph factorization applied to COVID-19,” Medrxiv, 2021. [DOI] [PMC free article] [PubMed] [Google Scholar]
[25].Täufer M., “Rapid, large-scale, and effective detection of COVID-19 via non-adaptive testing,” J. Theor. Biol., vol. 506, 2020, Art. no. 110450. [DOI] [PMC free article] [PubMed] [Google Scholar]
[26].Euler L., “Recherches sur une nouvelle espece de quarres magiques,” Verhandelingen uitgegeven door het zeeuwsch Genootschap der Wetenschappen te Vlissingen, vol. 9, pp. 85–239, 1782. [Google Scholar]
[27].Kautz W. and Singleton R., “Nonrandom binary superimposed codes,” IEEE Trans. Inf. Theory, vol. 10, no. 4, pp. 363–377, Oct. 1964. [Google Scholar]

[ref1] [1].“Coronavirus disease (COVID-19) outbreak,” 2020. [Online]. Available: https://www.who.int/emergencies/diseases/novel-coronavirus-2019

[ref2] [2].Nishiura H., et al. , “Estimation of the asymptomatic ratio of novel coronavirus infections (COVID-19),” Int. J. Infect. Dis., vol. 94, 2020, Art. no. 154. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref3] [3].Y.-C. Chen, P.-E. Lu, C.-S. Chang, and T.-H. Liu, “A time-dependent SIR model for COVID-19 with undetectable infected persons,” IEEE Trans. Netw. Sci. Eng., vol. 7, no. 4, pp. 3279–3294, Oct.–Dec. 2020, doi: 10.1109/TNSE.2020.3024723. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref4] [4].“Pooled sample testing and screening testing for COVID-19,” 2020. [Online]. Available: https://www.fda.gov/medical-devices/coronavirus-covid-19-and-medical-devices/pooled-sample-testing-and-screening-testing-covid-19

[ref5] [5].“Interim guidance for use of pooling procedures in SARS-CoV-2 diagnostic, screening, and surveillance testing,” 2020. [Online]. Available: https://www.cdc.gov/coronavirus/2019-ncov/lab/pooling-procedures.html

[ref6] [6].Dorfman R., “The detection of defective members of large populations,” Ann. Math. Statist., vol. 14, no. 4, pp. 436–440, 1943. [Google Scholar]

[ref7] [7].Sinnott-Armstrong N., Klein D., and Hickey B., “Evaluation of group testing for SARS-CoV-2 RNA,” Medrxiv, doi: 10.1101/2020.03.27.20043968, 2020. [DOI] [Google Scholar]

[ref8] [8].Shental N., et al. , “Efficient high-throughput SARS-CoV-2 testing to detect asymptomatic carriers,” Sci. Adv., vol. 6, no. 37, 2020, Art. no. eabc5961. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref9] [9].Ghosh S., et al. , “Tapestry: A single-round smart pooling technique for COVID-19 testing,” medRxiv, 2020. [Google Scholar]

[ref10] [10].Ghosh S., et al. , “A compressed sensing approach to group-testing for COVID-19 detection,” 2020, arXiv:2005.07895.

[ref11] [11].Yi J., Mudumbai R., and Xu W., “Low-cost and high-throughput testing of COVID-19 viruses and antibodies via compressed sensing: System concepts and computational experiments,” 2020, arXiv:2004.05759.

[ref12] [12].Heidarzadeh A. and Narayanan K., “Two-stage adaptive pooling with RT-qPCR for COVID-19 screening,” in Proc. IEEE Int. Conf. Acoust., Speech Signal Process., 2021.

[ref13] [13].Aldridge M., Johnson O., and Scarlett J., “Group testing: An information theory perspective,” Found. Trends Commun. Inf. Theory, vol. 15, no. 3-4, pp. 196–392, 2019. [Google Scholar]

[ref14] [14].Lohse S., et al. , “Pooling of samples for testing for SARS-CoV-2 in asymptomatic people,” Lancet Infect. Dis., vol. 20, no. 11, pp. 1231–1232, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref15] [15].Abdalhamid B., Bilder C. R., McCutchen E. L., Hinrichs S. H., Koepsell S. A., and Iwen P. C., “Assessment of specimen pooling to conserve SARS CoV-2 testing resources,” Amer. J. Clin. Pathol., vol. 153, no. 6, pp. 715–718, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref16] [16].Yelin I., et al. , “Evaluation of COVID-19 RT-qPCR test in multi-sample pools,” Clin. Infect. Dis., vol. 71, no. 16, pp. 2073–2078, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref17] [17].Gollier C. and Gossner O., “Group testing against COVID-19,” Covid Econ., vol. 2, 2020. [Google Scholar]

[ref18] [18].Reed I. S. and Solomon G., “Polynomial codes over certain finite fields,” J. Soc. Ind. Appl. Math., vol. 8, no. 2, pp. 300–304, 1960. [Google Scholar]

[ref19] [19].Y.-J. Lin and C.-S. Chang, “PPoL: A periodic channel hopping sequence with nearly full rendezvous diversity,” in Proc. 30th Wireless Opt. Commun. Conf., 2021.

[ref20] [20].C.-S. Chang, Liao W., and C.-M. Lien, “On the multichannel rendezvous problem: Fundamental limits, optimal hopping sequences, and bounded time-to-rendezvous,” Math. Oper. Res., vol. 40, no. 1, pp. 1–23, 2015. [Google Scholar]

[ref21] [21].C.-S. Chang, D.-S. Lee, and Wang C., “Asynchronous grant-free uplink transmissions in multichannel wireless networks with heterogeneous qos guarantees,” IEEE/ACM Trans. Netw., vol. 27, no. 4, pp. 1584–1597, Aug. 2019. [Google Scholar]

[ref22] [22].C.-S. Chang, J.-P. Sheu, and Y.-J. Lin, “On the theoretical gap of channel hopping sequences with maximum rendezvous diversity in the multichannel rendezvous problem,” IEEE/ACM Trans. Netw., vol. 29, no. 4, pp. 1620–1633, Aug. 2021. [Google Scholar]

[ref23] [23].Singer J., “A theorem in finite projective geometry and some applications to number theory,” Trans. Amer. Math. Soc., vol. 43, no. 3, pp. 377–385, 1938. [Google Scholar]

[ref24] [24].Hong D., Dey R., Lin X., Cleary B., and Dobriban E., “HYPER: Group testing via hypergraph factorization applied to COVID-19,” Medrxiv, 2021. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref25] [25].Täufer M., “Rapid, large-scale, and effective detection of COVID-19 via non-adaptive testing,” J. Theor. Biol., vol. 506, 2020, Art. no. 110450. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref26] [26].Euler L., “Recherches sur une nouvelle espece de quarres magiques,” Verhandelingen uitgegeven door het zeeuwsch Genootschap der Wetenschappen te Vlissingen, vol. 9, pp. 85–239, 1782. [Google Scholar]

[ref27] [27].Kautz W. and Singleton R., “Nonrandom binary superimposed codes,” IEEE Trans. Inf. Theory, vol. 10, no. 4, pp. 363–377, Oct. 1964. [Google Scholar]

PERMALINK

Constructions and Comparisons of Pooling Matrices for Pooled Testing of COVID-19

Yi-Jheng Lin

Che-Hao Yu

Tzu-Hsuan Liu

Cheng-Shang Chang

Wen-Tsuen Chen

Abstract

I. Introduction

TABLE IV. Suboptimal PPoL Pooling Matrices. : Prevalence Rates; and : Parameters of PPoL Pooling Matrices; Cost (14): Costs Computed From the Theoretical Approximations in (14); Cost (sim): Costs Measured From Simulations; Dorfman [6]: Costs by the Dorfman Two-Stage Algorithm.

II. Review of Group Testing

A. The problem statement

Fig. 1.

B. The definite defectives (DD) decoding algorithm

Algorithm 1. The definite defectives (DD) algorithm for binary samples

Fig. 2.

Fig. 3.

Proposition 1: —

Algorithm 2. The two-stage definite defectives (DD2) algorithm for binary samples

III. Related Works

TABLE I. Arrangement of the 9 Samples in a Rectangular Grid.

IV. PPOL Constructions of Pooling Matrices

A. Perfect difference sets and finite projective planes

Definition 2: —

Definition 3: —

B. The construction algorithm

Lemma 4: —

Algorithm 3. The PPoL algorithm

Example 5: —

Fig. 4.

Proposition 6: —

Proposition 7: —

Corollary 8: —

Theorem 9: —

C. Connection between the PPoL algorithm and the shifted transversal design

Example 10: —

Fig. 5.

D. Probabilistic analysis of the PPoL pooling matrices

Fig. 6.

Fig. 7.

TABLE II. The -Regular Pooling Matrix With the Lowest Expected Relative Cost From (14).

V. Noisy Decoding

Definition 11: —

Algorithm 4. The -combinatorial orthogonal matching pursuit (-COMP) algorithm for diluted binary samples

VI. Numerical Results

A. Noiseless decoding

TABLE III. Basic Information of Some Pooling Matrices.

Fig. 8.

Fig. 9.

Fig. 10.

Fig. 11.

B. Noisy decoding

Fig. 12.

Fig. 13.

Fig. 14.

Fig. 15.

VII. Conclusion

Biographies

Funding Statement

Contributor Information

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases