To the Editor—Pooling samples has been proposed by multiple authors as an efficient way to test for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) [1–4]. In particular, Yelin et al [1] showed that SARS-CoV-2 can be detected in pools with up to 32 samples and potentially in pools of 64 samples. They concluded that “this pooling method can be applied immediately in current clinical testing laboratories.” However, this research [1] and similar research of others [2, 3] missed answering a very important question: How does one choose the most efficient pool size relative to SARS-CoV-2 prevalence in samples? Without answering this question, laboratories cannot fully benefit from pooling. Here, we provide the answer so that laboratories can increase their testing capacity to its fullest potential.
The efficiencies from pooling samples occur when pools test negative. In general, the probability of a negative pool ( is given by for a prevalence () and pool size () [5]. For example, the most efficient pool size is 4 samples when prevalence is 10% (calculation discussed below). This will lead to 66% of the pools testing negative on average, resulting in 3 tests saved for each negative pool. On the other hand, choosing a pool size that is too large can be very inefficient. By changing the size to 32 samples in our example, only 3% of the pools will test negative. We subsequently show that there are no benefits from using this pool size with this prevalence. Similar inefficiencies occur as well when selecting pool sizes that are too small.
Yelin et al [1] identified a range of pool sizes that appear to not compromise testing sensitivity. From this range, one needs to determine the optimal pool size to perform testing most efficiently. Statistical research has shown, in general, that this is the pool size that minimizes the average number of tests on a per capita basis ( when testing a continuous series of samples, where is a mathematical function of prevalence [5–7]. Separate testing of each sample corresponds to , and pooling is more efficient when Expressions for are available [5–7], and the optimal pool size can be approximated by the next integer larger than [8] or found exactly [9, 10].
Table 1 provides for prevalences between 0.001 and 0.20. For example, a prevalence of 2% results in an optimal pool size of 8 and = 0.27. This corresponds to a 73% average reduction in tests from pooling. Equivalently, this can mean a 264% increase in testing capacity when compared with testing samples separately. Table 1 also includes for the same pool sizes as investigated by Yelin et al [1]. These additional results illustrate the importance of choosing pool size relative to prevalence. For example, while SARS-CoV-2 can be detected in pools of size 32, this size is optimal only for the smallest prevalence. In fact, for prevalences larger than 0.10, indicating that pooling results in more tests on average than separate testing.
Table 1.
Prevalence (%) | Optimal | for Specified Pool Size | ||||||
---|---|---|---|---|---|---|---|---|
Pool Size | 2 | 4 | 8 | 16 | 32 | 64 | ||
0.1 | 32 | 0.06 | 0.50 | 0.25 | 0.13 | 0.08 | 0.06 | 0.08 |
0.5 | 15 | 0.14 | 0.51 | 0.27 | 0.16 | 0.14 | 0.18 | 0.29 |
1 | 11 | 0.20 | 0.52 | 0.29 | 0.20 | 0.21 | 0.31 | 0.49 |
2 | 8 | 0.27 | 0.54 | 0.33 | 0.27 | 0.34 | 0.51 | 0.74 |
3 | 6 | 0.33 | 0.56 | 0.36 | 0.34 | 0.45 | 0.65 | 0.87 |
4 | 6 | 0.38 | 0.58 | 0.40 | 0.40 | 0.54 | 0.76 | 0.94 |
5 | 5 | 0.43 | 0.60 | 0.44 | 0.46 | 0.62 | 0.84 | 0.98 |
6 | 5 | 0.47 | 0.62 | 0.47 | 0.52 | 0.69 | 0.89 | 1.00 |
7 | 4 | 0.50 | 0.64 | 0.50 | 0.57 | 0.75 | 0.93 | 1.01 |
8 | 4 | 0.53 | 0.65 | 0.53 | 0.61 | 0.80 | 0.96 | 1.01 |
9 | 4 | 0.56 | 0.67 | 0.56 | 0.65 | 0.84 | 0.98 | 1.01 |
10 | 4 | 0.59 | 0.69 | 0.59 | 0.69 | 0.88 | 1.00 | 1.01 |
11 | 4 | 0.62 | 0.71 | 0.62 | 0.73 | 0.91 | 1.01 | 1.02 |
12 | 4 | 0.65 | 0.73 | 0.65 | 0.77 | 0.93 | 1.01 | 1.02 |
13 | 3 | 0.67 | 0.74 | 0.68 | 0.80 | 0.95 | 1.02 | 1.02 |
14 | 3 | 0.70 | 0.76 | 0.70 | 0.83 | 0.97 | 1.02 | 1.02 |
15 | 3 | 0.72 | 0.78 | 0.73 | 0.85 | 0.99 | 1.03 | 1.02 |
16 | 3 | 0.74 | 0.79 | 0.75 | 0.88 | 1.00 | 1.03 | 1.02 |
17 | 3 | 0.76 | 0.81 | 0.78 | 0.90 | 1.01 | 1.03 | 1.02 |
18 | 3 | 0.78 | 0.83 | 0.80 | 0.92 | 1.02 | 1.03 | 1.02 |
19 | 3 | 0.80 | 0.84 | 0.82 | 0.94 | 1.03 | 1.03 | 1.02 |
20 | 3 | 0.82 | 0.86 | 0.84 | 0.96 | 1.03 | 1.03 | 1.02 |
Calculations are performed using the binGroup2 package [10] of the R statistical software environment. Abbreviation: A, average number of tests per capita.
Notes
Financial support. This work was supported by National Institute of Allergy and Infectious Diseases at the National Institutes of Health (R01 AI121351).
Potential conflicts of interest. The authors: No reported conflicts of interest. All authors have submitted the ICMJE Form for Disclosure of Potential Conflicts of Interest.
Conflicts that the editors consider relevant to the content of the manuscript have been disclosed
References
- 1. Yelin I, Aharony N, Shaer Tamar E, et al. Evaluation of COVID-19 RT-qPCR test in multi-sample pools. Clin Infect Dis 2020. doi: 10.1093/cid/ciaa531/5828059 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Lohse S, Pfuhl T, Berkó-Göttel B, et al. Pooling of samples for testing for SARS-CoV-2 in asymptomatic people. Lancet Infect Dis 2020. doi: 10.1016/S1473-3099(20)30362-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Hogan CA, Sahoo MK, Pinsky BA. Sample pooling as a strategy to detect community transmission of SARS-CoV-2. J Am Med Assoc 2020; 323: 1967–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Abdalhamid B, Bilder CR, McCutchen EL, Hinrichs SH, Koepsell SA, Iwen PC. Assessment of specimen pooling to conserve SARS CoV-2 testing resources. Am J Clin Pathol 2020; 153:715–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Kim HY, Hudgens MG, Dreyfuss JM, Westreich DJ, Pilcher CD. Comparison of group testing algorithms for case identification in the presence of test error. Biometrics 2007; 63:1152–63. [DOI] [PubMed] [Google Scholar]
- 6. Hitt BD, Bilder CR, Tebbs JM, McMahan CS. The objective function controversy for group testing: much ado about nothing? Stat Med 2019; 38:4912–23. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Bilder CR. Group testing for identification. Wiley StatsRef: Statistics Reference Online 2019. doi: 10.1002/9781118445112.stat08227 [DOI] [Google Scholar]
- 8. Finucan HM. The blood testing problem. Appl Stat 1964; 13:43–50. [Google Scholar]
- 9. Hitt BD, Bilder CR, Tebbs JM, McMahan CS. A Shiny appfor pooled testing.2020. [updated 2020 May 26; cited 2020 Jun 3]. Available at: https://www.chrisbilder.com/shiny. Accessed 3 June 2020.
- 10. Hitt BD, Bilder CR, Schaarschmidt F, Biggerstaff BJ, Tebbs JM, McMahan CS. binGroup2: identification and estimation using group testing. 2020. Available at: https://CRAN.R-project.org/package=binGroup2. Accessed 3 June 2020.