Skip to main content
Cambridge University Press - PMC COVID-19 Collection logoLink to Cambridge University Press - PMC COVID-19 Collection
. 2021 Jan 5;149:e17. doi: 10.1017/S0950268820003155

A discussion on implementing pooling detection tests of novel coronavirus (SARS-CoV-2) for a large population

W K Chow 1,, C L Chow 2
PMCID: PMC7844152  PMID: 33397529

Abstract

A pooled sample analysis strategy for novel coronavirus (severe acute respiratory syndrome-coronavirus-2 (SARS-CoV-2)) is proposed for a large population in this paper. The population to be tested is divided into divisions based on earlier observed detection rate of SARS-CoV-2 first. Samples collected are then grouped in appropriate pooled size. The number of tests per person in that population is expressed as a function of two variables: the observed detection rate and the pooled size or number of samples grouped. The minimum number of tests per person can be further shown to be a function of only one of these two variables, because these two parameters are found to be related at this minimum. A management scheme on grouping the samples is proposed in order to reduce the number of tests, to save time, which is of utmost importance in fighting an epidemic. The proposed testing scheme will be useful for supporting the government in making decisions to handle regular routine detection tests for identifying asymptomatic patients and implementing health code system in large population of millions of citizens. Another important point is to use smaller number of test kits, allowing more resources to speed up the mass screening tests, particularly in places not so rich.

Key words: Coronavirus, COVID-19, laboratory tests, mathematical modelling

Introduction

Spreading of the novel coronavirus (severe acute respiratory syndrome-coronavirus-2 (SARS-CoV-2)) and the related disease coronavirus disease 2019 (COVID-19) among people is very fast [1]. The origin of such virus is unknown with the transmission route not yet fully understood, though some physical models have been proposed [2, 3] to explain the virus characteristics. There were only 0.75 million confirmed cases in the end of March, 25 million in the end of August, 50 million in the beginning of November and over 70 million in December 2020 all over the world [1]. Lockdown measures were imposed [4] to slow its spread. Most countries have to close their borders with appropriate quarantine containment schemes [5, 6] for weeks, affecting normal life and business activities. The collapse [7] of global health systems caused by COVID-19 could unleash synergistic public health crises (e.g. cholera) caused by both known and unknown opportunistic pathogens.

Millions of detection tests are needed to confirm infection and pick out asymptomatic patients (AP) for each wave of infection, practising universal testing and implementing health codes [8] in large populations. A data-driven approach reveals three key strategies in tackling COVID-19 by Lai and Cheong [9]. The scale of testing has to be suggested by health experts to contain the virus for better effect. Slow testing due to spending a long time on queueing means that AP continue to spread the virus in the community, making the battle against COVID-19 more difficult because of the hidden carriers. A large number of detection tests are required regularly for safely keeping economic activities, travelling, normal school activities and others, particularly on small and medium-sized enterprises (SME). High damages to SME are observed all over the world because the owners have to pay the rent, mortgage, staff salary and other expenses even when the shops are closed. Airlines are closed with aeroplanes parking at dry deserts. Globalisation is seriously affected under such conditions. Different views on the impact of different waves of infection [4, 7] were also reported. Quick identification tests for the SARS-CoV-2 virus is needed regularly for millions of people.

However, a test will normally take over 6 h to complete. The associated testing cost and the cost of the test kit are high, about US$3 to US$5 in Mainland China but over US$30 [10] in many other countries. There are difficulties in getting adequate number of test kits in some countries. The maximum number of tests the government can handle in many places is limited, say only 10 000 a day in the Hong Kong Special Administrative Region (HKSAR) with a population of 7 million in the end of July. With great effort made by the HKSAR government, the number of tests increased to over 30 000 per day in August. This was to be increased to a minimum of 100 000 per day in September with strong support from Mainland China [11].

Pooling tests were reported years ago [12] with application to COVID-19 recently [1316]. Such testing methodology is used in many places [17] by combining the samples to save time (particularly the queueing time) and resources. Different pool sizes [13, 15, 16] are summarised as shown in Table 1.

Table 1.

Brief summary on pool size used in the literature

References Pool sample
Pikovski and Bentele [13] 3, 4 and 5
Yelin et al. [15] 32 and 64
Deckert et al. [16] 3−25, can be up to 27
Adikari et al. [20] 32
Elsevier [27] 10
Majid et al. [28] Up to 64
Mishra et al. [29] 30

Optimal parameters for group testing of pooled specimens for the detection of SARS-CoV-2 were established recently [18]. The most efficient pool size was determined to be five specimens using a web-based application. This is appropriate [19] for testing asymptomatic and return-to-work/school cases, where the prevalence of COVID-19 may be relatively low, large groups of clinical samples can be classified as negative with a single test, with no need to test every sample individually. Group testing can save reagents and personnel time with an overall increase in testing capability of at least 69%. Different pooling strategies [20] can lead to increased process efficiencies for COVID-19 clinical diagnostic testing.

Mathematical strategy of such pooling tests applied to COVID-19 was briefly outlined [14, 21]. There were some criticisms on using the proposal [22] in the HKSAR in reducing the testing time. In fact, quick, reliable and cheap detection testing schemes with smaller number of tests can only be worked out by appropriate safety management scheme. Further works are needed to evaluate whether such mathematical strategy [14, 21] is viable.

Randomised group testing optimised per country could double the number of tested individuals from 1.85 million to 3.7 million using only 671 000 tests [23]. Pooling approaches for SARS-CoV-2 testing allow a drastic increase in throughput while maintaining clinical sensitivity. Successful large-scale pooled screening of asymptomatic populations [24] was reported. As a clinic or hospital is unable to handle a large number of tests, special testing centres are established for testing COVID-19 in many places including Hong Kong.

In this paper, the population to be tested is proposed to be divided into divisions based on the observed recent detection rate of SARS-CoV-2. The number of tests per person for such pooling detection tests can be expressed as a function of two parameters. The parameters are the observed detection rate of getting one positive result in testing a group of m people, and the pooled size or number of samples n in putting together for a pool test in that group of collected samples. The mathematical aspects [25, 26] of the function are discussed in the following sections. A management scheme can be worked out to reduce the number of tests so as to consume a smaller number of test kits and shorten the queuing time so that citizens need not wait for several days to get results. This approach of pooling tests is very appropriate in practising health code system [8] in handling large number of tests regularly.

Mathematical analysis

A population of P people to be tested is divided into (P/m) divisions each of m people first. The total number of tests in this population with a group testing scheme is L. This gives the average number of tests per person, to be expressed as a function f(m, n) of two variables:

graphic file with name S0950268820003155_eqn1.jpg 1

The two variables are:

  • m, the average number of people per 1 positive detection in recent test; and

  • n, pooled size or number of samples grouped together for a pool test.

Samples collected from each person in each division are grouped by n persons for pool testing. There will be about yi positive results in a division i.

In the jth division, the number of positive tests yj is most probably 1. However, there might be some divisions having 0, 2 or 3 in a large population P. The average value of yj is 1. This is because the infected patients are not uniformly distributed in each division. If the government implemented better quarantine containment scheme [5, 6], m can be increased, say instead of having 1 positive result in 100 tests, to 1 positive detection testing in 1000 people. That is, m can be increased from 100 to 1000.

The number of tests Lj in that jth division [26] is

graphic file with name S0950268820003155_eqn2.jpg 2

Summing over all divisions for a large sample of P, say 1 million, with j from 1 to P/m gives

graphic file with name S0950268820003155_eqn3.jpg 3

Summing over all Lj in equation (2) will give L as

graphic file with name S0950268820003155_eqnU1.jpg

or

graphic file with name S0950268820003155_eqn4.jpg 4

by putting equation (3) into equation (4) will give L.

Putting equation (4) into equation (1) gives f(m, n) as

graphic file with name S0950268820003155_eqnU2.jpg

or

graphic file with name S0950268820003155_eqn5.jpg 5

Curves of f(m, n) against the pooled sized n with some values of m = 50, 100 and 150 are shown in Figure 1. As shown in the figure, the minimum value of f(m, n) for m = 100 is 0.2 when n = 10. That means by pooling 10 samples together, only 20% of tests are required in comparing to doing one test for each person in the population.

Fig. 1.

Fig. 1.

Curves of f(m, n) against pooled size n.

In keeping n at some viable values such as n = 5, 10 and 15, variation of f(m, n) against m is shown in Figure 2. Asymptomatic value of f(m, n) is 1/n as shown in the figure.

Fig. 2.

Fig. 2.

f(m, n) against m.

Minimum values

The minimum value of f(m, n) can be found by differentiating equation (5) w.r.t. n, which gives:

graphic file with name S0950268820003155_eqn6.jpg 6

The minimum value of f(m, n) is found for

graphic file with name S0950268820003155_eqn7.jpg 7

The values of n and m are related by equation (7). It is interesting to see the pool size can be determined by the division size m, which is related to the observed detection rate.

The minimum value for f(m, n) is 2/n or 2/Inline graphic, which is reduced to a function of only one variable, say on m to give fmin(m) as

graphic file with name S0950268820003155_eqn8.jpg 8

fmin(m) is plotted against m in Figure 3.

Fig. 3.

Fig. 3.

fmin(m) against m.

Discussion on parameters m and n

Different values of pool size n were suggested and used in the literature [1517]. Pool size of 3−25 was discussed [16] as in Table 1. Other values [2529] had also been reported. Studies [25] have shown that an individual positive sample can still be detected in pools of up to 32 samples, and possibly even 64 samples provided additional polymerase chain reaction (PCR) amplification cycles. A big pool of 64 samples was used [15, 28]. Value of pooling 10 samples [27] is common, but 30 samples were also used [29].

Values of m are different for different risk groups (higher risk for medical professionals or catering services) at different transmission stages. At the beginning stage of virus transmission, the number of confirmed cases is high with patients having obvious symptoms. In testing people in places starting virus transmission [30] at a certain wave, the rate of getting positive results can be high, say one in 100 tests in HKSAR. At a later stage of transmission, the number of confirmed cases decreased to one in 200 tests. The values of n in pooling tests can now be derived from observed m as discussed in the above section.

For populations in places of lower risk, the value of m is larger. For example, 107 AP were identified [31] in testing more than 4 million citizens at Dalian. This gives a low rate of positive results, about one in 40 000, i.e. m is 40 000. As the value of m was changed, the division number can be changed accordingly at different waves of transmission.

The management scheme on detection test

A safety management scheme on identifying infected patients with small number of tests is proposed based on the observed detection rate. For a detection rate of one positive result in m tests the scheme is as follows:

  • Divide the collected population samples P into (P/m) divisions each of m people.

  • The individual testing samples collected are split into two parts with the first part kept first.

  • The second part is grouped with others to give n (n2 = m) samples for pool test.

  • If the test result is negative, the first part of all n samples can be destroyed.

  • If there are positive results in the group, the first part of the n samples will be tested individually.

  • The minimum number of tests per person is only 2/Inline graphic as discussed above.

In a division of 100 people, samples collected can be grouped as Inline graphic or 10 samples in a pool to test. The number of tests can be reduced to only 20% for individual tests of the population. Reducing the number of tests will consume less test kits, shorten the queuing time for doing the test and reduce the total cost. That means instead of testing 300 000 people a day, the maximum number of people tested can be 1 500 000 a day. A population of 7 million takes less than 5 days to test all citizens.

Testing kits usually have a limit of detection sensitivity, which limits the pool size n. Thus the minimum number of tests, or equivalently the maximum value of pool size n as obtained by mathematical analysis in this paper is only a theoretical value. The pool size is constrained by sensitivity limit of the test, and n could be less than the theoretical value. An initial value of n can be estimated first using the curves in Figure 1 under the ideal scenario. Observed detection limit and testing efficiency are then included to tune up the value for optimisation. The maximum value of m depends on the maximum value of n. At the moment, n of 64 is observed to be viable, giving m of n2 as 64 × 64 or 4096. There is a maximum value of n so maximum m is n2. Also, there is asymptomatic value of f in view of Figures 2 and 3 as m increases but with n fixed because the number of re-test becomes small as m increases.

Governments are always under criticisms not to arrange carrying out detection tests fast. They are also challenged to have management strategy without scientific support. Recommending a pooled testing strategy would be interesting to the governments to conduct interventions in a way that is more resource friendly. Mathematical studies to work out appropriate testing management strategy as above can assist them in convincing the citizens.

Further, issuing health codes [8] for all citizens can ensure that they are safe to travel. But a large number of tests are required regularly because the code is only valid for 3 days at most. For people coming from higher risk areas or groups, the health code only lasts for 1 day, depending on the safety management scheme. The government can implement a scheme for citizens on getting health code for travelling, depending on the maximum number of tests that can be arranged per day.

Conclusions

In view of the high transmission rate of COVID-19, pooling detection tests is more efficient for a large population. This paper aims at providing an optimal way for testing samples that could release some manpower and potentially speed up the process of testing. The population to be tested is proposed to be divided into divisions based on the recently observed detection rate. The average number of tests per person can be expressed as a function of the observed detection rate and the pool size. The number of people m having one positive test result is used in determining the number of divisions in the population and pool size n. This scheme can save a lot of time as the number of tests can be greatly reduced. The scheme is therefore very appropriate for identifying AP in a large population, particularly in some places having difficulties in getting test kits.

Health codes are implemented with the code valid for a certain period but it is reduced from 7 days to 3 days in many places. Consequently, a large number of tests are required for millions of people. If the government can handle 0.3 million tests a day, implementing this pooling detection scheme can test 1.5 million samples a day for keeping the health code system viable. This testing strategy is on handling millions of citizens quickly for implementing the health code system which is valid only for a few days. Another important point is to use a smaller number of test kits. This would be useful for government in places with limited resources to implement mass screening for millions of their citizens. The whole idea should be at least not losing any ability in detection, but with acceleration in testing.

This is the first stage of study for implementing universal testing and health code for travelling by using smaller number of test kits. A mass testing scheme proposed to accelerate testing needs support from mathematical principles. Further study should go beyond the mathematics, adding more scenarios or sensitivity analysis to make this scheme more realistic. A more rigorous approach is to compare the pooled method to the unpooled method. The pooled testing should not be significantly inferior to those approved testing. More resources are needed to carry out such studies for comparison. At the moment, based on a summary of pooled testing [14], only a mathematical strategy as mentioned above was reported.

It is important to explore scientific principles on handling virus outbreaks such as developing vaccines and specific medicine. But it takes a long time to get research funding to carry out research and to validate the medicine at various levels of tests, including final clinical tests. Engineering technology hardwares have to be worked out accordingly from selecting scientific principles available and judge whether they are viable. Management has to ensure the hardware systems work as designed and monitored by Authority. Software management needs to control hardware engineering systems. Only with the synergism between science and management could virus outbreaks be under control. Government can only implement regulations effectively with support from citizens. To go one step further, safety can only be achieved on having technology, management and culture.

Financial support

This research received no specific grant from any funding agency, commercial or not-for-profit sectors.

Conflict of interest

None.

Data availability statement

Data are available from the authors upon request.

References

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

Data are available from the authors upon request.


Articles from Epidemiology and Infection are provided here courtesy of Cambridge University Press

RESOURCES