Abstract
Background
To maintain control of the coronavirus disease 2019 (COVID-19) epidemic as lockdowns are lifted, it will be crucial to enhance alternative public health measures. For surveillance, it will be necessary to detect a high proportion of any new cases quickly so that they can be isolated, and people who have been exposed to them traced and quarantined. Here we introduce a mathematical approach that can be used to determine how many samples need to be collected per unit area and unit time to detect new clusters of COVID-19 cases at a stage early enough to control an outbreak.
Methods
We present a sample size determination method that uses a relative weighted approach. Given the contribution of COVID-19 test results from sub-populations to detect the disease at a threshold prevalence level to control the outbreak to 1) determine if the expected number of weekly samples provided from current healthcare-based surveillance for respiratory virus infections may provide a sample size that is already adequate to detect new clusters of COVID-19 and, if not, 2) to determine how many additional weekly samples were needed from volunteer sampling.
Results
In a demonstration of our method at the weekly and Canadian provincial and territorial (P/T) levels, we found that only the more populous P/T have sufficient testing numbers from healthcare visits for respiratory illness to detect COVID-19 at our target prevalence level—assumed to be high enough to identify and control new clusters. Furthermore, detection of COVID-19 is most efficient (fewer samples required) when surveillance focuses on healthcare symptomatic testing demand. In the volunteer populations: the higher the contact rates; the higher the expected prevalence level; and the fewer the samples were needed to detect COVID-19 at a predetermined threshold level.
Conclusion
This study introduces a targeted surveillance strategy, combining both passive and active surveillance samples, to determine how many samples to collect per unit area and unit time to detect new clusters of COVID-19 cases. The goal of this strategy is to allow for early enough detection to control an outbreak.
Keywords: surveillance, detection, COVD-19, outbreak, mathematical approach
Introduction
As with many countries around the world, Canada has implemented lockdowns to control the transmission of the virus that causes the coronavirus disease 2019 (COVID-19): severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Common lockdowns include travel restrictions and closure of social gathering locations such as restaurants, bars and other indoor entertainment venues. The decision to lift, reduce or stop lockdown measures is multi-criterial with social, economic and health considerations and decisions about the extent and timing of the lockdowns controlled at the federal, municipal and provincial and territorial (P/T) levels. At the most simplistic level, lockdowns can be relaxed at a defined prevalence level; a strategy used by Germany during their process of lifting lockdowns after the first wave of COVID-19 cases (1). To maintain control of the epidemic as lockdowns are lifted, it is crucial to enhance alternative public health measures (contact tracing, quarantining). Specifically, we need to detect a high proportion of new cases quickly so that they can be isolated, and people who have been exposed to or been in contact with these cases must be traced and quarantined. If there is insufficient capacity to test and trace, then resurgence of the epidemic that may overwhelm healthcare capacity is likely (2,3).
The ability to detect disease in a population depends on the type of surveillance strategy and the required number of samples to test. In large populations (i.e. greater than 1,000) a standard approach assumes random sampling from individuals that have equal risk of testing positive for the disease (4). However, if information is known about characteristics contributing to the probability of testing positive, then a targeted approach can be used to optimize sample size determination by weighing samples in their ability to detect given their characteristics (5).
In Canada there are currently two main strategies for collecting samples in COVID-19 surveillance: 1) healthcare visits and hospital admissions for respiratory illness (health care symptomatic testing demand); and 2) at-risk populations such as essential workers concerned that they may have been exposed to infection (6–8). However, these methods may not yield a sample size sufficiently large enough to detect new clusters of transmission at a time early enough (i.e. when infection prevalence in the community is low) to ensure that there is sufficient public health capacity to trace and quarantine contacts to control transmission. To achieve a sufficient sample size, a sampling strategy that tests volunteers may be required. This would likely capture more asymptomatic cases than when sampling those seeking health care; nevertheless, the value of including the volunteer population sampling would be twofold: first, to improve early warning by testing more broadly in the community and thus increase the probability of detecting new clusters; and second, to trigger a public health response at a determined level of prevalence in the population at which control of the outbreak is possible without the need to re-implement widespread lockdowns.
Targeted surveillance strategies can be used to efficiently sample from a population, which contains sub-populations having different probabilities of being infected, when the goal is early detection of disease at a given prevalence level (9,10). This approach requires weighing samples given their probability of detection and thus requires information on characteristics that relate to probability of a positive test result. This information includes factors affecting exposure and information on the frequency of these factors within the population, such as the proportion of people in each exposure category (11).
The probability of a positive test result may also include factors that are inherent to data from passive surveillance (12). The two main strategies for collecting COVID-19 samples in Canada are passive in the sense that people tested have decided to visit a health centre because they have developed symptoms or are at-risk individuals concerned about exposure. In contrast, a volunteer testing strategy is active surveillance in the sense of seeking out people to test. Other studies discuss strategies for accounting for under-ascertainment bias when not all diseased individuals present for health care, in the context of incorporating both passive and active surveillance data (12–14).
At the onset of an emerging disease, there may be insufficient information to account for challenges to using data from passive surveillance. The goal of this intervention study is to introduce a targeted surveillance strategy, combining both passive and active surveillance samples, and that uses minimal information for determining how many samples to collect per unit area and unit time to detect new clusters of COVID-19 cases—at a stage early enough to allow case isolation, contact tracing and contact quarantine—to control an outbreak.
Methods
To determine the need for volunteer sampling, the first step is to determine if the expected number of samples obtained from healthcare-based surveillance for respiratory virus infections provide a sample size that is already adequate to detect new clusters of COVID-19 at the desired threshold prevalence of infection in the general population for the time frame of interest. If the sample size is found to be inadequate, the second step is to determine how many additional weekly samples are needed from volunteers to detect new clusters of COVID-19 at the desired threshold prevalence in the general population.
We used a relative weighted approach, in which the expected prevalence level in a particular section of the population defines the weight that sample would have in detecting COVID-19 at p0. The approach assumes random sampling from within the sampling group. Every sample receives weight points given the expected prevalence in their population group. Sample collection continues until enough points have been reached to detect COVID-19 at p0. We demonstrated our method at the P/T and weekly levels, though this approach can be adjusted to other regional units or time frames.
Step 1: Determine if enough samples are obtained from symptomatic patients in healthcare settings
Pre-COVID-19 in Canada, testing for respiratory viruses was targeted to inpatients, as well as institutional and outbreak settings, where it would have the most impact on clinical care (15). However, COVID-19 testing is now recommended for all symptomatic individuals in Canada (16). Here, data on pre-COVID-19 healthcare visits for people with symptoms of respiratory infections are used to determine the expected number of weekly healthcare visits at which testing for COVID-19 could take place.
For pre-COVID-19 pandemic healthcare visit data, we needed to choose a recent time period during which there was no other pandemic underway. During the H1N1 influenza pandemic in 2009–2010, there were obviously more healthcare visits for viral infection symptoms than in most years. We assumed that if COVID-19 is being controlled at an acceptable level of risk, that the expected number of visits will conform to healthcare visits in years other those in which the H1N1 influenza pandemic occurred. Therefore, we used the mean annual number of reported visits for the non-pandemic time period of 2016 to 2018 as the mean annual expected healthcare visits for Canada (n=13,310,000) (Table 1; Canadian Institute for Health Information, unpublished analysis for Public Health Agency of Canada, 2020). The expected number of weekly healthcare visits per P/T, E, can then be calculated as a function of the population size of the P/T and time unit:
Table 1. Estimated annual number of ambulatory care visits and admissions for respiratory illness during a non-pandemic time period, Canada, 2016–2018.
Type of visit | Number of visits |
---|---|
Hospital admissions | 220,000a |
Emergency department visits | 1,900,000a |
Primary healthcare visits | 11,000,000a |
Number of residents in long-term care homesb | 190,000 |
TOTAL | 13,310,000 |
a Canadian Institute for Health Information (CIHI). Annual number, average of FY 2016–2018
b Canadian Institute for Health Information, 2020; refers to publicly funded/subsidized long-term care homes
Equation 1:
E = (P/T population size)/(Canadian population size) x (Canadian annual number of visits)/(52 weeks) |
To determine if E is sufficient to detect COVID-19 as early as possible at an acceptable level of risk it is necessary to define the threshold prevalence level, p0, in the general population to detect and control the eruption of new cases. For reference, Germany used a level at p0 = 0.05% during their process of lifting lockdown (1). This level corresponds to a 7-day period prevalence of 50/100,000. Here we investigate a more cautious value of p0 = 0.025% to correspond to a 7-day period prevalence of 25/100,000.
Healthcare visits for people with symptoms of respiratory infections are expected to have a higher probability of infection than asymptomatic people. We assume a 0.64% prevalence in the healthcare visits population to be a realistic value that can occur when COVID-19 is acceptably controlled and there is a relaxation of public health measures. This value is in the lower range of weekly mean percent positivity reported in the Canadian Network for Public Health Intelligence (CNPHI) System for Analysis of Laboratory Tests (SALT) for the month of May 2020, completing the spring period when maximal public health measures were in place in Canada. Then the weight of contribution of samples from sample group i, here being the healthcare visits population, with a prevalence of p, to detect COVID-19 at p0 during the time frame of interest t, is:
Equation 2:
w(i,t) = p(i,t) / p0 |
This weight is then used to translate weekly number of healthcare visits E into the number of weight points that go towards detecting COVID-19:
Equation 3:
wp(i,t) = E / w(i,t) |
The result, wp(healthcare, t), is then compared with the number of samples needed to detect at least one positive case of COVID-19, d(i,t), in the healthcare visits population using a standard sample size calculation (4):
Equation 4:
d(i,t) = -ln(1-α) / p x f |
for an α = 0.95 being the confidence of detecting at least one positive case of COVID-19 at a minimum detection threshold p = p(healthcare, t), and f = 0.79 being the test sensitivity for samples from symptomatic people (17). Sample size will increase with increasing levels of α. Typical values range from 0.95 to 0.99, and as more information becomes available, it may become evident that higher levels are needed to detect community transmission early enough to control the outbreak. If wp(i,t) < d(healthcare, t), then more samples from members of the public not visiting healthcare are needed to detect at least one positive case of COVID-19 at p0.
Step 2: Determine how many additional weekly samples are needed from volunteers
If there are not enough healthcare visit samples, a second step is used to calculate how many additional samples are needed from the general population for early detection during the time frame of interest t. Equation 4 is used again, but this time from the perspective of using volunteer sampling to detect COVID-19 at p0, meaning that in Equation 4, p = p0. Furthermore, volunteers are mostly asymptomatic, so we define a lower test sensitivity f = 0.70 for asymptomatic people (18,19). The result, d(volunteer,t), then used to calculate the number of additional tests needed from volunteers given sampling effort from the healthcare visits, E, as:
Equation 5:
a(t) = d(volunteer,t) - E |
To optimize sample collection from volunteers, we apply the relative weighted approach to target sampling by probability of testing positive. Selection of volunteer groups depends on knowledge of and data availability for characteristics influencing the probability for testing positive to COVID-19. In demonstration of our method, we defined volunteer groups by level of contact rates according to occupation data (unpublished data from the Centre for Labour Market Information, Statistics Canada at the request of the Public Health Agency of Canada. 2020), though other data characteristics could also be used (e.g. travel history, age group). The premise is that targeting sampling to higher risk groups reduces the overall sample size needed to detect COVID-19. Here we create three plausible volunteer populations whose expected infection prevalence differ according to the number of contacts (low, medium and high numbers of contacts) they have with other people (co-workers or other members of the public) each day according to their occupation. For this example, we use a prevalence for the medium contact of 0.04%. This is the mean prevalence observed in Alberta for asymptomatic people who were not close contacts or part of outbreak investigations during a period from February 14 to July 5, 2020 (unpublished data from Government of Alberta, 2020). We assume the low and high contact group prevalence levels are then twice and half, respectively, of the medium contact group.
The prevalence from sample group i is used to calculate their weight of contribution, w(i,t), towards detecting COVID-19 at p0 using Equation 2. Then, the number of tests needed from each volunteer population, in addition to E, needed to detect at least one positive case of COVID-19 at p0 given w(i,t) is calculated as:
Equation 6:
v(i,t) = a(t) / w(i,t) |
The value of v(i,t) is the total number of samples to test if sampling exclusively from that group. The final consideration is to calculate the optimum number of sample-tests needed from all volunteer sample groups given the probability of sampling from their populations. Data from the March 2020 Labour Force Survey (20) and the O*Net occupational database (unpublished data from the Centre for Labour Market Information, Statistics Canada at the request of the Public Health Agency of Canada. 2020) define the proportion of Canadians having jobs with low, medium and high contact rates, proportion(i), as 0.112, 0.392, and 0.494, respectively. Thus, the probability of a sample-test coming from volunteer sampling group i, in a P/T at t, given they are not part of E is:
Equation 7:
Pr(i,t) = λ x proportion(i) |
where λ is the probability of not being in the healthcare visits population: 1 – E/P/T population size. Therefore, the total number of sample-tests needed from all volunteer populations in a P/T at t to detect at least one positive cases of COVID-19 at p0 is:
Equation 8:
Z(t) = ΣiJ(v(i,t) x Pr(i,t) |
Where i is the volunteer sampling group and J is the total number of sampling groups.
This method depends on population size given the calculation of E. To assess the sensitivity of population size we also show results for Z(t) when p0 = 0.05%, to compare the proportion of population that must be surveyed when p0 = 0.05% and p0 = 0.025%.
Results
Here we present results for sample size determined at the provincial level and weekly levels. For all P/T, we assumed the same prevalence levels for the sampling groups. Considering only the weight of contribution to detect COVID-19 at p0 given assumed prevalence of the sampling groups, samples from the healthcare visits population are at least eight times to result in a positive COVID-19 test result (i.e. 25.6/3.20) (Table 2).
Table 2. Prevalence levels and weights of the volunteer sample groups in comparison with the healthcare visits population with low, l, medium, m, and high, h, contact rates.
Sample groups, i | Prevalence, p(i,t) | Weight, w(i,t) |
---|---|---|
Healthcare visits | 0.64 | 25.6 |
Volunteers with high contact rates | 0.08 | 3.20 |
Volunteers with medium contact rates | 0.04 | 1.60 |
p0 | 0.025 | 1.0 |
Volunteers with low contact rates | 0.02 | 0.80 |
As is inherent with the calculation, P/T with higher populations will have a higher number of expected healthcare visits, E. Given the high weight of contribution from this population to detect COVID-19 at p0, larger populations will require fewer additional, if any, samples for early detection. If the goal is to detect COVID-19 at p0 for the P/T level during the time frame of interest t, then only British Columbia, Alberta, Ontario and Québec would have a sufficient number of healthcare visit samples (Table 3). This assumes visits for respiratory illness at the assumed prevalence levels when maximal public health measures were in place from mid-March until just before the period of their relaxation in May 2020.
Table 3. Identification of province and territories that are short of samples by healthcare visits populationa,b.
Province/territory | E c | wp(healthcare, t) | d(healthcare, t) |
---|---|---|---|
BC | 34,522 | 1,349 | 593 |
AB | 29,809 | 1,164 | |
SKd | 7,982 | 312 | |
MBd | 9,304 | 363 | |
ON | 99,371 | 3,882 | |
QC | 57,668 | 2,253 | |
NBd | 5,268 | 206 | |
NSd | 6,602 | 258 | |
PEd | 1,068 | 42 | |
NLd | 3,522 | 138 | |
YKd | 277 | 11 | |
NTd | 303 | 12 | |
NVd | 264 | 10 |
Abbreviations: AB, Alberta; BC, British Columbia; MB, Manitoba; NB, New Brunswick; NL, Newfoundland and Labrador; NS, Nova Scotia; NT, Northwest Territories; NV, Nunavut; PE, Prince Edward Island; ON, Ontario; QC, Québec; SK, Saskatchewan; YK, Yukon
a Identification of province and territories that are short of samples by healthcare visits population as based on the number of expected healthcare visits, E, translated into weight points, wp(healthcare, t), and compared to the number of weighted samples d(healthcare, t) needed to detect COVID-19 in the healthcare visits population at p0
b Values are rounded up
c Expected number of samples from healthcare visits at the provincial/territorial level, E, and this number translated into weight points towards detecting COVID-19, wp(healthcare, t)
d Identification of province and territories that are short of samples by healthcare visits population
In step 2, it can be seen that low contact rate sample groups require model samples for early detection (Table 4). In calculation of the optimum number of additional samples needed to detect COVID-19 at p0 when augmenting with volunteer samples, Z(t), the low number of E compared with the total population of the P/T results in Pr(i,t) being very similar to the proportion of people with occupations with low, medium and high contact rates (Table 5). The less populous P/T require more volunteer samples for early detection because their E is lower, and hence the percentage of the population that needs to volunteer is higher. When p0 is increased from 0.25% to 0.50%, Manitoba has a sufficient number of E for early detection and the percentage of the population requiring volunteer sampling in the other P/T is reduced by half (Table 5).
Table 4. Number of samples needed to detect COVID-19a,b.
Province/territory | d(volunteer,t) | n(volunteer, t) | Low contacts | Medium contacts | High contacts |
---|---|---|---|---|---|
SK | 17,118 | 9,137 | 11,421 | 5,711 | 2,855 |
MB | 7,814 | 9,767 | 4,884 | 2,442 | |
NB | 11,850 | 14,812 | 7,406 | 3,703 | |
NS | 10,516 | 13,145 | 6,573 | 3,286 | |
PE | 16,050 | 20,063 | 10,031 | 5,016 | |
NL | 13,597 | 16,996 | 8,498 | 4,249 | |
YK | 16,841 | 21,051 | 10,526 | 5,263 | |
NT | 16,815 | 21,019 | 10,509 | 5,255 | |
NV | 16,854 | 21,068 | 10,534 | 5,267 |
Abbreviations: MB, Manitoba; NB, New Brunswick; NL, Newfoundland and Labrador; NS, Nova Scotia; NT, Northwest Territories; NV, Nunavut; PE, Prince Edward Island; SK, Saskatchewan; YK, Yukon
a Number of samples needed to detect COVID-19 at p0 for asymptomatic test sensitivity, d(volunteer, t); number of tests needed in addition to the healthcare visits samples from all volunteer sample groups, n(volunteer, t), and if sampling exclusively from each group with low, medium and high contacts at work
b Values are rounded up
Table 5. Optimum number of additional samples needed to detect COVID-19.
P/T | i | Population size | E | 𝜆 | Proportion(i) | Pr(i,t) | wp(i,t) | Z(t) | %P/T | Z(t) at p0 = 0.05% | % P/T at p0 = 0.05% |
---|---|---|---|---|---|---|---|---|---|---|---|
SK | L | 1,181,666 | 7,982 | 0.99 | 0.112 | 0.111 | 11,420 | 4,867 | 0.41 | 307 | 0.26 |
M | 0.99 | 0.392 | 0.386 | 5,710 | |||||||
H | 0.99 | 0.494 | 0.490 | 2,855 | |||||||
MB | L | 1,377,517 | 9,304 | 0.99 | 0.112 | 0.111 | 9,766 | 4,162 | 0.30 | N/A | N/A |
M | 0.99 | 0.392 | 0.386 | 4,883 | |||||||
H | 0.99 | 0.494 | 0.490 | 2,441 | |||||||
NB | L | 779,993 | 5,268 | 0.99 | 0.112 | 0.111 | 14,812 | 6,312 | 0.81 | 1,753 | 0.23 |
M | 0.99 | 0.392 | 0.386 | 7,406 | |||||||
H | 0.99 | 0.494 | 0.490 | 3,703 | |||||||
NS | L | 977,457 | 6,602 | 0.99 | 0.112 | 0.111 | 13,144 | 5,602 | 0.57 | 1,042 | 0.11 |
M | 0.99 | 0.392 | 0.386 | 6,572 | |||||||
H | 0.99 | 0.494 | 0.490 | 3,286 | |||||||
PE | L | 158,158 | 1,068 | 0.99 | 0.112 | 0.111 | 20,063 | 8,550 | 5.41 | 3,991 | 2.52 |
M | 0.99 | 0.392 | 0.386 | 10,031 | |||||||
H | 0.99 | 0.494 | 0.490 | 5,016 | |||||||
NL | L | 515,828 | 3,522 | 0.99 | 0.112 | 0.111 | 17,042 | 7,263 | 1.41 | 2,703 | 0.52 |
M | 0.99 | 0.392 | 0.386 | 8,521 | |||||||
H | 0.99 | 0.494 | 0.490 | 4,261 | |||||||
YK | L | 41,078 | 277 | 0.99 | 0.112 | 0.111 | 21,051 | 8,972 | 21.8 | 4,412 | 10.7 |
M | 0.99 | 0.392 | 0.386 | 10,526 | |||||||
H | 0.99 | 0.494 | 0.490 | 5,263 | |||||||
NT | L | 44,904 | 303 | 0.99 | 0.112 | 0.111 | 21,019 | 8,958 | 20.0 | 4,398 | 9.80 |
M | 0.99 | 0.392 | 0.386 | 10,509 | |||||||
H | 0.99 | 0.494 | 0.490 | 5,255 | |||||||
NV | L | 39,097 | 264 | 0.99 | 0.112 | 0.111 | 21,068 | 8,979 | 23.0 | 4,419 | 11.3 |
M | 0.99 | 0.392 | 0.386 | 10,534 | |||||||
H | 0.99 | 0.494 | 0.490 | 5,267 |
Abbreviations: H, high; L, low; M, medium; MB, Manitoba; N/A; not applicable; NB, New Brunswick; NL, Newfoundland and Labrador; NS, Nova Scotia; NT, Northwest Territories; NV, Nunavut; PE, Prince Edward Island; P/T, province/territory; SK, Saskatchewan; YK, Yukon
Note: At p0 when augmenting with volunteer samples from sample group i, Z(t), and the underlying values for the calculation, including λ, the probability of not being in the expected healthcare visits population, E. Also shown is the percentage of the provincial population that would need to participate in volunteer testing at temporal unit t
Discussion
We present a relative weighted approach for calculating the number of sample-tests required to detect at least one case of COVID-19 at a threshold level for early detection and control of new outbreaks. This approach combines expected numbers of tests from healthcare visits, with additional sampling from the general population. From the sampling groups, the probability of detecting COVID-19 is highest from the healthcare visits population. When insufficient samples are available from this group, sampling the general population using a relative weighted approach can provide the additional samples required.
Our approach is more feasible for large populations because they have higher testing rates from the healthcare visits population. If additional samples are needed, then the proportion of the population required as volunteers is more achievable than with smaller populations. For example, in our demonstration of sample size determination using P/T as the surveillance population, we find that British Columbia, Alberta, Ontario and Québec already have sufficient sample sizes from the healthcare visits population at the weekly level. Augmenting samples from volunteers requires testing 0.3 to 0.81% of the population for Saskatchewan, Manitoba, New Brunswick and Nova Scotia. However, for Prince Edward Island, Newfoundland and Labrador, Yukon, Northwest Territories and Nunavut, more than 1%–23% of the population must be tested. It is unlikely that level of compliance and/or ability to travel to testing sites would be achieved. This range reduces to 0.5%–11.3% of the population if assuming p0 = 0.05% instead of p0 = 0.025%, as was done for Germany following their first wave of COVID-19 infections.
Strengths and limitations
We did not consider test specificity in our approach. Polymerase chain reaction (PCR) tests for SARS-CoV-2 infection show excellent specificity of at least 98% but are more variable for test sensitivity (21,22). Even at 98%, large sample sizes can result in considerable numbers of false positives; for example, 160 false positive test results would be expected from testing 8,000 people. False positive test results can have significant consequences if the person with a false positive result undergoes unnecessary treatment for COVID that endangers the health of that person. Whereas a false positive result for a healthy person will only mean self-isolation for a while, and that would have limited impact on the health of that person.
The value of our approach is guiding surveillance efforts at the onset of an emerging disease when little is known about factors affecting the probability of a sample testing positive for the disease. At the onset of disease emergence, surveillance systems are developing their capacity to test and collect information that is pertinent for understanding transmission risk. Collecting information about high risk factors, such as travel history, lag behind socio-demographic information such as sex and age group. Furthermore, the association of socio-demographic information with the test result may not yet have been determined. When information for high risk factors becomes available, approaches that harness this type of information into a relative weighted approach can refine estimates of sample size determination, as shown by Jennelle et al. (10). This approach includes accounting for changes in the transmission risk over time as the disease risk grows, peaks and wanes. This also includes accounting for the passive nature of surveillance systems that result in violating the assumption that sampling is non-random. For example, barriers to access healthcare or testing centres in relation to gender, age, occupation or ethnicity. Consequently, overrepresentation of people with a certain socio-demographic profiles may skew the accuracy of prevalence values for the sampling groups. At present, sampling to collect nasopharyngeal swabs from patients visiting primary health care is rarely done, so less invasive sampling methods, such as mouth rinse tests, would facilitate reaching target sample sizes.
At the emergence of a novel disease there is likely insufficient information to accurately define the probability of a positive test result, which can then be used to inform sample size determination for early detection. Here we present a method to estimate sample sizes for early detection using limited information, as we show with prevalence levels (both estimated and assumed) from multiple sampling groups. Weighing the contribution of a sample from a given sampling group to result in a positive test result enables a more efficient sampling strategy for early detection, helping to target surveillance efforts and resources. Ideally, prevalence levels are updated, when possible, to reduce the error in the sample size estimates as the prevalence levels in sampling groups change over time and space. More specifically, P/Ts can cover large areas, where cities may be separated by hundreds of kilometers and, thus, may be only weakly connected in terms of drivers of infection. There may be multiple epidemiological units within a P/T, meaning that community transmission patterns are more similar within a unit than among neighbouring units. Hence, prevalence levels in the sampling groups can differ among the units over space and time. Metrics resulting from surveillance, such as sample size determination, are ideally performed at the spatial level of the epidemiological unit (12). The method presented here can be adapted to the level of an epidemiological unit. This approach would ensure that sample size determination for early detection is reflective of the sampling efforts (i.e. E) and prevalence levels for the sampling groups that are unique to the unit during the time frame of interest.
Conclusion
This intervention study introduces a targeted surveillance strategy, combining both passive and active surveillance samples, to determine how many samples to collect per unit area and unit time to detect new clusters of COVID-19 cases. The goal of this strategy is to allow for early enough detection to control an outbreak.
Acknowledgements
None.
Competing interest: None.
Funding: None.
References
- 1.Federal Government of Germany. A balanced outcome: Angela Merkel reports on meeting with state premiers. BREG; (modified 2020-05-06; accessed 2021-03-24). https://www.bundesregierung.de/breg-en/search/merkel-bund-laender-gespraeche-1751090
- 2.Ogden NH, Fazil A, Arino J, Berthiaume P, Fisman DN, Greer AL, Ludwig A, Ng V, Tuite AR, Turgeon P, Waddell LA, Wu J. Modelling scenarios of the epidemic of COVID-19 in Canada. Can Commun Dis Rep 2020. Jun;46(8):198–204. . 10.14745/ccdr.v46i06a08 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Ng V, Fazil A, Waddell LA, Bancej C, Turgeon P, Otten A, Atchessi N, Ogden NH. Projected effects of nonpharmaceutical public health interventions to prevent resurgence of SARS-CoV-2 transmission in Canada. CMAJ 2020. Sep;192(37):E1053–64. . 10.1503/cmaj.200990 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Fosgate GT. Practical sample size calculations for surveillance and diagnostic investigations. J Vet Diagn Invest 2009. Jan;21(1):3–14. . 10.1177/104063870902100102 [DOI] [PubMed] [Google Scholar]
- 5.Hicks AL, Kissler SM, Mortimer TD, Ma KC, Taiaroa G, Ashcroft M, Williamson DA, Lipsitch M, Grad YH. Targeted surveillance strategies for efficient detection of novel antibiotic resistance variants. eLife 2020. Jun;9:e56367. . 10.7554/eLife.56367 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Public Health Ontario. Enhanced epidemiological summary: COVID-19 in Ontario: A summary of wave 1 transmission patterns and case identification. PHO; 2020. https://www.publichealthontario.ca/-/media/documents/ncov/epi/2020/08/covid-19-wave-1-transmission-patterns-epi-summary.pdf?la=en
- 7.Short DL. Alberta Health no longer recommending asymptomatic testing. Edmonton Journal; (modified 2020-09-17). https://edmontonjournal.com/news/local-news/alberta-health-no-longer-recommending-asymptomatic-testing
- 8.BC Centre for Disease Control. Phases of COVID-19 testing in BC. BDCDC; 2020. http://www.bccdc.ca/health-info/diseases-conditions/covid-19/testing/phases-of-covid-19-testing-in-bc
- 9.Walsh DP, Miller MW. A weighted surveillance approach for detecting chronic wasting disease foci. J Wildl Dis 2010. Jan;46(1):118–35. . 10.7589/0090-3558-46.1.118 [DOI] [PubMed] [Google Scholar]
- 10.Jennelle CS, Walsh DP, Samuel MD, Osnas EE, Rolley R, Langenberg J, Powers JG, Monello RJ. memarest ED, Guber R, Heisey DM. Applying a Bayesian weighted surveillance approach to detect chronic wasting disease in white-tailed deer. J Appl Ecol 2018;55(6):2944–53. 10.1111/1365-2664.13178 [DOI] [Google Scholar]
- 11.Dohoo I, Martin W, Stryhn H. Veterinary Epidemiology Research 2nd Edition. Charlottetown, Canada: VER Inc.; 2009. [Google Scholar]
- 12.Cameron AR, Meyer A, Faverjon C, Mackenzie C. Quantification of the sensitivity of early detection surveillance. Transbound Emerg Dis 2020. Nov;67(6):2532–43. . 10.1111/tbed.13598 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Hadorn DC, Stärk KD. Evaluation and optimization of surveillance systems for rare and emerging infectious diseases. Vet Res 2008. Nov-Dec;39(6):57. . 10.1051/vetres:2008033 [DOI] [PubMed] [Google Scholar]
- 14.Li X, Chang HH, Cheng Q, Collender PA, Li T, He J, Waller LA, Lopman BA, Remais JV. A spatial hierarchical model for integrating and bias-correcting data from passive and active disease surveillance systems. Spat Spatio-Temporal Epidemiol 2020. Nov;35:100341. . 10.1016/j.sste.2020.100341 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Public Health Ontario. LABSTRACT – December 2020: Respiratory Virus Testing Update. PHO; 2020. https://www.publichealthontario.ca/-/media/documents/lab/lab-sd-121-respiratory-viral-testing-algorithm-enhanced-surveillance-update.pdf?la=en
- 16.Government of Canada. National polymerase chain reaction (PCR) testing indication guidance for COVID-19. 2020 (accessed 2020-10). https://www.canada.ca/en/public-health/services/diseases/2019-novel-coronavirus-infection/guidance-documents/national-laboratory-testing-indication.html
- 17.Kojima N, turner F, Slepnev V, Bacelar A, Deming L, Kodeboyia S, Slausner JD. Self-Collected Oral Fluid and Nasal Swab Specimens Demonstrate Comparable Sensitivity to Clinician-Collected Nasopharyngeal Swab Specimens for the Detection of SARS-CoV-2. Clin Infec Dis. 2020;ciaa 1589 [DOI] [PMC free article] [PubMed]
- 18.Public Health Ontario. COVID-19 Laboratory Testing Q&As. PHO; 2020. https://www.publichealthontario.ca/-/media/documents/lab/covid-19-lab-testing-faq.pdf?la=en
- 19.BC Centre for Disease Control. Interpreting the results of Nucleic Acid Amplification testing (NAT; or PCR tests) for COVID-19 in the Respiratory Tract. BCCDC; (updated 2020-04-03). https://medicalstaff.islandhealth.ca/sites/default/files/covid-19/testing/covid-interpreting-test-results-nat-pcr-bccdc.pdf
- 20.Statistics Canada. Labour Force Survey, March 2020. Statistic Canada; (updated 2020-04-09). https://www150.statcan.gc.ca/n1/en/daily-quotidien/200409/dq200409a-fra.pdf?st=kEd0dF4B
- 21.Mustafa Hellou M, Górska A, Mazzaferri F, Cremonini E, Gentilotti E, De Nardo P, Poran I, Leeflang MM, Tacconelli E, Paul M. Nucleic acid amplification tests on respiratory samples for the diagnosis of coronavirus infections: a systematic review and meta-analysis. Clin Microbiol Infect 2021. Mar;27(3):341–51. . 10.1016/j.cmi.2020.11.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Mironas A, Jarrom D, Campbell E, Washington J, Ettinger S, Wilbacher I, Endel G, Vrazic H, Myles S, Prettyjohns M. How to best test suspected cases of COVID-19: an analysis of the diagnostic performance of RT-PCR and alternative molecular methods for the detection of SARS-CoV-2. medRxiv. 2021.01.15.21249863. [Google Scholar]