Abstract
Seroprevalence studies have been used throughout the COVID-19 pandemic to monitor infection and immunity. These studies are often reported in peer-reviewed journals, but the academic writing and publishing process can delay reporting and thereby public health action. Seroprevalence estimates have been reported faster in preprints and media, but with concerns about data quality. We aimed to (i) describe the timeliness of SARS-CoV-2 serosurveillance reporting by publication venue and study characteristics and (ii) identify relationships between timeliness, data validity, and representativeness to guide recommendations for serosurveillance efforts.
We included seroprevalence studies published between January 1, 2020 and December 31, 2021 from the ongoing SeroTracker living systematic review. For each study, we calculated timeliness as the time elapsed between the end of sampling and the first public report. We evaluated data validity based on serological test performance and correction for sampling error, and representativeness based on the use of a representative sample frame and adequate sample coverage. We examined how timeliness varied with study characteristics, representativeness, and data validity using univariate and multivariate Cox regression.
We analyzed 1844 studies. Median time to publication was 154 days (IQR 64–255), varying by publication venue (journal articles: 212 days, preprints: 101 days, institutional reports: 18 days, and media: 12 days). Multivariate analysis confirmed the relationship between timeliness and publication venue and showed that general population studies were published faster than special population or health care worker studies; there was no relationship between timeliness and study geographic scope, geographic region, representativeness, or serological test performance.
Seroprevalence studies in peer-reviewed articles and preprints are published slowly, highlighting the limitations of using the academic literature to report seroprevalence during a health crisis. More timely reporting of seroprevalence estimates can improve their usefulness for surveillance, enabling more effective responses during health emergencies.
Keywords: Public health surveillance, Seroprevalence, Infectious disease, COVID-19, Reporting, Bibliometrics
Graphical Abstract
1. Introduction
Timely information about population immunity can be critical for effective public health decision-making, as emphasized throughout the COVID-19 pandemic. Seroprevalence studies estimate the prevalence of antibodies and are crucial sources of this information. Estimates of seroprevalence can inform scenario modeling, public health planning, and national policies.
During the COVID-19 pandemic, seroprevalence estimates have primarily been generated through a patchwork of standalone research studies instead of systematic ongoing public health surveillance efforts, raising questions about their public health impact (Arora et al., 2021, Bergeri et al., 2021, Bobrovitz et al., 2021). For these seroprevalence studies to be effective when used for public health surveillance, it is important that they have the attributes of effective surveillance systems — including timeliness, representativeness, and validity, among others ( Table 1) (German et al., 2001, Groseclose and Buckeridge, 2017). However, many of these attributes are challenging to realize through one-off study efforts, particularly when findings are shared in research manuscripts published as peer-reviewed articles.
Table 1.
Attributes of effective public health surveillance systems used in this study and the measures used to examine them.
| Attributea | Definition | Measures used in this studyb |
|---|---|---|
| Timeliness | Refers to the time between any two steps in the surveillance system. For example, the time between the onset of a health event and reporting of the event to a public health agency. Timeliness can be evaluated by the availability of information for control of health-related events, and is influenced by surveillance methods and data sources (Groseclose and Buckeridge, 2017) |
How rapidly were results released after participants were sampled? The time elapsed between the last date of participant sample collection (sampling end date) and the date of the release of results from a study, irrespective of type of publication platform. For studies released via multiple platforms (e.g., government report, preprint and journal article), we used the first date the results were available. |
| Data quality (validity)c | Refers to the validity of data in the surveillance system, which is influenced by the performance of screening tests (German et al., 2001), statistical methods, and surveillance methods (ie., study type, geographical scope, etc.). Validity refers to the proportion of data entries that correctly reflect the true value of the data collected (Groseclose and Buckeridge, 2017). |
Were valid methods used to identify seropositivity? Did the serology test used meet the FDA standards for Emergency Use Authorization for COVID-19 serology tests, with sensitivity ≥ 90 % and specificity ≥ 95 %? Was there a correction for the sampling error? If non-probability sampling was used, then were statistical adjustments or reweighting of sample demographics performed? |
| Representativenessc | Refers to the ability of the surveillance system to accurately describe the health event over time. This is achieved by considering its distribution in the population. Surveillance data should be described in terms of geography, demographics, and clinical manifestations (Groseclose and Buckeridge, 2017). |
Was the sample frame appropriate for the study to generalize its findings to the population of interest? Was the sample used in the study generally representative of the target population? As an example, a sample of healthcare workers in predominantly administrative roles would not be representative of all healthcare workers working during the pandemic. Was data analysis conducted with sufficient coverage of the identified sample? To evaluate coverage, we examined if the demographics of a sample aligned with the expected demographics in the target population (age, sex, ethnicity). |
Other attributes of effective public health surveillance not evaluable based on seroprevalence study reports and not examined in detail here include simplicity, flexibility, acceptability, stability, sensitivity, and positive predictive value (German et al., 2001, Groseclose and Buckeridge, 2017).
The response options for each measure were “Yes”, “No”, or “Unclear”.
These items were derived from a modified version of the Joanna Briggs Institute Critical Appraisal Checklist for Prevalence Studies.
Peer-review can delay the availability of data by months. The scientific publication process has been criticized for these delays, (Cornelius, 2012, Graf, 2019, Huisman and Smits, 2017) as these have an impact on public health responses, and can also hinder secondary analysis, modeling, and global comparisons. At the same time, journals are not designed for the routine reporting of surveillance data and may not even consider updated results as sufficiently novel for publication.
To expedite dissemination of results, some researchers have turned to more rapid and accessible platforms, such as news and media (Rodriguez, 2020), government reports (The Stockholm Region, 2021), and preprints (Ward et al., 2021). However, the generalizability and validity of such non-peer-reviewed evidence has been questioned (Horbach, 2020, Ravinetto et al., 2021). It remains unclear whether these alternative platforms do indeed lead to faster reporting compared to scientific journals, and whether the ability to bypass peer-review has resulted in prolific publication of weaker evidence.
We aimed to determine the timeliness of SARS-CoV-2 seroprevalence studies in providing information useful for public health surveillance and further analysis. To do so, we analyzed a global database of SARS-CoV-2 seroprevalence studies, aiming to:
(1) describe the timeliness of SARS-CoV-2 seroprevalence reporting by publication venue, study methods, and populations studied.
(2) identify whether more timely reporting compromises other facets of effective surveillance, by examining relationships between timeliness, data quality, and representativeness.
2. Methods
2.1. Study identification, data extraction, and quality assessment
We identified SARS-CoV-2 seroprevalence studies using a living systematic review registered with PROSPERO (CRD42020183634) (Arora et al., 2021). Data sources and searching methods have been previously described (Bobrovitz et al., 2021). In brief, we conducted a search of electronic databases, grey literature, and news media for cohort and cross-sectional studies reporting seroprevalence estimates published between January 1, 2020-December 31, 2021. We also invited submissions of seroprevalence studies on our dashboard, at SeroTracker.com (Arora et al., 2021).
Inclusion criteria, screening, data extraction and quality assessment of seroprevalence studies have also been previously described in detail (Bobrovitz et al., 2021). We included SARS-CoV-2 seroprevalence studies in humans. To be included, studies had to report a sample size, sampling end date, geographic location of sampling, and a seroprevalence estimate. Studies were not required to use an antibody test that met FDA Emergency Use Authorization (EUA) criteria. All records were screened independently and in duplicate. A risk of bias (RoB) assessment was performed by two independent reviewers. Studies that did not use tests meeting FDA EUA criteria were noted for further analysis. The assessment involved the use of a modified nine-item Joanna Briggs Institute (JBI) Critical Appraisal Checklist for Prevalence Studies and, based on the results of the JBI checklist, generation of an overall RoB rating (low, moderate, high, unclear) (Migliavaca et al., 2020a, Migliavaca et al., 2020b, Munn et al., 2015). The JBI checklist assesses the sampling bias and measurement bias of studies, both of which can contribute to higher risk of bias ratings (Bobrovitz et al., 2022). Sampling biases are related to the representativeness of the seroprevalence estimate in the study sample relative to the seroprevalence in the target population, whereas measurement biases are related to measurement error (Bobrovitz et al., 2022). Studies were not excluded on the basis of their RoB rating as we intended to investigate sampling and measurement bias as predictors of timeliness.
For all included studies, we identified the first date at which results were published after data collection ended, irrespective of publication venue. For each study, we categorized publication venue as a peer-reviewed journal article, preprint, institutional report (a report from a government, organization or institution presenting data in a formal but non-academic manuscript format), presentation or conference materials (abstracts, PDF presentations) or media (media releases and news reports). We categorized sample frame as (i) household or community samples, (ii) blood donors or residual sera, (iii) healthcare workers, (iv) other special populations, which included essential non-healthcare workers, non-essential workers, students and daycares, non-COVID-19 patients and hospital visitors, and (v) studies that sampled multiple different populations.
2.2. Defining study timeliness, representativeness, and data quality
In this work, we focused on timeliness as a key determinant of the effectiveness of seroprevalence data for public health surveillance. We also examined the relationships between timeliness and two other characteristics of effective public health surveillance: representativeness and data quality. Table 1 provides definitions of each attribute and the measures used to operationalize them in the present study. The overall RoB rating was applied as a metric to understand if timely surveillance is associated with higher risks of bias.
2.3. Analysis
We calculated the median publication timeliness of seroprevalence studies with IQR in both the overall sample and stratified by publication venue (peer-reviewed journal articles, preprints, presentation or conference materials, institutional reports and media). We compared the median timeliness between preprints and institutional reports or media, and between peer-reviewed publications and preprints.
To assess the role of preprints in expediting the release of data, we compared the median time to publication for peer-reviewed journal articles that first appeared as preprints and peer-reviewed journal articles that were not preprinted.
We examined the relationship between timeliness and study characteristics (i.e., publication venue, geographic scope, sample frame, WHO region, and overall RoB), each measure of representativeness, and each measure of validity. To do so, we generated stratified Kaplan-Meier plots and conducted univariate Cox regressions, calculating overall model p values with the Wald test. To directly compare timeliness between publication venues, we conducted pairwise log-rank tests, using the Bonferroni correction to adjust for multiple comparisons. The reference groups chosen for Cox regression were as follows: geographic scope - “national”, WHO region - “Region of the Americas (AMRO)”, sample frame - “household and community samples”, overall RoB - “low”, individual items evaluating data representativeness and validity (i.e., was the sample representative of the target population) - “yes”.
To examine which factors were independently associated with timeliness, we constructed a multivariate Cox model. The predictors used in this model were all those which were significant on univariate Cox regression. Overall RoB was excluded from the multivariate model because it is partially determined by the measures of representativeness and data quality we employed. Analysis was completed using the survminer library and ggsurvplot function in R (Version 4.0.5).
3. Results
Overall, 1844 studies were included in the analysis. The majority (59 %) of studies were first released as peer-reviewed journal articles, followed by preprints (24.2 %), institutional reports (7.81 %), news articles (6.24 %), and presentations or conference abstracts (2.66 %). The majority (78 %) of studies were single time point (cross-sectional studies) as opposed to studies with repeated measures.
Across all publication venues, median time to publication was 154 days (IQR: 64–255). The shortest time to publication was 0 days for a media report, while the longest was 556 days for a peer-reviewed article.
Timeliness varied significantly across publication venues ( Fig. 1). Media reports (median: 12 days; IQR: 3–25) were released significantly faster than institutional reports (median: 18 days; IQR: 2–45) (log-rank p = 0.02). Both media and institutional reports were published significantly faster than studies released in all other publication venues (log-rank p < 2e-16). Preprints (median: 101 days; IQR: 49–180) were released faster than presentation or conference materials (median: 187 days; IQR: 43–295) (log-rank p = 0.003), and both venues released study results in significantly faster time compared to peer-reviewed journal articles (median: 212 days; IQR: 131–305) (log-rank p < 2e-16 & log-rank p = 0.049, respectively).
Fig. 1.
Kaplan-Meier curve and risk table for time-to-publication by publication venue. Pairwise comparisons indicate significant differences in timeliness between publication venues. Media and institutional reports were published significantly faster than all other publication venues (all log-rank p < 2e-16, with Bonferroni correction). Preprints were published in significantly shorter time compared to journal articles (log-rank p < 2e-16) and presentation or conference materials (log-rank p = 0.003). Presentation and conference materials were also released faster than journal articles (log-rank p = 0.049). Timeliness curves are plotted with 95 % confidence intervals (shaded area).
There were 230 studies first published as preprints that later appeared as peer-reviewed journal articles. There was no significant difference in time to publication of these studies and studies that were released as peer-reviewed journal articles without preprinting (p = 0.3).
Examination of RoB by publication venue showed that presentation or conference materials and media reports had higher risks of bias than other publication venues. Only 5.1 % of presentation or conference materials and 9.6 % of media reports had a low or moderate risk of bias. Larger proportions of low or moderate RoB studies were reported in peer-reviewed journal articles (32 %), preprints (42 %), and institutional reports (51 %). Media reports had the highest number of studies that had insufficient information to evaluate bias (40 % unclear RoB). Of 1844 studies included, 866 (47 %) studies were confirmed to use tests that met the threshold of FDA EUA criteria for sensitivity and specificity. 852 (46 %) studies either did not report the sensitivity and specificity of the test used or it was not known. 126 (7 %) studies reported test sensitivity and specificity values that were below FDA EUA criteria.
Compared to AMRO, there were significant differences in timeliness for studies conducted in different WHO regions (overall p = 0.002). This result was driven by slower study publication in European Region (EURO) as compared to AMRO (hazard ratio [HR] = 0.85, 95 % confidence interval [0.77–0.95], p = 0.004) and the Eastern Mediterranean Region (EMRO) (HR 0.77 [0.62–0.95], p = 0.02) as compared to AMRO ( Fig. 2A). We observed significant differences in timeliness by overall RoB (overall p = 8.0e-14), driven by significantly faster timeliness for studies with unclear RoB (i.e., insufficient information to evaluate) as compared to low RoB (HR 2.22 [1.65–2.99], p = 1.2e-07), whereas there were no differences between studies at moderate (HR 1.13 [0.87-1.47], p = 0.34) or high (HR 1.12 [0.87 - 1.44], p = 0.38) RoB vs. low RoB (Fig. 2B). There were significant differences in timeliness by sample frame (overall p = <2e-16), where studies of blood donors or residual sera (HR 0.76 [0.65–0.88], p = 3.7e-04), multiple populations (HR 0.69 [0.58–0.82], p = 1.7e-05), healthcare workers (HR 0.58 [0.51–0.66], p = <2e-16), and other special populations (HR 0.58 [0.51–0.66], p = <2e-16), took significantly longer to be released than studies of household/community samples (Fig. 2C). There were no differences in timeliness by geographic scope of a study (overall p = 0.3) (Fig. 2D).
Fig. 2.
Kaplan-Meier curves for timeliness across study characteristics. Comparison of study timeliness according to (A) the WHO region the study was conducted in (reference: AMRO; overall p = 0.002), (B) overall risk of bias (reference: low; overall p = 8e-14), (C) sample frames (reference: household and community samples; overall p = <2e-16) and (D) geographic scope (reference: national; overall p = 0.3). Timeliness curves are plotted with 95 % confidence intervals (shaded area).
Studies that did not have sufficient information to evaluate if the study sample was representative of the target population (HR 1.30 [1.11–1.53], p = 0.001) ( Fig. 3A) and studies that did not report antibody test sensitivity and specificity (“unclear”) (HR 1.24 [1.13–1.36], p = 8.5e-06) (Fig. 3C) had a higher probability of publication in a shorter time compared to studies that reported these data and met the criteria for representativeness and high test sensitivity and specificity. Studies that were not conducted with significant coverage of the sample (HR = 0.83 [0.71–0.98], p = 0.03) (Fig. 3B) or that did not use either appropriate sampling methods or a population adjustment (HR = 0.77 [0.67–0.89], p = 3.6e-04) (Fig. 3D) were published slower compared to studies that met these criteria. Table 2.
Fig. 3.
Kaplan-Meier curves for timeliness across measures of study representativeness and data quality. Comparison of timeliness according to (A) whether or not the sample was representative of the general population (overall p = 9e-04), (B) sample coverage (overall p = 0.01), (C) sensitivity and specificity of the antibody test used (overall p = 5e-05) and (D) appropriateness of sampling method and statistical analysis (overall p = 4e-04). The reference group was “yes” for all analyses. Timeliness curves are plotted with 95 % confidence intervals (shaded area).
Table 2.
Cox Regression Table. Hazard ratios > 1 mean that the comparator group was published faster than the control group. Univariate models were conducted using Cox proportional hazards models with a single predictor of timeliness. Variables that were significant predictors of timeliness in univariate analysis were included in multivariate Cox regression. Overall risk of bias was not included to avoid collinearity with individual items evaluating data representativeness and quality. The overall multivariate p value was < 2e-16 according to the Wald test.
| Reference group | Comparison group | Univariate Cox regression hazard ratio [95 % CI] | Univariate Cox regression p value | Multivariate Cox regression hazard ratio [95 % CI] | Multivariate Cox regression p value |
|---|---|---|---|---|---|
| Publication Venue -Journal article (peer-reviewed) | Media | 20.4 [16.4–25.3] | < 2e-16 *** | 17.0 [13.3–21.8] | < 2e-16 *** |
| Institutional Report | 12.2 [10.1–14.8] | < 2e-16 *** | 12.4 [10.1–15.1] | < 2e-16 *** | |
| Preprint | 2.3 [2.0–2.5] | < 2e-16 *** | 2.2 [2.0–2.5] | < 2e-16 *** | |
| Presentation or Conference | 1.3 [1.0–1.8] | 0.05 | 1.3 [0.96–1.7] | 0.09 | |
| Sample Frame -Household and community samples | Blood donors or residual sera | 0.76 [0.65–0.88] | 3.7e-04 *** | 0.88 [0.74–1.05] | 0.15 |
| Healthcare workers and caregivers | 0.58 [0.51–0.66] | < 2e-16*** | 0.81 [0.70–0.94] | 0.004 ** | |
| Other special populations | 0.58 [0.51–0.66] | < 2e-16*** | 0.74 [0.64–0.86] | 6e-05 *** | |
| Multiple populations | 0.69 [0.58–0.82] | 1.7e-05 *** | 0.90 [0.74–1.08] | 0.26 | |
| WHO Region -AMRO | AFRO | 1.13 [0.89–1.43] | 0.30 | 1.17 [0.92–1.49] | 0.19 |
| EMRO | 0.77 [0.62–0.95] | 0.02 * | 0.84 [0.68–1.05] | 0.14 | |
| EURO | 0.85 [0.77–0.95] | 0.004 ** | 0.95 [0.85–1.06] | 0.33 | |
| SEARO | 1.09 [0.91–1.31] | 0.35 | 1.03 [0.86–1.25] | 0.73 | |
| WPRO | 0.87 [0.72–1.06] | 0.17 | 0.90 [0.74–1.10] | 0.30 | |
| Was the sample representative of the target population? -Yes | No | 0.96 [0.87–1.06] | 0.43 | 1.10 [0.99 − 1.22] | 0.09 |
| Unclear | 1.30 [1.11–1.53] | 0.001 ** | 1.07 [0.90–1.27] | 0.45 | |
| Was there appropriate sample coverage? -Yes | No | 0.83 [0.71–0.98] | 0.03 * | 0.85 [0.72–1.00] | 0.06 |
| Unclear | 1.04 [0.93–1.16] | 0.48 | 1.01 [0.90 − 1.14] | 0.82 | |
| Did the antibody test used meet FDA EUA standards? -Yes | No | 1.09 [0.91–1.32] | 0.35 | 1.01 [0.83–1.22] | 0.92 |
| Unclear | 1.24 [1.13–1.36] | 8.5e-06 *** | 1.08 [0.98–1.19] | 0.14 | |
| Was appropriate sampling or a population adjustment performed? -Yes | No | 0.77 [0.67–0.89] | 3.6e-04 *** | 1.20 [1.03–1.41] | 0.02 * |
| Geographical Scope -National | Regional | 0.88 [0.75–1.03] | 0.099 | Not included | Not included |
| Local | 0.93 [0.82–1.05] | 0.23 | Not included | Not included | |
| Overall Risk of Bias -Low | Moderate | 1.13 [0.87–1.47] | 0.34 | Not included | Not included |
| High | 1.12 [0.87–1.44] | 0.38 | Not included | Not included | |
| Unclear | 2.22 [1.65–2.99] | 1.2e-07 *** | Not included | Not included |
Levels of Significance * - < 0.05; ** < 0.01; *** - < 0.001
Predictors included in the multivariate Cox regression were all publication venues, sample frame, WHO region, and individual measures of data validity and representativeness. Overall risk of bias was excluded to avoid collinearity with individual measures. Compared to peer-reviewed journal articles, preprints (HR 2.2 [2.0–2.5], p = <2e-16), media (HR 17.0 [13.3–21.8], p = <2e-16), and institutional reports (HR 12.4 [10.1–15.1], p = <2e-16) were associated with faster publication (release of data). Presentation or conference materials were not associated with more timely dissemination in comparison to journal articles (p = 0.09). Studies that sampled blood donors/residual sera (p = 0.2) or that looked at multiple populations in one study (p = 0.3) did not differ in timeliness when compared to studies that investigated household or community samples; however, studies sampling healthcare workers (HR 0.81 [0.70–0.94], p = 0.004) and other special populations (HR 0.74 [0.64–0.86], p = 6e-05) took significantly longer to publish than studies that sampled the general population. There were no significant associations between timeliness and the WHO region a study was conducted in when compared to AMRO. Further, there were no significant associations between timeliness and study representativeness or data validity, with the exception of the item evaluating if there was appropriate sampling or population adjustment. Adjusting for all other factors, non-probability sampling methods or not performing a population adjustment was associated with faster publication (HR 1.2 [1.03–1.41], p = 0.02).
4. Discussion
Our analysis shows that many SARS-CoV-2 seroprevalence studies have not reported their findings in a timely fashion: with a median 154 days between sampling and reporting, there are considerable challenges in using these data for public health decision-making. Studies reported in preprints and peer-reviewed articles were much slower to be released compared to studies on other publication platforms, emphasizing delays introduced by the academic writing and publishing process that make seroprevalence studies less useful for public health decision-making and impactful secondary analysis. However, we also show that it is possible to quickly release robust seroprevalence results; government or institutional reports were more timely, and had better data validity and representativeness, compared to academic manuscripts. This suggests that there are opportunities to improve the timely reporting of strong seroprevalence studies and thereby improve their value for public health surveillance.
The slow reporting of SARS-CoV-2 seroprevalence studies overall emphasizes limitations in their relevance for public health action. The landscape of infection and immunity can change drastically in the median 154 days from the end of sampling to the release of results , particularly in an era of rapidly spreading SARS-CoV-2 variants and mass vaccination (Grabowski et al., 2022). Notably, some of the results from these studies are made available to public health agencies directly before being released publicly — for example, many studies of blood donors and residual sera in Canada. While this improves the ability of the agency in question to act on the data, the closed sharing of results hinders interpretation and action by other stakeholders. Firstly, public health agencies who the data has not been shared with (e.g., federal authorities, for studies done at a state/province level), which limits the coordination between levels of government that is crucial in a pandemic setting (Fos et al., 2021). Secondly, academic research groups, who have done secondary analysis and modeling that has generated key information during the pandemic (“COVID-19 Response Team 2020–2021 report,” 2021). Finally, global synthesis and comparison initiatives: where this has been carried out for seroprevalence, these delays have caused limitations in the synthesis that can be done (Bergeri et al., 2021).
We show that peer-reviewed manuscripts are released particularly slowly, with a median time-to-publication of about seven months. While many medical journals have expedited publication processes for COVID-19 research (Horbach, 2020), our study demonstrates continued delay in the publication of seroprevalence findings. This raises the question of whether peer-reviewed journals are fit-for-purpose for reporting surveillance and seroprevalence findings. This is particularly true given that some journals may see routine updates on seroprevalence as not sufficiently novel to be published, potentially introducing publication bias.
The median preprint was reported close to four months faster than the median published article, but still nearly three months slower than the median government or institutional report. While preprints are enabling more rapid dissemination of information compared to publications, their median time to publication is still over three months, suggesting that the process of preparing an academic manuscript in the first instance introduces substantial delays. This suggests that preprint platforms may themselves not be suitable for routine surveillance reporting.
In our analysis, studies of healthcare workers and other special populations took longer to publish compared to studies of the general population. This is in part because studies of the general population were typically regional or national studies conducted by government affiliated groups and disseminated via institutional or government reports (The Stockholm Region, 2021), whereas studies of special populations were largely done by academic groups and released as research articles. However, these delays are problematic considering the importance of seroprevalence to inform best practices in high risk settings, such as hospitals. There is a clear need for rapid release of the findings from these studies to enable their use for public health.
Interestingly, our analysis shows that it is possible for a seroprevalence study to be both timely and robust. Institutional reports, which were published rapidly, had a greater proportion of low RoB studies compared to preprints and published articles. Moreover, there were no differences in the timeliness of low, high, and moderate RoB studies on univariate analysis, suggesting that faster publication of valid study results is possible. However, studies with limited information to evaluate bias (unclear RoB) were published significantly faster. Many of these studies were published via news and media reports, which may account for the lack of data needed to evaluate bias. News and media outlets should endeavor to link to extended reports provided by investigators, even if not peer-reviewed or pre-printed on a formal platform, in order to maximize dissemination of crucial study context.
Collectively, these findings point back to a fundamental divide between the way in which research studies are ordinarily done and performing effective public health surveillance. Surveillance systems generate information for action, which may not be novel; in contrast, research generates information for knowledge, and is conventionally communicated via the peer-reviewed literature. The academic literature — whether peer-reviewed or preprint — may not be fit-for-purpose as a way of communicating surveillance information. However, because many public health agencies have limited resources and serosurveillance expertise, they have often had to rely on results from intermittent studies conducted by academics — which are often delayed and not systematic. This approach is problematic, and accelerating peer-review processes may be unlikely to solve issues of timeliness.
Continuous serosurveillance, where governments perform routine serological sampling with rapid reporting, would address many of the challenges that the present study identifies. As an example, the REACT 2 programme involves repeated serological testing with an established analytical framework, thereby allowing for sampling, analysis, reporting, and data sharing (“The REACT 2 programme,” 2021; Ward et al., 2021). Continuous serosurveillance provides standardized, up-to-date data and avoids the need to extrapolate data over major gaps between sampling periods. To achieve this, public health agencies would need to be resourced to perform unique, ongoing, and systematic serosurveillance — either independently or in partnership with academic groups. However, these resources are often lacking, meaning that there are comparatively few examples of mature, well-funded, and ongoing serosurveillance systems run by public health agencies (Shu and McCauley, 2017).
Recognizing that different public health systems and academic groups will conduct serosurveillance in different ways, there is a clear need for a centralized data system for serosurveillance data.
We propose the development of a global open data repository for population-based seroprevalence studies. This platform would enable robust seroprevalence estimates to be rapidly deposited into a data repository, allowing expedited dissemination of data for immediate use for public health surveillance, secondary analysis and synthesis. This repository could serve as a modern analogue of the CDC Morbidity and Mortality Weekly Report (MMWR) repository, which was initially created as a place to report surveillance data to enable public health action (Shaw et al., 2011).
A platform for serosurveillance would include similar features such as data submission and open access download portals as well as dashboards for data visualization and modelling. Metadata and aggregate data in addition to granular data such as antibody titres would be encouraged as forms of submission with the goal of building high-impact, global models of immunity. Rapid and flexible approaches to peer-review, such as crowdsourcing, could be built into such a repository to validate submitted data. This repository would be in line with evidence arguing the importance of structured (computable) results databases to support efforts of efficient and most importantly, continuous surveillance.
Databases like SeroTracker with over 3000 seroprevalence studies can help with bootstrapping of the data. It is important that the proposed serosurveillance initiative could encourage adherence to a standardized protocol such as the UNITY Study Protocols (WHO), with core data elements based on the ROSES Guidelines for reporting seroprevalence studies and serosurvey evidence synthesis efforts like SeroTracker’s (Arora et al., 2021; “Systematic Review Reporting Standards,” 2017; Van Kerkhove et al., 2021; World Health Organization, 2021). Learning from past proposals for serological observatories, like the suggested World Serum Bank (de Lusignan and Correa, 2017) and Global Serological Observatory (Mina et al., 2020), could also help inform the design and utility of the decentralized platform that we propose. Finally, efforts should be made to collaborate with global stakeholders. Organizations and key knowledge users can assist in iteration and building consensus of data elements and could collaborate in the design of important features to incentivize data submission and alignment of global efforts and awareness.
This resource would align with the goals of repositories like the Global Initiative on Sharing Avian Influenza Data (GISAID) (Shu and McCauley, 2017), which has promoted rapid data sharing of genetic sequences associated with viruses. Submission of sequences has allowed for timely data synthesis and response, notably in the context of SARS-CoV-2 variant information, which proved to be critical in the detection of the Omicron variant (B.1.1.529). The platform would also resemble GISAID in that it would leverage data that is already being generated globally.
We are currently exploring the feasibility of developing such a platform.
Our analysis had several strengths. We aimed to identify all seroprevalence studies publicly reported in 2020 and 2021, providing complete coverage of seroprevalence studies in the first two years of the COVID-19 pandemic. Our inclusion of studies across all geographic regions, publication venues, populations studied, and study designs enables comprehensive analysis. Moreover, the multivariate Cox regression conducted here enables us to isolate the association between timeliness and the covariates of interest, including study characteristics, measures of data quality, and measures of representativeness.
Some limitations of our approach should also be kept in mind. First, we were not able to determine the duration of key steps in the reporting process for each article — for example, to analyze samples, analyze data, prepare the report, peer review, and copyediting and typesetting. Greater granularity here would help inform tailored suggestions to expedite reporting. Second, we included the first public report of results for each seroprevalence study in our database. This avoids double-counting but does not capture subsequent publications of that study in other venues. Our analysis showing that time to publication is similar for published articles regardless of whether they were first pre-printed suggests this had a limited effect. Third, some news and conference articles did not report an end date for sampling and had to be excluded, while we had to approximate sampling end date for studies reporting imprecise sampling dates (e.g., “between March and April”). Finally, studies that were reported directly to public health agencies and never published could not be identified or included in this analysis.
Overall, our findings indicate that COVID-19 seroprevalence studies have often released results slowly through venues more suitable for research studies, limiting their utility as surveillance tools. It is crucial to prioritize the principles of surveillance in designing seroprevalence investigations. Well-resourced public health surveillance or close government-academic partnerships could help support continuous serosurveillance systems. At the same time, repositories for rapid and open dissemination of seroprevalence results can improve data comparability and enable secondary analysis. More timely, standardized, and robust reporting of seroprevalence results will increase their usefulness for surveillance, enabling more effective public health responses.
Funding
SeroTracker receives funding for SARS-CoV-2 seroprevalence evidence synthesis from the Public Health Agency of Canada through Canada’s COVID-19 Immunity Task Force, the World Health Organization Health Emergencies Programme, the Canadian Medical Association Joule Innovation Fund, and the Robert Koch Institute.
CRediT authorship contribution statement
This is a secondary analysis of seroprevalence studies from the SeroTracker living systematic review database, maintained by the SeroTracker Consortium. CD and NI were co-first authors on this manuscript. Senior authors were DB, NB and RKA. Claire Donnici: Conceptualization, Methodology, Validation, Formal analysis, Investigation, Writing (Original Draft, Review & Editing), Visualization. Natasha Ilincic: Conceptualization, Methodology, Validation, Investigation, Writing (Original Draft, Review & Editing). Christian Cao: Conceptualization, Methodology, Investigation, Resources, Data Curation, Writing (Review & Editing). Caseng Zhang: Validation, Investigation, Writing (Review & Editing). Gabriel Deveaux: Validation, Investigation, Data Curation, Writing (Review & Editing), Graphical Abstract. David Clifton: Methodology, Writing (Review & Editing). David Buckeridge: Methodology, Writing (Review & Editing). Niklas Bobrovitz: Conceptualization, Methodology, Investigation, Writing (Original Draft, Review & Editing), Supervision, Funding acquisition. Rahul K. Arora: Conceptualization, Methodology, Resources, Writing (Original Draft, Review & Editing), Supervision, Project administration, Funding acquisition.
Acknowledgements
We would like to acknowledge the SeroTracker Consortium, including the many research assistants who screened, reviewed and extracted seroprevalence study data as part of the SeroTracker living systematic review. We are especially grateful to Mercedes Yanes-Lane for her feedback during the conceptualization stage of this manuscript.
Declaration of Interest
Rahul K. Arora was previously a Technical Consultant for the Bill and Melinda Gates Foundation Strategic Investment Fund, is a minority shareholder of Alethea Medical, and was a former Senior Policy Advisor at Health Canada. David Buckeridge consults for the Public Health Agency of Canada and has participated with Medicago. David Clifton consults for Oxford University Innovation, Biobeats, Sensyne Health, and Bristol Myers Squibb. Each of these relationships is entirely unrelated to the present work. No other authors have conflicts of interest to report.
Data Availability
The authors of this research article are contributors to SeroTracker, an interactive dashboard and data platform for SARS-CoV-2 seroprevalence. SeroTracker is a free, open-access tool that allows researchers and policymakers to visualize and analyze seroprevalence data. The data used in this research article was collected from the SeroTracker database which can be found and downloaded online (https://serotracker.com/en/Data).
References
- Arora R.K., Joseph A., Van Wyk J., Rocco S., Atmaja A., May E., Yan T., Bobrovitz N., Chevrier J., Cheng M.P., Williamson T., Buckeridge D.L. SeroTracker: a global SARS-CoV-2 seroprevalence dashboard. Lancet Infect. Dis. 2021;21:e75–e76. doi: 10.1016/S1473-3099(20)30631-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bergeri I., Whelan M., Ware H., Subissi L., Nardone A., Lewis H.C., Li Z., Ma X., Valenciano M., Cheng B., Al Ariqi L., Rashidian A., Okeibunor J., Azim T., Wijesinghe P., Le L.V., Vaughan A., Pebody R., Vicari A., Yan T., Yanes-Lane M., Cao C., Cheng M.P., Papenburg J., Buckeridge D., Bobrovitz N., Arora R.K., van Kerkhove M.D. Global epidemiology of SARS-CoV-2 infection: a systematic review and meta-analysis of standardized population-based seroprevalence studies, Jan 2020-Oct 2021 (preprint) Epidemiology. 2021 doi: 10.1101/2021.12.14.21267791. (Unity Studies Collaborator Group) [DOI] [Google Scholar]
- Bobrovitz N., Arora R.K., Cao C., Boucher E., Liu M., Rahim H., Donnici C., Ilincic N., Duarte N., Van Wyk J., Yan T., Penny L., Segal M., Chen J., Whelan M., Atmaja A., Rocco S., Joseph A., Clifton D.A., Williamson T., Yansouni C.P., Evans T.G., Chevrier J., Papenburg J., Cheng M.P. Global seroprevalence of SARS-CoV-2 antibodies: a systematic review and meta-analysis. PLoS ONE. 2021:16. doi: 10.1101/2020.11.17.20233460. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bobrovitz, N., Noël, K., Li, Z., Cao, C., Deveaux, G., Selemon, A., Clifton, D.A., Yanes-Lane, M., Yan, T., Arora, R.K., 2022. SeroTracker-RoB: an approach to automating reproducible risk of bias assessment of seroprevalence studies. https://doi.org/10.1101/2021.11.17.21266471. [DOI] [PubMed]
- Cornelius J.L. Reviewing the review process: identifying sources of delay. Australas. Med. J. 2012;5:26–29. doi: 10.4066/AMJ.2012.1165. [DOI] [PMC free article] [PubMed] [Google Scholar]
- COVID-19 Response Team 2020–2021 report [WWW Document], 2021. Imp. Coll. Lond. URL http://www.imperial.ac.uk/medicine/departments/school-public-health/infectious-disease-epidemiology/mrc-global-infectious-disease-analysis/covid-19/covid-19-response-team-2020–2021-report/ (accessed 1.31.22).
- de Lusignan S., Correa A. Opportunities and challenges of a World Serum Bank. Lancet. 2017;389:250–251. doi: 10.1016/S0140-6736(17)30046-6. [DOI] [PubMed] [Google Scholar]
- Fos P.J., Honoré P.A., Honoré R.L. Coordination of public health response: the role of leadership in responding to public health emergencies. Sci. -Based Approaches Respond. COVID Other Public Health Threats IntechOpen. 2021 doi: 10.5772/intechopen.96304. [DOI] [Google Scholar]
- German R.R., Lee L.M., Horan J.M., Milstein R.L., Pertowski C.A., Waller M.N. Updated guidelines for evaluating public health surveillance systems: recommendations from the Guidelines Working Group. MMWR Recomm. Rep. Morb. Mortal. Wkly. Rep. Recomm. Rep. 2001;50:1–35. (Guidelines Working Group Centers for Disease Control and Prevention (CDC)) quiz CE1-7. [PubMed] [Google Scholar]
- Grabowski, F., Kochańczyk, M., Lipniacki, T., 2022. The spread of SARS-CoV-2 variant Omicron with the doubling time of 2.0–3.3 days can be explained by immune evasion. https://doi.org/10.1101/2021.12.08.21267494. [DOI] [PMC free article] [PubMed]
- Graf, C., 2019. Timeliness: An Essential Area for Better Peer Review [WWW Document]. URL https://www.wiley.com/network/latest-content/timelines-an-essential-area-for-better-peer-review (accessed 1.14.22).
- Groseclose S.L., Buckeridge D.L. Public health surveillance systems: recent advances in their use and evaluation. Annu. Rev. Public Health. 2017;38:57–79. doi: 10.1146/annurev-publhealth-031816-044348. [DOI] [PubMed] [Google Scholar]
- Horbach S.P.J.M. Pandemic publishing: Medical journals strongly speed up their publication process for COVID-19. Quant. Sci. Stud. 2020;1:1056–1067. [Google Scholar]
- Huisman J., Smits J. Duration and quality of the peer review process: the author’s perspective. Scientometrics. 2017;113:633–650. doi: 10.1007/s11192-017-2310-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Migliavaca C.B., Stein C., Colpani V., Barker T.H., Munn Z., Falavigna M. How are systematic reviews of prevalence conducted? a methodological study. BMC Med. Res. Methodol. 2020;20:96. doi: 10.1186/s12874-020-00975-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Migliavaca C.B., Stein C., Colpani V., Munn Z., Falavigna M. Quality assessment of prevalence studies: a systematic review. J. Clin. Epidemiol. 2020;127:59–68. doi: 10.1016/j.jclinepi.2020.06.039. [DOI] [PubMed] [Google Scholar]
- Mina M.J., Metcalf C.J.E., McDermott A.B., Douek D.C., Farrar J., Grenfell B.T. A global lmmunological observatory to meet a time of pandemics. eLife. 2020;9 doi: 10.7554/eLife.58989. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Munn Z., Moola S., Lisy K., Riitano D., Tufanaru C. Methodological guidance for systematic reviews of observational epidemiological studies reporting prevalence and cumulative incidence data. Int. J. Evid. Based Health. 2015;13:147–153. doi: 10.1097/XEB.0000000000000054. [DOI] [PubMed] [Google Scholar]
- Ravinetto R., Caillet C., Zaman M.H., Singh J.A., Guerin P.J., Ahmad A., Durán C.E., Jesani A., Palmero A., Merson L., Horby P.W., Bottieau E., Hoffmann T., Newton P.N. Preprints in times of COVID19: the time is ripe for agreeing on terminology and good practices. BMC Med. Ethics. 2021;22:106. doi: 10.1186/s12910-021-00667-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rodriguez, A., 2020. MGH Coronavirus Case Study In Chelsea Will Help Researchers Understand Immunity. CBS Boston.
- Shaw F.E., Goodman R.A., Lindegren M.L., Ward J.W., Centers for Disease Control and Prevention (CDC) A history of MMWR. MMWR Suppl. 2011;60(4):7–14. [PubMed] [Google Scholar]
- Shu Y., McCauley J. GISAID: Global initiative on sharing all influenza data – from vision to reality. Eurosurveillance. 2017:22. doi: 10.2807/1560-7917.ES.2017.22.13.30494. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Systematic Review Reporting Standards [WWW Document], 2017. ROSES. URL https://www.roses-reporting.com (accessed 1.26.22).
- The REACT 2 programme [WWW Document], 2021. Imp. Coll. Lond. URL http://www.imperial.ac.uk/medicine/research-and-impact/groups/react-study/the-react-2-programme/ (accessed 1.14.22).
- The Stockholm Region, 2021. 27 april: Lägesrapport om covid-19. The Stockholm Region.
- Van Kerkhove M., Grant R., Subissi L., Valenciano M., Glonti K., Bergeri I., Brangel P., Le Polain O., Lewis H., Nardone A., Pebody R., Azim T., Wijesinghe P., Al Ariqi L., Le L.-V., Okeibunor J., Vicari A., Ben Hamida A., Udhayakumar V., Gallagher K., Richard V., Arora R., Bobrovitz N., Zambon M., Drosten C., Koopmans M., Peiris M. ROSES-S: Statement from the World Health Organization on the reporting of Seroepidemiologic studies for SARS-CoV-2. Influenza Other Respir. Virus Press. 2021 doi: 10.1111/irv.12870. (World Health Organization Seroepidemiology Technical Working Group) [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ward, H., Whitaker, M., Tang, S.N., Atchison, C., Darzi, A., Donnelly, C.A., Diggle, P.J., Ashby, D., Riley, S., Barclay, W.S., Elliott, P., Cooke, G., 2021. Vaccine uptake and SARS-CoV-2 antibody prevalence among 207,337 adults during May 2021 in England: REACT-2 study. https://doi.org/10.1101/2021.07.14.21260497.
- World Health Organization, 2021. Unity Studies: Early Investigation Protocols [WWW Document]. URL https://www.who.int/emergencies/diseases/novel-coronavirus-2019/technical-guidance/early-investigations (accessed 11.22.21).
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
The authors of this research article are contributors to SeroTracker, an interactive dashboard and data platform for SARS-CoV-2 seroprevalence. SeroTracker is a free, open-access tool that allows researchers and policymakers to visualize and analyze seroprevalence data. The data used in this research article was collected from the SeroTracker database which can be found and downloaded online (https://serotracker.com/en/Data).




