Abstract
Data System. The Spanish National Seroepidemiological Survey of SARS-CoV-2 (or ENE-COVID; SARS-CoV-2 [severe acute respiratory syndrome coronavirus 2] is the causative agent of COVID-19) was funded by the Spanish Ministry of Health, the Instituto de Salud Carlos III, and the Spanish National Health System.
Data Collection/Processing. A stratified 2-stage probability sampling was used to select a representative cohort of the noninstitutionalized population of Spain. ENE-COVID collected longitudinal data from epidemiological questionnaires and 2 SARS-CoV-2 IgG antibody tests. From April 27 to June 22, 2020, 68 287 participants (77.0% of contacted persons) received a point-of-care test and 61 095 (68.9%) also underwent a laboratory immunoassay. A second follow-up phase was conducted between November 16 and 30, 2020.
Data Analysis/Dissemination. Analyses use weights to adjust for oversampling and nonresponse and account for design effects of stratification and clustering. ENE-COVID data for research purposes will be available upon request from the official study Web page.
Public Health Implications. ENE-COVID, a nationwide population-based study, allowed monitoring seroprevalence of antibodies against SARS-CoV-2 at the national and regional levels, providing accurate figures by gender, age (from babies to nonagenarians), and selected risk factors; characterizing symptomatic and asymptomatic infections; and estimating the infection fatality risk during the first pandemic wave. (Am J Public Health. 2023;113(5):525–532. https://doi.org/10.2105/AJPH.2022.307167)
The Spanish National Seroepidemiological Survey of SARS-CoV-2 (Estudio Nacional de Sero-Epidemiología de la Infección por SARS-CoV-2, or ENE-COVID; SARS-CoV-2 [severe acute respiratory syndrome coronavirus 2] is the causative agent of COVID-19) is a nationwide population-based cohort study to quantify seropositivity for SARS-CoV-2 in the noninstitutionalized population of Spain.
DATA SYSTEM
ENE-COVID was conducted by the Spanish Ministry of Health and the Instituto de Salud Carlos III (ISCIII), in collaboration with the health services of all the Spanish regions (Autonomous Communities).
Purpose
The main aims of this longitudinal seroprevalence study were (1) to quantify the extent of SARS-CoV-2 circulation throughout the country during the first epidemic wave, (2) to monitor its evolution over time, and (3) to evaluate which factors were associated with greater risk of being infected by this virus. Because of the different patterns of evolution of the pandemic among Spanish regions, ENE-COVID’s design was intended to provide accurate and representative estimates of the prevalence of infection at the province level.
Public Health Significance
ENE-COVID was carried out during the first and very severe epidemic wave of COVID-19 in a Spain under lockdown, at a time when diagnostic tests were scarce and the surveillance system was not yet able to give reliable data on the burden of SARS-CoV-2 infection. ENE-COVID tried to answer the 2 main questions associated with a pandemic: how many people were infected, and how many of them died?1 Its results showed that, despite the high impact of COVID-19, the prevalence figures of infection were low, very far from what would be needed to support control measures relying on herd immunity. Also, having information on the intensity of the pandemic in each region contributed to delineating different schedules for recovering activity across the country. In addition, ENE-COVID provided reliable estimates of 2 basic public health indicators needed to evaluate and control the COVID-19 pandemic; namely, the proportion of asymptomatic infections and age-specific infection fatality risks.
DATA COLLECTION AND PROCESSING
ENE-COVID collects data from a sample representative of the overall noninstitutionalized population in Spain, through personal interviews and using 2 immunoassays (as explained in this section). To gather information about the evolution of the epidemic, 3 consecutive data collection rounds were carried out during a first phase of the study (April 27–June 22); a second phase, consisting of a unique round, took place in November 2020. Primary care staff from each of the regional health services carried out fieldwork (recruitment, epidemiological questionnaires, and antibody testing) under a common protocol developed by the National Center for Epidemiology at ISCIII.
The training of all personnel was coordinated via a Web platform established at the National School of Public Health at ISCIII. The National Statistics Institute (INE) provided the name, age, and phone numbers of all residents registered in the selected households. Regional call centers and health care centers tried to contact each household by phone on different days and at different times.
The first person answering the call was informed about the general purpose of the study. If they agreed to participate, an initial phone questionnaire collected and updated the information on the current residents in the house and on the characteristics of the household. All people living in each household were invited either to go to their primary health care center or to allow a home visit, during which they provided written informed consent. At that visit, primary care staff collected epidemiological data through computer-assisted personal interviews; these included a history of symptoms compatible with COVID-19 (fever, chills, severe tiredness, sore throat, cough, shortness of breath, headache, anosmia or ageusia, and nausea, vomiting, or diarrhea), contact with confirmed or suspected cases, and other risk factors. Participants also had a point-of-care rapid test to detect antibodies against SARS-CoV-2 and, optionally, donated a blood sample to assess the presence of IgG antibodies in serum with a more precise technique.
The Ministry of Health designed a secure Web application specifically for this study to save both questionnaire and point-of-care test results. Blood samples were centrifuged to obtain the sera, labeled, stored refrigerated at the primary health care centers, and sent to the laboratory every 2 or 3 days. Serum samples were analyzed either at the National Center for Microbiology (CNM-ISCIII) or in 1 of 28 selected regional microbiology laboratories under CNM-ISCIII coordination.
The study used 2 immunoassays: (1) a point-of-care test applied to fingerprick blood to detect IgG antibodies against the receptor-binding domain of SARS-CoV-2 spike protein (Orient Gene Biotech COVID-19 IgG/IgM Rapid Test Cassette, Zhejiang, China; reference GCCOV-402a), with a sensitivity of 82% to 93% and a specificity of 99% to 100% in preliminary validation studies at CNM-ISCIII and elsewhere2; and (2) a chemiluminescent microparticle immunoassay (CMIA) requiring venipuncture to detect IgG antibodies against SARS-CoV-2 nucleoprotein (SARS-CoV-2 IgG for use with ARCHITECT, Abbott Laboratories, Abbott Park, IL; reference 06R8620), with a sensitivity of 91% and a specificity of 99% in a meta-analysis of 23 diagnostic accuracy studies.3
Ethical Procedures
To allow the design and recruitment of a nationwide representative sample, a specific collaboration agreement was established between the Ministry of Health, the Instituto de Salud Carlos III, and the INE that regulated the access and use of personal data from the Municipal Register of Inhabitants. The ISCIII Committee for Ethical Research approved the study. The Spanish Agency for Personal Data Protection was consulted. All study participants provided written informed consent, with specific forms for adults, teenagers, parents of participating children, and guardians of mentally disabled participants. These documents were available in the 4 official languages of Spain. In addition, witnesses assisted participants who were not able to read any of them.
Population and Geographic Coverage
Target population
The target population was the entire noninstitutionalized household population of Spain according to the Spanish Municipal Register of Inhabitants, updated in January 2020. Residents registered in municipal rolls but without a health card were also included (public health care coverage is universal in Spain, but immigrants without residence permit are not registered). The study, however, excluded nearly 1.0% of the Spanish population residing in institutional settings, mainly care homes for elderly and disabled persons, health institutions, prisons, and military and religious institutions.
Sampling design
We used a stratified 2-stage sampling to select baseline participants in the ENE-COVID survey. Given the heterogeneous circulation of SARS-CoV-2 by Spanish region, the first level of stratification comprised the 50 Spanish provinces (47 mainland and 3 insular provinces in the Balearic and Canary Islands) and the 2 autonomous cities of Ceuta and Melilla. In addition, because SARS-CoV-2 transmission may be affected by population density, the second level of stratification corresponded to the municipality size, grouped into municipalities with fewer than 5000, 5000 to 20 000, 20 000 to 100 000, and 100 000 or more inhabitants. There were 185 nonempty population strata formed by cross-classifying provinces and municipality size groups. Within each stratum, we selected census tracts as first-stage sampling units and households within census tracts as second-stage sampling units. All residents in the household were invited to participate in the study.
Sample size determination and allocation to strata
We determined a total sample size of 90 000 people to obtain reliable estimates of SARS-CoV-2 seroprevalence in all Spanish provinces, accounting for design effect and potential nonresponse. To achieve a minimum sample size by province while preserving to some extent the population distribution, we assigned half of the total sample uniformly to the 50 provinces and 2 autonomous cities, and the other half proportionally to their population sizes. The sample allocated to each province was distributed proportionally to its population in the 4 municipality size groups. As a result of this sample allocation, persons from less populated provinces were oversampled to increase the precision of seroprevalence estimates in these regions. Thus, we calculated design weights as the inverse of the sampling fractions within each province to restore the actual population proportions when computing multiprovince estimates, either at the national level or by Autonomous Communities (first-level administrative division integrating 1 or several provinces).
Within-stratum sampling
To facilitate fieldwork and reduce sample dispersion, we selected participants within each stratum in 2 sequential stages. First, 1500 census tracts were randomly selected with probability proportional to their size; their geographical distribution is displayed in Figure 1. Afterward, we sampled 24 households within each selected tract by simple random sampling without replacement. All persons residing in the household were invited to participate in the survey (average household size = 2.50 residents),4 yielding the target sample of 90 000 persons. All persons in any given population stratum had the same probability of being selected. The design effects, which were induced by correlated clusters of seropositivity among residents in the same household and households from the same census tract, inflate the variance of SARS-CoV-2 seroprevalence estimates, which were considered in statistical analyses. Further details of the ENE-COVID survey design are described elsewhere.5,6
Unit of Data Collection and Sample Size
Analysis unit
The unit of data collection and analysis was the participant. In the ENE-COVID survey, participants who completed the epidemiological questionnaire and received the point-of-care test were included in the point-of-care sample, and those who also donated a blood sample and received the CMIA constituted the CMIA sample.
Sample size
Of 98 886 eligible persons residing in 35 885 selected households, 10 238 could not be contacted and 14 926 declined to participate (Figure 2). The point-of-care sample included 68 287 participants who received the point-of-care test in at least 1 of the 3 rounds during the first survey phase conducted between April 27 and June 22, 2020 (69.1% of eligible persons and 77.0% of contacted persons). Similarly, the CMIA sample comprised 61 095 participants who received this test in at least 1 round of the first survey phase (61.8% of eligible persons and 68.9% of contacted persons).
Response rates
Based on the sociodemographic characteristics of all eligible persons, response rates to the point-of-care test were lower among persons aged 25 to 29 years (59.3%) and older than 80 years (54.5%), among middle-aged men compared with middle-aged women (66.6% vs 73.3%), and among the lowest census tract income quartile (66.1%). Response rates for the CMIA evolved similarly to those for the point-of-care test by sociodemographic characteristics, except for a sharp decrease among those aged younger than 15 years (Figure 3). We used nonresponse weights to adjust for the different responses to the point-of-care test and the CMIA by gender, age group, and census tract relative income category.
Frequency of Data Collection
The first phase of ENE-COVID started 1 month after the peak of the first COVID-19 pandemic wave in Spain and included 3 successive follow-up rounds of data collection and serological testing between April 27 and June 22, 2020 (Figure 4). Each round was completed in 2 weeks, with a 1-week break between rounds. Half of the cohort was randomly assigned for data collection to the first week of each round and the other half to the second week, so that point-of-care testing and serum specimens were collected in all participants 2 to 4 weeks apart.
We conducted a second survey phase after the second wave of the COVID-19 pandemic, including a fourth round of data collection between November 16 and 30, 2020. All selected persons in the cohort were invited to receive the point-of-care test, whereas, in this round, the CMIA was only offered to a random subcohort of 200 census tracts as well as to participants with a positive result in previous rounds. More details are available in Spanish on the ENE-COVID Web site.6
Key Data Elements and Data Quality and Editing
The widespread geographical coverage, the large sample size of ENE-COVID, and the health situation of the country during the design, planning, and fieldwork implied relevant challenges about the feasibility, uniformity, and quality of the study. However, the study had strong political support from health authorities, both at the national and regional level. Therefore, the leadership of the coordination team, the common protocol and Web-based training program, and the work of the very proactive and collaborative primary health care regional teams, in permanent contact with the central node to monitor fieldwork and information systems, favored the homogeneity and quality of the collected information. The epidemiological questionnaire included information on sociodemographic factors, COVID-19 symptoms, history of contacts, presence of chronic diseases, and other risk factors. Primary care staff collected this information before performing any serological exam, to avoid any influence of the test result on the participants’ answers. Another key element is the good performance of the 2 tests used in ENE-COVID to measure SARS-CoV-2 IgG seroprevalence, selected by professionals from the CNM-ISCIII after performing specific validation assays of several candidate tests. Also, the use of a point-of-care test facilitated participation, allowing us to report the result to study participants immediately, whereas the CMIA allowed a more precise estimation, with a low risk of false positive results.
The continuous process of cleaning and curating the data registered during the computer-assisted personal interview started at the same time as the fieldwork. The electronic platform forced the study personnel to fill in all the questions included, thereby generating no missing data. The first phase of ENE-COVID included 3 contact occasions with the participants within a 2-month period (April 27–June 22), during which lockdown and restriction measures changed very rapidly. Therefore, the epidemiological questionnaire in each round added specific questions to explore risk of exposure, and again collected data on those variables that had quality problems. The information was checked, data between rounds were compared, and detected errors were corrected. Those participants that did not collaborate with ENE-COVID in any of the rounds have missing information for the round-specific data.
DATA ANALYSIS AND DISSEMINATION
In the sections that follow, we review elements of statistical analysis, interpretation issues, linkage ability, data release and accessibility, and provide key references.
Statistical Analysis
Sampling weights
Statistical analyses assigned sampling weights to survey participants to account for the different selection probabilities by province and the diverse response rates to the point-of-care test and the CMIA by sociodemographic characteristics. Design weights were first calculated as the inverse of the selection probabilities within each province and municipality size stratum, which were then adjusted for nonresponse by poststratifying the sample by gender, 5 age groups, and 2 income categories, so that the weighted sum of participants in each stratum matched their population totals. Different sampling weights were calculated for the point-of-care and CMIA samples, and upper extreme weights (0.2% and 0.5%, respectively) were trimmed to prevent highly influential observations.
Design effects
Because of the complex survey design, the variance of SARS-CoV-2 seroprevalence estimates should consider stratification by province and municipality size and the clustering of seropositivity by household and census tract. The intratract correlation coefficient was 0.05 to 0.06, which inflated the variance of overall seroprevalence estimates by a factor of 3.93 for the point-of-care test and 3.90 for the CMIA. Finite population corrections could also be applied since the average sampling fractions of census tracts per stratum (6.6%) and households per census tract (4.9%) were not negligible.
Interpretation Issues
ENE-COVID had high participation rates, and the information available for nonparticipants allowed us to consider differences by age, gender, and census tract socioeconomic level to provide a reliable picture of the spread of the pandemic among the noninstitutionalized population in Spain. However, ENE-COVID left out institutionalized people, and the pandemic was particularly dramatic in certain nursing homes. No official register of this population was available at that time, and the study of these closed environments would have required a different approach.
As mentioned earlier under Frequency of Data Collection, a fourth round of ENE-COVID was conducted in the fall 2020 after the second pandemic wave with a slightly different design. The characteristics and results of this fourth round are described in our companion article in this issue of AJPH (p. 533) and can be consulted at the study Web page.6
Linkage Ability
ENE-COVID has individual data, whereas its design is based on households in selected census tracts, which have around 2000 to 3000 residents. This could allow researchers to perform spatial analyses based on their location, as well as to combine the data from this survey with contextual information from the Spanish census. In this sense, we have already profited from the ability to classify households according to the average personal income of the corresponding census tract. Also, the collaboration of the regional health services offers the future possibility of adding information from the clinical records of consenting participants.
Data Release and Accessibility
Very detailed national and regional reports, as well as interactive maps with the results for each phase of ENE-COVID, are available on the official study Web page, maintained by ISCIII (https://portalcne.isciii.es/enecovid19). This portal will also provide information on the official procedure of accessing ENE-COVID basic data for research purposes, which is being defined. Requests, which must include a short scientific protocol, will be assessed by the interinstitutional Collaborative Research Scientific Committee of ENE-COVID, as well as by the ISCIII Committee for Ethical Research, if necessary, to allow data use without compromising due confidentiality.
Key References
PUBLIC HEALTH IMPLICATIONS
The first round of ENE-COVID served to characterize the geographical distribution of the first COVID-19 pandemic wave in Spain. It showed an important heterogeneity, with provinces with seroprevalences close to or greater than 10% mainly in the center of the country, contrasting with coastal regions and the islands, where prevalences were lower than 3%. Even in areas highly affected by the new pandemic, seroprevalence estimates were low, showing the difficulty of achieving herd immunity in the short term.
Contrary to what could be concluded from the information provided by the National Surveillance system,7 seroprevalence was similar in men and women,8 with no great differences observed by age group.5,8 ENE-COVID was the first population-based study providing seroprevalence data from babies to people aged older than 90 years.
It was also the first population-based study estimating the proportion of asymptomatic infections,5 and their distribution according to sociodemographic and epidemiological characteristics.9
The information provided by ENE-COVID and the number of deaths registered in our surveillance systems allowed us to estimate the infection fatality risk for SARS-CoV-2 in Spain.3 The combination of serological and epidemiological information served also to monitor new infections during the 3 rounds of the first phase of the study, and propose a symptomatic risk score to predict COVID-19 among symptomatic people attended by primary health doctors.9
ACKNOWLEDGMENTS
This work was supported by the Spanish Ministry of Health, the Institute of Health Carlos III (Ministry of Science and Innovation), and the National Health System, including the Health Services of all Autonomous Communities and autonomous cities: Servicio Andaluz de Salud, Servicio Aragonés de Salud, Servicio de Salud del Principado de Asturias, Servei de Salut Illes Balears, Servicio Canario de la Salud, Servicio Cántabro de Salud, Servicio de Salud de Castilla-La Mancha, Servicio de Salud de Castilla y León, Servei Català de Salut, Conselleria de Sanitat Universal i Salut Pública de la Generalitat Valenciana, Servicio Extremeño de Salud, Servizo Galego de Saúde, Servicio Riojano de Salud, Servicio Madrileño de Salud, Servicio Murciano de Salud, Servicio Navarro de Salud-Osasunbidea and Instituto de Salud Pública y Laboral de Navarra, Servicio Vasco de Salud-Osakidetza, and Instituto de Gestión Sanitaria. The Spanish Institute of Statistics provided the random selection of households and the information required to contact participants.
We thank all the nurses, general practitioners, administrative personnel, and other health care workers who collaborated in this study and all participants. This study is the result of the efforts of many professionals and the trust and generosity of more than 68 000 participants who have understood the importance of providing time, information, and samples to learn about the COVID-19 epidemic in Spain.
Collaborators of the ENE-COVID Study Group are listed in the Appendix, available as a supplement to the online version of this article at https://www.ajph.org.
CONFLICTS OF INTEREST
The authors declare they have no conflicts of interest.
HUMAN PARTICIPANT PROTECTION
The institutional review board of the Institute of Health Carlos III (ISCIII Committee for Ethical Research) approved the study (register no. PI 39_2020).
REFERENCES
- 1.Hernán MA, Yotti R. Two pandemics, two surveys in the United States and in Spain. Am J Public Health. 2021;111(3):414–415. doi: 10.2105/AJPH.2020.306105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Hoffman T, Nissen K, Krambrich J, et al. Evaluation of a COVID-19 IgM and IgG rapid test; an efficient tool for assessment of past exposure to SARS-CoV-2. Infect Ecol Epidemiol. 2020;10(1):1754538. doi: 10.1080/20008686.2020.1754538. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Pastor-Barriuso R, Pérez-Gómez B, Hernán MA, et al. Infection fatality risk for SARS-CoV-2 in community dwelling population of Spain: nationwide seroepidemiological study. BMJ. 2020;371:m4509. doi: 10.1136/bmj.m4509. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Instituto Nacional de Estadística. Encuesta Continua de Hogares. 2020. https://www.ine.es/prensa/ech_2019.pdf
- 5.Pollán M, Pérez-Gómez B, Pastor-Barriuso R, et al. Prevalence of SARS-CoV-2 in Spain (ENE-COVID): a nationwide, population-based seroepidemiological study. Lancet. 2020;396(10250):535–544. doi: 10.1016/S0140-6736(20)31483-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Instituto de Salud Carlos III. 2020. https://portalcne.isciii.es/enecovid19
- 7.Centro Nacional de Epidemiología/Red Nacional de Vigilancia Epidemiológica. 2020. https://www.isciii.es/QueHacemos/Servicios/VigilanciaSaludPublicaRENAVE/EnfermedadesTransmisibles/Documents/INFORMES/Informes%20COVID-19/Informe%20n%c2%ba%2032.%20Situaci%c3%b3n%20de%20COVID-19%20en%20Espa%c3%b1a%20a%2021%20de%20mayo%20de%202020.pdf
- 8.Pollán M, Pérez-Gómez B, Pastor-Barriuso R, et al. SARS-CoV-2 seroprevalence in Spain—authors’ reply. Lancet. 2020;396(10261):1484–1485. doi: 10.1016/S0140-6736(20)32266-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Pérez-Gómez B, Pastor-Barriuso R, Pérez-Olmeda M, et al. ENE-COVID nationwide serosurvey served to characterize asymptomatic infections and to develop a symptom-based risk score to predict COVID-19. J Clin Epidemiol. 2021;139:240–254. doi: 10.1016/j.jclinepi.2021.06.005. [DOI] [PMC free article] [PubMed] [Google Scholar]