Skip to main content
eClinicalMedicine logoLink to eClinicalMedicine
. 2021 May 7;35:100880. doi: 10.1016/j.eclinm.2021.100880

Prevalence of SARS-CoV-2 in urban and rural Ethiopia: Randomized household serosurveys reveal level of spread during the first wave of the pandemic

Saro Abdella a,1, Samuel Riou b,⁎,1, Masresha Tessema c, Ashenafi Assefa d, Albab Seifu b, Anna Blachman b, Adugna Abera d, Nicolas Moreno b, Fernando Irarrazaval b, Getachew Tollera a, David Browning b, Geremew Tasew d
PMCID: PMC8176122  PMID: 34124630

Abstract

Background

The spread of SARS-CoV-2 in Sub-Saharan Africa is poorly understood and to date has generally been characterised by a lower number of declared cases and deaths as compared to other regions of the world. Paucity of reliable information, with insights largely derived from limited RT-PCR testing in high-risk and urban populations, has been one of the biggest barriers to understanding the course of the pandemic and informed policy-making. Here we estimate seroprevalence of anti-SARS-CoV-2 antibodies in Ethiopia during the first wave of the pandemic.

Methods

We undertook a population-based household seroprevalence serosurvey based on 1856 participants in Ethiopia, in the capital city Addis Ababa, and in Jimma, a middle-sized town in the Oromia region, and its rural surroundings (districts of Seka and Mana), between 22 July and 02 September 2020. We tested one random participant per household for anti-SARS-CoV-2 antibodies using a high specificity rapid diagnostic tests (RDTs) and evaluated population seroprevalence using a Bayesian logistic regression model taking into account test performance as well as age and sex of the participants.

Findings

In total, 2304 random households were visited, with 1856 individuals consenting to participate. This produced a sample of 956 participants in Addis Ababa and 900 participants in Jimma. IgG prevalence was estimated at 1.9% (95% CI 0.4–3.7%), and combined IgM/IgG prevalence at 3.5% (95% CI 1.7–5.4%) for Addis Ababa in early August 2020, with higher prevalence in central sub-cities. Prevalence in Jimma town was lower at 0.5% (95% CI 0–1.8%) for IgG and 1.6% (95%CI 0–4.1%) for IgM/IgG, while in rural Jimma IgG prevalence was 0.2% and IgM/IgG 0.4% in early September.

Interpretation

More than four months after the first cases were detected in Ethiopia, Addis Ababa displayed a prevalence under 5% and likely as low as 2%, while rural Jimma displayed a prevalence of 0.2%. A 2% seroprevalence figure for the capital translated to a number of cases at least five times larger than those reported for the country as a whole. At the same time, it contrasts with significantly higher seroprevalence figures in large cities in Europe and America only two to three months after the first cases. This population-based seroepidemiological study thus provides evidence of a slower spread of SARS-CoV-2 in the Ethiopian population during the first wave of the pandemic and does not appear to support the notion that lower case numbers were simply a reflection of limited testing and surveillance.

Funding

Schmidt Family Foundation, Joachim Hertz Foundation, Nespresso, Peet's and Smuckers.

Keywords: SARS-CoV-2, Prevalence, Ethiopia, Africa, Household surveys

1. Introduction

The COVID-19 pandemic swept through China, Europe and the US in the first half of 2020, before spreading to the rest of the world with varying levels of intensity. Brazil was severely affected, with 92,000 deaths reported by August 2020. By then, Mexico had reported 47,000 deaths and Indonesia 5100. In contrast, countries in the Sub-Saharan Africa region, such as Nigeria and Ethiopia, comparable from a population size perspective, reported 880 and 274 deaths respectively [1]. Reported case counts show a similar disparity:  Brazil 2.7 M, Mexico 400,000, Nigeria 43,000, Ethiopia 17,000 (as of August 2020). Only South Africa, with 500,000 cases at that point, and to a much smaller extent Egypt, diverged from the pattern of low caseload on the African continent [2].

Considerable speculation exists amongst health professionals regarding the spread of COVID-19 in Africa [3,4]. Warnings that the spread would be rapid due to high population density and the impact overwhelming due to weaker public health systems have, to date and fortunately, not materialised [3,5,6].

A number of hypotheses have been formulated to explain the lower spread and impact on health of COVID-19 in Africa. Climatic conditions could be less conducive to transmission of SARS-CoV-2, as established in other respiratory viruses such as influenza [7], [8], [9]. The predominance of younger age groups, less prone to developing severe forms of the disease, may have resulted in a larger number of infected individuals displaying no or mild symptoms, and thus passing unnoticed [10,11]. Authors have looked to local conditions that could promote a more effective immune response in individuals infected on the African continent [3]. Containment measures, such as the closure of airports and stay-at-home measures, may have also delayed the onset, but they are unlikely to explain prolonged low prevalence, particularly given that these measures were only partially enforced in countries where the informal business sector is the main source of employment [7]. Finally, significantly lower testing capacity has often been used to explain the comparatively low number of cases in Africa. Indeed Ethiopia's daily PCR testing rate per capita was 30–34 times lower than that of the UK and the US respectively, in the first week of August 2020 [12].

Since the beginning of the pandemic, countries in Sub-Saharan Africa have faced major challenges in attempting to respond effectively to the threat posed by COVID-19. They have generally enacted regulation imposing stringent containment measures but faced considerable challenges to fund the comprehensive testing efforts and stay-at-home subsidies seen in higher income countries. In order to make informed decisions, governments require reliable and unbiased data to estimate the true impact of COVID-19. Testing in Africa has overwhelmingly been limited to RT-PCR in high risk groups, which is problematic for several reasons: cases of COVID-19 that show no or mild symptoms remain unnoticed, RT-PCR detects the virus in a small window of approximately 10 days, and sampling is not representative of the general population [11,13]. To overcome these issues, randomized population-based seroepidemiological surveys have been conducted in several countries worldwide to quantify the proportion of the population exposed to the virus, but these large scale seroepidemiological surveys have mostly remained unavailable in Sub-Saharan Africa [14], [15], [16], [17], [18], [19]. While one study has recently reported important seroprevalence estimates for Kenya, these were based on a population of volunteer blood donors [20].

Population-based seroprevalence studies have remained relatively uncommon due to the cost involved in running laboratory immunoassays on a sufficiently large sample. More recently, rapid diagnostic tests (RDTs) have been shown to produce reliable results and epidemiological information comparable to that of laboratory immunoassays, provided that a test with high and independently verified specificity was used [15,17,21]. In this study, we leveraged the advantages of a high performance RDT for serology testing in challenging field conditions of a randomised household survey to evaluate seroprevalence of SARS-CoV-2 antibodies in Ethiopia's capital city Addis Ababa and in Jimma, a middle-sized town of the Oromia region, and its rural surroundings, the districts of Seka and Mana, hereafter referred to as rural Jimma, between 22 July and 02 September 2020.

2. Methods

2.1. Study area and design

The focal areas were Addis Ababa, capital of Ethiopia, representing a population of more than 4.5 M people and three districts in the zonal administration of Jimma: Jimma town, the largest city in southwest Oromia and two rural districts, Mana and Seka, for a total population of ca. 900,000 people (Fig. 1). 956 households were tested in Addis Ababa between 22 July and 10 August 2020, and 900 households were tested in Jimma region between 19 August and 2 September.

Fig. 1.

Fig. 1

Location of the areas included in the study .

2.2. Survey protocol and randomization

Household sampling followed a proprietary method of geographical sampling developed by Enveritas for randomization in situations where census tracts listing households are either not available or outdated [22]. Geographical space is divided into 30 m by 30 m tiles using high resolution satellite imagery and a household model is built using an algorithm that detects tiles with households with 95% accuracy [23,24]. Pins representing households are then drawn at random from the household model. The model was modified to account for population density, with tiles weighted by population density data obtained from census data (2017 projections based on the 2007 census) [25].

A random sample of households was selected from the model, a randomization routine was followed by health workers to select the household in a geographic location, and one participant per household, aged 18 or above, was chosen at random for antibody testing. If that household or person declined to participate, the health worker sampled the nearest household that provided consent. To prevent over-representation of individuals disproportionately present at home, we ensured randomization was done on all individuals living in the household: if the random draw selected a person not in the household at the time, the health worker contacted them by phone and arranged a test outside of the house (further details are provided in the appendix).

2.3. Data collection and participant consent

Fieldwork was carried out by staff from the Ethiopian Public Health Institute (EPHI). Every household was provided with information about the nature of the study and written informed consent was obtained from all participants. Participants were tested with a RDT and asked to answer a short questionnaire which covered demographics, a history of social exposure (e.g. time spent outside the household, number of social interactions), symptoms compatible with COVID-19 and any known medical conditions. Answers and test results were collected on a smartphone using Commcare [26]. The study was approved by the Institutional Review Board of EPHI, every participant was provided guidance on interpretation of the test result and interviews/tests were always carried out outside households using appropriate personal protective equipment.

2.4. Serology testing

Serology testing was done using the CE approved COVID-19 IgM/IgG rapid test (Core Technology Co., Ltd, Beijing, China). This is a lateral-flow immunochromatographic test that uses SARS-CoV-2 antigen coated with colloidal gold to bind with IgG and IgM antibodies targeting both the spike protein (S1 sub-unit) and the nucleocapsid protein, yielding results in 15 min. In a clinical trial on 609 samples reported by the manufacturer, specificity was 98.6% for IgG and 98.4% for IgM (n = 364) while sensitivity was 91.4% and 89.4% respectively (n = 245). An independent trial was run by on 200 plasma samples collected prior to 2019, with band intensity scored from 0 to 6, where 1 was a barely visible and ambiguous band, in agreement with Whitman et al. [21] (not shown). IgG specificity was 97.0% (95%CI 94.6–99.4%), with no ambiguous bands in the trial sample. IgM specificity was 98.0% (95%CI 96.0–99.9%) treating ambiguous bands of intensity 1 as negatives. In addition to this, results reported in this study in the two rural districts of Jimma show extremely rare occurrences of seropositives, confirming the high performance of the test, with a minimum specificity of 99.8% for IgG (1 positive for 607 participants in Mana and Seka, and 99.5% for IgM (3 positives for 607 participants) making the assumption that all positives were false positives. We combined these datasets in the priors of the Bayesian statistical model of seroprevalence, estimating posterior specificity at ca. 99% for IgG as well as for IgM/IgG combined (see appendix). Sensitivity could only be evaluated independently by EPHI on blood samples of positive individuals 15 days post symptoms onset (n = 100 SARS-CoV-2 RT-PCR positive individuals). This is too early in the phase of antibody development to adequately measure sensitivity, but results showed detection of 67% (95% CI 57–76%) for IgG and 69% (95% CI 60–78%) for IgM/IgG, which is consistent with known pattern of RDTs showing gradually rising antibody detection in RT-PCR positive individuals, and sensitivities above 80% after 20 days [21].

Test results were carried out following manufacturer guidance and results scored by band intensity. A photograph of the test result was taken at the time of the test, recorded in the survey application and confirmed by a second reader. We chose not to use IgM results alone due to the higher heterogeneity of results reported for IgM in RDTs, including the higher occurrence of ambiguous IgM bands [15,21]. We thus built two separate datasets for analysis: one with seroprevalence based only on IgG bands and one with seroprevalence based on the presence of an IgG band or an unambiguous IgM band.

2.5. Statistical analysis

To estimate seroprevalence, we used a Bayesian logistic regression model with post stratification on age and sex of the population. The model takes into account uncertainty around the specificity and sensitivity of the test in the priors as well as in the estimation of prevalence, following Gelman and Carpenter [27]. We implemented it using Stan probabilistic programming language, running 8000 iterations, verified convergence with the R-hat statistic and checked for fitness using posterior predictive checks [28]. Further details on the statistical model are available in the appendix. Estimates are means of the posterior samples, with the 2.5th and 97.5th percentiles reported as the 95% credible interval (CI). Credible intervals are easier to interpret than frequentist confidence intervals. They can be read as the probability that the estimate falls inside the interval. The prevalence we report thus has a 95% chance of being within the 95% CI. Using this model we estimated that a sample size of 800 to 1000 would allow the calculation of a regional prevalence estimate (e.g. Addis Ababa) with a margin of error of ca. 1% (with 95% confidence) for a population prevalence of 1% and a margin of error of ca. 2% for a population prevalence of 5%

2.6. Role of the funding source

The funding source had no involvement in the collection, analysis, interpretation of the data, realisation of the manuscript or decision to submit for publication. The corresponding author has access to the full dataset and decided to submit the manuscript for publication.

3. Results

In total, 2304 random households were visited between 22 July and 02 September 2020, of which 241 (10.5%) were not accessible and 179 randomly selected participants (9%) were unwilling or unable to participate, or, if outside the household, were not reachable on the phone. This produced a sample of 956 participants in Addis Ababa and 900 in Jimma region. The age distribution in the sample was representative of the age distribution in the census population in both areas (Table 1). Men and women were sampled in equal proportion in Jimma. In Addis Ababa, the sample overrepresented women and this was taken into account by post-stratification in the model.

Table 1.

Demographics of the study participants.

Demographic categories Sample size Sample size (%) Census data (%)*
Addis Ababa Sex Male 373 39% 47%
Female 583 61% 53%
Age 18–34 461 48% 51%
35–49 268 28% 28%
50–64 162 17% 11%
65–90 65 7% 9%
Jimma Sex Male 426 47% 50%
Female 474 53% 50%
Age 18–34 477 53% 50%
35–49 258 29% 28%
50–64 117 13% 14%
65–90 48 5% 8%

* Central Statistics Agency of Ethiopia.

Testing results and outputs by area for IgG and combined IgM/IgG are reported in Table 2. Between 22 July and 10 August 2020, IgG prevalence in Addis Ababa was estimated at 1.9% (95% CI 0.4–3.7%) with an upper bound of the 95% CI at 3.7%. Combined IgM/IgG prevalence in Addis Ababa was estimated at 3.5% (95% CI 1.7–5.4%). Between 19 August and 2 September 2020, prevalence in Jimma town was 0.5% (95% CI 0–1.8%) for IgG and 1.6% (95% CI 0–4.1%) for IgM/IgG, while the two rural districts of Mana and Seka combined (rural Jimma) returned one IgG positive and two IgM/IgG positives, with a prevalence estimated at 0.2% (95% CI 0–0.8%) for IgG and 0.4% (95% CI 0–1.0%) for IgM/IgG.

Table 2.

Summary of testing results and model outputs by area for IgG and combined IgM/IgG.

Area Sample size Positives Prevalence (95% CI) Posterior specificity
IgG only Addis Ababa 956 31 1.9% (0.4–3.7%) 98.7%
Jimma region 900 5 0.3% (0–0.9%) 99.1%
Jimma town 283 4 0.5% (0–1.8%) 99.1%
Jimma rural (Seka, Manna) 617 (312 + 305) 1 0.2% (0–0.8%) 99.1%
IgM or IgG Addis Ababa 956 45 3.5% (1.7–5.4%) 98.8%
Jimma region 900 11 0.6% (0.1–1.6%) 99.1%
Jimma town 283 9 1.6% (0–4.1%) 99.1%
Jimma rural (Seka, Manna) 617 (312 + 305) 2 0.4% (0–1.0%) 99.1%

In Addis Ababa, means and upper bounds of the 95% CI of IgG seroprevalence were higher in central sub-cities of Addis Ababa than peripheral sub-cities, ranging from 3.2% in Kirkos to 1.3% in Kolfe Keraniyo (Fig. 2). Prevalence was slightly higher in males (2.1%) than females (1.8%), and lower in the 18–34 age group compared to the other age groups analysed 35–49, 50–64, 65–90 (Table 3). In Jimma town and rural, there were no detectable differences in prevalence between sex and age groups, with IgG prevalence at 0.3% (Table 3).

Fig. 2.

Fig. 2

IgG seroprevalence (with 95% CI) in Addis Ababa by sub-cities. Number of participants per sub-city were proportional to population numbers per sub-city from census data: Arada 57, Akaki Kaliti 67, Addis Ketema 66, Bole 124, Gulele 99, Kirkos 72, Kolfe Keraniyo 158, Lideta 47, Nefas Silk-Lafto 112, Yeka 154.

Table 3.

Summary of sample sizes and prevalence outputs by age and gender.

Demographic categories Sample size IgG Prevalence (95% CI) IgM/IgG Prevalence (95% CI)
Addis Ababa Sex Male 373 2.1% (0.3–4.3%) 4.0% (1.6–6.6%)
Female 583 1.8% (0.3–3.6%) 3.2% (1.5–5.1%)
Age 18–34 461 1.6% (0.2–3.4%) 2.8% (1.0–5.0%)
35–49 268 2.4% (0.4–5.2%) 4.5% (1.8–7.6%)
50–64 162 2.4% (0.3–5.5%) 4.6% (1.7–8.7%)
65–90 65 2.4% (0.3–5.8%) 5.1% (1.7–10.5%)
Jimma* Sex Male 426 0.3% (0–1.0%) 0.7% (0.1–1.8%)
Female 474 0.3% (0–1.0%) 0.8% (0.1–2.0%)
Age 18–34 477 0.3% (0–0.9%) 0.8% (0.1–1.9%)
35–49 258 0.3% (0–1.1%) 0.8% (0.1–2.0%)
50–64 117 0.3% (0–1.2%) 0.8% (0.1–2.0%)
65–90 48 0.3% (0–1.1%) 0.8% (0.1–2.1%)

* Jimma town and rural combined.

Discussion

The spread of SARS-CoV-2  has been poorly documented in Africa, with insights largely limited to RT-PCR case counts in high-risk populations and death counts reported by health authorities, both of which provide a poor estimate of the real impact and propagation of SARS-CoV-2 due to limited capacity for diagnosis on the continent, and have remained particularly low in Africa [2]. Using a large scale population-based household serosurvey, we show that the capital Addis Ababa displayed a seroprevalence under 5% (higher bound of 95% CI, IgM/IgG estimate) and likely as low as 2%, more than four months after the first cases detected in Ethiopia, while in rural Ethiopia seroprevalence was under 0.5%.

Although rapid diagnostic tests have been shown to produce prevalence estimates that are comparable to those derived from immunoassays [15,29], they have also received some criticism owing to variability in quality between brands, and specifically specificity [21,30,31]. We were able to confirm high specificity of the test used in Ethiopia from three independent sources, and modelled the occurrence and uncertainty around the number of false positives using a Bayesian framework. IgM bands were less reliable, with a number of faint or ambiguous intensity bands that the lab evaluation confirmed were false positives and that were excluded. While IgG data was more reliable, we also reported combined IgG/IgM estimates, which were higher than IgG prevalence estimates, so as to provide a complete picture. The difference observed could possibly be due to increased likelihood of detecting an immune response to SARS-CoV-2 via the detection of IgM antibodies in recently infected participants. Ethiopia reported more than a tenfold expansion in the number of RT-PCR positive cases through the study period, from 3500 cases on 15 July to 53,000 on 1 September [12], making it possible that a number of study participants had been infected days prior to testing, thus explaining IgM seropositivity, without IgG antibodies. However, higher IgM/IgG figures may also be explained by lower specificity or increased difficulty in reading the IgM test. Differentiating between faint band intensities was prone to error and could have resulted in a slight overestimate of prevalence, using the combined dataset.

A seroprevalence of 2% in Addis Ababa corresponds to approximately 90,000 individuals infected in the capital. While this number is five times the cumulative number of cases reported for the entire country of Ethiopia at the time of the serosurvey (18,000 cases on 1 August 2020), the level of seroprevalence is lower than might have been expected had SARS-CoV-2  spread at the same speed as in other metropolises for which seroprevalence data is available. Capitals in Europe reported seroprevalence above 10% two to three months after the first cases, prevalence was estimated at 12% in New York state and 20% in New York city [18]. Figures in Addis Ababa are more comparable to those reported from Sao Paulo and Rio de Janeiro, in the 2 to 7% range three to four months after the first reported cases [19]. At the time of writing, two population wide serology studies have reported results for Sub-Saharan Africa. In Kenya, 7% IgG prevalence was reported for Nairobi, based on a sample of volunteer blood donors. Participants actively left their home to donate blood, something that could have possibly led to a slight overestimate. Still it appears that Nairobi displayed seroprevalence levels higher than in Addis Ababa, and lower than those reported in European cities in that first phase of the pandemic. A recent study in Zambia reported 2% IgG prevalence 4 months after the first cases, similar to what we show for Addis Ababa [32]. Addis Ababa airport was not shut down (unlike in Nairobi), flights continued operating throughout the first COVID-19 wave in Europe, and while Ethiopia did encourage social distancing and masks, movement within and between the areas surveyed in this study was not restricted, except for a short period in April 2020. We might thus have expected higher seroprevalence. The authors of the study in Zambia contrasted their findings with higher prevalence that they estimated using RT-PCR on a representative sample. They suggested that lower antibody seroprevalence in Zambia could be explained by recent infections, immediately prior to data collection. This is similar to the contrast we report between IgG and IgG-IgM results. It is therefore possible that the low seroprevalence in Addis Ababa is partly a result of a significantly delayed first wave.

The results presented in this study do not align with the notion that slow spread of the virus during the first wave was an artefact of limited testing capacity, but are suggestive of a slower spread of SARS-CoV-2 seropositivity in Ethiopia. One possible explanation might be a constrained transmission due to environmental factors. A significant body of evidence supports a relationship between climatic and environmental conditions, namely increased UV light, temperature and humidity and reduced spread of SARS-CoV-2 [9,33], consistently with what has been established in other respiratory viruses such as influenza [8]. This could partly explain reduced transmission and growth rates of SARS-CoV-2 in Ethiopia and would also be consistent with the minimal presence of seropositive individuals in rural areas of Ethiopia, a pattern that contrasts with the significant spread from urban to rural areas that occurred in a matter of weeks, for example, in the U.S.A [34]. Population density plays a role in promoting transmission, seen here by the increased seroprevalence in the higher density sub-cities of central Addis Ababa, but its effect has, to date, been more subdued than expected in fuelling the spread in densely populated African cities [3]. Another factor that may also play a key contributing role is the tendency of the virus to spread in clusters through the occurrence of super-spreading events [35]. These events, encouraged for instance by large gatherings, may have occurred where ecological conditions and social circumstances were ideal for transmission, helping to explain the heterogeneity in seroprevalence within and between countries that would otherwise seem to provide a comparable setting for the virus [19]. Finally, while a younger population and differing genetic or pre-existing serological conditions in Sub-Saharan Africa likely have an effect on the quality of the immune response, and thus on fatality rates, it is less clear how these would contribute to lower prevalence [36]. Antibodies produced in response to a SARS-CoV-2 infection may wane in a few months to a year [37]. This is unlikely to impact the results presented here given the length of these timeframes, the timing of data collection and the very slow rate of progression of the disease following the first case in Ethiopia in March 2020 (there were still fewer than 150 cases reported in the country in the first week of May[1]). This could pose a challenge to future serosurveys, which may need to take into account the rate of decline of antibody titers in modelling seroprevalence so as to allow comparisons over time [38].

Funding

This work was supported by the Schmidt Family Foundation, Joachim Hertz Foundation, Nespresso, Peet's and Smuckers.

Data sharing statement

A subset of anonymised raw data collected during the study is available upon request to the corresponding author, after approval of EPHI.

Author Contributions

DB, SR, GTa, SA, MT, AB and AS conceived and designed this study. SA, AS, AAs, AAd and GTo coordinated operations and logistics. SR, NM and AB performed data analysis, FI and NM designed the statistical model. SR drafted the first version of the manuscript. All authors read and approved the final manuscript.

Declaration of Competing Interest

We declare no competing interests.

Acknowledgments

The study would not have been possible without Morgan Minogue, Yohanan Bedasso, Gemechu Gudeta, Lamessa Negeri, and the team of health workers at the Ethiopian Public Health Institute, who contributed their time and energy to data collection efforts in Addis Ababa and Jimma. We also thank the team at Enveritas including but not limited to Erin Collins, Lushuang Xu, Karin Remmelzwaal, Diana Baquero, Jeffrey Potter, Brett Woodward, Maria Burkiewicz, Eugene Wang, Grace Gee and Carl Cervone for providing input or directly helping improve and refine the data collection and storage process.

Appendix 1. Household randomization

Selecting a random household within a household tile

The random draw from the geographic household model produces household tiles (and GPS pins) with known population density. The health worker travels to that tile and can then be faced with several households to choose from. In rural settings, the health worker goes to the household nearest to the GPS pin. If the nearest household is not willing to participate or is not available, the second nearest is picked (and so on and so forth). In urban settings with multistory buildings, several households are present. The health worker follows a randomization routine to select the household. He orders flats by number or surname, and uses a random number provided by a randomizing app to select the flat. Should that first household be empty or not provide consent, the health worker moves on to the nearest household by distance. In the event of a tie, a random number provided by the app is used to make the choice.

Sampling within households

The objective is to sample a random person in the household among household members 18 years of age or above. During the pre-survey discussion, the health worker records the number of people living in the household, as well as their age. The health worker then randomly selects the person to sample who is over 18 years of age. This is done by collecting the ages of everyone in the household, assigning them a number from 1 to 10 based on their age, and using a randomizing app to select a random number between 1 and 10. If the draw selects a household member absent at the time, the health worker contacts him on the phone to set up a test. This was added to the protocol to prevent under-sampling people that had left the house to work outdoors. If this fails, the health worker restarts the routine at the next nearest household.

Appendix 2. Statistical model

The model estimates prevalence p, taking into account specificity and sensitivity of the test as well as the uncertainty around these:

ybinomial(n,psample)
psample=p*se+(1p)(1sp)
yspbinomial(nsp,sp)
ysebinomial(nse,se)

where psampleis the probability of seeing a positive in the sample, y is the number of positives observed, n is the sample size, se and sp are the sensitivity and specificity of the test. Prior information on specificity and sensitivity was taken from clinical trials.

Prior information on specificity and sensitivity was taken from the results of the test manufacturer's trials: ysp/nsp=359/364,yse/nse=230/245. We then added the results from the trial run at EPHI (ysp/nsp=193/199). For Addis Ababa only we also included the “minimum specificity” obtained from testing in the 2 rural districts of Jimma: ysp/nsp=616/617. These additional results were included by adding a hierarchical structure to the model of the specificity. Since specificity varies across trials due to varying testing conditions, it is not sufficient to simply pool the results. Following Gelman and Carpenter [27], we let the parameter vary according to a hierarchical model where, for any study j, the specificity spjis drawn from a normal distribution on the logistic scale:

logit(spj)normal(μsp,σsp)

where the hyperparameters μsp and σsp can be estimated from the data. When coding the model we use the convention that j=1corresponds to the specificity study of interest and the parameter sp1 to the specificity for the site where we performed our prevalence study.

We fit the model to our data assigning a uniform(0,1) prior to p and μspand weak normal+(0,1)to σsp. This implies that the model will draw spjapproximately 2/3rds of the time from the interval [0.92, 0.99]. Gelman and Carpenter[27] summarise the role of the hyperparameter: “The normal + (0, 1) hyperpriors weakly pull the specificities and sensitivities from different studies toward each other, while allowing for a large variation if required by the data”.

The full model is a multilevel regression hierarchical model with post-stratification (MRP). Although the data collected was largely representative of the population, MRP allowed us to do minor adjustments to the estimates using census data. The model is based on the above, replacing p for pi(i=1ton), the result of a logistic regression that models the probability of being positive on age group, subcity and gender:

pi=logit1(b1+b2*male+b3*xsubcity[i]+aageage[i]+asubcitysubcity[i])
psample[i]=pi*se+(1pi)*(1sp)
yibernoulli(psample[i])

where male is a variable that takes the value 1 for men and 0 for women; xsubcityis a parameter that provides more weight to geographical location, it can be set to anything relevant at the sub-city level (we defined it as the percentage of people above 60 years of age in each subcity); age[i],subcity[i]are index variables that correspond to testee i;b1,b2,b3 are logistic regression coefficients; aage,asubcityare vectors of varying intercepts:

aagenormal(0,σage)
asubcitynormal(0,σsubcity)

The full model can be accessed on Github at https://github.com/enveritas/ethiopia-covid-model/blob/main/mrpHierPrevalence.stan

Appendix 3. Number of samples by band intensity in positive samples

# Positives 1 2 3 4 5 6
Addis Ababa 31 6 (19%) 6 (19%) 7 (22%) 3 (10%) 5 (16%) 4 (13%)
Jimma 5 4 (80%) 1 (20%) 0 0 0 0

References

  • 1.ECDC . European Center for Disease Control; 2020. Daily number of new reported cases of COVID-19 by country worldwide.https://www.ecdc.europa.eu/en/publications-data/download-todays-data-geographic-distribution-covid-19-cases-worldwide published online Nov 22. (accessed Nov 23, 2020) [Google Scholar]
  • 2.Nachega J., Seydi M., Zumla A. The late arrival of coronavirus disease 2019 (COVID-19) in Africa: mitigating pan-continental spread. Clin Infect Dis. 2020;71:875–878. doi: 10.1093/cid/ciaa353. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Mbow M., Lell B., Jochems S.P. COVID-19 in Africa: dampening the storm? Science. 2020;369:624–626. doi: 10.1126/science.abd3902. [DOI] [PubMed] [Google Scholar]
  • 4.Maeda J.M., Nkengasong J.N. The puzzle of the COVID-19 pandemic in Africa. Science. 2021;371:27–28. doi: 10.1126/science.abf8832. [DOI] [PubMed] [Google Scholar]
  • 5.Nkengasong J.N., Mankoula W. Looming threat of COVID-19 infection in Africa: act collectively, and fast. Lancet. 2020;395:841–842. doi: 10.1016/S0140-6736(20)30464-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Gilbert M., Pullano G., Pinotti F. Preparedness and vulnerability of African countries against importations of COVID-19: a modelling study. Lancet. 2020;395:871–877. doi: 10.1016/S0140-6736(20)30411-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Njenga M.K., Dawa J., Nanyingi M. Why is there low morbidity and mortality of COVID-19 in Africa? Am J Trop Med Hyg. 2020;103:564–569. doi: 10.4269/ajtmh.20-0474. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Moriyama M., Hugentobler W.J., Iwasaki A. Seasonality of respiratory viral infections. Annu Rev Virol. 2020;7:83–101. doi: 10.1146/annurev-virology-012420-022445. [DOI] [PubMed] [Google Scholar]
  • 9.Mecenas P., Bastos RT da R.M., Vallinoto A.C.R., Normando D. Effects of temperature and humidity on the spread of COVID-19: a systematic review. PLoS One. 2020;15 doi: 10.1371/journal.pone.0238339. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Mougeni F., Mangaboula A., Lell B. The potential effect of the African population age structure on COVID-19 mortality. medRxiv. 2020 2020.05.19.20106914. [Google Scholar]
  • 11.Mizumoto K., Kagaya K., Zarebski A., Chowell G. Estimating the asymptomatic proportion of coronavirus disease 2019 (COVID-19) cases on board the diamond princess cruise ship, Yokohama, Japan, 2020. Eurosurveillance. 2020;25 doi: 10.2807/1560-7917.es.2020.25.10.2000180. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Roser M., Ritchie H., Ortiz-Ospina E., Hasell J. Our World in Data; 2020. Coronavirus pandemic (COVID-19)https://ourworldindata.org/coronavirus?te=1&nl=nicholas-kristof&emc=edit_nk_20200622 [Google Scholar]
  • 13.Wikramaratna P., Paton R.S., Ghafari M., Lourenco J. Estimating the false-negative test probability of SARS-CoV-2 by RT-PCR. Euro Surveillance 2020;25(50) 10.2807/1560-7917.ES.2020.25.50.2000568 [DOI] [PMC free article] [PubMed]
  • 14.Stringhini S., Wisniak A., Piumatti G. Seroprevalence of anti-SARS-CoV-2 IgG antibodies in Geneva, Switzerland (SEROCoV-POP): a population-based study. Lancet. 2020;396:313–319. doi: 10.1016/S0140-6736(20)31304-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Pollán M., Pérez-Gómez B., Pastor-Barriuso R. Prevalence of SARS-CoV-2 in Spain (ENE-COVID): a nationwide, population-based seroepidemiological study. Lancet. 2020;396:535–544. doi: 10.1016/S0140-6736(20)31483-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Bobrovitz N., Arora R.K., Yan T., et al. Lessons from a rapid systematic review of early SARS-CoV-2 serosurveys. DOI: 10.1101/2020.05.10.20097451. [DOI]
  • 17.Silveira M.F., Barros A.J.D., Horta B.L. Population-based surveys of antibodies against SARS-CoV-2 in Southern Brazil. Nat Med. 2020;26:1196–1199. doi: 10.1038/s41591-020-0992-3. [DOI] [PubMed] [Google Scholar]
  • 18.Rosenberg E.S., Tesoriero J.M., Rosenthal E.M. Cumulative incidence and diagnosis of SARS-CoV-2 infection in New York. Ann Epidemiol. 2020;48:23–29. doi: 10.1016/j.annepidem.2020.06.004. .e4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Hallal P.C., Hartwig F.P., Horta B.L. SARS-CoV-2 antibody prevalence in Brazil: results from two successive nationwide serological household surveys. Lancet Glob Health. 2020;8:e1390–e1398. doi: 10.1016/S2214-109X(20)30387-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Uyoga S., Adetifa I.M.O., Karanja H.K. Seroprevalence of anti-SARS-CoV-2 IgG antibodies in Kenyan blood donors. Science. 2020 doi: 10.1126/science.abe1916. published online Nov 11. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Whitman J.D., Hiatt J., Mowery C.T. Evaluation of SARS-CoV-2 serology assays reveals a range of test performance. Nat Biotechnol. 2020;38:1174–1183. doi: 10.1038/s41587-020-0659-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.ArcGIS StoryMaps; 2020. Supply chain verification: finding small-holder coffee farms.https://storymaps.arcgis.com/stories/10ef0006e4d94b86b228748c7082cf85 published online Feb 25. accessed May 14, 2020. [Google Scholar]
  • 23.Escamilla V., Emch M., Dandalo L., Miller W.C., Hoffman I. Sampling at community level by using satellite imagery and geographical analysis. Bull World Health Organ. 2014;92:690–694. doi: 10.2471/BLT.14.140756. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.FPFIS team. Using GIS and Remote Sensing to build Master Sampling Frames for Agricultural Statistics - EU Science Hub - European Commission. EU Science Hub - European Commission. 2015; published online Dec 10. https://ec.europa.eu/jrc/en/publication/using-gis-and-remote-sensing-build-master-sampling-frames-agricultural-statistics (accessed May 13, 2020).
  • 25.CSA. Population projection of Ethiopia for all regions at wereda level from 2014 to 2017. 2013.
  • 26.Chatfield A., Javetski G., Lesh N. CommCare evidence base. Dimagi. 2013 https://www.dimagi.com/commcare/ [Google Scholar]
  • 27.Gelman A., Carpenter B. Bayesian analysis of tests with unknown specificity and sensitivity. J R Stat Soc Ser C Appl Stat. 2020;69:1269–1283. doi: 10.1111/rssc.12435. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Stan Development Team . Vol. 2. R package version 2212; 2020. http://mc-stan.org/ (RStan: the R interface to Stan). [Google Scholar]
  • 29.Lundkvist Å., Hanson S., Olsen B. Pronounced difference in Covid-19 antibody prevalence indicates cluster transmission in Stockholm, Sweden. Infect Ecol Epidemiol. 2020;10 doi: 10.1080/20008686.2020.1806505. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Gray C.M., Peter J., Mendelson M., Madhi S., Blackburn J.M. COVID-19 antibody testing: from hype to immunological reality. S Afr Med J. 2020;110:837–841. [PubMed] [Google Scholar]
  • 31.Tollånes M.C., Bakken Kran A.-.M., Abildsnes E., Jenum P.A., Breivik A.C., Sandberg S. Evaluation of eleven rapid tests for detection of antibodies against SARS-CoV-2. Clin Chem Lab Med. 2020;58:1595–1600. doi: 10.1515/cclm-2020-0628. [DOI] [PubMed] [Google Scholar]
  • 32.Mulenga L.B., Hines J.Z., Fwoloshi S. Prevalence of SARS-CoV-2 in six districts in Zambia in July 2020: a cross-sectional cluster sample survey. Lancet Glob Health. 2021 doi: 10.1016/S2214-109X(21)00053-X. published online March 9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Merow C., Urban M.C. Seasonality and uncertainty in global COVID-19 growth rates. Proc Natl Acad Sci U S A. 2020;117:27456–27464. doi: 10.1073/pnas.2008590117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Paul R., Arif A.A., Adeyemi O., Ghosh S., Han D. Progression of COVID-19 from urban to rural areas in the United States: a spatiotemporal analysis of prevalence rates. J Rural Health. 2020;36:591–601. doi: 10.1111/jrh.12486. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Adam D.C., Wu P., Wong J.Y. Clustering and superspreading potential of SARS-CoV-2 infections in Hong Kong. Nat Med. 2020;26:1714–1719. doi: 10.1038/s41591-020-1092-0. [DOI] [PubMed] [Google Scholar]
  • 36.Tso F.Y., Lidenge S.J., Peña P.B. High prevalence of pre-existing serological cross-reactivity against severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) in sub-Saharan Africa. Int J Infect Dis. 2021;102:577–583. doi: 10.1016/j.ijid.2020.10.104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Choe P.G., Kim K.-.H., Kang C.K. Antibody responses 8 months after asymptomatic or mild SARS-CoV-2 Infection. Emerg Infect Dis. 2020;27 doi: 10.3201/eid2703.204543. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Shioda K., Lau M.S., Kraay A.N. Estimating the cumulative incidence of SARS-CoV-2 infection and the infection fatality ratio in light of waning antibodies. medRxiv. 2020 doi: 10.1101/2020.11.13.20231266. published online Nov 16. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from EClinicalMedicine are provided here courtesy of Elsevier

RESOURCES