Skip to main content
PLOS Neglected Tropical Diseases logoLink to PLOS Neglected Tropical Diseases
. 2020 Apr 15;14(4):e0008227. doi: 10.1371/journal.pntd.0008227

Identification of cholera hotspots in Zambia: A spatiotemporal analysis of cholera data from 2008 to 2017

John Mwaba 1, Amanda K Debes 2, Patrick Shea 2, Victor Mukonka 3, Orbrie Chewe 3, Caroline Chisenga 1, Michelo Simuyandi 1, Geoffrey Kwenda 4, David Sack 2, Roma Chilengi 1, Mohammad Ali 2,*
Editor: Adam Akullian5
PMCID: PMC7159183  PMID: 32294084

Abstract

The global burden of cholera is increasing, with the majority (60%) of the cases occurring in sub-Saharan Africa. In Zambia, widespread cholera outbreaks have occurred since 1977, predominantly in the capital city of Lusaka. During both the 2016 and 2018 outbreaks, the Ministry of Health implemented cholera vaccination in addition to other preventative and control measures, to stop the spread and control the outbreak. Given the limitations in vaccine availability and the logistical support required for vaccination, oral cholera vaccine (OCV) is now recommended for use in the high risk areas (“hotspots”) for cholera. Hence, the aim of this study was to identify areas with an increased risk of cholera in Zambia. Retrospective cholera case data from 2008 to 2017 was obtained from the Ministry of Health, Department of Public Health and Disease Surveillance. The Zambian Central Statistical Office provided district-level population data, socioeconomic and water, sanitation and hygiene (WaSH) indicators. To identify districts at high risk, we performed a discrete Poisson-based space-time scan statistic to account for variations in cholera risk across both space and time over a 10-year study period. A zero-inflated negative binomial regression model was employed to identify the district level risk factors for cholera. The risk map was generated by classifying the relative risk of cholera in each district, as obtained from the space-scan test statistic. In total, 34,950 cases of cholera were reported in Zambia between 2008 and 2017. Cholera cases varied spatially by year. During the study period, Lusaka District had the highest burden of cholera, with 29,080 reported cases. The space-time scan statistic identified 16 districts to be at a significantly higher risk of having cholera. The relative risk of having cholera in these districts was significantly higher and ranged from 1.25 to 78.87 times higher when compared to elsewhere in the country. Proximity to waterbodies was the only factor associated with the increased risk for cholera (P<0.05). This study provides a basis for the cholera elimination program in Zambia. Outside Lusaka, the majority of high risk districts identified were near the border with the DRC, Tanzania, Mozambique, and Zimbabwe. This suggests that cholera in Zambia may be linked to movement of people from neighboring areas of cholera endemicity. A collaborative intervention program implemented in concert with neighboring countries could be an effective strategy for elimination of cholera in Zambia, while also reducing rates at a regional level.

Author summary

Zambia has experienced cholera outbreaks since 1977. It is a landlocked country bordered by the DRC and Tanzania to the north, Malawi and Mozambique to the east and Zimbabwe to the south; all of which experience regular cholera outbreaks. The Zambian Ministry of Health included cholera vaccination, in addition to standard cholera control measures, e.g., clean water, improving sanitation and promoting hygiene to counter a cholera outbreak in 2016. The implementation of these control measures is in line with Zambia’s National Cholera Eliminating Plan (NCEP) by 2025 and is also consistent with guidance by the Global Task Force on Cholera Control’s (GTFCC) global roadmap to end cholera by 2030. In both plans, the identification of high risk areas known as cholera “hotspots” is necessary to prioritize OCV deployment while also key in identifying areas where improvements are needed including surveillance systems and effective WASH improvements. In this study, we retrospectively analyzed district-level cholera data from 2008 to 2017. Sixteen of 72 districts were identified to have an increased risk of cholera using a geostatistical model. Outside of Lusaka district, which is a primary hotspot, the additional hotspot districts share borders with Zambia’s neighboring countries. To achieve cholera elimination in Zambia by 2025, a regional strategy involving each of the countries bordering will be needed.

Introduction

The global burden of cholera is increasing, with current estimates indicating that 1.3 billion people are at risk in endemic countries, resulting in 2.8 million cases and 91,000 deaths annually [1]. Of these cases, the majority (60%) occur in sub-Saharan. In Zambia, widespread cholera outbreaks have occurred since 1977 [2], predominantly in the capital city of Lusaka [3]. The causes have been attributed to poor access to safe water and sanitation facilities in peri-urban areas of the city [2]. To prevent and control cholera outbreaks, the Zambian government has adopted a multi-sectorial approach that engages relevant ministries and cooperating partners, as was the case for the 2017–18 outbreaks [4]. The intervention program includes provision of adequate safe water, improving sanitation facilities, and vaccinating individuals in Lusaka. Provision of adequate safe water and improving sanitation facilities are long-term measures which tend to be extremely expensive, require infrastructural change, skilled personnel for implementation, and management of the infrastructure [5]. As a short-term measure, vaccination programs will be implemented to control the disease [6,7].

A successful vaccination program requires a well-designed implementation plan. The World Health Organization (WHO) has advised the use of the oral cholera vaccine (OCV) in areas that are deemed at high risk or “hotspots” [8,9]. Since cholera has a spatial expression, understanding the geographical distribution of the disease is important for implementation of an effective intervention strategy [10]. Importantly, spatiotemporal clusters of cholera should be identified, i.e., areas where cholera incidence is significantly higher and occurs more frequently than elsewhere in the country. Identification of these areas provides the needed information to allocate resources to intervention programs targeted to these sites [11].

In the past, the OCV programs in Zambia lacked in-depth understanding of areas of high transmission. The aims of this study were to identify areas with an increased risk of cholera in space and time, and to perform an area-based analysis to understand the driving factors for the risk of cholera in these areas. This knowledge will help in developing an effective intervention strategy for controlling cholera in Zambia.

Materials and methods

The study area

Zambia is a landlocked country in Southern Africa and is located between latitudes 8° and 18° south and longitudes 22° and 34° east of the equator covering a total area of 752,612 km2 [12]. The country is surrounded by, Malawi to the east, Mozambique, Zimbabwe, Botswana and Namibia to the South, Angola to the west, The Democratic Republic of Congo (DRC) to the north, and Tanzania to the north-east. The country is divided into 10 administrative provinces encompassing a population of approximately 16.6 million people as of 2016, and an estimated annual growth rate of 3.0 percent. The country further divided administratively into 114 districts as of August 2018. For the purposes of this analysis, we restricted the analysis to the 2010 census with 72 districts.

Cholera data

This analysis included cases per WHO criteria: any patient aged 5 years or more presenting with acute watery diarrhea and severe dehydration where cholera is not known to be occurring, or any patient 2 years or older presenting with acute watery diarrhea where cholera is known to be occurring. A suspected case in which Vibrio cholerae 01 or 0139 was isolated from stool is considered a confirmed case [13]. This analysis includes cases reported by year and by district from 2008 through 2017; the data were obtained from the Zambian Ministry of Health Department of Public Health and disease surveillance database [14].

Population and socioeconomic data

District level population and urban/rural proportion of population by district were obtained from the 2010 Census of Population and Housing report compiled by the Zambian Central Statistics Office (CSO) (http://www.mcaz.gov.zm/wp-content/uploads/2014/10/2010-Census-of-Population-Summary-Report.pdf). Additional socioeconomic data such as the percentage of the population living below the poverty index were obtained from the CSO Living Conditions Monitoring Survey 2015 [15].

Water, Sanitation, and Hygiene (WASH) data

Data focused on access to improved sanitation and improved water sources was obtained from the Zambia Demographic and Health Survey 2014 (https://dhsprogram.com/what-we-do/survey/survey-display-406.cfm). The percentage of population using improved water source was defined as the population whose main source of drinking water was piped household water, a public tap or standpipe, tube-well or borehole, protected dug well, protected spring, collected rainwater, or bottled water. The percentage of the population with access to improved sanitation was defined as households with flush toilets, ventilated improved pit latrines, pit latrines with slabs, or composting toilets not shared with other households. The data were presented based on percentages for urban and rural population in the report. For this analysis, the data by district were calculated using district level urban/rural population percentages.

GIS data

The digital maps of Zambia were obtained from The Humanitarian Data Exchange (https://data.humdata.org/dataset/zambia-administrative-boundaries-level-1-provinces-and-level-2-districts-with-census-2010-population), which is shared under CC-by license (https://data.humdata.org/about/license).Until 2013, Zambia was subdivided into 72 districts. Since we have the data based on those 72 districts, we collapsed the 115 districts into those 72 districts for analysis. We compiled the cholera data and the other data sets in this study per district in the GIS database.

Hotspots identification

We used a spatial scan test [16] to identify spatiotemporal hotspots of cholera from 2008 to 2017 in Zambia. A discrete Poisson-based space-time scan statistic was utilized to account for variations in cholera risk across both space (districts) and time (year) during the 10-year study period. Under the Poisson model, it was assumed that the number of cases for each segment of the study area would be proportional to the population, thus the model compared cases against the underlying population at risk. Since the location and size of the window changed in this process, the model created several distinct windows, therefore, a likelihood ratio was calculated. Under the Poisson model, the likelihood function for a specific window is:

λ=(nμ)n(NnNμ)NnI(n>μ)

where, N is the number of cases in the study area, n is the number of cases within the window, μ is the expected number of cases within the window under the null hypothesis, and I() is an indicator function. The likelihood function was maximized over all windows, identifying the window that constituted the most likely cluster. The most likely cluster (hotspots) is the area that is least likely to have occurred by chance. The likelihood ratio for the window was noted and constituted the maximum likelihood ratio test statistic. Its distribution under the null hypothesis and its corresponding p-value was determined by repeating the same procedure on a large number of random replications of the data set generated under the null hypothesis using a Monte Carlo simulation approach.

In this study, since we were interested in the space-time scan statistic, the approach uses a cylindrical scanning window with a circular spatial base and height corresponding to time [16]. We set the spatial window to 20% of the population at risk assuming that a larger spatial window would obscure local details. In contrast, a smaller window would make the cluster individualistic in nature. We set the temporal window to 50% of the study period. We sought to identify the high-risk clusters, i.e. the areas where the interior of the scanning window are at a higher risk than the areas surrounding the window. The completion of the scan results in the identification of districts in which the risk of cholera was higher than the rest of the country during the study period. These high-risk districts represent cholera hotspots.

Statistical analysis of the potential risk factors

Zero-inflated negative binomial (ZINB) model

To examine the potential drivers of cholera in Zambia, we first employed a zero-inflated negative binomial (ZINB) model [17] considering that the model would account for over dispersion and zero-inflation in the data set. The model assumes that our dataset contains two groups: a count regression group and an excess zero group. The count regression model fits the count data and the binary regression model fits the excess zero data. For each observation with probability p, the possible response of the “excess zero group” is 0 count, and with probability of 1-p, the response of the count regression group is governed by a negative binomial with mean count of cases λ. If the response Y (cumulative number of cases over the study period) follows a ZINB distribution, then

P(Y=y)={p+(1p)(kλ+k)k,ify=0(1p)Γ(Y+k)Γ(k)Γ(Y+1)(kλ+k)k(1kλ+k)Y,ify>0

where 0 ≤ p ≤ 1, k is the overdisperson parameter and Γ is the gamma function. We therefore modelled the ZINB regression as

  • - for count model: Log λ = β0 + β1x1 + β2x2

  • - for excess zero model: Logit (p) = γ0 + γ1z1+ γ2z2

where xi and zi are the variable of interest, and βi and γ i are the corresponding regression and zero-inflated coefficients, respectively. β0 and γ0 are the intercepts and logit (p) = log (p/1-p).

Spatial dependency test

We employed global Moran’s I to test for spatial dependency of cholera in Zambia. The Moran’s I was calculated as

I=i=1mj=1mwij(rir¯)(rjr¯)wiji=1m(rir¯)/m

where ri is the rate in region i, rj is the rate in region j, wij is a measure of adjacency between region i and j, and is defined as (1 if i and j are adjacent; 0 otherwise). When rates in nearby areas are similar, the Moran's I will be large and positive, and when rates in nearby areas are dissimilar the Moran's I will be negative.

Spatial regression

It is important to note that spatial data may show spatial dependence in the variables and error terms, as the data collection using spatial units may reflect measurement error. This is because the administrative boundaries do not necessarily reflect the underlying process of disease transmission and the spatial dimension of the socioeconomic characteristics is an important aspect of the phenomenon. Therefore, based on the diagnostic test of the OLS, we further created spatial lag model (SLM) and spatial error model (SEM) to get the unbiased estimates of the factors for higher risk of cholera after adjusting for spatial heterogeneity of the outcomes and/or residuals. The SLM is defined as

y=ρwy+βx+ε

where ρ is the spatial lag parameter, and wy is the weighted average of its value in its neighborhood:

And, the (SEM) is defined as

y=βx+ε,withε=λwε+ζ

Here, λ is the spatial autoregressive parameter and the error ζ is independently and identically distributed. In case SLM, it is assumed that the observations are spatially dependent, whereas SEM assumes that the residuals are correlated with the neighborhood. Both SLM and SEM are estimated by maximizing the corresponding likelihood functions.

Software applications

We used SatSCan (https://www.satscan.org/) for identifying hotspots, Geoda (https://geodacenter.github.io/) for spatial analysis, SAS 9.4 for analyzing the data using ZINB model, and ArcMap Desktop 10.6 (Esri Inc.) for mapping of the hotspots.

Ethics

The study used secondary data aggregated at the district level, and the data analyzed were anonymized. The Ministry of Health, Zambia gave permission to access the data from the Department of Public Health and disease surveillance database. Therefore, no ethical approval was required for conducting this study.

Results

In total, 34,950 cases of cholera were reported in Zambia between 2008 and 2017. The highest number of cases, 17,348, spanning 33 districts (almost half of the country) were reported in 2010. However, 89% of the cholera cases were reported from the Lusaka district in both 2009 and 2010. The lowest number of cholera cases (31 cases) were in 9 districts in 2015 (Fig 1).

Fig 1. Distribution of cholera cases, by WHO case definition, by year, 2008–2017.

Fig 1

Note: No. of cholera affected districts are recorded on the top of bars.

Cholera cases also varied spatially by year (Fig 2). Starting with only a few districts affected near the DRC in 2008, cholera spread to a larger area from 2009–2012. Subsequently, the number of cases decreased from 2013–2017. Throughout the study period, Lusaka District had the highest burden of cholera, with 29,080 total reported cases.

Fig 2. Cases of cholera by district and by year, 2008–2017.

Fig 2

The spatiotemporal analysis based on the district centers as the geographic coordinates yielded 16 high-risk clusters. Cholera hotspots were defined based on the location ID provided by the SatSCan for the identified clusters where 16 districts were found to be at a significantly higher risk of having cholera. The risk of having cholera in these districts ranged from 1.25 to 78.87 times compared to that elsewhere in the country (Fig 3).

Fig 3. Spatiotemporal hotspots of cholera in Zambia, 2008–2017.

Fig 3

About 4.7 million people (36% of the total population) live in these districts (Table 1).

Table 1. Number of districts and population by risk group in Zambia.

Risk group Relative Risk Number of Districts No. of Population Percent of Total Population
Extremely high 10.01+ 1 1,747,152 13.34
High 5.01–10.00 4 469,974 3.59
Medium 2.01–5.00 5 839,995 6.42
Low 1.25–2.00 6 1,633,000 12.47
Total 16 4,690,121 35.82

We noted that although cholera occurred during several years in some districts, most were not identified as being significantly high-risk areas of cholera in the spatiotemporal analysis. For instance, Chongwe district experienced cholera 9 out of 10 years, but was not a hotspot in this model. Similarly, Choma district was affected 8 times, but not determined to be a hotspot. This is because there were fewer than 100 cases in these two districts during the study period, thus the relative risk is too low to be defined as a hotspot.

It is also notable that cholera did not affect entire districts but rather, only affected some parts. For instance, 3 of the 33 wards in Lusaka district reported over 50 cases of cholera per 100,000 population in three years (2016–2018) (Fig 4). 322,198 people live in these three wards compared to 1,747,152 people in all of Lusaka district. This indicates that only 18% of the population in the district were at a higher risk for cholera. It is important to note that despite Kabulonga ward being a low density area, this ward includes Bauleni compound which is highly populated area that experienced recurrent cholera outbreaks; hence the ward was determined to be a hotspot.

Fig 4. Cholera cases in Lusaka by ward, 2016–2018.

Fig 4

The descriptive statistics of the variables included in the risk factor analysis are presented in the Table 2.

Table 2. Descriptive statistics of the study variables (n = 72 districts).

Variable: Mean Median Standard Deviation Minimum Maximum
Total population 181,843 135,825 208,429 24,304 1,747,152
Total number of cholera cases 2008–2017 486 20 3,397 0 29,080
Population living in the urban area (%) 25.47 13.50 28.45 2.02 100.00
Households having access to improved sanitation (%) 37.73 31.75 14.22 26.01 75.00
Household having access to improved water source (%) 57.52 52.81 12.94 21.51 90.00
Households living under poverty (%) 46.32 51.67 12.72 13.00 56.80
Distance from the center of district to the nearest waterbody (km) 26.33 21.41 12.81 13.00 56.80

After being adjusted for the incidence rate of first order neighborhoods, the ZIP model determined that “Distance from household to the nearest waterbody” was associated with increased risk of cholera in Zambia (P < .05) (Table 3). Since, no other variables showed an association with risk for cholera we did not conduct a multivariable analysis.

Table 3. Results of the analysis using zero inflated negative binomial model (ZINB) model.

Variables Estimate Wald 95% CI P-value
Percent of population living in the urban area 0.0028 -0.0145 to 0.0201 0.75
Percent of households having access to improved sanitation 0.0056 -0.0289 to 0.0401 0.75
Percent of household having access to improved water source 0.0213 -0.0310to 0.0525 0.26
Percent of households living under poverty -0.0062 -0.0449 to 0.0324 0.75
Distance from center of the district to the nearest waterbodies -0.0170 -0.0338 to -0.0003 0.045

Note: Each variable was entered in the model in combination with the neighborhood incidence rate to adjust for the spatial structure of the disease. Only the negative binomial component of the model is provided, since the zero inflated component of the model did not converge.

The estimated Moran’s I statistic showed a moderately significant spatial dependence (p = 0.09) in cholera incidence (Fig 5), suggesting exploration of spatial regression.

Fig 5. Moran’s I statistic and associated p-values based on 999 permutations.

Fig 5

Diagnostic tests for the OLS models were conducted and the value of the Lagrange Multiplier was not statistically significant for either lag model (value: 0.52, p = 0.47) or error model (value: 1.86, p = 0.17). However, the Robust Lagrange Multiplier was found to be statistically significant for both the lag (value: 7.04, p = 0.01) and error model (value: 8.38, p = 0.003), indicative of conducting both SLM and SEM models. The results from OLS, SLM and SEM describing the effect of the factors on the outcome are presented in Table 4. Based on the model diagnostics and comparing the Akaike Information criterion (AIC) (the lower the better) and R-square (the higher the better), the SEM was found to be the best-fit model for the data. This was also supported by the lag coefficient (lamda = 0.40) and its associated p-value (0.0003) of the SEM model. However, none of the variables included in the spatial regression models were found to be associated with cholera incidence in the country. Importantly, we had a very high multicollinearity number (61136) in this analysis, indicative of highly correlated data for the independent variables included in the analysis.

Table 4. Results of the different regression analysis.

Variables OLS SLM SEM
Population living in the urban area -20.51 (0.63) -21.84(0.60) -35.49 (0.33)
Households having access to improved sanitation 32.41 (0.67) 28.72(0.69) 38.74 (0.55)
Household having access to improved water source 0.016 (0.82) 0.0136(0.84) 0.0084 (0.89)
Households living under poverty -9.66(0.90) -16.77 (0.82) -36.12 (0.59)
Distance from waterbodies -0.00057(0.96) -0.0014 (0.90) -0.0067 (0.53)
Multicollinearity condition number 61136 - -
Lag coefficient - 0.18 (0.23) 0.40 (.0003)
Akaike Information Criteria (AIC) 318.371 319.562 314.778
R-square 0.0906 0.1074 0.1691

OLS = Ordinary Least Square regression, SLM = Spatial Lag regression Model, SEM: Spatial Error regression Model

Discussion

The results of our study identified 16 districts of Zambia as hotspots, but at varying levels of risk. Five of these districts had a relative risk >5. A major hotspot identified was the city of Lusaka; 89% of the cases in this analysis were reported from Lusaka. Interestingly, only 3 of 33 wards in Lusaka district were identified as high risk areas. The sub-district (constituency) level data analysis of Lusaka found that only three of seven constituencies, with about 20% of the population (600,000 people), experienced high rates of cholera. This suggests that control efforts should focus in these constituencies. Lusaka has several densely populated peri-urban settlement areas with inadequate water and sanitation infrastructure which compromises sanitation and hygiene, facilitating cholera transmission [3]. Lusaka has experienced prolonged rainfall that results in flooding, and this likely further increases cholera risk [2]. The city is also a center for an international cross-borders trading, with many people traveling between countries which may also increase risk of transmission to the city.

The hotspot areas outside Lusaka were near the borders with DRC, Tanzania, Mozambique, and Zimbabwe. Given the geographical distance between these border area hotspots, control efforts will be challenging. More localized analysis may likely reveal that the hotspot areas in these districts only encompass a select few wards, as was the case in Lusaka. The hotspots near the borders suggest that cholera in Zambia is linked to cross-border movement between countries where cholera is also endemic. This was observed in Uganda [18]. Further, this suggests that a collaborative intervention program with the neighboring countries could be an effective strategy to eliminate cholera in Zambia and a step toward reduction and elimination in the region.

The peak of cholera in Zambia was observed between 2009 and 2010, at beginning of the study period. Subsequently, the number of cases declined. Some have hypothesized that the large number of cases in Lusaka in 2009 and 2010 might be due to prolonged rainfall and flooding (https://www.who.int/cholera/countries/ZambiaCountryProfile2011.pdf), however, this time period was also a peak time for cholera in other African countries suggesting that the factors responsible for the high numbers may have occurred more generally in Africa. As depicted in Fig 2, cholera occurred sporadically in Zambia; thus, it was difficult to ascertain which risk factors would be the best predictors for its occurrence, rendering it difficult to identify any district level predictors for the increased risk of cholera in Zambia.

Interestingly, Chiengi and Mpulungu districts were identified as the areas of highest risk after Lusaka; however, neighboring Kaputa reported only 2 cases over the 10-year period. Of the highest risk districts after Lusaka (Chiengi, Mpulungu, and Sinazongwe), none reported cases in more than 5 out of the 10 years studied. In contrast, Solwezi district, a hotspot of lesser risk, reported cases in 8 of the 10 years. Since we did not identify any risk factors for cholera from our district level data except distance to the waterbodies, it is difficult to determine the underlying cause(s) of cholera. Note that increasing risk among people living proximate to water bodies has already been documented in a number of studies [1820]. Since the best fit model in our study, i.e. SEM, explained only 16% of the total variations in the outcome, it is reasonable to believe that other risk factor(s) play a role at a spatial scale in Zambia.

Since cholera is transmitted by fecal oral route through water, one expects a relationship between cholera and WaSH conditions [2124]. It is assumed that a well-managed, improved WaSH infrastructure, as has proven effective in industrialized countries, would be an optimal strategy for the cholera elimination program in Zambia. This study did not find an association between cholera and WaSH at the district level, the data available may not have been enough discriminating to find such an association. With the limitation of resources for major improvements in infrastructure, household level WaSH programs have been conducted in the past and might be effective [2526]. While large-scale WASH interventions may ultimately eliminate cholera, cholera vaccination can be used in the interim as an effective control measure targeting the identified hotspots.

This study has limitations. By conducting the analysis based on the 72 districts with the available census population data, rather than the current 115 districts, there was a loss of spatial resolution in the identification of hotspots. Secondly, we did not include acute watery diarrhea cases less than 5 years old from areas where cholera was not known to have occurred or acute watery diarrhea cases less than 2 years old from areas where cholera was known to have occurred. Therefore, we could have missed some cholera cases in our study. Cholera can occur in these age groups and the decision to not include these may have led to an underestimation of cholera cases. Thirdly, the data was obtained from routine surveillance system and there could be differences in the reporting of cases from different parts of the county leading to reporting bias. The data on water and sanitation (WASH) were available only by urban and rural areas; thus there was limitation in the ability to calculate association of cholera with WASH. Also, the WASH data was only for a single time point, precluding an ability to perform time-series analysis with the data. Further, there was limited risk factor data available at the district level that could be used in our analysis, thus, we were unable to identify key risk factors for cholera as well as in predicting future outbreaks.

While accepting these limitations, this analysis has identified the districts with elevated risk of cholera which will facilitate the selection of sites for more intensive control strategies. By targeting the highest risk districts, as identified in our analysis, further investigation and data collection at the sub-district level are needed to identify the specific areas to be targeted for the interventions within each district. This would facilitate the planning of interventions in the highest risk wards in a more cost-effective manner. For this, local participation and knowledge are needed to identify data and refine the analyses to highlight ward level high-risk areas within the districts. In the future, if very sensitive and specific surveillance methods allow for real-time case detection with GIS coordinates of cases, improved maps can be created, allowing for even better targeting of interventions.

The WHO announced the cholera elimination by 2030 program by partnering with priority countries. Zambia hopes to achieve this goal by 2025 and interventions based on the identified hotspots should assist in this effort.

Supporting information

S1 STROBE Checklist

(DOCX)

S1 Data

(CSV)

Acknowledgments

We are grateful to the staff of the Ministry of Health, Zambia for allowing us to access data from their surveillance system database. Support rendered during data collection by colleagues at Centre for Infectious Diseases Research in Zambia -Enteric Diseases and Vaccine Research Unit.

Data Availability

All relevant data are within the manuscript and its Supporting Information files.

Funding Statement

Funding support for data analysis and report preparation was provided by Bill and Melinda Gates Foundation through the Delivering Oral Vaccine Effectively (DOVE) project, administered by the Johns Hopkins Bloomberg School of Public Health (OPP1148763). The funder of the study had no role in study design, data collection, analysis, or interpretation, or writing of the report.

References

PLoS Negl Trop Dis. doi: 10.1371/journal.pntd.0008227.r001

Decision Letter 0

Hélène Carabin, Adam Akullian

2 Dec 2019

Dear Dr. Ali:

Thank you very much for submitting your manuscript "Identification of cholera hotspots in Zambia: A spatiotemporal analysis of cholera data from 2008 to 2017" (#PNTD-D-19-00964) for review by PLOS Neglected Tropical Diseases. Your manuscript was fully evaluated at the editorial level and by independent peer reviewers. The reviewers appreciated the attention to an important problem, but raised some substantial concerns about the manuscript as it currently stands. These issues must be addressed before we would be willing to consider a revised version of your study. We cannot, of course, promise publication at that time.

We therefore ask you to modify the manuscript according to the review recommendations before we can consider your manuscript for acceptance. Your revisions should address the specific points made by each reviewer.

When you are ready to resubmit, please be prepared to upload the following:

(1) A letter containing a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript.

(2) Two versions of the manuscript: one with either highlights or tracked changes denoting where the text has been changed (uploaded as a "Revised Article with Changes Highlighted" file); the other a clean version (uploaded as the article file).

(3) If available, a striking still image (a new image if one is available or an existing one from within your manuscript). If your manuscript is accepted for publication, this image may be featured on our website. Images should ideally be high resolution, eye-catching, single panel images; where one is available, please use 'add file' at the time of resubmission and select 'striking image' as the file type.

Please provide a short caption, including credits, uploaded as a separate "Other" file. If your image is from someone other than yourself, please ensure that the artist has read and agreed to the terms and conditions of the Creative Commons Attribution License at http://journals.plos.org/plosntds/s/content-license (NOTE: we cannot publish copyrighted images).

(4) If applicable, we encourage you to add a list of accession numbers/ID numbers for genes and proteins mentioned in the text (these should be listed as a paragraph at the end of the manuscript). You can supply accession numbers for any database, so long as the database is publicly accessible and stable. Examples include LocusLink and SwissProt.

(5) To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions see http://journals.plos.org/plosntds/s/submission-guidelines#loc-methods

Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/ PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org.

We hope to receive your revised manuscript by Jan 31 2020 11:59PM. If you anticipate any delay in its return, we ask that you let us know the expected resubmission date by replying to this email.

To submit a revision, go to https://www.editorialmanager.com/pntd/ and log in as an Author. You will see a menu item call Submission Needing Revision. You will find your submission record there.

Sincerely,

Adam Akullian, Ph.D.

Associate Editor

PLOS Neglected Tropical Diseases

Hélène Carabin

Deputy Editor

PLOS Neglected Tropical Diseases

***********************

Please also address the following re: case definition on lines 105-109. The case definition is unclear. Please describe the inclusion / exclusion criteria for cases, including age-restrictions and whether cases needed to be culture confirmed or just "suspected." Also, it appears that patients under 5 were not included in the study. Please explain and consider adding this as a limitation. The word, "should" on line 108 suggests the case definition may have not been followed. Please comment.

Reviewer's Responses to Questions

Key Review Criteria Required for Acceptance?

As you describe the new analyses required for acceptance, please consider the following:

Methods

-Are the objectives of the study clearly articulated with a clear testable hypothesis stated?

-Is the study design appropriate to address the stated objectives?

-Is the population clearly described and appropriate for the hypothesis being tested?

-Is the sample size sufficient to ensure adequate power to address the hypothesis being tested?

-Were correct statistical analysis used to support conclusions?

-Are there concerns about ethical or regulatory requirements being met?

Reviewer #1: This is a straightforward study with clearly defined aims and appropriate statistical methodologies employed.

The cholera data section does not explicitly state the spatial resolution of the confirmed/suspected cases until the ethical section. Would it be possible to combine these sections or have the ethical section come immediately after? The cholera data feels lacking in detail about the spatial resolution of the data otherwise.

It would also be helpful if the section on population factors (lines 111-115) described the variables used in more detail rather than requiring the reader to wait for the table in the results section.

Reviewer #2: The objectives of the study was clearly stated, however the data sources and statistical analysis are not well described.

Are the cholera data used in this study monthly or yearly and at district level? Please explicitly state the level(s) in which the data was collected.

The authors provided a generic description of Zero Inflated Negtative Binomial model but did not explicitly relate the model to the current study. They mentioned that “y” is the independent variable. They should clearly state whether it is the counts of cholera cases in each district at a particular time (yearly?). In formulating the model, subscript should be used to allow the reader follow the methods easily.

Line 137: Provide reference for SaTScan.

--------------------

Results

-Does the analysis presented match the analysis plan?

-Are the results clearly and completely presented?

-Are the figures (Tables, Images) of sufficient quality for clarity?

Reviewer #1: The analysis of results is equally straightforward as the methods, although I have a few questions:

It is unclear what the numbers on the bars represent in Figure 1, as they do not seem to match the numbers mentioned in the preceding text.

The peak in 2009/2010 is striking. Do the authors have access to enough temporally refined data to study why this was the case, in terms of environmental/socio-economic factors? It would be helpful at least for the authors to discuss potential underlying reasons for this peak and then relative decline.

This leads me to also wonder if the hotspots found might differ if the authors split the data into, say, two groups - pre- and post-2010 and then re-ran their analysis on these two different time periods.

Figure 5 is of somewhat low resolution/quality.

Reviewer #2: The annotation in Figure 1 is not clear. What does the number on each bar represents? It does not correspond to the numbers on the y axis.

The maps in Figure 2 are too small. Change the format to 2 maps per row (5 by 2).

Table 2: Why is N=72? Since your study is spatiotemporal, you should have more data points than 72 districts times time. Were the 2008-2017 data sets aggregated for ZINB model?

On line 261, the authors said that “Distance from waterbodies” was the only significant variable. I will be cautious in reporting this as such, the confidence interval in Table 3 did not support this claim.

--------------------

Conclusions

-Are the conclusions supported by the data presented?

-Are the limitations of analysis clearly described?

-Do the authors discuss how these data can be helpful to advance our understanding of the topic under study?

-Is public health relevance addressed?

Reviewer #1: Conclusions and limitations are clearly presented and supported by the study, so I have no major comments here.

Reviewer #2: The limitation should also include the use of a single (year) survey, ZDHS 2014 to extract the WASH data when the study period is 2008-2017.

--------------------

Editorial and Data Presentation Modifications?

Use this section for editorial suggestions as well as relatively minor modifications of existing data that would enhance clarity. If the only modifications needed are minor and/or editorial, you may wish to recommend “Minor Revision” or “Accept”.

Reviewer #1: Some editorial modifications are needed throughout, where words are missing for example.

Reviewer #2: The manuscript "Identification of cholera hotspots in Zambia: A spatiotemporal analysis of cholera data from 2008 to 2017” will require full language edits. There are grammatical and orthographic errors here and there.

--------------------

Summary and General Comments

Use this section to provide overall comments, discuss strengths/weaknesses of the study, novelty, significance, general execution and scholarship. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. If requesting major revision, please articulate the new experiments that are needed.

Reviewer #1: As per my comments above, I find this to be a straightforward study which is of use for Zambian public health efforts to reduce the burden of cholera. My suggestions have been mainly to explore other ways of dividing the data to see if conclusions remain the same. It would be very interesting to see future work include data from the surrounding countries and look into human movement patterns across and within national boundaries to see how these are affecting the persistence of cholera in the region as a whole.

Reviewer #2: My major concern with this manuscript is the statistical methods used. Rather than using a range of regression models, a multilevel model (ZINB or ZIP with random effects) will suffice. The spatial heterogeneity of the cholera outbreaks will be captured by the random effects. Alternatively, a spatial ZINB or ZIP can be used with a spatially correlated errors.

See for example, Loquiha O, Hens N, Chavane L, Temmerman M, Osman N, Faes C, Aerts M. Mapping maternal mortality rate via spatial zero-inflated models for count data: A case study of facility-based maternal deaths from Mozambique. PloS one. 2018 Nov 9;13(11):e0202186.

--------------------

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

PLoS Negl Trop Dis. doi: 10.1371/journal.pntd.0008227.r003

Decision Letter 1

Hélène Carabin, Adam Akullian

17 Feb 2020

Dear Dr. Ali,

Thank you very much for submitting your manuscript "Identification of cholera hotspots in Zambia: A spatiotemporal analysis of cholera data from 2008 to 2017" for consideration at PLOS Neglected Tropical Diseases. As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. The reviewers appreciated the attention to an important topic. Based on the reviews, we are likely to accept this manuscript for publication, providing that you modify the manuscript according to the review recommendations.

Please prepare and submit your revised manuscript within 30 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email.  

When you are ready to resubmit, please upload the following:

[1] A letter containing a detailed list of your responses to all review comments, and a description of the changes you have made in the manuscript. 

Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out

[2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file).

Important additional instructions are given below your reviewer comments.

Thank you again for your submission to our journal. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments.

Sincerely,

Adam Akullian, Ph.D.

Associate Editor

PLOS Neglected Tropical Diseases

Hélène Carabin

Deputy Editor

PLOS Neglected Tropical Diseases

***********************

Reviewer's Responses to Questions

Key Review Criteria Required for Acceptance?

As you describe the new analyses required for acceptance, please consider the following:

Methods

-Are the objectives of the study clearly articulated with a clear testable hypothesis stated?

-Is the study design appropriate to address the stated objectives?

-Is the population clearly described and appropriate for the hypothesis being tested?

-Is the sample size sufficient to ensure adequate power to address the hypothesis being tested?

-Were correct statistical analysis used to support conclusions?

-Are there concerns about ethical or regulatory requirements being met?

Reviewer #1: The objectives of the study are clearly stated and straightforward, and the types statistical analyses are appropriate. However, I have a few questions and concerns about the data and study design:

Cholera data: What effect might the inclusion of suspected cases have on the analysis, particularly if the goal of the study is to identify vaccine targets rather than zones where sanitation improvements are needed? Are there not other water-borne pathogens in the country?

It also isn’t clear why cases were included for patients 5 years or older from areas where cholera was not known to have occurred. What is the justification behind this decision?

Population and socioeconomic data: In the introduction, the authors state that it is particularly peri-urban areas that are at highest risk for cholera transmission in Zambia. Would it be possible to further refine urban/rural classifications as such? It seems it would be important to identify and target not only those districts which are at risk, but specifically the peri-urban zones within these districts.

--------------------

Results

-Does the analysis presented match the analysis plan?

-Are the results clearly and completely presented?

-Are the figures (Tables, Images) of sufficient quality for clarity?

Reviewer #1: Figure 1 – Does this figure represent confirmed cases alone, or both confirmed and suspected? It would be good to distinguish between the two in the figure. Furthermore, it seems as though cholera rates have declined overall since the peak in 2010. Can the authors comment on the reasons underlying this pattern? Given this pattern, it would be interesting to see analyses that focus on the data from recent years and compare it to the distribution of previous years. Have the patterns changed, or do they remain the same as for the peak years? If they have changed, what are the implications for vaccine targeting?

Figure 2 -There appears to be a colour in the map that is not represented in the legend (green).

Results – paragraph beginning Line 263: It is interesting to see that some wards were not hotspots within districts, and is pertinent to my comment about identifying peri-urban areas above. Were these wards peri-urban?

--------------------

Conclusions

-Are the conclusions supported by the data presented?

-Are the limitations of analysis clearly described?

-Do the authors discuss how these data can be helpful to advance our understanding of the topic under study?

-Is public health relevance addressed?

Reviewer #1: Conclusions are well supported by the data, and limitations are addressed albeit briefly. There is clear public health relevance of this study, but maps showing specific wards that should be targeted for vaccines would make this even clearer. Especially if the goal is specifically to focus limited vaccine resources.

--------------------

Editorial and Data Presentation Modifications?

Use this section for editorial suggestions as well as relatively minor modifications of existing data that would enhance clarity. If the only modifications needed are minor and/or editorial, you may wish to recommend “Minor Revision” or “Accept”.

Reviewer #1: There are minor edits needed throughout for missing words and clarity.

--------------------

Summary and General Comments

Use this section to provide overall comments, discuss strengths/weaknesses of the study, novelty, significance, general execution and scholarship. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. If requesting major revision, please articulate the new experiments that are needed.

Reviewer #1: This paper is of public health interest for Zambia, and has implications for global cholera control as well. I believe it will be even stronger if analyses are split by time period (distinguishing recent years), and if greater attention is paid to ward-level outputs and peri-urban status.

--------------------

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Figure Files:

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org.

Data Requirements:

Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5.

Reproducibility:

To enhance the reproducibility of your results, PLOS recommends that you deposit laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions see http://journals.plos.org/plosntds/s/submission-guidelines#loc-materials-and-methods

PLoS Negl Trop Dis. doi: 10.1371/journal.pntd.0008227.r005

Decision Letter 2

Hélène Carabin, Adam Akullian

17 Mar 2020

Dear Dr. Ali,

We are pleased to inform you that your manuscript 'Identification of cholera hotspots in Zambia: A spatiotemporal analysis of cholera data from 2008 to 2017' has been provisionally accepted for publication in PLOS Neglected Tropical Diseases.

Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. A member of our team will be in touch with a set of requests.

Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated.

IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript.

Should you, your institution's press office or the journal office choose to press release your paper, you will automatically be opted out of early publication. We ask that you notify us now if you or your institution is planning to press release the article. All press must be co-ordinated with PLOS.

Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Neglected Tropical Diseases.

Best regards,

Adam Akullian, Ph.D.

Associate Editor

PLOS Neglected Tropical Diseases

Hélène Carabin

Deputy Editor

PLOS Neglected Tropical Diseases

***********************************************************

PLoS Negl Trop Dis. doi: 10.1371/journal.pntd.0008227.r006

Acceptance letter

Hélène Carabin, Adam Akullian

27 Mar 2020

Dear Dr. Ali,

We are delighted to inform you that your manuscript, "Identification of cholera hotspots in Zambia: A spatiotemporal analysis of cholera data from 2008 to 2017," has been formally accepted for publication in PLOS Neglected Tropical Diseases.

We have now passed your article onto the PLOS Production Department who will complete the rest of the publication process. All authors will receive a confirmation email upon publication.

The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any scientific or type-setting errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript. Note: Proofs for Front Matter articles (Editorial, Viewpoint, Symposium, Review, etc...) are generated on a different schedule and may not be made available as quickly.

Soon after your final files are uploaded, the early version of your manuscript will be published online unless you opted out of this process. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers.

Thank you again for supporting open-access publishing; we are looking forward to publishing your work in PLOS Neglected Tropical Diseases.

Best regards,

Serap Aksoy

Editor-in-Chief

PLOS Neglected Tropical Diseases

Shaden Kamhawi

Editor-in-Chief

PLOS Neglected Tropical Diseases

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 STROBE Checklist

    (DOCX)

    S1 Data

    (CSV)

    Attachment

    Submitted filename: Responses to the reviewers comments.docx

    Attachment

    Submitted filename: Responses to the Reviewers comments_R2.docx

    Data Availability Statement

    All relevant data are within the manuscript and its Supporting Information files.


    Articles from PLoS Neglected Tropical Diseases are provided here courtesy of PLOS

    RESOURCES