Skip to main content
PLOS Global Public Health logoLink to PLOS Global Public Health
. 2023 Dec 8;3(12):e0002260. doi: 10.1371/journal.pgph.0002260

Understanding the fine-scale heterogeneity and spatial drivers of malaria transmission in Kenya using model-based geostatistical methods

Donnie Mategula 1,2,3,*,#, Judy Gichuki 4,5,#
Editor: Collins Otieno Asweto6
PMCID: PMC10707679  PMID: 38064427

Abstract

Malaria remains a public health concern. Monitoring the fine-scale heterogeneity of the malaria burden enables more targeted control efforts. Although malaria indicator surveys (MIS) have been crucial in evaluating the progress of malaria control interventions, they are only designed to provide a cross-sectional national and regional malaria disease burden. Recent advances in geostatistical methods allow us to interpolate national survey data to describe subnational disease burden that is crucial in informing targeted control. A binomial geostatistical model employing Markov chain Monte Carlo (MCMC) parameter estimation methods is used to understand the spatial drivers of malaria risk in Kenya and to predict malaria risk at a fine-scale resolution, including identifying hotspots. A total of 11,549 children aged six months to 14 years from 207 clusters were sampled in this survey and used in the present analysis. The national malaria prevalence based on the data was 8.4%, with the highest in the lake endemic zone (18.1%) and the lowest in the low-risk zone (<1%). The analysis shows that elevation, proportion of insectcide treated net (ITN) distributed, rainfall, temperature and urbanization covariates are all significant predictors of malaria transmission. The 5x5 Km resolution maps show that malaria is heterogeneous in Kenya, with hotspot areas in the lake endemic area, the coastal areas, and some parts of the shores of Lake Turkana and Kajiado. The high-resolution malaria prevalence maps produced as part of the analysis have shown that Kenya has additional malaria hotspots, especially in areas least expected. These findings call for a rethinking of malaria burden classification in some regions for effective planning, implementation, resource mobilization, monitoring, and evaluation of malaria interventions in the country.

Background

Malaria remains a public health concern and continues to be one of the most important tropical diseases affecting human populations to date [1]. In 2020, an estimated 241 million malaria cases occurred worldwide, 90% of which were in sub-Saharan Africa [2]. The disease is caused by protozoa of the genus Plasmodium of which five known species, Plasmodium falciparum, P. vivax, P. ovale, and P. malariae, more recently, P. knowlesi, are responsible for human Infection [3,4]. The vector responsible for human transmission is the female anopheles mosquito. In Sub-Saharan Africa, malaria is one of the leading causes of morbidity and mortality, especially in children under five. Other high-risk groups include pregnant women and immunologically naïve persons like travellers coming from non-endemic places [5].

Malaria remains a significant public health problem in Kenya, accounting for an estimated 13% to 15% of outpatient cases, with nearly 70% of the population at risk for malaria [6]. Four out of the five species of malaria parasite that cause human Infection are present in Kenya, but the Plasmodium falciparum parasite is the predominant cause of Infection in the country. Over the past decade, Kenya has substantially scaled up available malaria control tools, such as insecticide-treated bed nets, indoor residual spraying and the use of artemisinin-based combination therapies [6]. Evidence of this massive scale-up of interventions is the observed decline in prevalence. Kenya has experienced a decrease in the national prevalence of malaria among children ages six months to 14 years, from 13 per cent in 2010 to 8 per cent in 2015, and % in 2020 [6].

One of the key objectives of the Kenya Health Policy 2014–2030 is the elimination of communicable diseases, including malaria. This is supported by the Kenya malaria strategy for 2019 to 2023, which sets a vision of a malaria-free Kenya and targets to reduce malaria incidence and mortality by seventy-five per cent by 2023, with 2016 as the baseline year[6].

Malaria transmission in Kenya varies geographically. This could be due to varied climatic conditions, vector and parasite resistance, differences in intervention uptake across populations and other unmeasured factors that are thought to be responsible for this increasing heterogeneity [6,7]. The country is administratively divided into five malaria epidemiological zones based on risk profiles. These zones include highland epidemic-prone areas, lake endemic areas, coast endemic, semi-arid seasonal, and low-risk malaria areas. The endemic areas lie in the lake and coastal regions with altitudes ranging from 0m to 1300 above sea level. These areas have perennial malaria transmission due to rainfall, temperature, humidity and other critical factors that drive malaria transmission. The semi-arid seasonal malaria transmission areas are in the country’s northern, northeastern, and southeastern parts. These areas experience short periods of intense malaria. The highland epidemic-prone areas are located within the western highlands and have seasonal malaria transmission with some yearly variation. The altitude in these zones is relatively higher than the other zones, lying 1500 meters above sea level. The malaria epidemics in the highland epidemic-prone zones are less predictable. Lastly, the low-risk malaria areas cover Nairobi and the central highland. Temperatures are usually too low to allow the completion of the sporogony cycle of the malaria parasite in the vector in the low-risk zones [6].

Malaria indicator surveys (MIS) measure progress on key malaria indicators in Kenya. The country has conducted four MIS in 2007, 2010, 2015 and 2020. The MIS are nationally representative household surveys that provide estimates of national and regional malaria indicators to assist malaria control programs in tracking their progress and evaluating the impact of strategies and interventions. The MIS follow a standard methodology recommended by the Roll Back Malaria Monitoring and Evaluation working group guidelines [8]. Originally, MIS surveys were designed to measure the blanket scale up of interventions like bed nets, using a classic two-stage sample design and coverage indicators as the primary endpoints. Over time, as coverage increased, interest expanded to the impact of parasite prevalence. While it is still a norm that the MIS traditionally measures progress in these areas, the survey methodologies need to consider the underlying disease heterogeneity [8]. Recent advances in statistical analyses, including geostatistics, have made it possible to make fine-scale inferences of malaria transmission from survey data like the MIS, that is not traditionally designed for such extrapolations. In this paper, we use the Kenya 2020 MIS data to achive the following objectives:

  1. understand the relationship between malaria prevalence and several factors including environmental factors and

  2. understand the disease heterogeneity across the country’s surface including identification of hotspots.

We hypothesized that:

  1. There is a significant association between environmental factors (such as temperature, humidity, and rainfall patterns) and malaria prevalence in different regions of Kenya as evidenced from the 2020 Malaria Indicator Survey (MIS) data.

  2. There is substantial heterogeneity in malaria prevalence across different regions of Kenya, with identifiable hotspots of higher prevalence, as can be determined from an analysis of the 2020 MIS data.

  3. The existing methodology of MIS, primarily focused on intervention coverage indicators, might be inadequate in capturing the evolving complexities and heterogeneities of malaria transmission in Kenya, and propose that model based geostatics has the potential to capture this very well.

Methods

Country profile

Kenya is an East African country that covers an area of 582,550 km2. It is bordered by Ethiopia to the north, Tanzania to the south, Uganda to the west, South Sudan to the northwest, and Somalia to the northeast. Approximately 80% of Kenya’s land is arid and semi-arid, only 20% is arable, and only 1.9% of the total surface area is occupied by standing water. The great East African Rift Valley extends from Lake Victoria to Lake Turkana and further southeast to the Indian Ocean [9]. The country has a number of large rivers including the Tana, Galana, Turkwel and Nzoia [10]. Fig 1 below is a map of Kenya showing the five epidemiological zones as defined by the national malaria program [6].

Fig 1. Kenya epidemiological malaria zones.

Fig 1

Data

This secondary analysis used data from the Kenya MIS [6]. Access to the dataset was given to the authors on Apr 28 2022. The datasets were de-identified. The Institutional Review Board(IRB-)approved procedures for Demographic Health Survey(DHS) public-use datasets do not in any way allow respondents, households, or sample communities to be identified. Authors had no access to the names of individuals or household addresses in the data files. Additionally, the geographic identifiers only go down to the regional level, which is hard to identify individuals.

The 2020 MIS, the fourth conducted by the country, followed a similar design and set-up as the former ones. It was conducted during the peak malaria season in November and December 2020. A two-stage stratified sampling design was used, powered to give malaria parasite prevalence estimates and other key malaria indicators at the national level (urban and rural areas) and for the five epidemiological zones. The first stage sampling unit was a cluster developed from enumeration areas (EAs). EAs are the smallest geographical areas created for purposes of census enumeration. The EAs used were based on the 2019 Kenya population census. In the Kenya MIS, a cluster was defined as either an EA or part of an EA. A total of 301 clusters (134 urban and 167 rural) were sampled in this first stage using the probability proportional to size approach. The second stage sampling unit was households. In each cluster, 30 households were selected from a line listing of the sampled clusters using a systematic random sampling approach. A total of 7,952 households were sampled. All women aged 15–49 in the selected households were eligible for individual interviews. They were asked questions about preventing malaria during pregnancy and treating childhood fevers. In addition, the survey included testing for anaemia and malaria among children aged six months to 14 years using a finger- or heel-prick blood sample.

Permission to use the dataset was obtained from The Demographic and Health Surveys (DHS) Program through the archiving office. The original study received ethical clearance from the Kenyatta National Hospital/University of Nairobi Scientific and Ethics Review Committee. All participants provided oral informed consent.

Variables

Outcome variable

In this analysis, the outcome variable was a binary outcome derived from the total number of children tested and the total number testing positive. This was extrapolated to estimate the cluster-level plasmodium falciparum malaria prevalence (PfPR).

Explanatory variables

The explanatory variables included cluster-level factors such as rainfall, temperature, elevation, and urbanization and individual-level characteristics such as gender and age. These variables have been shown to affect malaria transmission [11].

Data cleaning

The data cleaning and analysis were done in R [12]. Maps produced in the analysis were further processed for better visualization in QGIS (Version 3.2). Relevant variables were extracted and renamed to shorter names for ease of coding. Coordinates in the initial dataset were given the longitude and latitude system, which were transformed into the universal coordinate system (UTM). All distances were scaled to kilometres.

Exploratory analysis

The initial exploratory analysis was descriptive to understand the data and to explore the initial relationships between the outcome variable of prevalence with the covariates in the data set. Scatter plots with fitted linear regression lines were used for this step to observe the relationship between prevalence and the explanatory variables. To further understand the variables, correlation plots were used to understand the relationships between the variables to guide the decisions of which covariates to include in the Model. The additional exploratory analysis involved plotting the clusters on the Kenyan map’s surface, showing the sampled cluster’s distribution and the crude malaria prevalence.

Model fitting

The first objective of the analysis was to understand the relationships between malaria prevalence and several factors, including environmental factors. Several steps were followed:

  1. Fitting a generalized linear model

  2. Assessing evidence of residual correlation

  3. Fitting a generalized linear mixed model

  4. Reassessing evidence of residual correlations

  5. Fitting a binominal geostatistical model and parameter estimation

  6. Model validation

The model description for the generalized linear Model and the generalized linear mixed Model are described in S1 File.

Description for the binominal geostatistical model

A detailed description for the model based geostatistics developed by Diggle and Giorgi are described elsewhere [13]

Let Yi denote the number of individuals that test positive for plasmodium falciparum at survey cluster location xi

And that the survey team went to the sampled clusters given by xi and sampled mi: i = 1….n individuals at risk in the cluster and recoded the outcome of every person that tests positive and negative for plasmodium falciparum malaria.

Then standard geostatistical Model assumes that:

YiBinomial(mi,P(xi)

Yi is a Binomial distribution with mi trials and probability of a positive test P(xi) specified in the binomial geostatistical Model below:

log{P(x)1P(x)}=α+d(xi)Tβ+S(x)+Zi

Where α is the intercept parameter, S(x) are the spatial random effects, representing the spatial variation between the sampled clusters. Zi are mutually independent zero-mean Gaussian random variables with variance r and in this analysis represent the spatial variation within cluster variation, measurement error or small-scale spatial variation.

d(xi)T is a vector of observed spatially referenced explanatory variables associated with the response Yi, and β is a vector of spatial regression coefficients for the covariates.

The Matérn correlation function for the stationary Gaussian processes S(x) used in this analysis, a two-parameter family is given by:

p(u,φ,k)=2(k1)(u/φ)kKK+(u/φ)

Where:

  • u denotes the distance between two locations x and x′,

  • φ >0 is a scale parameter that determines the rate at which correlation decays to 0 as the distance increases, and

  • k >0, is a smoothness parameter which determines the analytic smoothness of the underlying process S(x).

In the binomial geostatistical regression for this analysis, the Matérn shape parameter k was set to 0.5 variance parameters τ2 to 0.

The covariates d(xi)T used in the binomial geostatistical Model for prediction were obtained from an exploratory analysis set to understand the relationship of the variables with the outcome variable of malaria prevalence. This Model included the covariates: elevation, ITN usage, mean temperature, rainfall, and cluster urbanization (urban vs rural). The Markov chain Monte Carlo (MCMC) methods were used for parameter estimation in this Model. Confidence intervals of the estimates are calculated on the log scale then transformed back to the non-log scale that is used to report the results.

To test whether there was any evidence against spatial correlation in the data, empirical variogram methods are used. A simulation of 1000 empirical variograms around the fitted Model is ran and these are used to compute 95% confidence intervals at any given spatial distance of the variogram. A conclusion is reached that that there is a spatial correlation in the data if the empirical variogram obtained from the data falls outside the 95% tolerance bandwidth.

The second objective is to understand the disease heterogeneity across the surface of the country, including identification of hotspots and the uncertainty attached to these hot spots. For this purpose, a binomial geostastical model was used as described above but with covariates that were available as raster. These included urbanization, temperature, and precipitation. The target for the predictions was a prevalence of malaria over the 5 x 5 km regular grid surface covering the whole surface of Kenya. A map of malaria prevalence was generated. Uncertainty of the prevalence was addressed using Exceedance Probabilities, an approach that is more relevant to policy makers, than the traditional approach of using confidence intervals. Exceedance Probabilities (EP) method sets policy relevant thresholds. The EP can be formally expressed as:

EP=Probability{,P(xi)>t|data}

where t is the prevalence threshold, set to 10% in this analysis.

Results

A total of 11,549 children aged six months to 14 years were sampled. The analysis used 297 clusters. The number of clusters per transmission zone is shown in Table 1 below. The lake endemic area had the greatest number of clusters (97), while the coastal endemic area had the lowest number of clusters (29). The national malaria prevalence based on the data was 8.4%, with the highest in the lake endemic zone (18.1%) and the lowest in the low-risk zone (<1%).

Table 1. Malaria prevalence in Kenya across five epidemiological zones.

Epidemiological Zone Total
clusters
Number tested(N) Number positive(n) Prevalence
(n/N)%
Coastal Endemic 29 1088 59 4.95
Highland Epidemic Prone 55 2122 33 1.56
Lake Endemic 97 4621 836 17.93
Low Risk 54 1307 1 0.08
Seasonal 56 2210 33 1.49

The map in Fig 2 below shows the sampled locations and the cluster-level malaria prevalence on. The lake endemic area has the highest number of clusters sampled and is also the zone with the highest prevalence estimates at the cluster level.

Fig 2. Map of sampled locations and the malaria prevalence.

Fig 2

The weather pattern varies across the surface of Kenya. The maps in Fig 3 below show the variation in temperature across space on the top, and the variation in annual precipitation for the year 2020 on the bottom.

Fig 3. Mean temperature (degrees Celsius) (top) and Annual precipitation (mm)in Kenya (2020) (Bottom).

Fig 3

Binomial geostatistical model results

The binomial geostatistical Model results indicate that the elevation, proportion of ITN distributed, rainfall, temperature and urbanization covariates are all significant predictors of malaria transmission (Table 2).

Table 2. Binomial geostatistical model.

Estimate Standard error Lower Bound Upper Bound p-value
Intercept 0.00 3.54 0.00 0.00 <0.0001
elevation 1.86 1.05 1.70 2.02 <0.0001
INT coverage 3.15 1.07 2.75 3.61 <0.0001
Urban vs Rural 0.32 1.11 0.26 0.39 <0.0001
Rainfall(mm) 1.91 1.06 1.69 2.15 <0.0001
Temperature 1.37 1.04 1.28 1.47 <0.0001
Age in months (ref <12)
12–23 1.93 1.54 0.83 4.48 <0.001
24–35 2.72 1.52 1.19 6.21 <0.01
36–47 3.56 1.51 1.59 7.96 <0.001
48–59 7.84 1.45 3.76 16.35 <0.001
Female vs male 1.20 1.08 1.03 1.40 <0.001
Sigma^2* 0.56 1.44 0.27 1.14 NA
Phi** 59.57 1.85 17.87 198.50 NA
Tau^2*** 0.95 2.24 0.20 4.58 NA

sigma2 is the variance of the Gaussian process, phi is the scale parameter of the spatial correlation and tau2 is the variance of the nugget effect.

Model validation

The odds of malaria transmission are less in the urban clusters compared to the rural ones. Urban clusters have nearly 68% less malaria prevalence than rural ones (OR 0.32 CI: 0.26–0.39, P value <0.0001). The higher the rainfall, the higher the risk of malaria transmission. Every mm increase in the average rain increases malaria prevalence by 1.9 times (OR 1.91 CI 1.69–2.15, P value <0.0001). Rise in mean temperature also increases the risk of malaria prevalence. Every degree increase in temperature increases the odds of malaria prevalence by 1.4 times (OR 1.37CI 1.28–1.47, P value <0.0001).

Using variogram-based techniques described above, the Model above was tested for evidence of spatial correlation. The results of this process are shown in Fig 4 below. Since the empirical semi-variogram (solid line) falls within the 95% confidence interval (grey envelope), this shows that the Model is valid; the Model for malaria prevalence is compatible with the data.

Fig 4. Model validation.

Fig 4

Prediction

To understand the disease heterogeneity across the country’s surface, including the identification of hotspots, a 5 × 5 km resolution map for malaria prevalence in children six months to 14 years is presented in Fig 5 below. Overall, malaria prevalence is low in most parts of the country. Hotspots were notable in Western Kenya in the lake endemic areas around Lake Victoria, in the endemic coastal regions along the Indian Ocean and three hotspot areas within the seasonal epidemiological zone, one around the Lake Turkana region, one around the humid and sub-humid belt in Meru County and the other in the semi-arid belt of Kajiado County.

Fig 5. Malaria prevalence predictions among children six months- 14 years in Kenya.

Fig 5

Fig 6 presents a map of malaria exceedance and probabilities, showing areas where p(x)≥ 10% with certainty on the colour gradient.

Fig 6. Map of Kenya showing the exceedance probabilities of malaria prevalence exceeding 10%.

Fig 6

Discussion

Understanding the spatial distribution of malaria and the factors that drive its transmission are key in malaria control. Given the heterogeneity of malaria transmission in Kenya, defining the malaria burden at more localized locations is important to allow for targeted control activities. The national malaria indicator surveys performed in the country are not designed to provide malaria prevalence estimates at localized levels. This paper uses Model-Based geostatistical methods to understand malaria transmission drivers in Kenya and map out malaria prevalence at a very high resolution (5 x 5 Km square grid).

In our analysis, we have found that several factors influence malaria transmission. These include gender, age, temperature, rainfall, bed net coverage, elevation, and urbanization. This is consistent with the well-known predictors of malaria transmission. Studies in the same area have previously found a higher risk of malaria among males and an increasing risk with age compared to the first year of life [12]. Both the natural environment and the artificial environment are known to affect malaria transmission. Temperature, humidity, and rainfall all have interactions with mosquitos at specific points in their life cycle. Temperature regulates the development of mosquitos at each stage. The laying of eggs by mosquitos is reduced in temperature extremes, either too cold or too hot [14]. Temperature also affects the mosquito stage transition, with the optimal temperature being between 22 and 26 degrees Celsius[12].

Rainfall has been shown to be positively correlated with high malaria transmission. During the rainy season, there is usually water logging in the ground, creating mosquito breeding grounds. This analysis observed that prevalence nearly doubles for every mm increase in the annual rainfall. The areas observed to have a higher prevalence of malaria in Kenya are known to have prolonged rainy seasons[6].

The analysis also identified that malaria transmission is higher in rural areas compared to urban areas. This finding is consistent with other studies in the same region. Urban areas may have better housing and improved health services that are easier to access. These factors contribute to the lower risk of malaria. Conversely, rural areas are primarily associated with favourable conditions for malaria, including stagnant water, poor housing, inaccessible health services and agricultural activities [15].

The finding of increasing malaria prevalence with higher bed net coverage can be explained through reverse causality, which is often observed due to the higher distribution of bed nets in areas heavily affected by malaria.

Malaria hotspot areas identified in the analysis include the entire lake and coastal regions classified as malaria endemic [6]. This finding is in keeping with other previous analyses done for past time points [16]. The climatic condition in these areas is known to support malaria transmission. We do find additional hotspots, which highlights the strength in our analysis approach. Localized malaria hotspots are identified in the county of Turkana. Though this area is classified as a seasonal malaria transmission zone, a reactive case detection in the area conducted from 2018 to 2019 also detected high malaria transmission with a prevalence as high as 33.6% [17]. Another study in a refugee camp in the same region identified a malaria prevalence of 64.2% [17]. Evidence from a recent study examining the contribution of P. falciparum parasite importation to local malaria transmission in Central Turkana confirms that malaria in the area is rather endemic, with intense local transmission as opposed to the importation of malaria [18]. Due to its malaria risk classification status, Turkana is often left out of malaria control activities. This is an important finding where an area’s transmission is misclassified. The recent WHO malaria surveillance guide calls for countries to view malaria transmission as a continuum in space and stratify the malaria burden for better targeting and improved efficiency of malaria interventions[8]. As malaria transmission declines, it becomes increasingly focal and prone to outbreaks. Understanding and predicting patterns of transmission risk becomes an essential component of an effective elimination campaign, allowing limited resources for control and elimination to be targeted cost-effectively. By concentrating surveillance, monitoring and control efforts in hotspot areas, programs can maximize the impact of their interventions and reduce the burden of malaria more efficiently. In this study, we also find additional hotspots in the counties of Meru and Kajiado, areas with humid and arid weather conditions, respectively. There is a need for more local surveillance in the area. These areas are also characterized by low implementation of malaria control measures. Future research should prioritize understanding the influence of underlying factors and evaluating the effectiveness of various malaria control interventions in these malaria hotspots.

There are several strengths and limitations of the data used in the analysis. To the best of our knowledge, this is the latest nationally representative data on malaria prevalence. With this, the results of this study are generalizable to the entire population of Kenya. Use of the geostatistical Model as opposed to the traditional non-spatial Model, is a key strength. It allows us to borrow information from the sampled cluster to infer for the unsampled ones and at the same time, account for predictors that influence malaria transmission. The major limitation in the analysis is the lack of adequate environmental covariates to improve spatial predictions.

Conclusion

The objectives of this analysis were to understand the relationship between malaria prevalence and various predictor factors and examine the disease heterogeneity, including identifying hotspots. Our findings show that rainfall, urbanization, temperature, and bed net coverage are important factors that affect malaria transmission. The high-resolution malaria prevalence maps produced as part of the analysis are important in identifying hotspots, an essential element in planning, implementation, resource mobilization, monitoring, and evaluation of malaria interventions in the country. We have also identified malaria hotspots in areas not traditionally classified as endemic, highlighting the need to rethink the classification of malaria transmission epidemiology in Kenya.

Supporting information

S1 File. The model description for the generalized linear model and the generalized linear mixed model.

(DOCX)

Acknowledgments

Many thanks to the DHS programme team for allowing the authors to use the dataset and to the participants that provided the data.

Data Availability

The datasets used and/or analyzed during the current study are available upon request from the Demographic Health Survey Programme Team. The request can be made via the link https://dhsprogram.com/Data/.

Funding Statement

The authors received no specific funding for this work.

References

PLOS Glob Public Health. doi: 10.1371/journal.pgph.0002260.r001

Decision Letter 0

Collins Otieno Asweto

21 Aug 2023

PGPH-D-23-01337

Understanding the fine-scale heterogeneity and spatial drivers of malaria transmission using model-based geostatistical methods in Kenya

PLOS Global Public Health

Dear Mategula,

Thank you for submitting your manuscript to PLOS Global Public Health. After careful consideration, we feel that it has merit but does not fully meet PLOS Global Public Health’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by 21/9/2023. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at globalpubhealth@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pgph/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

We look forward to receiving your revised manuscript.

Kind regards,

Collins Otieno Asweto, PhD

Academic Editor

PLOS Global Public Health

Journal Requirements:

1. Please provide separate figure files in .tif or .eps format.

For more information about figure files please see our guidelines:  LINK

https://journals.plos.org/globalpublichealth/s/figures 

https://journals.plos.org/globalpublichealth/s/figures#loc-file-requirements 

1. Please provide separate figure files in .tif or .eps format.

For more information about figure files please see our guidelines:  LINK

https://journals.plos.org/globalpublichealth/s/figures 

https://journals.plos.org/globalpublichealth/s/figures#loc-file-requirements 

2. Some material included in your submission may be copyrighted. According to PLOS’s copyright policy, authors who use figures or other material (e.g., graphics, clipart, maps) from another author or copyright holder must demonstrate or obtain permission to publish this material under the Creative Commons Attribution 4.0 International (CC BY 4.0) License used by PLOS journals. Please closely review the details of PLOS’s copyright requirements here: PLOS Licenses and Copyright. If you need to request permissions from a copyright holder, you may use PLOS's Copyright Content Permission form.

Please respond directly to this email or email the journal office and provide any known details concerning your material's license terms and permissions required for reuse, even if you have not yet obtained copyright permissions or are unsure of your material's copyright compatibility. 

Potential Copyright Issues:

Figure 1-6: please (a) provide a direct link to the base layer of the map (i.e., the country or region border shape) and ensure this is also included in the figure legend; and (b) provide a link to the terms of use / license information for the base layer image or shapefile. We cannot publish proprietary or copyrighted maps (e.g. Google Maps, Mapquest) and the terms of use for your map base layer must be compatible with our CC-BY 4.0 license. 

Note: if you created the map in a software program like R or ArcGIS, please locate and indicate the source of the basemap shapefile onto which data has been plotted.

If your map was obtained from a copyrighted source please amend the figure so that the base map used is from an openly available source. Alternatively, please provide explicit written permission from the copyright holder granting you the right to publish the material under our CC-BY 4.0 license.

Please note that the following CC BY licenses are compatible with PLOS license: CC BY 4.0, CC BY 2.0 and CC BY 3.0, meanwhile such licenses as CC BY-ND 3.0 and others are not compatible due to additional restrictions. 

If you are unsure whether you can use a map or not, please do reach out and we will be able to help you. The following websites are good examples of where you can source open access or public domain maps: 

* U.S. Geological Survey (USGS) - All maps are in the public domain. (http://www.usgs.gov

* PlaniGlobe - All maps are published under a Creative Commons license so please cite “PlaniGlobe, http://www.planiglobe.com, CC BY 2.0” in the image credit after the caption. (http://www.planiglobe.com/?lang=enl) 

* Natural Earth - All maps are public domain. (http://www.naturalearthdata.com/about/terms-of-use/)

Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Does this manuscript meet PLOS Global Public Health’s publication criteria? Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe methodologically and ethically rigorous research with conclusions that are appropriately drawn based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available (please refer to the Data Availability Statement at the start of the manuscript PDF file)?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception. The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: No

Reviewer #3: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS Global Public Health does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: First :In the abstract, 'ITN' is used as an abbreviation without its full form explicitly mentioned."

Second: In Introduction

In this context, "spatial drivers" refer to the factors or variables that influence the distribution of malaria cases across geographic space. These influential variables, which play a crucial role in shaping the spatial patterns of malaria risk, will be explicitly outlined in the concluding paragraph of the introduction. Although those that were mentioned subsequently.

Third: in methods section : Even though the IRB and DHS acronyms are generally known, they nonetheless require providing explanations before the abbreviation though mentioned later in another paragraph. It should be mentioned first then should be written abbreviation.

Fourth: In the result Section:

The sentence "Malaria Prevalence in five epidemiological zones" could be revised. The term "epidemiological zones" might not be the most commonly used or recognized term in this context. The terminology should be written which are more consistent with standard usage in epidemiological research and might be more appropriate when referring to different areas with distinct epidemiological characteristics.

Fifth: Furthermore, following should be included in the result section as the study focused on Understanding the fine-scale heterogeneity and spatial drivers of malaria transmission using model-based geostatistical methods

1.Geographic Distribution of Malaria Cases: A map displaying the geographical distribution of malaria cases at a fine-scale resolution. This could involve using different colors or shading to represent varying levels of malaria prevalence across different geographic areas. And also mapping of Risk Zones - high-risk or low-risk zones for malaria transmission, these could be delineated on the map.

2. As the study identifies spatial drivers of malaria transmission, you could overlay environmental factors such as temperature, humidity, and rainfall on the map to show how these factors correlate with the spatial distribution of malaria cases.

Reviewer #2: The paper appears to be the introduction and methods section of a scientific research paper focused on understanding the fine-scale heterogeneity and spatial drivers of malaria transmission in Kenya. The paper aims to use model-based geostatistical methods to analyze malaria prevalence data obtained from the Kenya Malaria Indicator Survey (MIS) and investigate the relationship between malaria transmission and various environmental factors. The research also involves creating high-resolution maps of malaria prevalence to identify hotspots and areas of higher transmission risk. I present my comments in three sections, 1. General comments, 2. Specific comments, 3. Comments on the model. Kindly check and address these comments properly.

General Comments:

1. Clarity and Organization: The introduction provides a comprehensive overview of the malaria situation in Kenya and the significance of the study. Consider breaking down the introduction into smaller sub-sections for better readability.

2. Research Objectives: Clearly state the research objectives and hypotheses in the introduction section. This will help readers understand the purpose of the study more explicitly.

3. Citations: There are many sentences, facts, and statistics that are not properly cited. Lines 56-85, the researchers used the same reference. Please consider using multiple references. Most of the article paragraphs are referenced to the same citation (#6).

4. Figure and maps: It is not clear why the researchers added all of these maps. I would recommend removing them and use a specific map with clear lables and purpose (check comment #9).

5. Methods Explanation: The methods section is detailed and comprehensive. However, consider providing more context on the statistical techniques used, especially for readers who may not be familiar with geostatistical models.

6. Variable Definitions: Provide clear definitions of each explanatory variable and its relevance to malaria transmission in Kenya. This will help readers understand the rationale behind the variables chosen for the analysis.

7. Data Source and Cleaning: Discuss any potential limitations or biases in the data source and data cleaning process. This will help readers understand the reliability of the dataset used.

8. Ethics and Consent: While you mentioned obtaining permission to use the dataset and ethical clearance, provide a bit more detail on the ethical considerations, participant consent (if applicable), and data anonymization.

9. Graphical Representation: Use clear, labeled figures and maps that are easy to interpret. Consider providing legends and titles for each figure to enhance their clarity.

Specific Comments:

10. Objectives Clarification: State the specific objectives of the study at the beginning of the "Methods" section. For example, "The objectives of this study were to...".

11. Model Assumptions: Discuss the assumptions underlying the binomial geostatistical model used in the analysis. Highlight any limitations of the model and potential impact on the results.

12. Results Interpretation: In the results section, elaborate on the practical implications of the odds ratios and parameter estimates. Explain how changes in the explanatory variables translate to changes in malaria transmission risk.

13. Model Validation: Provide a brief explanation of the empirical variogram technique used for model validation. This will help readers understand the statistical approach used to assess spatial correlation.

14. Discussion: In the discussion section, relate your findings to existing literature. Discuss similarities and differences with other studies, particularly those conducted in similar settings.

15. Policy Implications: Elaborate on the policy implications of your findings. How can the identified hotspots guide targeted interventions and resource allocation in malaria control programs?

16. Study Limitations: Acknowledge any limitations of your study, such as the availability of certain covariates or potential biases in the data. Discuss how these limitations might affect the interpretation of your results.

17. Future Directions: Suggest potential avenues for future research based on your findings. This could include exploring the effectiveness of specific interventions in the identified hotspots.

18. Conclusion: Summarize the key findings of the study in the conclusion section, relating them back to the research objectives.

Comments on the model:

19. Consider paraphrasing the title of this paragraph: "Model description for the binominal geostatistical Model". You used the word "Model" twice in the title.

20. Clarify Covariate Inclusion: In the model description, provide a clear rationale for the selection of covariates used in the binomial geostatistical model. Explain why elevation, ITN usage, mean temperature, rainfall, and urbanization were chosen as predictor variables. Additionally, elaborate on any potential interactions or collinearity among these covariates.

21. Matérn Correlation Function Parameters: Provide more detailed explanations for the parameters of the Matérn correlation function used in the spatial component of the model. Explain the significance of the scale parameter () and the smoothness parameter () in the context of malaria transmission. This will help readers understand how these parameters influence spatial correlation and variability.

22. Discussion on Variance Parameters: Elaborate on the interpretation and implications of the variance parameters presented in the model. Explain the significance of ² (variance of the Gaussian process), ² (variance of the nugget effect), and their practical implications for the spatial variation of malaria prevalence.

23. Spatial Random Effects: Clarify the concept of spatial random effects (()) and how they capture spatial variation between the sampled clusters. Provide insight into the factors that contribute to the spatial variability, such as geographic features or environmental conditions.

24. Interpretation of Coefficients: In the model formula, explicitly interpret the coefficients associated with each covariate. For example, provide a clear explanation of the interpretation of the coefficient for urbanization (Urban vs Rural) in terms of its effect on the odds of malaria transmission.

25. Inclusion of Interaction Terms: Discuss whether any interaction terms were considered in the model and if they had a significant impact on the results. Interaction terms could capture complex relationships between covariates and provide a more nuanced understanding of their influence on malaria prevalence.

26. Sensitivity to Parameter Choices: Address the sensitivity of the model results to the choices of parameters such as the Matérn correlation function parameters and variance parameters. Explain whether different parameter choices were tested and their potential impact on the findings.

27. Comparison with Alternative Models: Discuss any alternative modeling approaches that were considered and the reasons for choosing the binomial geostatistical model over other options. Compare the strengths and weaknesses of different modeling strategies, if relevant.

28. Implications of Non-Spatial Variation: Explain the potential implications of the Gaussian random variables () representing small-scale spatial variation, measurement error, or within-cluster variation. Discuss how these sources of variation could affect the overall model results and their interpretation.

29. Model Robustness and Validity: Describe how the robustness of the model was assessed. Discuss any sensitivity analyses conducted to examine the impact of different assumptions or parameter choices on the results. Additionally, elaborate on the methods used to validate the model's assumptions and spatial structure.

Reviewer #3: • Congratulations to the researchers. This is a well written manuscript with good statistical analysis to identify disease heterogeneity across Kenya and its policy implications.

• Line 234 indicated that there we 98 clusters in lake endemic areas however in the line 237 (table 1) it was 97. Please check and align.

• Line 239: change ‘’figure one’’ to ‘’Figure 1’’ for consistency.

• To what extent is the misalignment between spatial and especially paucity of environmental covariates affected the validity of the model's results?

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

Do you want your identity to be public for this peer review? If you choose “no”, your identity will remain anonymous but your review may still be made public.

For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: Yes: Orwa Al-Abdulla

Reviewer #3: No

**********

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLOS Glob Public Health. doi: 10.1371/journal.pgph.0002260.r003

Decision Letter 1

Collins Otieno Asweto

1 Nov 2023

Understanding the fine-scale heterogeneity and spatial drivers of malaria transmission using model-based geostatistical methods in Kenya

PGPH-D-23-01337R1

Dear Mategula,

We are pleased to inform you that your manuscript 'Understanding the fine-scale heterogeneity and spatial drivers of malaria transmission using model-based geostatistical methods in Kenya' has been provisionally accepted for publication in PLOS Global Public Health.

Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. A member of our team will be in touch with a set of requests.

Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated.

IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they'll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact globalpubhealth@plos.org.

Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Global Public Health.

Best regards,

Collins Otieno Asweto, PhD

Academic Editor

PLOS Global Public Health

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #3: All comments have been addressed

Reviewer #4: All comments have been addressed

**********

2. Does this manuscript meet PLOS Global Public Health’s publication criteria? Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe methodologically and ethically rigorous research with conclusions that are appropriately drawn based on the data presented.

Reviewer #3: Yes

Reviewer #4: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #3: Yes

Reviewer #4: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available (please refer to the Data Availability Statement at the start of the manuscript PDF file)?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception. The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #3: Yes

Reviewer #4: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS Global Public Health does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #3: Yes

Reviewer #4: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #3: Congratulations to the research team.

Reviewer #4: The authors did great job responding to the comments by the reviewers. As the fourth reviewer, I think most of my concerns have been addressed. However, line 65 & 66 highlighted the decline in Malaria prevalences in previous years 13%, 8% for 2010 and 2015 respectively but the prevalence for year 2020 is missing.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

Do you want your identity to be public for this peer review? If you choose “no”, your identity will remain anonymous but your review may still be made public.

For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #3: Yes: Richard AMENYAH

Reviewer #4: No

**********

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 File. The model description for the generalized linear model and the generalized linear mixed model.

    (DOCX)

    Attachment

    Submitted filename: Response to reviewers 4 October 2023.docx

    Data Availability Statement

    The datasets used and/or analyzed during the current study are available upon request from the Demographic Health Survey Programme Team. The request can be made via the link https://dhsprogram.com/Data/.


    Articles from PLOS Global Public Health are provided here courtesy of PLOS

    RESOURCES