Skip to main content
PLOS Neglected Tropical Diseases logoLink to PLOS Neglected Tropical Diseases
. 2022 Mar 11;16(3):e0010273. doi: 10.1371/journal.pntd.0010273

Predicting future community-level ocular Chlamydia trachomatis infection prevalence using serological, clinical, molecular, and geospatial data

Christine Tedijanto 1,*, Solomon Aragie 2, Zerihun Tadesse 2, Mahteme Haile 3, Taye Zeru 3, Scott D Nash 4, Dionna M Wittberg 1, Sarah Gwyn 5, Diana L Martin 5, Hugh J W Sturrock 6, Thomas M Lietman 1,7,8,9, Jeremy D Keenan 1,7, Benjamin F Arnold 1,7
Editor: Ali M Somily10
PMCID: PMC8942265  PMID: 35275911

Abstract

Trachoma is an infectious disease characterized by repeated exposures to Chlamydia trachomatis (Ct) that may ultimately lead to blindness. Efficient identification of communities with high infection burden could help target more intensive control efforts. We hypothesized that IgG seroprevalence in combination with geospatial layers, machine learning, and model-based geostatistics would be able to accurately predict future community-level ocular Ct infections detected by PCR. We used measurements from 40 communities in the hyperendemic Amhara region of Ethiopia to assess this hypothesis. Median Ct infection prevalence among children 0–5 years old increased from 6% at enrollment, in the context of recent mass drug administration (MDA), to 29% by month 36, following three years without MDA. At baseline, correlation between seroprevalence and Ct infection was stronger among children 0–5 years old (ρ = 0.77) than children 6–9 years old (ρ = 0.48), and stronger than the correlation between active trachoma and Ct infection (0-5y ρ = 0.56; 6-9y ρ = 0.40). Seroprevalence was the strongest concurrent predictor of infection prevalence at month 36 among children 0–5 years old (cross-validated R2 = 0.75, 95% CI: 0.58–0.85), though predictive performance declined substantially with increasing temporal lag between predictor and outcome measurements. Geospatial variables, a spatial Gaussian process, and stacked ensemble machine learning did not meaningfully improve predictions. Serological markers among children 0–5 years old may be an objective tool for identifying communities with high levels of ocular Ct infections, but accurate, future prediction in the context of changing transmission remains an open challenge.

Author summary

Trachoma, one of the leading infectious causes of blindness globally, is targeted for elimination as a public health problem by 2030. District-level estimates of active trachoma among children 1–9 years old are currently used to guide control programs and assess elimination. However, active trachoma, based on diagnosis of clinical signs, is a subjective indicator. Serological markers present an objective, scalable alternative that could be measured in integrated platforms. In a hyperendemic region, community-level seroprevalence aligned more closely with concurrent infection prevalence than active trachoma. The correlation between seroprevalence and infection prevalence was stronger among 0–5-year-olds compared to 6–9-year-olds and was consistent over a three-year period of increasing transmission. Serosurveillance among children 0–5 years old may be a promising monitoring strategy to identify communities with the highest burdens of ocular chlamydial infection.

Introduction

Trachoma, caused by ocular infection with the bacterium Chlamydia trachomatis (Ct), is a leading infectious cause of blindness worldwide [1] and has been targeted for elimination as a public health problem by 2030 [2]. The World Health Organization’s SAFE strategy (Surgery, Antibiotics, Facial cleanliness, and Environmental improvement) has been successful in countries across Asia and the Middle East, achieving elimination as a public health problem in many cases [2]. Yet, trachoma is a persistent challenge in pockets of Africa, including some areas of Ethiopia that remain hyperendemic despite over 10 years of control activities [3]. The ability to efficiently identify potential areas of ongoing transmission for follow-up surveys and more intensive interventions is crucial for the trachoma endgame.

Trachoma elimination programs are currently guided by estimates of active trachoma in evaluation units (EUs) of 100,000–250,000 people [4]. Evidence of trachoma clusters at the village- or sub-village level throughout Africa [510] suggest that aggregate estimates may mask heterogeneity in infection: high-transmission villages may be missed by sampling design or their signal may be “washed out” in EU-level averages. Fine-scale estimates of trachoma could facilitate targeted allocation of limited resources to communities with the highest burden [11] and reduce unnecessary antibiotic use and subsequent selection for antibiotic resistance [12].

Mass drug administration (MDA) of azithromycin is currently recommended for EUs with trachomatous inflammation—follicular (TF) prevalence ≥5% among children 1–9 years old [2]. Clinical disease states are relevant signals of progression towards conjunctival scarring and ultimately blindness [1] but are subject to misclassification, even by experienced graders [13]. Immunoglobulin G (IgG) antibody responses to Pgp3 and CT694 antigens are a more objective alternative and have been identified as sensitive, specific, and durable indicators of past ocular Ct infection [14,15]. In addition, dried blood spot specimens used to assess serological markers are easy to collect, and Ct antigens can be included in multiplexed, integrated serosurveillance platforms to simultaneously and cost-effectively monitor numerous pathogens [16].

Thus far, efforts to predict future trachoma prevalence at the village and district level have had modest success [17,18] but have not considered serology or recent advances in machine learning and geostatistics that may facilitate fine-scale prediction. We hypothesized that models incorporating trachoma indicators (active trachoma, ocular Ct infection identified by polymerase chain reaction (PCR), and IgG response to Ct antigens), remotely sensed geospatial layers, and spatial structure would accurately predict future community-level Ct infection prevalence. We also hypothesized that seroprevalence would be a more accurate and stable predictor of Ct infections compared to active trachoma and that communities with high levels of infection would be geographically clustered in stable foci of transmission (“hotspots”). We tested our hypotheses using measurements from 40 communities in the hyperendemic Amhara region of Ethiopia.

Methods

Ethics statement

This research was approved by a human subjects review board at the University of California, San Francisco. Each participant or guardian provided verbal consent before any study activity, with separate consent required for census, examinations and intervention at each study visit.

Data collection

This work was a secondary analysis of data from the WASH Upgrades for Health in Amhara (WUHA) community-randomized trial, one of the trials in the Sanitation, Water, and Instruction in Face-Washing for Trachoma (SWIFT) (NCT02754583) series. Details of study methodology and implementation are described in the published protocol [19]. WUHA was conducted from November 2015 through March 2019 in the Gazgibella, Sekota Zuria (i.e. Sekota) and Sekota Ketema (i.e. Sekota town) woredas of the Wag Hemra Zone in Amhara, Ethiopia (Fig 1). Forty communities were randomized in a 1:1 ratio to receive a comprehensive Water, Sanitation, and Hygiene (WASH) package at baseline or at completion of the study. Communities were not selected at random; they were located in rural areas within a 4-hour drive and/or walk from the main road and included all households within 1.5 km of a potential water point (e.g. hand-dug well or protected spring) as determined by geohydrologic survey; further details are available in the study protocol [19]. Mass administration of azithromycin occurred for seven consecutive years (May 2009 to June 2015, with supplemental administration in October 2014) prior to the start of the study but was suspended in all study communities for the duration of the WUHA trial.

Fig 1. Map of study area.

Fig 1

Inset (top right) highlights the Amhara Region (gray shading) of Ethiopia and the study area (black rectangle). Forty communities from three woredas (administrative level 3) in Amhara were included in the WUHA trial. The base map layer for this figure was downloaded from Stamen Maps (“Terrain”) and is available under the CC BY 3.0 license.

Trachoma indicators were measured in each study community at baseline and three annual monitoring visits. Approximately one month prior to each monitoring visit, a census was taken to enumerate individuals living in each study community. At each visit, thirty individuals in each of three age groups (0–5 years, 6–9 years, 10+ years) were randomly selected from each community for monitoring; this analysis focused on children 0–9 years old. Per the trial design, not all trachoma indicators were measured in all age groups at each time point; only children 0–5 years old were tested for clinical, serological, and PCR outcomes at all visits. At the end of WUHA, after adjusting for baseline, there was no statistically significant difference in the primary endpoint of community-level ocular Ct infection among 0–5-year-olds between intervention arms across the three post-baseline time points (risk difference: 3.7 percentage points higher in WASH arm, 95% CI: -4.9 to 12.4, p = 0.40) [20]. As a result, we combined information across arms for this analysis.

Measurement and definition of trachoma indicators

We analyzed age-group-specific community-level prevalence of three trachoma indicators: active trachoma, ocular Ct infection detected by PCR, and IgG response to Pgp3 and CT694 antigens.

Each year, eight local nurses and other healthcare professionals were recruited to serve as trachoma graders and swabbers. These individuals completed a four-day training with two days of classroom training and two field practice days. Prior to participation in fieldwork, graders were required to pass a photographic grading test with a Cohen’s kappa score of 0.6 or greater relative to consensus grades from a panel of three expert graders. Grading teams were randomly assigned to clusters. Trained trachoma graders used a pair of 2.5× loupes and a flashlight to assess the everted right superior tarsal conjunctiva for the presence of trachomatous inflammation—follicular (TF) or trachomatous inflammation—intense (TI) according to the WHO grading system [21]. Specifically, TF is characterized by the presence of five or more follicles which are (a) each at least 0.5 mm in diameter and (b) located in the central part of the upper tarsal conjunctiva. TI is distinguished by pronounced inflammatory thickening of the upper tarsal conjunctiva that obscures more than half of the normal deep tarsal vessels. An individual was considered positive for active trachoma if either TF or TI was detected.

Conjunctival swabs were collected and tested in the study laboratory at the Amhara Public Health Institute in Bahir Dar, Ethiopia with the Abbott RealTime assay (automated Abbott m2000 System), which is highly sensitive and specific for Ct [22,23]. Groups of five samples, stratified by community and age group, were pooled for testing, and community-level Ct infection prevalence was estimated from pooled results using a maximum likelihood approach [24]. Swabs from positive pools were tested individually for 0–5-year-olds at all visits, for 6–9-year-olds at months 12, 24, and 36, and if >80% of pools for a cluster were positive for all other age groups and time points. Approximately 12% of samples from 6–9-year-olds with an equivocal or positive pooled result at baseline were also tested individually. Air swabs were collected in every cluster at the beginning and end of each monitoring visit. None of the air swabs tested positive for Ct.

To measure antibody response, field staff lanced the index finger of each individual and collected blood onto TropBio filter paper. Samples were tested at the US Centers for Disease Control on a multiplex bead assay on the Luminex platform for antibodies to two recombinant antigens (Pgp3, CT694) that measure previous exposure to C. trachomatis [14,15,25]. Seropositivity thresholds were defined as median fluorescence intensity minus background (MFI-bg) of 1113 for Pgp3 and 337 for CT694 using an ROC cutoff from reference samples [26]. Individuals who were seropositive with respect to both antigens were considered seropositive for the main analysis.

Descriptive analysis of trachoma indicators

Spearman rank correlation coefficients were calculated for pairwise combinations of trachoma indicators by age group and follow-up visit. Correlations were also calculated between PCR prevalence at month 36 and serological, PCR, and active trachoma prevalence at each preceding time point to observe changes in correlation with increasing temporal lag between measurements. 95% confidence intervals were estimated from 1000 bootstrap samples. As communities were the unit of analysis, each bootstrap replicate consisted of forty communities resampled with replacement. This aligns with measures of uncertainty for cluster-level summaries which treat clusters as the primary source of variation [27,28].

Descriptive spatial analysis

Administrative boundaries for Ethiopia were downloaded from the Humanitarian Data Exchange [29]. Spatially interpolated maps for each trachoma indicator at each time point were generated using a simple kriging model including latitude, longitude, and a Matérn covariance. We estimated empirical variograms after removing linear spatial trends for distances up to 33.3 km (half of the maximum distance between any two study communities) and fit exponential and Matérn models; for stability, we required bins to contain ten or more pairs of communities. The effective, or practical, range was defined as the distance at which the fitted model reached 95% of the sill. We compared the observed variograms to a 95% pointwise envelope based on 1000 Monte Carlo simulations; for each simulation, prevalence residuals were permuted while holding coordinates fixed and the empirical variogram was recalculated [30]. We also calculated Moran’s I, a measure of global spatial autocorrelation, over 1000 permutations of the community-level prevalence values and estimated a p-value based on permutations resulting in a Moran’s I greater than or equal to the observed value.

Predictive model selection

Prediction models were limited to children 0–5 years old due to availability of all trachoma indicators for this age range at all time points. We developed several candidate models using baseline data only, with the analysis team masked to any future measurements. A wide range of publicly available environmental [3135], demographic [36], and socioeconomic [3739] variables were explored based on prior associations with trachoma or other infectious diseases (S1 Table). When possible, features were extracted and aggregated using Google Earth Engine [40], and means were used for spatial and temporal aggregation unless otherwise specified in S1 Table. All features were aggregated to a grid resolution of 2.5 arc minutes (approximately 4.5 km at the median latitude of the study area) based on the lowest resolution dataset (TerraClimate) and reprojected to WGS84. Each community was assigned to the grid cell containing its household-weighted geographic centroid, defined as the median latitude and longitude across all households in the community.

Models were built using predictor variables measured over the same (“concurrent”) and prior (“forward predictions”) time periods. Time-varying features were summarized based on calendar year, with 2015 data considered “concurrent” with month 0 trachoma indicators and so on. Time-varying features were first aggregated by month and then summarized based on recency relative to the time of monitoring (e.g. last 1 month or December of the calendar year, last 2 months, up to 12 months). To reduce collinearity, we evaluated pairwise Pearson correlation coefficients between temporal summaries of the same variable and dropped the summary over fewer months for pairs with correlation over 0.9.

During preliminary model development with baseline data, we observed that including a large number of predictor variables led to overfitting and unstable model performance due to the relatively small number of communities. As a result, logistic LASSO regression was used to identify a restricted set of geospatial features to include in the final prediction models. Night light radiance and daily precipitation averaged over the preceding 12 months were selected from a model using concurrently measured predictors and outcomes across all follow-up visits.

Logistic regression models of the following form were used as base prediction models:

logit(πcm)=α+pβnpxcnp+S(latitudec,longitudec)

where πcm represents PCR prevalence for study community c at month m, α is the model intercept, and xcn1xcnp denote p covariates (and corresponding coefficients β) measured at time n, where n = m for concurrent predictions and n = m—k for predictions k months forward. Extended models also included a Gaussian process with Matérn covariance function [41] to capture residual spatial structure, represented by the S function dependent on latitude and longitude of each community.

We additionally explored stacked ensemble machine learning, also known as stacked regression [42] or stacked generalization [43]. Stacked ensembles combine predictions from multiple ‘Level 0’ models using a ‘Level 1’ model, also called the superlearner or metalearner [44]. Ensembles are theoretically guaranteed to perform as well as or better than any single member of their library [42,44]. Our ‘Level 0’ learners included logistic regression, generalized additive models [45], random forest [46], extreme gradient boosting [47], and multivariate adaptive regression splines [48]. This set of models, including parametric, semi-parametric, and tree-based methods, was selected to ensure diversity in approach; outcome specification also varied (e.g. binomial, quasibinomial, continuous) based on requirements of the learner. Logistic regression with a Matérn covariance was used as the ‘Level 1’ superlearner for the baseline analysis.

Predictive model assessment

We conducted 10-fold cross-validation to assess predictive performance. Spatial autocorrelation can violate the independence assumption between training and validation sets in cross-validation and lead to overly optimistic estimates of predictive power [49,50]. Therefore, we partitioned the study area into 12 15x15km blocks, each containing 1–8 spatially proximate communities. Communities in the same block were assigned to the same validation set, with some sets consisting of more than one block. This approach decreases spatial dependence between training and validation sets in the same fold and simulates prediction in a new, but geographically proximate, area. Predictive performance was assessed using cross-validated root-mean-square-error (RMSE) and R2 [51], where R2 was calculated as:

1c(pcmpcm^)2c(pcmpcm)2

95% confidence intervals for R2 were estimated using the influence function [52,53]. Communities received equal weight in all validation metrics.

As this was a secondary analysis, the sample size was fixed at 40 communities per survey. To our knowledge, there are no methods available to estimate power for cross-validated error in prediction problems. Instead, we estimated the minimum detectable effect for the correlation analysis. Assuming a two-tailed alpha of 0.05, we had 80% power to detect a correlation of 0.43 or larger with 40 communities [54].

Results

Study population

Approximately thirty children from each of two age groups (0–5 years old and 6–9 years old) were randomly sampled from each community at baseline and follow-up visits. The number of children evaluated differed slightly for each trachoma indicator (S2 Table). Over the three-year study period, ocular Ct infection prevalence, as measured by PCR, increased substantially in both age groups (Table 1). Levels of active trachoma fluctuated with time but remained fairly consistent with baseline levels. Seropositivity, defined as antibody response above pre-determined cut-offs for both Pgp3 and CT694 antigens, increased gradually among 0–5-year-olds (two-sided p = 2.6×10−4 in a Wilcoxon signed-rank test comparing month 0 and month 36). Antibodies were not measured among 6–9-year-olds at months 12 and 24 but were similar between study arms at months 0 and 36 (p = 0.44). Results were similar when seroprevalence was assessed for each antigen separately (S3 Table).

Table 1. Community-level prevalence of trachoma across 40 study communities by indicator, age group and month of follow-up visit.

Month Median prevalence (%) (IQR), 0–5-year-olds Median prevalence (%) (IQR), 6–9-year-olds
n1 PCR2 TF/TI3 Serology n1 PCR2 TF/TI3 Serology4
0 1,269 5.6 (2.9–18.1) 62.9 (51.0–72.5) 25.0 (10.1–34.8) 1,135 3.5 (0.0–13.9) 40.3 (25.9–54.9) 49.2 (29.8–60.2)
12 1,162 19.1 (6.6–30.2) 50.8 (40.6–61.1) 29.7 (15.6–40.2) 1,092 10.9 (5.7–17.4) 21.3 (14.3–27.8) -
24 1,214 27.4 (11.6–34.3) 67.5 (55.5–77.4) 33.3 (20.5–39.0) 1,208 19.9 (9.7–34.2) 45.1 (29.4–53.4) -
36 1,192 29.3 (16.2–46.8) 56.7 (45.2–64.3) 33.3 (23.5–42.3) 1,218 21.7 (15.2–38.2) 38.2 (30.1–53.6) 50.8 (28.9–65.4)

1 Number of children tested for any indicator across all study communities

2 Polymerase chain reaction

3 Trachomatous inflammation—follicular / trachomatous inflammation—intense

4 Serology was not measured for a random sample of 6–9-year-olds at months 12 and 24

Ocular infection was more common in the western and northern regions of the study area (Fig 2A), and seroprevalence and active trachoma were similarly distributed in space (S1A and S2A Figs). Based on empirical variograms (Fig 2B) and Moran’s I (Fig 2C), there was weak spatial structure in community-level Ct PCR prevalence that increased slightly over the study period; serology and active trachoma also did not display clear spatial structure over the study area (S1 and S2 Figs).

Fig 2.

Fig 2

Predicted surface (A), variograms (B), and Moran’s I (C) for PCR-confirmed ocular C. trachomatis infection prevalence among 0–5-year-olds at each study month. Maps display prevalence for 40 study communities at each follow-up visit, spatially interpolated over the convex hull using kriging. Variograms capture similarity between community-level prevalence measurements as a function of distance between community pairs (in km), with smaller semivariance values representing increased similarity. Exponential (magenta) and Matérn (green) models were fit to each empirical variogram, and the effective range (dashed vertical line) is defined as the distance at which the fitted model reaches 95% of the sill. The Monte Carlo envelope (gray shading) displays pointwise 95% coverage of 1000 permutations, representing a null distribution. Moran’s I was calculated over 1000 permutations (gray bars, with observed value represented by red line), and a permutation-based p-value was calculated. The base map layer for panel A in this figure was downloaded from Stamen Maps (“Terrain”) and is available under the CC BY 3.0 license.

Comparisons between serological, clinical, and molecular trachoma indicators

Seroprevalence demonstrated a stronger rank-preserving relationship, as measured by the Spearman correlation, with contemporaneous PCR prevalence than active trachoma for both age groups (Fig 3A and 3B). Descriptive results were similar when considering either antigen separately (S3 Fig and S3 Table). At baseline, immediately following seven years of MDA, the correlations between trachoma indicators were more pronounced among younger children, potentially reflecting lower transmission in the presence of MDA and saturation in seroprevalence due to durable antibody responses among older children. Similar saturation dynamics may be at play for active trachoma, which has been shown to resolve slowly among children [55]. By month 36, when infections were higher across the study area (Table 1), correlations between trachoma indicators were similar across age groups (Fig 3A and 3B). Rank-preserving relationships between indicators at each time point and month 36 PCR prevalence were stronger for more proximate measurements, and this increase was more pronounced for PCR compared to active trachoma or serology (Fig 3C).

Fig 3. Correlations between trachoma indicators by age group and over time.

Fig 3

Panels display Spearman rank correlations between community-level seroprevalence and PCR prevalence at study months 0 and 36 (A), active trachoma prevalence and PCR prevalence at months 0 and 36 (B), and PCR prevalence at month 36 and trachoma indicators measured at each survey across 40 study communities (C). Correlations are shown separately for 0–5-year-olds (green) and 6–9-year-olds (purple), and 95% confidence intervals were estimated from 1000 bootstrap samples. Serology data were not collected for a random sample of 6–9-year-olds at months 12 and 24.

Concurrent and forward prediction of PCR prevalence

We predicted community-level infection prevalence using a range of model specifications and conducted spatial 10-fold cross-validation (CV) with 15x15 km blocks [49] to assess predictive performance using CV R2 and root-mean-square-error (RMSE). Fig 4 presents results for models predicting PCR prevalence at month 36. “Concurrent” predictions utilized trachoma indicators measured at month 36 and/or geospatial variables measured over the preceding year (2018), while “forward” predictions used covariates measured 12, 24, or 36 months in the past. Seroprevalence was the single strongest concurrent predictor of month 36 community-level PCR prevalence (CV R2: 0.75, 95% confidence interval (CI): 0.58–0.85, CV RMSE: 0.10), substantially outperforming active trachoma prevalence (CV R2: 0.37, 95% CI: 0.08–0.56, CV RMSE: 0.16) (Fig 4). When predicting 12 months into the future, all trachoma indicators performed moderately well, but predictive performance declined for longer time horizons across all model specifications. No model that we assessed had a CV R2 significantly different from 0 (equivalent to an intercept-only or mean-only model) when predicting PCR prevalence 24 months or more into the future.

Fig 4. Cross-validated R2 for models predicting month 36 community-level PCR prevalence among 0–5-year-olds.

Fig 4

Cross-validated coefficient of determination (R2), 95% influence-function-based confidence interval, and cross-validated root-mean-square error (RMSE, text label) are shown for each model specification. Logistic regression was used for all models with the exception of the stacked ensemble (gray). Blocks of size 15x15 km were used for 10-fold spatial cross-validation.

As anticipated by the weak spatial dependence in PCR prevalence (Fig 2), incorporation of a Gaussian process with a Matérn covariance function did not improve predictions. In addition, LASSO-selected geospatial features (night light radiance and daily precipitation averaged over the preceding 12 months) (S4 Fig) and a stacked ensemble approach leveraging five base models did not meaningfully improve CV R2 or CV RMSE compared to simpler models. Results were similar for models predicting PCR prevalence at each time point and pooled over all time points (S5 Fig). We also observed similar results with various superlearner models (S6 Fig) and cross-validation folds (S7 Fig), with the latter perhaps reflecting the weak spatial autocorrelation observed in this dataset (Fig 2).

Efficient identification of high-burden communities

A complementary task to prediction is identifying communities with the highest infection burden, defined here as the number of Ct infections among 0–5-year-olds at a given time. To address variability in sample size, the number of Ct infections in each community was scaled to represent a sample of 30 individuals. At month 36, 80% of Ct infections were concentrated in just over half of the communities (23/40), and ordering communities by cross-validated concurrent predictions using seroprevalence identified infections more efficiently (i.e. in fewer communities, 25/40) than ordering them by predictions using active trachoma (27/40) (Fig 5). Performance declined when using predictors measured 12 months in the past, and communities ranked by most predictors measured 24 and 36 months in the past could not identify high-burden communities based on PCR infections at month 36 better than chance. The distinction between models was greater at month 0 when 80% of Ct infections were concentrated in just the top 15 of 40 (38%) of communities (S8 Fig).

Fig 5. Cumulative proportion of C. trachomatis infections at month 36 identified by concurrent and forward prediction models.

Fig 5

Dashed lines indicate the point at which the cumulative proportion of identified Ct infections at month 36, scaled to represent a sample of 30 individuals per community, surpassed 80%. The black line in each facet represents the optimal ordering of scaled PCR infections at month 36. To simulate a null distribution, we estimated the cumulative proportion of infections identified for 1000 random orderings of the 40 communities and plotted the 95% pointwise envelope (gray shading). For concurrent and 24-month-forward predictions, models using serology only and PCR only, respectively, performed equally well to a model using all trachoma indicators, geospatial features, a Matérn covariance, and ensemble machine learning; vertical lines were offset slightly for visibility.

Discussion

We conducted a comprehensive study of repeated cross-sectional measurements of active trachoma, PCR-positive ocular Ct infections, and serological responses to Ct antigens over three years in 40 communities in the hyperendemic Amhara region of Ethiopia. In the absence of MDA during the study, ocular Ct infections surged and became increasingly dispersed across study communities. Based on empirical variograms and Moran’s I, we observed weak evidence for global spatial clustering in trachoma indicators over the study region. Seroprevalence among children 0–5 years old aligned closely with PCR prevalence measured at the same time, highlighting the potential for serosurveillance as a monitoring tool that corresponds well with levels of ocular infection and is potentially easier to measure [56]. Predictive performance of all models declined with increasing temporal lag between outcome and predictor measurements. In this setting, remotely sensed demographic, socioeconomic, and environmental geospatial layers, a spatial Gaussian process with Matérn covariance, and stacked ensemble machine learning did not meaningfully improve predictive performance compared to models using only trachoma indicators. We also illustrate a potential application of predictive models to rank-order and therefore efficiently identify communities with high infection burden; we expect that this approach may be most useful when infections are concentrated in a small number of communities.

Identifying potential future trachoma hotspots is notoriously challenging and sometimes termed “chasing ghosts” by trachoma programs [17]. Our results underscore the difficulty of predicting community-level Ct infection prevalence even a year into the future, at least in the context of increasing transmission in the absence of MDA. Furthermore, our “forward prediction” models were trained on infection outcomes from the desired prediction time point and thus were potentially more optimistic than true “forecasting” models trained solely on historical data. Prior efforts to forecast district-level TF [18] and village-level PCR prevalence [17] have explored mechanistic and statistical models and observed modest performance, with one investigation concluding that models with the highest uncertainty resulted in the best predictive performance [17]. It remains unclear why future prediction of trachoma presents such a difficult challenge, though likely contributing factors include the stochasticity of rare events especially in near-elimination settings [57], biological unknowns in the complex natural history of trachoma [58], and the extended duration between survey measurements (often 6 months or greater). Models for other neglected tropical diseases have achieved some success in future prediction at the sub-district level, though often capitalizing on larger datasets. For example, a recent study developed models with over 80% accuracy for prediction of Schistosoma mansoni persistent hotspots (defined as failure of a village to reduce infection prevalence and/or intensity by specific thresholds) up to two years in the future in the context of decreasing prevalence [59]. In a setting with fairly stable transmission, a sub-district-level study for visceral leishmaniasis reported 85.7% coverage of four-month-ahead 25–75% prediction intervals for case counts [60].

Our investigation builds upon an existing body of work characterizing the dynamics between clinical, serological, and molecular trachoma indicators. Reports at the district, village, and individual level have established that relatively high levels of active trachoma or ocular infections tend to correspond to higher seroprevalence and/or seroconversion rates [14,6164]; post-elimination settings have been of particular interest, with populations often displaying little to no antibody response [15,26,6569]. Our findings align with earlier studies reporting that active trachoma was more strongly correlated with infection prevalence in populations with ongoing transmission compared to populations in which transmission has been suppressed by MDA [7072]; also in agreement with prior findings, we observed that TI was slightly, but not significantly, more closely correlated with infection prevalence compared to TF immediately following MDA (S9 Fig) [73].

We additionally found that seroprevalence among children 0–9 years old was more closely aligned with concurrent infection prevalence than active trachoma both immediately after and following several years without MDA. Moreover, we found that seroprevalence was more strongly correlated with PCR prevalence among children 0–5 years old compared to children 6–9 years old, especially in the context of recent MDA at month 0. The lower correlation among 6–9-year-olds is likely due to the discordance between dampened transmission due to MDA and high seroprevalence from past exposures in this older age group. In contrast, seroprevalence among younger children reflects more recent transmission patterns, an observation which has been leveraged to investigate potential recrudescence in Tanzania [74]. Our findings support a focus on children 0–5 years old as a key sentinel population for trachoma serosurveillance.

Interestingly, active trachoma maintained a fairly consistent and strong correlation (~0.7) with PCR prevalence at both 0 and 12 months into the future (Fig 3C) and was a slightly better predictor of PCR prevalence 12 and 24 months ahead compared to serology (Fig 4); however, these findings should be interpreted with caution due to substantial uncertainty in predictive performance across models and inconsistent trends in active trachoma (Table 1).

In general, we did not observe strong evidence of global spatial autocorrelation for trachoma indicators over the study region, though spatial structure in PCR prevalence appeared to increase slightly over the study period. A prior analysis over the entire Amhara region reported evidence of spatial autocorrelation in TF between villages within 25km bands [10], and another study of TF and TI in Southern Sudan detected residual spatial structure between villages at approximately 8 km, after adjusting for age, sex, rainfall, and land cover [75]. A larger number of existing studies have characterized spatial autocorrelation at a fairly small scale. Studies using household-level information identified spatial clustering at less than 2 km for bacterial load [6,9], ocular infection [8,9], and active trachoma [76]. Our ability to detect spatial structure may have been limited by the geographic distribution of the communities, which was determined by the main trial objectives rather than optimized for estimation of spatial model parameters, which often requires points fairly close to one another [77]. In our study, only 26 (out of 780) pairs of study communities were within 5 km of one another leading to wide uncertainty at small ranges and hindering our ability to assess fine-scale spatial clustering.

In addition to rainfall and land cover, studies have reported associations between active trachoma and distance to water source [10,7880], temperature [7,79,81], altitude [79,8184], markers of socioeconomic status [7,10,78,80,84,85], and markers of personal or household hygiene, such as facial cleanliness [7,10,78,80,8592]. Fewer studies have examined Ct infections identified by PCR, but associations reported were generally similar [85,92,93]. Using LASSO to down-select geospatial features, we included night light radiance (often a proxy for socioeconomic activity [94]) and precipitation in prediction models. However, these features were unable to predict infection prevalence better than an intercept-only model. Predictive power of geospatial variables may have been limited by relative homogeneity across the study area, and the relatively small number of communities likely limited the predictive performance of all models.

As with all secondary analyses, our data were constrained by the objectives and design of the original trial. For instance, communities were purposely selected to be rural, but not too remote, and close to a potential water point–as a result, our findings may not be generalizable to urban or very remote areas, and spatial interpolation across the study site should be interpreted cautiously. Furthermore, this study was conducted in a hyperendemic region with increasing trachoma transmission in the absence of MDA and may not generalize to lower transmission settings. Ethiopia’s Amhara region presents a particularly stubborn elimination challenge, as seven consecutive years of MDA were unable to sustain control before the start of this study. It is unclear whether prediction would be more or less challenging in the context of low transmission; we may expect more predictability in a “steady state” environment, but stochasticity is also a defining characteristic of near-elimination disease dynamics [57]. As an additional sensitivity analysis, we included survey month as a covariate to assess potential benefits of repeated sampling in the context of changing transmission and found only a modest improvement in predictive performance (S10 Fig).

The methods used here may be extended to other surveys in which trachoma prevalence can be estimated at the cluster level, including those supported by the Global Trachoma Mapping Project and Tropical Data service which have typically relied on multi-stage sampling strategies with compact segment sampling within villages [95,96]. Additional steps towards programmatic implementation of serosurveillance for trachoma should focus on further development of survey design and analytic methods including model-based geostatistics, which has recently been applied to trachomatous trichiasis [97], cost-effectiveness analyses to weigh the benefits of targeted interventions against the costs of fine-scale monitoring, and consideration of integrated serosurveillance programs to enable scalability.

Conclusions

Serological markers among children 0–5 years old may be well-suited for community-level trachoma monitoring given their objectivity, durability, relative ease of collection, and strong correlation with ocular Ct infection prevalence. While seroprevalence and active trachoma were both correlated with infection prevalence in the midst of high transmission in the absence of MDA, only seroprevalence was strongly associated with community-level infections in the context of suppressed transmission directly following MDA. Accurate, future prediction of community-level Ct infection prevalence in settings with unstable transmission remains an open challenge.

Supporting information

S1 Table. Description and sources of geospatial variables explored for prediction analysis.

(DOCX)

S2 Table. Number of children evaluated across 40 study communities by trachoma indicator, age group and study month.

(DOCX)

S3 Table. Community-level seroprevalence across 40 study communities by antigen, age group, and study month.

(DOCX)

S1 Fig

Maps (A), variograms (B), and Moran’s I (C) for seroprevalence among 0–5-year-olds at each study month. Maps display prevalence for 40 study communities at each follow-up visit, spatially interpolated over the convex hull using kriging. Variograms capture similarity between community-level prevalence measurements as a function of distance between community pairs (in km), with smaller semivariance values representing increased similarity. Exponential (magenta) and Matérn (green) models were fit to each empirical variogram, and the effective range (dashed vertical line) is defined as the distance at which the fitted model reaches 95% of the sill. The Monte Carlo envelope (gray shading) displays pointwise 95% coverage of 1000 permutations, representing a null distribution. Moran’s I was calculated over 1000 permutations (gray bars, with observed value represented by red line), and a permutation-based p-value was calculated. The base map layer for panel A in this figure was downloaded from Stamen Maps (“Terrain”) and is available under the CC BY 3.0 license.

(TIF)

S2 Fig

Maps (A), variograms (B), and Moran’s I (C) for active trachoma prevalence among 0–5-year-olds at each study month. Maps display prevalence for 40 study communities at each follow-up visit, spatially interpolated over the convex hull using kriging. Variograms capture similarity between community-level prevalence measurements as a function of distance between community pairs (in km), with smaller semivariance values representing increased similarity. Exponential (magenta) and Matérn (green) models were fit to each empirical variogram, and the effective range (dashed vertical line) is defined as the distance at which the fitted model reaches 95% of the sill. The Monte Carlo envelope (gray shading) displays pointwise 95% coverage of 1000 permutations, representing a null distribution. Moran’s I was calculated over 1000 permutations (gray bars, with observed value represented by red line), and a permutation-based p-value was calculated. The base map layer for panel A in this figure was downloaded from Stamen Maps (“Terrain”) and is available under the CC BY 3.0 license.

(TIF)

S3 Fig. Correlations between PCR prevalence and antigen-specific seroprevalence by age group and over time.

Panels display Spearman rank correlations between community-level Pgp3 seroprevalence and PCR prevalence at months 0 and 36 (A), CT694 seroprevalence and PCR prevalence at months 0 and 36 (B), and PCR prevalence at month 36 and seroprevalence measured at each follow-up visit across 40 study communities (C). Correlations are shown separately for 0–5-year-olds (green) and 6–9-year-olds (purple) when possible, and 95% confidence intervals were estimated from 1000 bootstrap samples. Serology data was not collected for a random sample of 6–9-year-olds at months 12 and 24.

(TIF)

S4 Fig. Spatio-temporal distribution of LASSO-selected geospatial predictor variables.

Variables were estimated for 240 grid cells of 2.5 x 2.5 arc minutes (approximately 20 km2 at the median latitude of the study area). Daily precipitation (A) and monthly night light radiance (B) averaged over the year were included in the final set of prediction models. The base map layer for this figure was downloaded from Stamen Maps (“Terrain”) and is available under the CC BY 3.0 license.

(TIF)

S5 Fig

Cross-validated R2 for models predicting community-level PCR prevalence among 0–5-year-olds at month 0 (A), at month 12 (B), at month 24 (C), at month 36 (D), and pooled across all months (E). Cross-validated R2 (coefficient of determination), 95% influence-function-based confidence interval, and cross-validated root-mean-square error (RMSE, text label) are shown for each model specification. Blocks of size 15x15km were used for 10-fold spatial cross-validation. (D) is equivalent to Fig 4 in the main text and is included here for comparison.

(TIF)

S6 Fig. Cross-validated R2 for stacked ensemble models predicting community-level PCR prevalence at month 36 among 0–5-year-olds using various superlearner models.

Cross-validated R2 (coefficient of determination), 95% influence-function-based confidence interval, and cross-validated root-mean-square error (RMSE, text label) are shown for each model specification. Blocks of size 15x15km were used for 10-fold spatial cross-validation.

(TIF)

S7 Fig

Cross-validated R2 for models predicting community-level PCR prevalence among 0–5-year-olds at month 36 using random 10-fold cross-validation (A), 10-fold spatial cross validation with 5x5 km blocks (B), 15x15 km blocks (C), and 20x20 km blocks (D), and leave-one-out cross-validation (E). Cross-validated R2 (coefficient of determination), 95% influence-function-based confidence interval, and cross-validated root-mean-square error (RMSE, text label) are shown for each model specification. (C) is equivalent to Fig 4 in the main text and is included here for comparison.

(TIF)

S8 Fig. Cumulative proportion of C. trachomatis infections at months 0 and 36 identified by concurrent prediction models.

The black line in each facet represents the optimal ordering of scaled PCR infections at each respective month. Dashed lines indicate the point at which the cumulative proportion of infections, scaled to represent a sample of 30 individuals per community, surpassed 80%. To simulate a null distribution, we estimated the cumulative proportion of infections identified for 1000 random orderings of the 40 communities and plotted the 95% pointwise envelope (gray shading). At month 36, a model using only serology performed equally well to a model using all trachoma indicators, geospatial features, a Matérn covariance, and ensemble machine learning; vertical lines were offset slightly for visibility.

(TIF)

S9 Fig. Correlations between PCR prevalence and active trachoma by age group and over time.

Panels display Spearman rank correlations between community-level TF prevalence and PCR prevalence at months 0 and 36 (A), TI prevalence and PCR prevalence at months 0 and 36 (B), and PCR prevalence at month 36 and active trachoma measured at each follow-up visit across 40 study communities (C). TF prevalence included any child diagnosed with TF, regardless of TI status, and vice versa. Correlations are shown separately for 0–5-year-olds (green) and 6–9-year-olds (purple), and 95% confidence intervals were estimated from 1000 bootstrap samples.

(TIF)

S10 Fig. Cross-validated R2 for models predicting pooled community-level PCR prevalence among 0–5-year-olds at month 36 with survey month (time) modeled as a linear covariate or Gaussian process.

Cross-validated R2 (coefficient of determination), 95% influence-function-based confidence interval, and cross-validated root-mean-square error (RMSE, text label) are shown for each model specification. Blocks of size 15x15km were used for 10-fold spatial cross-validation. For predictions 36 months ahead, time could not be explicitly modeled as a linear covariate as all outcomes were measured at month 36.

(TIF)

Acknowledgments

We would like to thank the WUHA study participants and field team without whom this research would not be possible. We would also like to thank Abbott for its donation of the m2000 RealTime molecular diagnostics system and consumables. The findings and conclusions in this article are those of the authors and do not necessarily represent the official position of the Centers for Disease Control and Prevention. Use of trade names is for identification only and does not imply endorsement by the Public Health Service or by the U.S. Department of Health and Human Services.

Software

All analysis was conducted in R Version 4.0.2 (“Taking Off Again”) [98]. The main R packages used for this analysis were automap (variograms) [99], rgee (Google Earth Engine) [100], glmnet (feature selection) [101], spaMM (regression with spatial Gaussian process) [102], sl3 (stacked ensemble) [103], and blockCV (spatial cross-validation) [104].

Data Availability

Community latitude and longitude values have been modified to protect the privacy of study participants. The pre-specified statistical analysis plan is available on Open Science Framework (https://osf.io/t48zb/). De-identified data and code to replicate this work are available at the following repository: https://doi.org/10.5281/zenodo.5851642.

Funding Statement

This work was supported by the National Institute of Allergy and Infectious Diseases (R03 AI147128 to BFA) and the National Eye Institute (U10 EY023939 to JDK). This work was also made possible in part by an Unrestricted Grant from Research to Prevent Blindness. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Taylor HR, Burton MJ, Haddad D, West S, Wright H. Trachoma. The Lancet. 2014. Dec 13;384(9960):2142–52. [DOI] [PubMed] [Google Scholar]
  • 2.World Health Organization. WHO Alliance for the Global Elimination of Trachoma by 2020: progress report, 2019. World Health Organization; 2020 Jul p. 349–60. (Weekly epidemiological record). Report No.: 30. [Google Scholar]
  • 3.Sata E, Nute AW, Astale T, Gessese D, Ayele Z, Zerihun M, et al. Twelve-Year Longitudinal Trends in Trachoma Prevalence among Children Aged 1–9 Years in Amhara, Ethiopia, 2007–2019. Am J Trop Med Hyg. 2021. Apr;104(4):1278–89. doi: 10.4269/ajtmh.20-1365 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.World Health Organization. Validation of elimination of trachoma as a public health problem [Internet]. Geneva: World Health Organization; 2016. [cited 2021 Apr 6]. Report No.: WHO/ HTM/NTD/2016.8. Available from: https://apps.who.int/iris/bitstream/handle/10665/208901/WHO-HTM-NTD-2016.8-eng.pdf;jsessionid=4F29D347E33C2D21B3042ADE4F1EF91E?sequence=1 [Google Scholar]
  • 5.Bailey R, Osmond C, Mabey DCW, Whittle HC, Ward ME. Analysis of the Household Distribution of Trachoma in a Gambian Village Using a Monte Carlo Simulation Procedure. Int J Epidemiol. 1989;18(4):944–51. doi: 10.1093/ije/18.4.944 [DOI] [PubMed] [Google Scholar]
  • 6.Broman AT, Shum K, Munoz B, Duncan DD, West SK. Spatial Clustering of Ocular Chlamydial Infection over Time following Treatment, among Households in a Village in Tanzania. Invest Ophthalmol Vis Sci. 2006. Jan 1;47(1):99–104. doi: 10.1167/iovs.05-0326 [DOI] [PubMed] [Google Scholar]
  • 7.Hägi M, Schémann J-F, Mauny F, Momo G, Sacko D, Traoré L, et al. Active Trachoma among Children in Mali: Clustering and Environmental Risk Factors. Gyapong JO, editor. PLoS Negl Trop Dis. 2010. Jan 19;4(1):e583. doi: 10.1371/journal.pntd.0000583 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Yohannan J, He B, Wang J, Greene G, Schein Y, Mkocha H, et al. Geospatial Distribution and Clustering of Chlamydia trachomatis in Communities Undergoing Mass Azithromycin Treatment. Invest Ophthalmol Vis Sci. 2014. Jul;55(7):4144–50. doi: 10.1167/iovs.14-14148 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Last A, Burr S, Alexander N, Harding-Esch E, Roberts CH, Nabicassa M, et al. Spatial clustering of high load ocular Chlamydia trachomatis infection in trachoma: a cross-sectional population-based study. Pathog Dis [Internet]. 2017. Jul [cited 2021 Apr 6];75(5). Available from: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5808645/ doi: 10.1093/femspd/ftx050 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Altherr FM, Nute AW, Zerihun M, Sata E, Stewart AEP, Gessese D, et al. Associations between Water, Sanitation and Hygiene (WASH) and trachoma clustering at aggregate spatial scales, Amhara, Ethiopia. Parasit Vectors. 2019. Nov 14;12(1):540. doi: 10.1186/s13071-019-3790-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Dowell SF, Blazes D, Desmond-Hellmann S. Four steps to precision public health. Nature. 2016. Dec;540(7632):189–91. [Google Scholar]
  • 12.O’Brien KS, Emerson P, Hooper P, Reingold AL, Dennis EG, Keenan JD, et al. Antimicrobial resistance following mass azithromycin distribution for trachoma: a systematic review. Lancet Infect Dis. 2019. Jan;19(1):e14–25. doi: 10.1016/S1473-3099(18)30444-4 [DOI] [PubMed] [Google Scholar]
  • 13.Gebresillasie S, Tadesse Z, Shiferaw A, Yu SN, Stoller NE, Zhou Z, et al. Inter-Rater Agreement between Trachoma Graders: Comparison of Grades Given in Field Conditions versus Grades from Photographic Review. Ophthalmic Epidemiol. 2015. May 4;22(3):162–9. doi: 10.3109/09286586.2015.1035792 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Goodhew EB, Priest JW, Moss DM, Zhong G, Munoz B, Mkocha H, et al. CT694 and pgp3 as Serological Tools for Monitoring Trachoma Programs. Vinetz JM, editor. PLoS Negl Trop Dis. 2012. Nov 1;6(11):e1873. doi: 10.1371/journal.pntd.0001873 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Goodhew EB, Morgan SMG, Switzer AJ, Munoz B, Dize L, Gaydos C, et al. Longitudinal analysis of antibody responses to trachoma antigens before and after mass drug administration. BMC Infect Dis. 2014. Dec;14(1):3154. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Arnold BF, Scobie HM, Priest JW, Lammie PJ. Integrated Serologic Surveillance of Population Immunity and Disease Transmission. Emerg Infect Dis. 2018. Jul;24(7):1188–94. doi: 10.3201/eid2407.171928 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Liu F, Porco TC, Amza A, Kadri B, Nassirou B, West SK, et al. Short-term Forecasting of the Prevalence of Trachoma: Expert Opinion, Statistical Regression, versus Transmission Models. Ngondi JM, editor. PLoS Negl Trop Dis. 2015. Aug 24;9(8):e0004000. doi: 10.1371/journal.pntd.0004000 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Pinsent A, Liu F, Deiner M, Emerson P, Bhaktiari A, Porco TC, et al. Probabilistic forecasts of trachoma transmission at the district level: A statistical model comparison. Epidemics. 2017. Mar 1;18:48–55. doi: 10.1016/j.epidem.2017.01.007 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Wittberg DM, Aragie S, Tadesse W, Melo JS, Aiemjoy K, Chanyalew M, et al. WASH Upgrades for Health in Amhara (WUHA): study protocol for a cluster-randomised trial in Ethiopia. BMJ Open. 2021. Feb;11(2):e039529. doi: 10.1136/bmjopen-2020-039529 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Aragie S, Wittberg DM, Tadesse W, Dagnew A, Hailu D, Chernet A, et al. Water, sanitation, and hygiene for control of trachoma in Ethiopia (WUHA): a two-arm, parallel-group, cluster-randomised trial. Lancet Glob Health. 2022. Jan 1;10(1):e87–95. doi: 10.1016/S2214-109X(21)00409-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Thylefors B, Dawson CR, Jones BR, West SK, Taylor HR. A simple system for the assessment of trachoma and its complications. Bull World Health Organ. 1987;65(4):477–83. [PMC free article] [PubMed] [Google Scholar]
  • 22.Møller JK, Pedersen LN, Persson K. Comparison of the Abbott RealTime CT New Formulation Assay with Two Other Commercial Assays for Detection of Wild-Type and New Variant Strains of Chlamydia trachomatis. J Clin Microbiol. 2010. Feb;48(2):440–3. doi: 10.1128/JCM.01446-09 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Cheng A, Qian Q, Kirby JE. Evaluation of the Abbott RealTime CT/NG Assay in Comparison to the Roche Cobas Amplicor CT/NG Assay. J Clin Microbiol. 2011. Apr;49(4):1294–300. doi: 10.1128/JCM.02595-10 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Ray KJ, Zhou Z, Cevallos V, Chin S, Enanoria W, Lui F, et al. Estimating Community Prevalence of Ocular Chlamydia trachomatis Infection using Pooled Polymerase Chain Reaction Testing. Ophthalmic Epidemiol. 2014. Apr;21(2):86–91. doi: 10.3109/09286586.2014.884600 [DOI] [PubMed] [Google Scholar]
  • 25.Woodhall SC, Gorwitz RJ, Migchelsen SJ, Gottlieb SL, Horner PJ, Geisler WM, et al. Advancing the public health applications of Chlamydia trachomatis serology. Lancet Infect Dis. 2018. Dec;18(12):e399–407. doi: 10.1016/S1473-3099(18)30159-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Migchelsen SJ, Martin DL, Southisombath K, Turyaguma P, Heggen A, Rubangakene PP, et al. Defining Seropositivity Thresholds for Use in Trachoma Elimination Studies. Johnson C, editor. PLoS Negl Trop Dis. 2017. Jan 18;11(1):e0005230. doi: 10.1371/journal.pntd.0005230 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Davison A, Hinkley D. 3.8 Hierarchical Data. In: Bootstrap Methods and their Application. Cambridge University Press; 1997. p. 100–1. [Google Scholar]
  • 28.Hayes RJ, Moulton LH. 10. Analysis Based on Cluster-Level Summaries. In: Cluster Randomised Trials. 2nd ed. Boca Raton: CRC Press; 2017. (Chapman & Hall / CRC Biostatistics). [Google Scholar]
  • 29.Central Statistics Agency (CSA), Regional Bureau of Finance and Economic Development (BoFED). Ethiopia—Subnational Administrative Divisions [Internet]. Ethiopia; 2020 [cited 2020 Nov 3]. Available from: https://data.humdata.org/dataset/ethiopia-cod-ab
  • 30.Diggle PJ, Ribiero J Jr. Model-Based Geostatistics. 1st ed. Springer Series in Statistics; 2007. [Google Scholar]
  • 31.Funk C, Peterson P, Landsfeld M, Pedreros D, Verdin J, Shukla S, et al. The climate hazards infrared precipitation with stations—a new environmental record for monitoring extremes. Sci Data. 2015. Dec;2(1):150066. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Abatzoglou JT, Dobrowski SZ, Parks SA, Hegewisch KC. TerraClimate, a high-resolution global dataset of monthly climate and climatic water balance from 1958–2015. Sci Data. 2018. Dec;5(1):170191. doi: 10.1038/sdata.2017.191 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Didan K. MOD13Q1 MODIS/Terra Vegetation Indices 16-Day L3 Global 250m SIN Grid V006 [Data set] [Internet]. NASA EOSDIS Land Processes DAAC; 2015. Available from: 10.5067/MODIS/MOD13Q1.006 [DOI] [Google Scholar]
  • 34.Jarvis A, Reuter H, Nelson A, Guevara E. Hole-filled SRTM for the globe Version 4, available from the CGIAR-CSI SRTM 90m [Internet]. 2008. Available from: http://srtm.csi.cgiar.org [Google Scholar]
  • 35.Pekel J-F, Cottam A, Gorelick N, Belward AS. High-resolution mapping of global surface water and its long-term changes. Nature. 2016. Dec;540(7633):418–22. doi: 10.1038/nature20584 [DOI] [PubMed] [Google Scholar]
  • 36.Tiecke TG, Liu X, Zhang A, Gros A, Li N, Yetman G, et al. Mapping the world population one building at a time. ArXiv171205839 Cs [Internet]. 2017. Dec 15 [cited 2020 Oct 1]; Available from: http://arxiv.org/abs/1712.05839 [Google Scholar]
  • 37.OpenStreetMap contributors. Planet dump retrieved from https://planet.osm.org. 2017.
  • 38.Elvidge CD, Baugh K, Zhizhin M, Hsu FC, Ghosh T. VIIRS night-time lights. Int J Remote Sens. 2017. Nov 2;38(21):5860–79. [Google Scholar]
  • 39.Weiss DJ, Nelson A, Vargas-Ruiz CA, Gligorić K, Bavadekar S, Gabrilovich E, et al. Global maps of travel time to healthcare facilities. Nat Med [Internet]. 2020. Sep 28 [cited 2020 Nov 18]; Available from: http://www.nature.com/articles/s41591-020-1059-1 doi: 10.1038/s41591-020-1059-1 [DOI] [PubMed] [Google Scholar]
  • 40.Gorelick N, Hancher M, Dixon M, Ilyushchenko S, Thau D, Moore R. Google Earth Engine: Planetary-scale geospatial analysis for everyone. Remote Sens Environ [Internet]. 2017; Available from: 10.1016/j.rse.2017.06.031 [DOI] [Google Scholar]
  • 41.Rasmussen CE, Williams CKI. Gaussian processes for machine learning. Cambridge, Mass: MIT Press; 2006. 248 p. (Adaptive computation and machine learning). [Google Scholar]
  • 42.Breiman L. Stacked regressions. Mach Learn. 1996. Jul;24(1):49–64. [Google Scholar]
  • 43.Wolpert DH. Stacked generalization. Neural Netw. 1992. Jan 1;5(2):241–59. [Google Scholar]
  • 44.van der Laan MJ, Polley EC, Hubbard AE. Super Learner. 2007. [cited 2020 Nov 25]; Available from: https://biostats.bepress.com/cgi/viewcontent.cgi?article=1226&context=ucbbiostat doi: 10.2202/1544-6115.1309 [DOI] [PubMed] [Google Scholar]
  • 45.Hastie T, Tibshirani R. Generalized Additive Models. Stat Sci. 1986;1(3):297–310. [DOI] [PubMed] [Google Scholar]
  • 46.Breiman L. Random Forests [Internet]. [cited 2020 Sep 19]. Available from: https://link.springer.com/content/pdf/10.1023/A:1010933404324.pdf
  • 47.Friedman JH. Greedy Function Approximation: A Gradient Boosting Machine. Ann Stat. 2001;29(5):1189–232. [Google Scholar]
  • 48.Friedman JH. Multivariate Adaptive Regression Splines. Ann Stat. 1991;19(1):1–67. [DOI] [PubMed] [Google Scholar]
  • 49.Roberts DR, Bahn V, Ciuti S, Boyce MS, Elith J, Guillera-Arroita G, et al. Cross-validation strategies for data with temporal, spatial, hierarchical, or phylogenetic structure. Ecography. 2017;40(8):913–29. [Google Scholar]
  • 50.Ploton P, Mortier F, Réjou-Méchain M, Barbier N, Picard N, Rossi V, et al. Spatial validation reveals poor predictive performance of large-scale ecological mapping models. Nat Commun. 2020. Dec;11(1):4540. doi: 10.1038/s41467-020-18321-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Kvålseth TO. Cautionary Note about R 2. Am Stat. 1985. Nov;39(4):279–85. [Google Scholar]
  • 52.Hubbard AE, Kherad-Pajouh S, van der Laan MJ. Statistical Inference for Data Adaptive Target Parameters. Int J Biostat. 2016. May 1;12(1):3–19. doi: 10.1515/ijb-2015-0013 [DOI] [PubMed] [Google Scholar]
  • 53.Benkeser D, Mertens A, Colford JM, Hubbard A, Arnold BF, Stein AD, et al. A machine learning-based approach for estimating and testing associations with multivariate outcomes. Int J Biostat [Internet]. 2020. Aug 20 [cited 2020 Oct 22];0(0). Available from: https://www.degruyter.com/view/journals/ijb/ahead-of-print/article-10.1515-ijb-2019-0061/article-10.1515-ijb-2019-0061.xml [DOI] [PubMed] [Google Scholar]
  • 54.Hulley S, Cummings S, Browner W, Grady D, Newman T. Appendix 6C. In: Designing clinical research: an epidemiologic approach. 4th ed. Philadelphia, PA: Lippincott Williams & Wilkins; 2013. p. 79. [Google Scholar]
  • 55.Keenan JD, Lakew T, Alemayehu W, Melese M, House JI, Acharya NR, et al. Slow resolution of clinically active trachoma following successful mass antibiotic treatments. Arch Ophthalmol Chic Ill 1960. 2011. Apr;129(4):512–3. doi: 10.1001/archophthalmol.2011.46 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Martin DL, Saboyà-Díaz MI, Abashawl A, Alemayeh W, Gwyn S, Hooper PJ, et al. The use of serology for trachoma surveillance: Current status and priorities for future investigation. Ngondi JM, editor. PLoS Negl Trop Dis. 2020. Sep 24;14(9):e0008316. doi: 10.1371/journal.pntd.0008316 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Basáñez M-G, McCarthy JS, French MD, Yang G-J, Walker M, Gambhir M, et al. A Research Agenda for Helminth Diseases of Humans: Modelling for Control and Elimination. PLoS Negl Trop Dis. 2012. Apr 24;6(4):e1548. doi: 10.1371/journal.pntd.0001548 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Pinsent A, Gambhir M. Improving our forecasts for trachoma elimination: What else do we need to know? Ngondi JM, editor. PLoS Negl Trop Dis. 2017. Feb 9;11(2):e0005378. doi: 10.1371/journal.pntd.0005378 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Shen Y, Sung M-H, King CH, Binder S, Kittur N, Whalen CC, et al. Modeling Approaches to Predicting Persistent Hotspots in SCORE Studies for Gaining Control of Schistosomiasis Mansoni in Kenya and Tanzania. J Infect Dis. 2020. Feb 18;221(5):796–803. doi: 10.1093/infdis/jiz529 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Nightingale ES, Chapman LAC, Srikantiah S, Subramanian S, Jambulingam P, Bracher J, et al. A spatio-temporal approach to short-term prediction of visceral leishmaniasis diagnoses in India. PLoS Negl Trop Dis. 2020. Jul 9;14(7):e0008422. doi: 10.1371/journal.pntd.0008422 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Nash SD, Astale T, Nute AW, Bethea D, Chernet A, Sata E, et al. Population-Based Prevalence of Chlamydia trachomatis Infection and Antibodies in four Districts with Varying Levels of Trachoma Endemicity in Amhara, Ethiopia. Am J Trop Med Hyg [Internet]. 2020. Oct 26 [cited 2020 Nov 7]; Available from: http://www.ajtmh.org/content/journals/10.4269/ajtmh.20-0777 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Cama A, Müller A, Taoaba R, Butcher RMR, Itibita I, Migchelsen SJ, et al. Prevalence of signs of trachoma, ocular Chlamydia trachomatis infection and antibodies to Pgp3 in residents of Kiritimati Island, Kiribati. French M, editor. PLoS Negl Trop Dis. 2017. Sep 12;11(9):e0005863. doi: 10.1371/journal.pntd.0005863 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Butcher R, Handley B, Garae M, Taoaba R, Pickering H, Bong A, et al. Ocular Chlamydia trachomatis infection, anti-Pgp3 antibodies and conjunctival scarring in Vanuatu and Tarawa, Kiribati before antibiotic treatment for trachoma. J Infect. 2020. Apr 1;80(4):454–61. doi: 10.1016/j.jinf.2020.01.015 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Kim JS, Oldenburg CE, Cooley G, Amza A, Kadri B, Nassirou B, et al. Community-level chlamydial serology for assessing trachoma elimination in trachoma-endemic Niger. PLoS Negl Trop Dis [Internet]. 2019. Jan 28 [cited 2021 Mar 4];13(1). Available from: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6366708/ doi: 10.1371/journal.pntd.0007127 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.West SK, Munoz B, Mkocha H, Gaydos CA, Quinn TC. The effect of Mass Drug Administration for trachoma on antibodies to Chlamydia trachomatis pgp3 in children. Sci Rep. 2020. Sep 16;10(1):15225. doi: 10.1038/s41598-020-71833-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Martin DL, Bid R, Sandi F, Goodhew EB, Massae PA, Lasway A, et al. Serology for Trachoma Surveillance after Cessation of Mass Drug Administration. Lietman TM, editor. PLoS Negl Trop Dis. 2015. Feb 25;9(2):e0003555. doi: 10.1371/journal.pntd.0003555 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.West SK, Munoz B, Weaver J, Mrango Z, Dize L, Gaydos C, et al. Can We Use Antibodies to Chlamydia trachomatis as a Surveillance Tool for National Trachoma Control Programs? Results from a District Survey. Ngondi JM, editor. PLoS Negl Trop Dis. 2016. Jan 15;10(1):e0004352. doi: 10.1371/journal.pntd.0004352 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Migchelsen SJ, Sepúlveda N, Martin DL, Cooley G, Gwyn S, Pickering H, et al. Serology reflects a decline in the prevalence of trachoma in two regions of The Gambia. Sci Rep. 2017. Nov 8;7(1):15040. doi: 10.1038/s41598-017-15056-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69.West SK, Zambrano AI, Sharma S, Mishra SK, Muñoz BE, Dize L, et al. Surveillance Surveys for Reemergent Trachoma in Formerly Endemic Districts in Nepal From 2 to 10 Years After Mass Drug Administration Cessation. JAMA Ophthalmol. 2017. Nov 1;135(11):1141. doi: 10.1001/jamaophthalmol.2017.3062 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70.Keenan JD, Lakew T, Alemayehu W, Melese M, Porco TC, Yi E, et al. Clinical Activity and Polymerase Chain Reaction Evidence of Chlamydial Infection after Repeated Mass Antibiotic Treatments for Trachoma. Am J Trop Med Hyg. 2010. Mar 1;82(3):482–7. doi: 10.4269/ajtmh.2010.09-0315 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.Amza A, Kadri B, Nassirou B, Cotter SY, Stoller NE, West SK, et al. Community-level Association between Clinical Trachoma and Ocular Chlamydia Infection after MASS Azithromycin Distribution in a Mesoendemic Region of Niger. Ophthalmic Epidemiol. 2019. Jul 4;26(4):231–7. doi: 10.1080/09286586.2019.1597129 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72.Ramadhani AM, Derrick T, Macleod D, Holland MJ, Burton MJ. The Relationship between Active Trachoma and Ocular Chlamydia trachomatis Infection before and after Mass Antibiotic Treatment. PLoS Negl Trop Dis. 2016. Oct 26;10(10):e0005080. doi: 10.1371/journal.pntd.0005080 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73.Nash SD, Stewart AEP, Zerihun M, Sata E, Gessese D, Melak B, et al. Ocular Chlamydia trachomatis Infection Under the Surgery, Antibiotics, Facial Cleanliness, and Environmental Improvement Strategy in Amhara, Ethiopia, 2011–2015. Clin Infect Dis. 2018. Nov 28;67(12):1840–6. doi: 10.1093/cid/ciy377 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Odonkor M, Naufal F, Munoz B, Mkocha H, Kasubi M, Wolle M, et al. Serology, infection, and clinical trachoma as tools in prevalence surveys for re-emergence of trachoma in a formerly hyperendemic district. PLoS Negl Trop Dis. 2021. Apr 16;15(4):e0009343. doi: 10.1371/journal.pntd.0009343 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Clements ACA, Kur LW, Gatpan G, Ngondi JM, Emerson PM, Lado M, et al. Targeting Trachoma Control through Risk Mapping: The Example of Southern Sudan. PLoS Negl Trop Dis. 2010. Aug 17;4(8):e799. doi: 10.1371/journal.pntd.0000799 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Polack SR, Solomon AW, Alexander NDE, Massae PA, Safari S, Shao JF, et al. The household distribution of trachoma in a Tanzanian village: an application of GIS to the study of trachoma. Trans R Soc Trop Med Hyg. 2005. Mar 1;99(3):218–25. doi: 10.1016/j.trstmh.2004.06.010 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 77.Diggle P, Lophaven S. Bayesian Geostatistical Design. Scand J Stat. 2006;33(1):53–64. [Google Scholar]
  • 78.Schémann J-F, Sacko D, Malvy D, Momo G, Traore L, Bore O, et al. Risk factors for trachoma in Mali. Int J Epidemiol. 2002;(31):194–201. doi: 10.1093/ije/31.1.194 [DOI] [PubMed] [Google Scholar]
  • 79.Bero B, Macleod C, Alemayehu W, Gadisa S, Abajobir A, Adamu Y, et al. Prevalence of and Risk Factors for Trachoma in Oromia Regional State of Ethiopia: Results of 79 Population-Based Prevalence Surveys Conducted with the Global Trachoma Mapping Project. Ophthalmic Epidemiol. 2016. Nov;23(6):392–405. doi: 10.1080/09286586.2016.1243717 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 80.Hsieh Y-H, Bobo LD, Quinn TC, West SK. Risk Factors for Trachoma: 6-Year Follow-up of Children Aged 1 and 2 Years. Am J Epidemiol. 2000. Aug 1;152(3):204–11. doi: 10.1093/aje/152.3.204 [DOI] [PubMed] [Google Scholar]
  • 81.Phiri I, Manangazira P, Macleod CK, Mduluza T, Dhobbie T, Chaora SG, et al. The Burden of and Risk Factors for Trachoma in Selected Districts of Zimbabwe: Results of 16 Population-Based Prevalence Surveys. Ophthalmic Epidemiol. 2018. Dec 28;25(sup1):181–91. doi: 10.1080/09286586.2017.1298823 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 82.Alemayehu W, Melese M, Fredlander E, Worku A, Courtright P. Active trachoma in children in central Ethiopia: association with altitude. Trans R Soc Trop Med Hyg. 2005. Nov 1;99(11):840–3. doi: 10.1016/j.trstmh.2005.06.013 [DOI] [PubMed] [Google Scholar]
  • 83.Baggaley RF, Solomon AW, Kuper H, Polack S, Massae PA, Kelly J, et al. Distance to water source and altitude in relation to active trachoma in Rombo district, Tanzania. Trop Med Int Health TM IH. 2006. Feb 1;11(2):220–7. doi: 10.1111/j.1365-3156.2005.01553.x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 84.Ngondi J, Gebre T, Shargie EB, Graves PM, Ejigsemahu Y, Teferi T, et al. Risk factors for active trachoma in children and trichiasis in adults: a household survey in Amhara Regional State, Ethiopia. Trans R Soc Trop Med Hyg. 2008. May 1;102(5):432–8. doi: 10.1016/j.trstmh.2008.02.014 [DOI] [PubMed] [Google Scholar]
  • 85.Harding-Esch EM, Edwards T, Mkocha H, Munoz B, Holland MJ, Burr SE, et al. Trachoma Prevalence and Associated Risk Factors in The Gambia and Tanzania: Baseline Results of a Cluster Randomised Controlled Trial. PLoS Negl Trop Dis. 2010. Nov 2;4(11):e861. doi: 10.1371/journal.pntd.0000861 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 86.Mesfin MM, de la Camera J, Tareke IG, Amanual G, Araya T, Kedir AM. A Community-Based Trachoma Survey: Prevalence and Risk Factors in the Tigray Region of Northern Ethiopia. Ophthalmic Epidemiol. 2006. Jan;13(3):173–81. doi: 10.1080/09286580600611427 [DOI] [PubMed] [Google Scholar]
  • 87.Mpyet C, Lass BD, Yahaya HB, Solomon AW. Prevalence of and Risk Factors for Trachoma in Kano State, Nigeria. PLOS ONE. 2012. Jul 6;7(7):e40421. doi: 10.1371/journal.pone.0040421 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 88.Mpyet C, Goyol M, Ogoshi C. Personal and environmental risk factors for active trachoma in children in Yobe state, north-eastern Nigeria. Trop Med Int Health. 2010;15(2):168–72. doi: 10.1111/j.1365-3156.2009.02436.x [DOI] [PubMed] [Google Scholar]
  • 89.Schémann J-F, Guinot C, Ilboudo L, Momo G, Ko B, Sanfo O, et al. Trachoma, flies and environmental factors in Burkina Faso. Trans R Soc Trop Med Hyg. 2003. Jan;97(1):63–8. doi: 10.1016/s0035-9203(03)90025-3 [DOI] [PubMed] [Google Scholar]
  • 90.Vinke C, Lonergan S. Social and environmental risk factors for trachoma: a mixed methods approach in the Kembata Zone of southern Ethiopia. Can J Dev Stud Can Détudes Dév. 2011. Sep;32(3):254–68. [Google Scholar]
  • 91.Edwards T, Harding-Esch EM, Hailu G, Andreason A, Mabey DC, Todd J, et al. Risk factors for active trachoma and Chlamydia trachomatis infection in rural Ethiopia after mass treatment with azithromycin. Trop Med Int Health. 2008;13(4):556–65. doi: 10.1111/j.1365-3156.2008.02034.x [DOI] [PubMed] [Google Scholar]
  • 92.Abdou A, Nassirou B, Kadri B, Moussa F, Munoz BE, Opong E, et al. Prevalence and risk factors for trachoma and ocular Chlamydia trachomatis infection in Niger. Br J Ophthalmol. 2007. Jan 1;91(1):13–7. doi: 10.1136/bjo.2006.099507 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 93.Last AR, Burr SE, Weiss HA, Harding-Esch EM, Cassama E, Nabicassa M, et al. Risk Factors for Active Trachoma and Ocular Chlamydia trachomatis Infection in Treatment-Naïve Trachoma-Hyperendemic Communities of the Bijagós Archipelago, Guinea Bissau. PLoS Negl Trop Dis. 2014. Jun 26;8(6):e2900. doi: 10.1371/journal.pntd.0002900 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 94.Chen X, Nordhaus WD. VIIRS Nighttime Lights in the Estimation of Cross-Sectional and Time-Series GDP. Remote Sens. 2019. May 5;11(9):1057. [Google Scholar]
  • 95.Design parameters for population-based trachoma prevalence surveys [Internet]. Geneva: World Health Organization; 2018. Available from: https://apps.who.int/iris/bitstream/handle/10665/275523/WHO-HTM-NTD-PCT-2018.07-eng.pdf [Google Scholar]
  • 96.Solomon AW, Bella ALF, Negussu N, Willis R, Taylor HR. How much trachomatous trichiasis is there? A guide to calculating district-level estimates. Community Eye Health. 2019;31(104):S5–8. [PMC free article] [PubMed] [Google Scholar]
  • 97.Amoah B, Fronterre C, Johnson O, Dejene M, Seife F, Negussu N, et al. Model-based geostatistics enables more precise estimates of neglected tropical-disease prevalence in elimination settings: mapping trachoma prevalence in Ethiopia. Int J Epidemiol [Internet]. 2021. Nov 13 [cited 2021 Nov 16];(dyab227). Available from: 10.1093/ije/dyab227 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 98.R Core Team. R: A language and environment for statistical computing. [Internet]. Vienna, Austria: R Foundation for Statistical Computing; 2020. Available from: https://www.R-project.org/ [Google Scholar]
  • 99.Hiemstra PH, Pebesma EJ, Twenhofel CJW, Heuvelink GBM. Real-time automatic interpolation of ambient gamma dose rates from the Dutch Radioactivity Monitoring Network. Comput Geosci. 2008; [Google Scholar]
  • 100.Aybar C, Wu Q, Bautista L, Yali R, Barja A. rgee: An R package for interacting with Google Earth Engine. J Open Source Softw. 2020; [Google Scholar]
  • 101.Friedman J, Hastie T, Tibshirani R. Regularization Paths for Generalized Linear Models via Coordinate Descent. J Stat Softw. 2010;33(1):1–22. [PMC free article] [PubMed] [Google Scholar]
  • 102.Rousset F, Ferdy J-B. Testing environmental and genetic effects in the presence of spatial autocorrelation. Ecography. 2014;37(8):781–90. [Google Scholar]
  • 103.Coyle JR, Hejazi NS, Malenica I, Sofrygin O. sl3: Modern Pipelines for Machine Learning and Super Learning [Internet]. 2021. Available from: 10.5281/zenodo.1342293 [DOI] [Google Scholar]
  • 104.Valavi R, Elith J, Lahoz-Monfort JJ, Guillera-Arroita G. blockCV: An r package for generating spatially or environmentally separated folds for k-fold cross-validation of species distribution models. Methods Ecol Evol. 2019;10(2):225–32. [Google Scholar]
PLoS Negl Trop Dis. doi: 10.1371/journal.pntd.0010273.r001

Decision Letter 0

Dileepa Senajith Ediriweera, Ali M Somily

29 Nov 2021

Dear Tedijanto,

Thank you very much for submitting your manuscript "Predicting future ocular Chlamydia trachomatis infection prevalence using serological, clinical, molecular, and geospatial data" for consideration at PLOS Neglected Tropical Diseases. As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. In light of the reviews (below this email), we would like to invite the resubmission of a significantly-revised version that takes into account the reviewers' comments.

We cannot make any decision about publication until we have seen the revised manuscript and your response to the reviewers' comments. Your revised manuscript is also likely to be sent to reviewers for further evaluation.

When you are ready to resubmit, please upload the following:

[1] A letter containing a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.

[2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file).

Important additional instructions are given below your reviewer comments.

Please prepare and submit your revised manuscript within 60 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email. Please note that revised manuscripts received after the 60-day due date may require evaluation and peer review similar to newly submitted manuscripts.

Thank you again for your submission. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments.

Sincerely,

Ali M. Somily

Associate Editor

PLOS Neglected Tropical Diseases

Dileepa Ediriweera

Deputy Editor

PLOS Neglected Tropical Diseases

***********************

Reviewer's Responses to Questions

Key Review Criteria Required for Acceptance?

As you describe the new analyses required for acceptance, please consider the following:

Methods

-Are the objectives of the study clearly articulated with a clear testable hypothesis stated?

-Is the study design appropriate to address the stated objectives?

-Is the population clearly described and appropriate for the hypothesis being tested?

-Is the sample size sufficient to ensure adequate power to address the hypothesis being tested?

-Were correct statistical analysis used to support conclusions?

-Are there concerns about ethical or regulatory requirements being met?

Reviewer #1: Yes to all, though I do not profess deep expertise in geostatistical modelling

Reviewer #2: Ok

Reviewer #3: Methods: in general, the description of the methods section needs more detail to determine what was done.

Line 108: only 40 communities in a zone were selected. How was the selection done? How many total communities exist in Wag Hemra zone? Please elaborate here. While spatial analyses can provide interpolation, assumptions if the data are too sparce become more problematic.

Line 111: Please provide the dates of the trial here so that it can be placed in context with MDA.

Line 122:Please clarify for here and Line 257: Were 30 participants TOTAL from each community chosen, or 30 individuals in each age group, for a total of roughly 90 per community.

Figure One. A map of all the communities, with the ones enrolled in the study in red, should be created. A visual picture of the potential for clustering would be ideal. This would also help us understand the paucity of villages selected in the south and southeastern areas of the study area if in general that area , which appears mountainous, is sparsely settled.

Line 125: Does molecular refer to a test for infection? Molecular tests are relatively nonspecific (for example, they are used for antibody testing as well) so perhaps using a term with more specificity would be useful or define its use here.

Line 126: it is always tricky attempting to use clinical trial data for other purposes, and especially to combine data across arms. The authors state there was no difference in endpoint of community level ocular Ct infection among 0-5 year-olds between intervention arms-but is this based on a reduced sample size (since the age group six years and older is not included)? Are the point estimates very close as well? And what about the 6-9 year-olds? I would suggest they adjust for intervention arm as a variable, as I am not convinced yet that combining the data is across arms is justified.

Line 157: This reviewer is concerned that inferences on trachoma , infection, and serologic status are being made per community on the basis of 30 children, when it takes almost 2500 to have confidence in an estimate made in a district. For these correlations, how is the error of measurement in the underlying estimate of interest propagated into the correlation itself? And if not, why not?

Line 135: Please provide more information on the training of the graders. How many graders were involved, what was the training they received? Importantly, were the graders assigned to communities at random, or were they clustered in the woredas? If so, what was the agreement between them in grading?

Line 140: Were air swabs taken to control for contamination in the field/lab? If so, what were the findings? Please provide the criteria by which pools were selected for unpooling, including the “other characteristics” and how often this occurred.

Line 159: Apparently predictive correlations were done based just on the 0-5 year-olds, and this should be stated here.

Somewhere in the methods, please provide a sample size estimate or at least a power calculation given the number of children per community and 40 communities.

--------------------

Results

-Does the analysis presented match the analysis plan?

-Are the results clearly and completely presented?

-Are the figures (Tables, Images) of sufficient quality for clarity?

Reviewer #1: Yes to all

Reviewer #2: Ok

Reviewer #3: Results:

Line 257. Please provide test of significance for the statement that seropositivity increased among 0-5 year-olds, because unless the test for trend is significant, I do not see a significant difference from baseline to 36 months.

Figure 2b: shouldn’t the title be Semivariogram, if the figure is the spatial dependence of the semivariance as stated?

Figure 3 and Figure 4. Perhaps I am reading this incorrectly, but it appears to me that clinical TF/TI at 24 months correlates better with PCR outcomes at 36 months than does serology or even PCR at 24 months. This is worth mentioning, because the readers will notice it and want to have the authors take on the finding. Although serology was more correlated in the concurrent measurements, if one is trying to determine who is at risk going forward, could clinical TF one year earlier provide clues? This is also worth mentioning in the discussion, and the authors thoughts on if the correlation is strong enough, correlation of 0.75, to be useful.

Paragraph starts line 349 : I am concerned that in this setting of very high rates of infection, while high burden communities are truly in need of MDA, it does not mean the other 17 communities do not have worrisome infection. Please show the infection rates in the other 17 communities. By using this approach to identify high burden, what is the likelihood that we are missing communities for intervention that still have infection rates greater than 5%, for example? Would another approach that identifies the communities that have very low infection (efficient identification of low burden communities) be a better way to screen out communities?

--------------------

Conclusions

-Are the conclusions supported by the data presented?

-Are the limitations of analysis clearly described?

-Do the authors discuss how these data can be helpful to advance our understanding of the topic under study?

-Is public health relevance addressed?

Reviewer #1: Yes to all

Reviewer #2: Ok

Reviewer #3: limitations section is needed, the public health relevance needs more discussion as outlined below

Line 466. Conclusions: this reviewer is concerned with a conclusion that serological markers may be well suited for community level monitoring, without a fuller discussion on the practical implementation of the suggestions. At least a discussion of the need to study cost and resource balance of undertaking monitoring at community level versus precision targeting of antibiotic is needed. At present, districts will need a sampling approach to monitor some communities at some time points, and at each time point always in a position of not including some communities, which has an unknown effect on outcomes. I recognize that this study was not a true district study and was not designed to test how to implement seroprevalence strategically in a district, but in fact given the authors conclusions, some attention to this issue is warranted.

--------------------

Editorial and Data Presentation Modifications?

Use this section for editorial suggestions as well as relatively minor modifications of existing data that would enhance clarity. If the only modifications needed are minor and/or editorial, you may wish to recommend “Minor Revision” or “Accept”.

Reviewer #1: (No Response)

Reviewer #2: (No Response)

Reviewer #3: (No Response)

--------------------

Summary and General Comments

Use this section to provide overall comments, discuss strengths/weaknesses of the study, novelty, significance, general execution and scholarship. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. If requesting major revision, please articulate the new experiments that are needed.

Reviewer #1: Trachoma is an important public health problem. This is an important addition to the literature on it. It’s an elegant study and has been very well written-up. To provide some sense that I have done my job in providing peer review, I am forced to provide a list, below, of what amounts to extremely minor comments. But also: I am a pedant.

Great paper, though. I thoroughly enjoyed reading it.

The authors are quite right to use as their outcome variable the prevalence of conjunctival C. trachomatis infection rather than, say, the prevalence of TF. Although programmes currently use the prevalence of TF in 1-9-year-olds as a measure of success, this reflects the fact that when targets for trachoma programmes were being defined, assays for C. trachomatis infection were felt to be either too expensive, too insensitive or too unavailable to be used to monitor progress. Decisions on whether or not to apply antibiotic MDA should absolutely be based on the presence or absence of significant levels of transmission of the organism targeted by the antibiotic in question.

Line 81: please change “prevalence above 5%” to prevalence ≥5%”

Line 122: please change “three age groups (0-5, 6-9, 10+)” to “three age groups (0-5 years, 6-9 years, 10+ years)”

Line 122: I think the authors mean “thirty individuals in each of three age groups…” Is that correct? If yes, suggest amend the sentence.

Line 124: suggest delete “old” since the word “aged” already appears before the numbers

Line 135: please change the “X” to a multiplication sign

Lines 136 and 137: on each line, please change the hyphens after “inflammation” to em-dashes, without flanking spaces, as recommended in the report of the 4th Global Scientific Meeting on Trachoma

Line 137-138: please change “An individual was considered positive for clinical trachoma if either TF or TI was detected” to “An individual was considered positive for active trachoma if either TF or TI was detected”

Line 216: suggest change one of the “also”s in this sentence to another word. For example, “We additionally explored…”

Lines 253-255: The abbreviations TF and TI have already been defined, and “clinical disease” (by which the authors mean [active trachoma]: please see note above) has, too. The definitions of the signs TF and TI have not yet been given. It might be better to add this material to the section that is currently at lines 136-137. Please note that to be significant, follicles contributing to a diagnosis of TF have to (a) be in the central part of the upper tarsal conjunctival (not just “on the upper eyelid”) and (b) be at least 0.5mm in diameter; to be significant, the inflammatory thickening of TI has to be “pronounced” such that more than half of the normal deep tarsal vessels are not visible because they are obscured by inflammatory infiltration. I think it is worth adding these details

Lines 255, 270, 297-298, 302, 308, 324-325, 375-376, 413, 417, 422, 436, 443-444, abstract line 36: please change “clinical disease” or “clinical trachoma” to “active trachoma”

Table 1: the word “clinical” in front of “TF/TI” in the column heads is redundant; please delete

Line 269: please change “active ocular infection” to just “ocular infection”, or even better, “conjunctival infection” (since only the conjunctiva was sampled). Adding the adjective “active” risks confusion between [active trachoma] and [infection], which is already too widespread. The word “active” should also be deleted from lines 378 and 383.

Line 296: can the authors please provide a little more numerical clarity to the expression “very likely”?

Line 312: suggest change “serology data was” to “serology data were”

Anthony Solomon

Reviewer #2: Well written and good focused study on Predicting future ocular Chlamydia trachomatis infection prevalence using serological,

clinical, molecular, and geospatial data.

Reviewer #3: Title. The title should make clear that this is prediction at community level. Trachoma programs focus on district level surveillance, using sampling to represent the district. Identifying communities within districts is another level of complexity altogether.

Abstract:

Line 31: Please indicate if the communities were originally chosen at random or not. If not, the ramifications of this need to be addressed in the discussion.

Line 33. The baseline findings need to be placed in context by indicating that MDA occurred 6 months prior to baseline. Infection would be much lower due to MDA, but disease burden would not yet have declined, as it does by month 12 in both age groups. I would suggest not devoting too much abstract space to baseline findings but rather focus on 24-month associations and then the concurrent associations.

Line 42: the conclusion that serologic markers may be a programmatic tool has not been shown by this study, unless the authors discuss or theorize how this might be implemented in a programmatic setting. The evaluation of this tool in this study was done in a sample of communities, but how it might be extended to manage detection in communities not sampled is unclear.

Introduction

Line 76: The authors nicely describe the evidence of trachoma clusters at village level, and high burden villages that are not apparent in overall EU estimates have been shown in previous studies. However, to argue that fine scale estimates of trachoma at community level could target limited resources is to ignore the fact that significant resources would be needed to undertake these estimates before targeting allocation of resources to communities with the highest burden. This issue needs to be addressed in the discussion, as a nuance to the rationale as stated here.

Discussion

The discussion section is balanced and well written. It would benefit from more discussion as suggested in the comments above. In addition, a discussion would be helpful of why the 6-9 year-olds may not have been as informative in a high prevalence setting ( they already had high seropositivity to begin with, and the increase in infection was more modest in this age range). Arguments bolstering the use of younger ages would be enhanced by citing a paper by Odonkor et al (Plos NTD 2021 ) where re-emergence was clearly evident from strategic serostatus determination in the youngest children, who should have had no exposure to infection if the program has lowered transmission.

A section on limitations would also be reasonable to add, particularly the issues with small numbers of communities.

--------------------

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: Yes: Mehmet Sarier

Reviewer #3: No

Figure Files:

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org.

Data Requirements:

Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5.

Reproducibility:

To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols

PLoS Negl Trop Dis. doi: 10.1371/journal.pntd.0010273.r003

Decision Letter 1

Dileepa Senajith Ediriweera, Ali M Somily

12 Feb 2022

Dear Tedijanto,

Thank you very much for submitting your manuscript "Predicting future community-level ocular Chlamydia trachomatis infection prevalence using serological, clinical, molecular, and geospatial data" for consideration at PLOS Neglected Tropical Diseases. As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. The reviewers appreciated the attention to an important topic. Based on the reviews, we are likely to accept this manuscript for publication, providing that you modify the manuscript according to the review recommendations.

Please prepare and submit your revised manuscript within 30 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email.

When you are ready to resubmit, please upload the following:

[1] A letter containing a detailed list of your responses to all review comments, and a description of the changes you have made in the manuscript.

Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out

[2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file).

Important additional instructions are given below your reviewer comments.

Thank you again for your submission to our journal. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments.

Sincerely,

Ali M. Somily

Associate Editor

PLOS Neglected Tropical Diseases

Dileepa Ediriweera

Deputy Editor

PLOS Neglected Tropical Diseases

***********************

Reviewer's Responses to Questions

Key Review Criteria Required for Acceptance?

As you describe the new analyses required for acceptance, please consider the following:

Methods

-Are the objectives of the study clearly articulated with a clear testable hypothesis stated?

-Is the study design appropriate to address the stated objectives?

-Is the population clearly described and appropriate for the hypothesis being tested?

-Is the sample size sufficient to ensure adequate power to address the hypothesis being tested?

-Were correct statistical analysis used to support conclusions?

-Are there concerns about ethical or regulatory requirements being met?

Reviewer #1: Yes, yes, yes, yes, yes; no concerns about ethical or regulatory requirements

Reviewer #2: none

Reviewer #3: thank you for the detailed response, the changes have improved the clarity greatly. Please insert in Line 110 (of marked up manuscript) that the "communities were not selected at random. They were....". This makes it clear these are not a random sample and the readers do not have to infer.

--------------------

Results

-Does the analysis presented match the analysis plan?

-Are the results clearly and completely presented?

-Are the figures (Tables, Images) of sufficient quality for clarity?

Reviewer #1: Yes, yes, yes

Reviewer #2: none

Reviewer #3: Line 322-325 of marked up manuscript. The addition of the cohort study is new, and not described in the methods. Cohort studies are quite complex and previous work shows antibody seropositivity stability is highly correlated with age even in the under five year olds. Unless there is a description (other than in the title of the figure) of how these children were selected at random, an analysis of the age distribution of the seropositives at baseline, how many were followed longitudinally, etc, these data should not be included as they are difficult to interpret. They add little to the manuscript and should be deleted along with figure S4.

--------------------

Conclusions

-Are the conclusions supported by the data presented?

-Are the limitations of analysis clearly described?

-Do the authors discuss how these data can be helpful to advance our understanding of the topic under study?

-Is public health relevance addressed?

Reviewer #1: Yes, yes, yes, yes

Reviewer #2: none

Reviewer #3: Very nice job with a full description of limitations

--------------------

Editorial and Data Presentation Modifications?

Use this section for editorial suggestions as well as relatively minor modifications of existing data that would enhance clarity. If the only modifications needed are minor and/or editorial, you may wish to recommend “Minor Revision” or “Accept”.

Reviewer #1: The response to reviewer 3's comment about line 157 ("This reviewer is concerned that inferences on trachoma , infection, and serologic status are being made per community on the basis of 30 children, when it takes almost 2500 to have confidence in an estimate made in a district. For these correlations, how is the error of measurement in the underlying estimate of interest propagated into the correlation itself? And if not, why not?") could perhaps have taken into account the fact that surveys supported by GTMP/Tropical Data have, primarily for logistical reasons, almost always used compact segment sampling as the mechanism for selecting households and individuals within selected villages. That has the very helpful secondary advantage of meaning that cluster-level disease proportions are true cluster-level prevalences, assuming low levels of absenteeism, refusal and misdiagnosis. This point could perhaps be included in the discussion if the authors wished to do so, since it means that the methods that they propose could have even more applicability in the general case than in the specific dataset employed here (since the latter used a different sampling strategy at village level).

Reviewer #2: none

Reviewer #3: as noted above in two comments

--------------------

Summary and General Comments

Use this section to provide overall comments, discuss strengths/weaknesses of the study, novelty, significance, general execution and scholarship. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. If requesting major revision, please articulate the new experiments that are needed.

Reviewer #1: I am satisfied with the changes that the authors have made to the paper in response to the first round of review.

Reviewer #2: well wirtten good focused following revision.

Reviewer #3: Overall, excellent revision and response to reviewers. Addition of one sentence and deletion of non-essential few lines as noted above are final comments.

--------------------

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Reviewer #3: No

Figure Files:

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org.

Data Requirements:

Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5.

Reproducibility:

To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols

References

Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article's retracted status in the References list and also include a citation and full reference for the retraction notice.

PLoS Negl Trop Dis. doi: 10.1371/journal.pntd.0010273.r005

Decision Letter 2

Dileepa Senajith Ediriweera, Ali M Somily

23 Feb 2022

Dear Tedijanto,

We are pleased to inform you that your manuscript 'Predicting future community-level ocular Chlamydia trachomatis infection prevalence using serological, clinical, molecular, and geospatial data' has been provisionally accepted for publication in PLOS Neglected Tropical Diseases.

Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. A member of our team will be in touch with a set of requests.

Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated.

IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript.

Should you, your institution's press office or the journal office choose to press release your paper, you will automatically be opted out of early publication. We ask that you notify us now if you or your institution is planning to press release the article. All press must be co-ordinated with PLOS.

Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Neglected Tropical Diseases.

Best regards,

Ali M. Somily

Associate Editor

PLOS Neglected Tropical Diseases

Dileepa Ediriweera

Deputy Editor

PLOS Neglected Tropical Diseases

***********************************************************

PLoS Negl Trop Dis. doi: 10.1371/journal.pntd.0010273.r006

Acceptance letter

Dileepa Senajith Ediriweera, Ali M Somily

8 Mar 2022

Dear Tedijanto,

We are delighted to inform you that your manuscript, "Predicting future community-level ocular Chlamydia trachomatis infection prevalence using serological, clinical, molecular, and geospatial data," has been formally accepted for publication in PLOS Neglected Tropical Diseases.

We have now passed your article onto the PLOS Production Department who will complete the rest of the publication process. All authors will receive a confirmation email upon publication.

The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any scientific or type-setting errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript. Note: Proofs for Front Matter articles (Editorial, Viewpoint, Symposium, Review, etc...) are generated on a different schedule and may not be made available as quickly.

Soon after your final files are uploaded, the early version of your manuscript will be published online unless you opted out of this process. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers.

Thank you again for supporting open-access publishing; we are looking forward to publishing your work in PLOS Neglected Tropical Diseases.

Best regards,

Shaden Kamhawi

co-Editor-in-Chief

PLOS Neglected Tropical Diseases

Paul Brindley

co-Editor-in-Chief

PLOS Neglected Tropical Diseases

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Table. Description and sources of geospatial variables explored for prediction analysis.

    (DOCX)

    S2 Table. Number of children evaluated across 40 study communities by trachoma indicator, age group and study month.

    (DOCX)

    S3 Table. Community-level seroprevalence across 40 study communities by antigen, age group, and study month.

    (DOCX)

    S1 Fig

    Maps (A), variograms (B), and Moran’s I (C) for seroprevalence among 0–5-year-olds at each study month. Maps display prevalence for 40 study communities at each follow-up visit, spatially interpolated over the convex hull using kriging. Variograms capture similarity between community-level prevalence measurements as a function of distance between community pairs (in km), with smaller semivariance values representing increased similarity. Exponential (magenta) and Matérn (green) models were fit to each empirical variogram, and the effective range (dashed vertical line) is defined as the distance at which the fitted model reaches 95% of the sill. The Monte Carlo envelope (gray shading) displays pointwise 95% coverage of 1000 permutations, representing a null distribution. Moran’s I was calculated over 1000 permutations (gray bars, with observed value represented by red line), and a permutation-based p-value was calculated. The base map layer for panel A in this figure was downloaded from Stamen Maps (“Terrain”) and is available under the CC BY 3.0 license.

    (TIF)

    S2 Fig

    Maps (A), variograms (B), and Moran’s I (C) for active trachoma prevalence among 0–5-year-olds at each study month. Maps display prevalence for 40 study communities at each follow-up visit, spatially interpolated over the convex hull using kriging. Variograms capture similarity between community-level prevalence measurements as a function of distance between community pairs (in km), with smaller semivariance values representing increased similarity. Exponential (magenta) and Matérn (green) models were fit to each empirical variogram, and the effective range (dashed vertical line) is defined as the distance at which the fitted model reaches 95% of the sill. The Monte Carlo envelope (gray shading) displays pointwise 95% coverage of 1000 permutations, representing a null distribution. Moran’s I was calculated over 1000 permutations (gray bars, with observed value represented by red line), and a permutation-based p-value was calculated. The base map layer for panel A in this figure was downloaded from Stamen Maps (“Terrain”) and is available under the CC BY 3.0 license.

    (TIF)

    S3 Fig. Correlations between PCR prevalence and antigen-specific seroprevalence by age group and over time.

    Panels display Spearman rank correlations between community-level Pgp3 seroprevalence and PCR prevalence at months 0 and 36 (A), CT694 seroprevalence and PCR prevalence at months 0 and 36 (B), and PCR prevalence at month 36 and seroprevalence measured at each follow-up visit across 40 study communities (C). Correlations are shown separately for 0–5-year-olds (green) and 6–9-year-olds (purple) when possible, and 95% confidence intervals were estimated from 1000 bootstrap samples. Serology data was not collected for a random sample of 6–9-year-olds at months 12 and 24.

    (TIF)

    S4 Fig. Spatio-temporal distribution of LASSO-selected geospatial predictor variables.

    Variables were estimated for 240 grid cells of 2.5 x 2.5 arc minutes (approximately 20 km2 at the median latitude of the study area). Daily precipitation (A) and monthly night light radiance (B) averaged over the year were included in the final set of prediction models. The base map layer for this figure was downloaded from Stamen Maps (“Terrain”) and is available under the CC BY 3.0 license.

    (TIF)

    S5 Fig

    Cross-validated R2 for models predicting community-level PCR prevalence among 0–5-year-olds at month 0 (A), at month 12 (B), at month 24 (C), at month 36 (D), and pooled across all months (E). Cross-validated R2 (coefficient of determination), 95% influence-function-based confidence interval, and cross-validated root-mean-square error (RMSE, text label) are shown for each model specification. Blocks of size 15x15km were used for 10-fold spatial cross-validation. (D) is equivalent to Fig 4 in the main text and is included here for comparison.

    (TIF)

    S6 Fig. Cross-validated R2 for stacked ensemble models predicting community-level PCR prevalence at month 36 among 0–5-year-olds using various superlearner models.

    Cross-validated R2 (coefficient of determination), 95% influence-function-based confidence interval, and cross-validated root-mean-square error (RMSE, text label) are shown for each model specification. Blocks of size 15x15km were used for 10-fold spatial cross-validation.

    (TIF)

    S7 Fig

    Cross-validated R2 for models predicting community-level PCR prevalence among 0–5-year-olds at month 36 using random 10-fold cross-validation (A), 10-fold spatial cross validation with 5x5 km blocks (B), 15x15 km blocks (C), and 20x20 km blocks (D), and leave-one-out cross-validation (E). Cross-validated R2 (coefficient of determination), 95% influence-function-based confidence interval, and cross-validated root-mean-square error (RMSE, text label) are shown for each model specification. (C) is equivalent to Fig 4 in the main text and is included here for comparison.

    (TIF)

    S8 Fig. Cumulative proportion of C. trachomatis infections at months 0 and 36 identified by concurrent prediction models.

    The black line in each facet represents the optimal ordering of scaled PCR infections at each respective month. Dashed lines indicate the point at which the cumulative proportion of infections, scaled to represent a sample of 30 individuals per community, surpassed 80%. To simulate a null distribution, we estimated the cumulative proportion of infections identified for 1000 random orderings of the 40 communities and plotted the 95% pointwise envelope (gray shading). At month 36, a model using only serology performed equally well to a model using all trachoma indicators, geospatial features, a Matérn covariance, and ensemble machine learning; vertical lines were offset slightly for visibility.

    (TIF)

    S9 Fig. Correlations between PCR prevalence and active trachoma by age group and over time.

    Panels display Spearman rank correlations between community-level TF prevalence and PCR prevalence at months 0 and 36 (A), TI prevalence and PCR prevalence at months 0 and 36 (B), and PCR prevalence at month 36 and active trachoma measured at each follow-up visit across 40 study communities (C). TF prevalence included any child diagnosed with TF, regardless of TI status, and vice versa. Correlations are shown separately for 0–5-year-olds (green) and 6–9-year-olds (purple), and 95% confidence intervals were estimated from 1000 bootstrap samples.

    (TIF)

    S10 Fig. Cross-validated R2 for models predicting pooled community-level PCR prevalence among 0–5-year-olds at month 36 with survey month (time) modeled as a linear covariate or Gaussian process.

    Cross-validated R2 (coefficient of determination), 95% influence-function-based confidence interval, and cross-validated root-mean-square error (RMSE, text label) are shown for each model specification. Blocks of size 15x15km were used for 10-fold spatial cross-validation. For predictions 36 months ahead, time could not be explicitly modeled as a linear covariate as all outcomes were measured at month 36.

    (TIF)

    Attachment

    Submitted filename: reviewer_responses.pdf

    Attachment

    Submitted filename: reviewer_responses.pdf

    Data Availability Statement

    Community latitude and longitude values have been modified to protect the privacy of study participants. The pre-specified statistical analysis plan is available on Open Science Framework (https://osf.io/t48zb/). De-identified data and code to replicate this work are available at the following repository: https://doi.org/10.5281/zenodo.5851642.


    Articles from PLoS Neglected Tropical Diseases are provided here courtesy of PLOS

    RESOURCES