Abstract
Objective:
We identified the geographical clustering of HIV as well as those at highest risk of infection using a decade long data (2002–2012) from KwaZulu-Natal, South Africa.
Methods:
A total of 5,776 women who enrolled in several HIV prevention trials were included in the study. Geo-coded individual-level data were linked to the community-level characteristics using the South African Census. High-risk women were identified using a risk scoring algorithm. Generalized additive models were used to identify the significant geographical clustering of high-risk women and HIV.
Results:
Overall, 60% of the women were classified as high risk of HIV. HIV infection rates were estimated as high as 10 to 15 per 100 person year. Areas with high rates of HIV infections were spatially clustered and overlapped particularly in the Northern part of Durban.
Conclusion:
Targeting multifactorial and complex nature of the epidemic is urgently needed to identify the “high transmission” areas.
Keywords: HIV, spatial clustering, generalized additive models, community-level characteristics, South Africa
1. Introduction
South Africa has the highest number of HIV infected individuals in the world (UNAIDS, 2013). The epidemic is currently described as a hyper-endemic due to its generalized and clustered nature (Tanser et al., 2009; Tanser et al., 2014; Cuadros, et al., 2017). The prevalence of HIV infection is estimated to be as high as 40% among women who attended antenatal clinics in KwaZulu-Natal, the most populated province of South Africa (DOH, 2009). Elevated number of infections has been frequently linked to condomless sex, multiple/concurrent sexual partners and presence of other sexually transmitted infections (STIs) (Laga et al., 1993; Leon et al., 2010). Studies have also reported strong links between HIV infection and a range of community-level characteristics particularly to those related to the relationship factors, low socio-economic conditions, social, environmental and structural factors (Ghani et al., 1997; Hunter, 2007; Jewkes et al., 2014). Regions with high proportion of young women and unmarried adults can be considered as surrogate for partnership and high levels of sexual mixing. We also considered other characteristics such as communities with high unemployment rates and low education levels which may reflect poor socio-economic conditions in the region. It is also known that infections are not uniformly distributed and geographical variations have been reported to be associated with the spread of the epidemic in Southern Africa (Williams and Gouws, 2001; Tanser et al., 2014; Cuadros et al., 2017). As the number of HIV infected individuals continues to increase in the region, effective strategies are urgently needed to reduce the burden of infections among the most vulnerable populations.
In order to understand the complex and multifactorial nature of the epidemic, we conducted analyses by linking a range of individual, community and geographical level data sources. The hypothesis is that individual-level factors can identify those at high risk of infection with acceptable level of accuracy and robustness due to their “direct” effect on HIV incidence; while community-level characteristics may have a significant impact on shaping the individual-level factors; therefore they may have “indirect” effect on HIV incidence; geographical-level analysis will identify areas where high-risk individuals live. We hypothesized that geographical clustering of these populations may potentially represent “high transmission” areas that would also overlap with the areas where HIV infections are clustered. We specifically aim to (1) examine the geographical patterns of HIV infection rates across the region; (2) identify the geographical clustering of women at high risk of HIV infection based on their individual-level risk factors using a risk scoring algorithm developed previously (Wand et al. 2017).
The current study used generalized additive models (GAMs) (Wood, 2004, Wood, 2006) to analyse non-linear function structure of geographical locations (i.e. latitude, longitude) using a bivariate smooth function. We also create image plots by superimposing estimates from the GAMs onto geographical maps. These maps provide guidance to help policy makers to prioritize prevention resources using the data from a region where the epidemic is severely rampant. This is particularly crucial to target those at high risk of infection for effective prevention interventions such as such as pre-exposure prophylaxis (PrEP). The computer script was provided using the publicly available statistical software system R version 3.3.3 and presented in the appendix.
Although there has been considerable literature to confirm geographical variations of the epidemic in South Africa, to our knowledge, this is one the first study to investigate the spatial locations of high risk behaviours and HIV infections by linking various data sources to snapshot of the epidemic. As HIV funding resources are declining, identifying and reaching the most-at-risk individuals will likely play a significant role in developing the most efficient and cost-effective interventions. This strategy will also guide the policy makers to implement tailored prevention strategies in order to allocate scarce funding resources and has the potential to change the trajectory of the epidemic.
2. Methods
2.1. Sample
A total of ~8,000 sexually active, 16+ years old women who consented to enrol in one of the four biomedical intervention trials during the period of 2002–2012. Women were recruited from the surrounding areas (urban/peri-urban/rural) of Durban which is the third populous city in South Africa – after Johannesburg and Cape Town. Durban is located in eThekwini Metropolitan Municipality, which is one of the 11 districts in KwaZulu-Natal Province (south of Swaziland and west of Lesotho) (Appendix 1). Study populations have been described elsewhere (McCormack et al., 2010; Microbicides Trials Network 2016; Padian et al., 2007; Skoler-Karpoff et al., 2008). Briefly, we analysed site-specific data from the neighbourhood of Durban in Kwazulu-Natal, South Africa (2002–2012). Study population was primarily recruited from family planning, well-baby and general health clinics as well as through community meetings. We also routinely conducted recruitment activities at government agencies such as social welfare grant offices. The current analysis included site-specific data from 5,776 women who had geographical information and community-level characteristics through their residential addresses. Further details of the main studies have been described elsewhere (McCormack et al., 2010; Marrazzo et al., 2015; Padian et al., 2007; Skoler-Karpoff et al., 2008). Participants were tested for HIV by using two rapid tests on whole blood sourced from either finger-prick or venepuncture (Determine HIV-1/2, Abbot Laboratories, Tokyo, Japan and Oraquick, Orasure Technologies, Bethlehem, PA, USA). At each quarterly study visits, women were also tested for other curable sexually transmitted infections (STIs) (chlamydia, gonorrhoea and syphilis); those who were tested positive were treated according to the local treatment guidelines.
2.2. Individual-level characteristics
Individual-level data were extracted from the combined trial populations; they were geo-located at participants’ residential address (or the nearest location point to residence) using global positioning system (GPS) coordinate data during enrolment GPS satellite coordinate data. We used a previously developed risk scoring algorithm for HIV seroconversion to classify women as high-risk (Wand et al., 2017 which was based on the following 7-items: (1) age: <20 (HIV risk score=11), 20–24 (HIV risk score=10), 25–29 (HIV risk score=6) and 30+ years old (HIV risk score=0); (2) marital/cohabitation status: married/cohabiting (HIV risk score=0) vs. single/not-cohabiting (HIV risk score=13); (3) number of sexual partners in past three months: <3 (HIV risk score=0) vs 3+ (risk score=15); (4) parity: <2 births (HIV risk score=10), 2 births (HIV risk score=8) vs. 3+ births (HIV risk score=0); (5) contraceptive use: injectables (HIV risk score=4) vs. pills/others (HIV risk score=0); (6) age at sexual debut: <16 years (HIV risk score=3) vs. 16+ years (HIV risk score=11); (7) presence of STIs: yes (HIV risk score=5) vs. no (HIV risk score=0) (Wand et al. 2017) (Appendix 2).
2.3. Community-level characteristics
Various community-level factors were extracted from the 2011 South African Census Statistics and linked to the individual-level data from the five biomedical prevention trials using geographical components (Census 2011 Statistics South Africa). For this analysis, we particularly focussed on the characteristics which can be indicators for low-socio economic conditions and other structural factors : “% female headed households (HH)”, “% population no personal income” “% population with low schooling”, “% HH with no pipe/running water”, “% HH with no electricity”, “% population living in traditional and/or informal settlement types”; we also considered the following factors which can be indicators for sexual network/partnership: “% population female aged 18–25 years” and “% adult population unmarried”; finally, “average distance to a hospital” was also included in our analysis as it can be surrogate for access to health-care system.
2.4. Geocoding of participants
Residential addresses of the study participants were recorded and geocoded (i.e. latitude and longitude) at the screening of each trial. This information was grouped and forwarded to the Geographical Information System (GIS) Laboratory (Malaria Research Unit of the South African Medical Research Council, Durban). Women provided verbal consents to collect their co-ordinates that link to their residential addresses. Participants’ confidentiality was protected by using the identifying numbers which was linked to geographical coordinate readings rather than their names and addresses. Age-standardized HIV incidence rates were calculated using the combined individual-level data at each geographical unit.
2.5. Statistical analysis
In this study, we identified a cohort of South African women at high-risk of HIV infection based on a previously developed risk scoring algorithm (Wand et al., 2017) (Appendix 2). Details of the risk scoring algorithm have been described previously (Wand et al., 2017). Briefly, data from approximately 9,000 sexually active women who resided in the surrounding areas (urban/peri-urban/rural) of Durban were used to develop the algorithm. A split-sample methodology was employed by randomly allocating 67% and 33% of the study population into “development” and “validation” datasets respectively. Cox regression models were used to identify the independent predictors of HIV infection. Backward selection technique was employed to produce the final models. A weighted-scoring algorithm was used by multiplying logarithms of the hazard ratios (i.e. regression coefficients). Final integer weights for each risk factor were summed up to create subject-specific risk score. Diagnostic characteristics of various cut-points were assessed using the standard statistical measures such as sensitivity, specificity and area under the curve (AUC). Validation and the robustness of the algorithm were further assessed using an external dataset from the Africa Centre Demographic Information System. This external validation dataset included more than 4000 women who were considered to be a nationally representative sample (Tanser et al. 2008). Women with total scores of 25 or more were classified as “high-risk” of HIV. This cut-point has been shown to perform well with acceptable statistical accuracy (sensitivity = 83% and specificity =33%) robustness in an internal and an external validation datasets (Wand et. al., 2017). We predicted the cumulative two-year probabilities of HIV seroconversion for the study participants for those who had risk score of 25 or more versus <25.
2.6. Analysis using Community-level characteristics
Associations between the community-level characteristics and high-risk women were investigated using logistic regression models. Odds ratios (ORs) and their 95% Confidence Intervals (CIs) were presented. These characteristics were also examined for their associations with HIV seroconversions using the Cox proportional regression models. Hazard ratios (HRs) and 95% CIs were presented. For this analysis, each community-level factor was split into tertiles i.e. 1st (lowest), 2nd (medium) and 3rd (highest), based on their observed empirical distributions.
2.7. Population level impacts of community-level factors
We estimated the population level impacts of the community-level factors for high-risk women and HIV seroconversion using population attributable risk (). For this analysis, we used a methodology which can handle complex nature of the risk factors at multiple levels. Therefore we were able to estimate overall impact of a certain community-level factors (i.e. ) as well as their level-specific (i.e. lowest, medium and highest) impacts on the primary outcome of interests (i.e. ) (Wand and Ramjee, 2011). Briefly, and their 95% CIs were estimated by combining the distribution of a community-characteristic in the population (i.e. prevalence) and its strength of association with the outcome of interests (i.e. ORs or HRs for high risk women and HIV seroconversions respectively). For example, after accounting for correlated nature of the factors in multivariable model. For a binary risk factor:
| (1) |
is the hazard ratio, is the prevalence of the risk factor in the population and s indexes the two strata determined by the value of the risk factor. In multi-factorial setting i.e. when there is more than one risk factor with multiple categories:
| (2) |
where and , , are the hazard ratios and the prevalence in the target population for the s th combination of the risk factors.
| (3) |
where t denotes a stratum of unique combinations of levels of all background risk factors which are not modifiable and/or not under study, and is the hazard ratio in combination t relative to the lowest risk level, where which can be interpreted as the proportion of the HIV seroconversions associated with the factors in the model in the target population. All these analysis were presented as univariate due to the collinearity of the community-level factors. Therefore, this part of the analysis was in exploratory nature rather than investigating the causality.
2.8. Generalized additive models:
In this part of the analysis, the generalized additive models (GAMs) were used to assess the spatial patterns and clustering of the high risk areas and HIV infection rates among women who enrolled in several biomedical prevention trials. The GAMs are extension of the generalized linear models which can handle complex nature of the data with minimum assumptions (Wood, 2004; Wood, 2006). These modern statistical techniques can handle non-linear geographical structures of the data and they have the ability to produce image plots by superimposing the mean of a response using the bivariate function of the geographical components , onto mappings. The following models were used to analyze the data:
where f is a smooth bivariate function and error term for measurement with zero mean and variance ; corresponds to a continuous outcome measurement for individual; while corresponds to a probability of a binary outcome of interest for individual. Image plots were also provided by superimposing the estimates from the GAMs on the maps using the publicly available R-software package (version 3.3.3) “Multivariate Generalized Cross Validation” (mgcv) (http://cran.r-project.org/) (Wood, 2004; Wood, 2006).
Besides the visual presentation such as “intensity maps”, result from mgcv also provided formal quantitative assessments for geographical variations and clustering based on model specific degrees of freedoms () where higher with p<0.05 were interpreted as significant geographical variations/clustering. The goodness of fit test for each model was assessed by the proportion of deviance explained for each model.
The computer scripts to show how to use “mgcv” for our analysis have also written using the publicly available statistical software system R (version 3.3.3). All documentation for this part of the paper is presented in Appendix 3.
2.9. Ethical statement
All protocols and informed consent forms were approved by the Biomedical Research Ethics Committee at the University of KwaZulu-Natal in Durban.
3. Results
Approximately 8,000 women from 43 geographical locations who enrolled in various biomedical intervention trials; 5,776 of them had the necessary geographical information to be linked to the community-level factors and included in the current analysis. The crude HIV incidence was 7.00 (6.6, 7.7) per 100 person year. Characteristics of the study population were presented in Table 1. Approximately 43% of the study population was younger than 25 years of age. More than 80% of the women were categorized as single or not cohabiting with their sex partners and 11% of them had at least 3 sexual partners in past three months. Majority of the women (>60%) reported their first sexual encounter as <16 years of age; 39% of the study population was null or prim-parity while 49% were using injectables as a family planning method at baseline. Prevalence of sexually transmitted infections (i.e. chlamydia, gonorrhoea or syphilis) was 19% at baseline.
Table 1:
Baseline characteristics of the study population (N=5,776)
| Baseline characteristics | N (%) |
|---|---|
| Age | |
| <20 years | 592 (10%) |
| 20–24 years | 1921 (33%) |
| 25–29 years | 1160 (20%) |
| 30+ years | 2103 (36%) |
| Age at sexual debut | |
| <16 years old | 1827 (32%) |
| 16+ years old | 3949 (68%) |
| Married/cohabitating | |
| Yes | 866 (15%) |
| No | 4910 (85%) |
| Parity | |
| None/1 child only | 2254 (39%) |
| 2 children | 1864 (32%) |
| 3+ children | 1628 (29%) |
| Number of sex partners | |
| <3 partner | 5135 (89%) |
| 3+ partners | 641 (11%) |
| Parity | |
| 0/1 child | 2254 (39%) |
| 2 children | 1864 (32%) |
| 3+ children | 1628 (29%) |
| Contraceptive use | |
| Others | 2954 (51%) |
| Injectables | 2822 (49%) |
| STIs§ diagnosis | |
| No | 4683 (81%) |
| Yes | 1093 (19%) |
Chlamydia, gonorrhoea or syphilis
3.1. Impact of Community-level characteristics on high risk behaviours and HIV infections:
The associations between the community-level characteristics and women at highest risk for HIV were examined using logistic regression models (Table 2). These characteristics were further investigated in their associations with HIV seroconversion using Cox regression models. Women who resided in the communities where more than 8% of the population was estimated to be young females (18–25 years old) were more likely to have risk score of 25+ (OR: 1.24, 95% CI: 1.08, 1.44, OR: 1.36, 95% CI: 1.18, 1.56) and significantly higher at risk for HIV infection (HR: 1.20, 95% CI: 0.96, 1.50 and HR: 1.68, 95% CI: 1.26, 2.24). This community-level characteristic showed similar impact on both endpoints with : 17%, 95% CI: 13%, 19% (for high risk women) and : 18%, 95% CI: 13%, 24% (for HIV seroconversion). Areas with higher proportions of female headed house-holds (40%−45%, >45% vs. <40%) were also correlated with women who had multiple risk factors (OR: 1.39, 95% CI: 1.20, 1.60 and OR: 1.29, 95% CI: 1.12, 1.48) and increasing risk for HIV infection (HR: 1.50, 95% CI: 1.18, 1.90, HR: 1.30, 95% CI: 1.02, 1.66). At the population level, 16% of the high risk women and 25% of the HIV infections were attributed to this specific community-level characteristic. Women who lived in areas with high proportion of population (>70%) was reported to be unmarried/not cohabiting were more likely to have higher risk score (OR: 1.28, 95% CI: 1.12, 1.47, p<0.001) and they were also at risk for HIV seroconversion (HR: 1.20, 95% CI: 1.03, 1.40, p=0.019). Overall, 10% of the “high-risk” women and HIV infections were associated with the areas where higher proportion of unmarried adults lived (>70%). Among the socioeconomic characteristics, study participants who lived in communities with higher levels of unemployment rates (70%−80%, >80% vs. <70%) and low level-schooling (30–32% vs. <30%) were also significantly more likely to have the highest number of risk factors and at risk for HIV infection. Population level impacts of these community level socio-economic factors were ranged 16% to 22%.
Table 2:
Distribution of community-level factors and their associations with high-risk women and HIV seroconversion and their population-level impacts (PAR%)ǂ;
| Outcome = risk score 25+ (vs. <25) | Outcome = HIV seroconversion | ||||||
|---|---|---|---|---|---|---|---|
| % | Odds Ratio (95% CI) | p-value | PAR% (95% CI) | Hazard Ratio (95% CI) | p-value | PAR% (95% CI) | |
| % population female 18–25 | 17% (13%, 20%) | 18% (13%, 24%) | |||||
| Lowest (<8%) | 42 | 1 | 1 | - | |||
| Medium (8%−10% ) | 25 | 1.47 (1.22, 1.76) | <0.001 | 7% (5%, 9%) | 1.20 (0.96, 1.50) | 0.104 | 7% (6%, 8%) |
| Highest (>10%) | 32 | 1.55 (1.30, 1.84) | <0.001 | 10% (8%, 12%) | 1.68 (1.26, 2.24) | <0.001 | 11% (8%, 15%) |
| % population female headed HH | 16% (14%, 20%) | 25% (18%, 31%) | |||||
| Lowest (<40%) | 21 | 1 | 1 | - | |||
| Medium (40%−45%) | 40 | 1.39 (1.20, 1.60) | <0.001 | 8% (6%, 9%) | 1.50 (1.18, 1.90) | 0.001 | 15% (12%, 18%) |
| Highest (>45%) | 39 | 1.29 (1.12, 1.48) | <0.001 | 8% (7%, 10%) | 1.30 (1.02, 1.66) | 0.011 | 10% (7%, 12%) |
| % adults unmarried | 10% (8%, 11%) | 13% (4%, 21%) | |||||
| Lowest (<65%) | 38 | 1 | 1 | - | |||
| Medium (65%−70%) | 30 | 0.93 (0.80, 1.08) | 0.338 | - | 1.01 (0.78, 1.31) | 0.953 | 3% (0%, 9%) |
| Highest (>70%) | 32 | 1.28 (1.12, 1.47) | <0.001 | 10% (8%, 11%) | 1.20 (1.03, 1.40) | 0.019 | 10% (5%, 15% |
| % population personal income | 16% (13%, 19%) | 17% (13%, 22%) | |||||
| Lowest (<70%) | 34 | 1 | 1 | - | |||
| Medium (70%−80%) | 33 | 1.20 (1.03, 1.38) | 0.017 | 5% (4%, 7%) | 1.29 (1.04, 1.61) | 0.020 | 8% (6%, 10%) |
| Highest (>80%) | 33 | 1.39 (1.21, 1.61) | <0.001 | 11% (9%, 13%) | 1.34 (1.08, 1.66) | 0.008 | 9% (7%, 12%) |
| % population < secondary school | 18% (15%, 21%) | 22% (18%, 27%) | |||||
| Lowest (<30%) | 37 | 1 | 1 | - | |||
| Medium (30%−32%) | 31 | 1.50 (1.30, 1.73) | <0.001 | 13% (11%, 15%) | 1.52 (1.23, 1.88) | <0.001 | 12% (11%, 15%) |
| Highest (>32%) | 32 | 1.20 (1.04, 1.39) | 0.015 | 5% (4%, 6%) | 1.40 (1.12, 1.74) | 0.003 | 10% (8%, 12%) |
| Distance to the nearest hospital | 32% (28%, 36%) | 7% (4%, 10%) | |||||
| Lowest (<5 km) | 33 | 1 | 1 | - | |||
| Medium (5–9 km) | 35 | 1.76 (1.49, 2.10) | <0.001 | 14% (12%, 16%) | 1.23 (0.92, 1.63) | 0.160 | 1% (0%, 3%) |
| Highest (>9 km) | 32 | 1.86 (1.55, 2.24) | <0.001 | 18% (16%, 20%) | 1.37 (1.04, 1.80) | 0.026 | 6% (4%, 8%) |
| % informal/traditional living | 20% (17%, 23%) | -- (*) | |||||
| Lowest (<25%) | 41 | 1 | 1 | - | |||
| Medium (25% – 55%) | 34 | 1.33 (1.16, 1.53) | <0.001 | 9% (7%, 10%) | 0.85 (0.69, 1.05) | 0.124 | -- (*) |
| Highest (>55%) | 25 | 1.55 (1.34, 1.79) | <0.001 | 11% (9%, 13%) | 1.15 (0.84, 1.34) | 0.229 | -- (*) |
| % HH with no pipes | 14% (11%, 16%) | 9% (5%, 14%) | |||||
| Lowest (<20%) | 50 | 1 | 1 | ||||
| Medium (20% – 40%) | 20 | 1.43 (1.19, 1.72) | <0.001 | 8% (6%, 10%) | 1.42 (1.05, 1.91) | 0.024 | 6% (3%, 6%) |
| Highest (>40%) | 30 | 1.28 (1.08, 1.52) | 0.005 | 6% (5%, 7%) | 1.07 (0.85, 1.34) | 0.558 | 3% (2%, 8%) |
| % HH with no electricity | 17% (14%, 19%) | -- (*) | |||||
| Lowest (<2%) | 30 | 1 | 1 | ||||
| Medium (2% – 5%) | 43 | 1.42 (1.20, 1.68) | <0.001 | 7% (6%, 8%) | 0.83 (0.64, 1.08) | 0.169 | -- (*) |
| Highest (>5%) | 27 | 1.37 (1.15, 1.65) | 0.001 | 10% (8%, 11%) | 0.91 (0.68, 1.21) | 0.505 | -- (*) |
not applicable
Distance to the nearest hospital was identified as the most influential community-level factor for high risk women (OR: 1.76, 95% CI: 1.49, 2.10 and OR: 1.86, 95% CI: 1.55, 2.24 for 5–9 km, 10+ km vs. <5 km respectively). Approximately one third of the women who had a score of 25+ were associated with high distance to the nearest hospital (: 32%, 95% CI: 28%, 36%). Other structural factors including areas with high proportion of informal/traditional living, house-holds no pipe lines and electricity were all associated with increased prevalence of high risk women, however, their impact on HIV seroconversion was not notable.
Crude HIV incidence rate was 15.8 (95% CI: 12.0, 18.4) per 100 PY among women who had score of ≥ 25 compared to 6.6 (95% CI: 6.1, 7.2) per 100 PY for those with score of < 25 (p-value <0.001, log-rank test) (Table 3). As expected, compared to those who had score of <25, women with score 25+ were more than three times more likely to seroconvert (HR: 3.1, 95%: 2.35, 5.00). Discriminative power was estimated to be reasonably high with area under the curve (AUC): 72%. At first 12 months of follow-up, the predicted probability of HIV seroconversion was significantly higher for women with score of 25+ compared to those score < 25 (6.7% versus 16.0% respectively). These cumulative probabilities were estimated as 14.7% (for score<25) and 27.0% (for score 25+) within two years respectively.
Table 3:
Identifying “high risk women” based on the risk scoring algorithm
| Score <25 (40%) | Score ≥ 25 (60%) | |
|---|---|---|
| Crude incidence rates (95% CI) | 6.6 (6.1, 7.2) per 100 PY | 15.8 (12.0, 18.4) per 100 PY |
| Hazard ratio 1 | 1 (reference) | 3.1 (2.35, 5.00 ) |
| Predicted probabilities of HIV seroconversion (AUC2=72%) | ||
| Within 6 months | 2.8 (2.4, 3.27) | 5.6 (3.9, 8.2) |
| Within 12 months | 6.7 (6.0, 7.2) | 16.0 (12.6, 20.3) |
| Within 18 months | 10.0 (9.2, 11.1) | 23.0 (18.9, 30.0) |
| Within 24 months | 14.7 (12.2, 15.3) | 27.0 (20.0, 36.0) |
<0.001 test for trend in hazard ratios;
Area under the curve
3.2. Spatial clustering of high risk behaviours and HIV infections:
We observed significant geographical variations and clustering of high-risk women (edf = 68.91, p-value <0.001) and age standardized HIV incidence rates (edf = 98.45, p-value <0.001). Overall, areas with high prevalence of women higher scores were overlapped with high rates of HIV infections particularly in the Northern part of Durban (Figures 1a and 1b respectively). For example, the prevalence of high risk women was >50% in the areas where age-standardized HIV incidence rates were as high as 10 to 15 per 100 person year.
Figure 1.
Figure 1a) Distribution of high-risk women ǂ
Figure 1b) Distribution of age-standardized HIV incidence ratesǂǂ
ǂ Estimated degrees of freedom from GAM= 68.91, p<0.001, R-squared =81.8%
ǂǂ Estimated degrees of freedom from GAM= 98.45, p<0.001, R-squared =85.8%
Circles indicate overlapping areas in Figure 1a & Figure 1b
3.3. Spatial clustering of community-level factors
In an additional analysis, the geographical distributions of the certain community-level characteristics were also assessed. The results showed significant spatial disparities in all community-level factors in the region (edf ranged from 52% to 98%, p<0.001, all). Although, these community-level characteristics were broadly overlapped with each other, we did not observe apparent associations with the areas where high-risk women lived and high HIV infections were concentrated. We presented the maps for: “% population female aged 18–25 years”, “% female headed households(HH)”, “% population low education” and “average distance to the nearest hospital” (supplementary Figures 2a – 2d respectively). We observed significant geographical variations where they were clustered in various smaller geographical units across the region. These areas were broadly overlapped where high HIV infections were. Particularly, communities with higher proportions of young females (supplementary Figure 2a) and longer distance to a hospital (supplementary Figure 2d) were overlapped with relatively higher levels of HIV infections. Other characteristics were omitted in this paper since they have shown similar patterns due to the high correlations.
4. Discussion
In this study, we identified a cohort of South African women at high-risk of HIV infection based on a previously developed risk scoring algorithm (Wand et al., 2017). A combination of established risk factors was used to classify women as high-risk for HIV with statistically acceptable accuracy (Luke, 2003; Dunkle, 2004; Wand and Ramjee, 2012; Heffron et al., 2011; Wand and Ramjee, 2012; Pettifor et al. 2004). As the epidemic continues to spread, the Centers for Disease Control and Prevention and President’s Emergency Plan for AIDS Relief, recommended development of interventions for smaller geographically defined areas where HIV infected individuals are clustered (UNAIDS, 2013; The Global Fund, 2018). Aligned with these guidelines, we also investigated spatial clustering of HIV infections using a decade long data from a region where epidemic is known to be severely rampant. We found that the geographical units where HIV infections were clustered were also likely to be the areas where high-risk women lived. These results confirmed the profound impact of combined risk factors on excess HIV infection rates. In an additional analysis, the risk of HIV within two years were predicted to be increased by 80% among women who were classified as high-risk compared to those who were relatively at lower risk. Therefore, the overlapping nature of the clusters for high-risk women and high HIV infection rates is not surprising. These empirical results collectively pointed out the most vulnerable areas at localized levels. Therefore, they have significant implications for optimal allocations of prevention modalities by identifying “where” and “who” to target. This is particularly important for development of geographically targeted strategies by scaling up multiple prevention modalities among the priority populations.
We also reported significant links between a range of community-level characteristics and high-risk individuals as well as HIV infection. We considered several community-level factors because of their potential impact on the epidemic (directly or indirectly). We hypothesised that these community-level characteristics may potentially increase women’s high risk sexual behaviours. Consistent with the previous research, women living in communities that have higher proportions of young women and unmarried adults were likely to be at higher risk for HIV infection due to the sexual mixing (Zuma, Lurie, Williams, Mkaya-Mwamburi, & Sturm, 2005). Relationship factors, including lack of formal marriage and/or not cohabiting with a partner, have been frequently linked to HIV infections (Balkus et al., 2016; Ramjee et al., 2016). Risky sexual behaviours such as lack of condom use and multiple sexual partners are known to be more common among single women who are not living with their sexual partners. Given the high HIV incidence and prevalence rates in the region, condomless sex with multiple and/or concurrent sex partners can spread infections in communities (Street et al., 2015). Our results also showed impacts of these factors on increasing prevalence of high-risk women as well as risk of HIV infection (Lurie et al. 2008). Our study participants, who resided in the areas with high proportions of female-headed households, were also more likely to be classified as individuals at high-risk of infection. In South Africa, approximately half of the households are headed by women (2011 South African census Statistics). This relatively high proportion has been primarily linked to male labour migration as well as lack of formal marriage (Posel, 2001; Posel and Rogan, 2009; Leah, 2016). Consistent with these findings female-headed households were more likely to be vulnerable to low socio-economic conditions and poverty than male-headed households. Therefore, they may likely to be exposed to transactional sex, which may decrease their power to negotiate condom with their sexual partners (Posel and Rogan, 2009; Dunkle, 2004). Increasing distance to the nearest hospital was also associated with prevalence of high-risk women as well as risk of HIV infection. Barriers to access health care at community level have been often linked to low rates of HIV testing and treatment (Tshuma et al. 2014). Distance to a clinic has been previously reported to be a barrier for access to treatment among HIV infected individuals (Frank et al., 2017). The results from our study showed strong association between “longer distance to the nearest hospital” and high-risk women. In fact, at the population level, one third of the all high-risk women were associated with the communities where the nearest hospital was more than 5 kilometres away. These results are the first to indicate such community-level barriers on individuals who are identified at highest risk of HIV infection.
4.1. Limitations:
The current study has some limitations that need to be considered and results should be interpreted with a caution. Although our analyses used geographically diverse data, the study population were women who participated in large biomedical HIV prevention trials. Therefore, they may not be representative of the women in general. However, based on our knowledge and experience in the region, we can cautiously state that these findings may not differ widely from those in general population surveys (Shisana et al., 2012). Although, our results based on several data sources, we still cannot rule out the impact of unmeasured characteristics such as migration of women or their partners and commercial sex work which were not collected in these analyses. We were also unable to collect HIV status and sexual behaviours of the male partners of the study participants. Finally, our analyses were based on the data collected during the period of 2002–2012. However, a recent study conducted in this region has reported high HIV infection rates (Baeten et al. 2016).
Nevertheless, our study presented one of the most comprehensive analyses and provided clear implications for the complex and multifactorial influence of by using information from various data sources. These results collectively indicated that besides individual-level established risk factors, certain community-level characteristics need to be targeted. The visual presentation of the geographical variations of the epidemic particularly highlights the sub-geographical areas where multiple prevention modalities are needed.
5. Conclusion
There is an urgent need to reach the most-at-risk populations at a localized level. The results from our analyses provide significant insight into the epidemic by identifying priority populations and their geographical locations. The current study also provides critical implications for policy makers regarding the allocation of scarce resources.
Supplementary Material
Acknowledgement:
We gratefully acknowledge the women who participated in the studies.
Footnotes
Competing interests: None.
REFERENCES
- 2008 national antenatal sentinel HIV and syphilis prevalence survey, South Africa. Pretoria: National Department of Health (DOH); 2009.
- Abdool Karim SS, Richardson BA, Ramjee G, Hoffman IF, Chirenje ZM, Taha T, et al. Safety and effectiveness of buffer gel and 0. 5% PRO2000 gel for the prevention of HIV infection in women. AIDS 2010; 25(7):957–66. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Baeten JM, Palanee-Phillips T, Brown ER, Schwartz K, Soto-Torres LE, Govender V, et al. Use of a vaginal ring containing dapivirine for HIV-1 prevention in women. N Engl J Med. 2016; 375: 2121–2132. pmid: [DOI] [PMC free article] [PubMed] [Google Scholar]
- Balkus JE, Brown E, Palanee T, Nair G, Gafoor Z, Zhang J, Richardson BA, Chirenje ZM, Marrazzo JM and Baeten JM. An Empiric HIV Risk Scoring Tool to Predict HIV-1 Acquisition in African Women. J Acquir Immune Defic Syndr. 2016. July 1; 72(3): 333–343. doi: 10.1097/QAI.0000000000000974. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Census 2011: Statistical Release. Statistics South Africa. 2012. (http://www.statssa.gov.za (Accessed 1 June 2017) [Google Scholar]
- Cuadros DF, Jingjing L, Branscum JA, Akullian A, Jia P, Mziray EN, Tanser F Mapping the spatial variability of HIV infection in Sub-Saharan Africa: Effective information for localized HIV prevention and control. Scientific Reports 2017; 7: 9093;DOI: 10.1038/s41598-017-09464-y9093. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dunkle K Transactional sex among women in Soweto, South Africa: prevalence, risk factors and association with HIV infection. Soc Sci Med. 2004;59(8):1581–92. [DOI] [PubMed] [Google Scholar]
- Tanser F, Hosegood V, Barnighausen T, Herbst K, Nyirenda M, Muhwava W, et al. (2008) Cohort Profile: Africa Centre Demographic Information System (ACDIS) and population-based HIV survey. Int J Epidemiol 37: 956–962. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ghani AC, Swinton J, Garnett GP. The role of sexual partnership networks in the epidemiology of gonorrhea. Sex Transm. Dis. 1997;24:45–56. [DOI] [PubMed] [Google Scholar]
- Heffron R, Donnell D, Rees H, Celum C, Mugo N, Were E, et al. Use of hormonal contraceptives and risk of HIV-1 transmission: a prospective cohort study. Lancet Infect Dis. 2011. doi: 10.1016/S1473-3099(11)70247-X. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hunter M The changing political economy of sex in South Africa: The significance of unemployment and inequalities to the scale of the AIDS pandemic. Social Science & Medicine 2007; 64, 689–700. [DOI] [PubMed] [Google Scholar]
- Jewkes R, Flood M, Lang J. From work with men and boys to changes of social norms and reduction of inequities in gender relations: a conceptual shift in prevention of violence against women and girls. The Lancet 2014;385: 1580–1589. [DOI] [PubMed] [Google Scholar]
- Joint United Nations Programme on HIV/AIDS (UNAIDS). Location: Connecting People Faster to HIV Services. Geneva: UNAIDS, 2013. (Accessed May 20, 2018). [Google Scholar]
- Laga M, Manoka A, Kivuvu M, Malele B, Tuliza M, Nzila N. Non-ulcerative sexually transmitted diseases as risk factors for HIV-1 transmission in women: results from a cohort study. AIDS 1993; 7:95–102. [DOI] [PubMed] [Google Scholar]
- Leah G The mercurial piece of the puzzle: Understanding stigma and HIV/AIDS in South Africa, SAHARA-J: Journal of Social Aspects of HIV/AIDS 2016;13:1, 8–16, DOI: 10.1080/17290376.2015.1130644. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Leon N, Naidoo P, Mathews C, Lewin S, Lombard C. The impact of provider-initiated (opt-out) HIV testing and counseling of patients with sexually transmitted infection in Cape Town, South Africa: a controlled trial. Implement Sci. 2010;5:8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Luke N Age and economic asymmetries in the sexual relationships of adolescent girls in sub-Saharan Africa. Stud Fam Plan. 2003;34(2):67–86. [DOI] [PubMed] [Google Scholar]
- Lurie M, Pronyk P, de Moor E, et al. Sexual behavior and reproductive health among HIV-infected patients in urban and rural South Africa. J Acquir Immune Defic Syndr. 2008;47(4): 484–93. [DOI] [PMC free article] [PubMed] [Google Scholar]
- McCormack S, Ramjee G, Kamali A, Rees H, Crook AM, Gafos M, et al. PRO2000 vaginal gel for prevention of HIV-1 infection (Microbicides Development Programme 301): a phase 3, randomised, double-blind, parallel-group trial. Lancet 2010; 376:1329–37. doi: 10.1016/S0140-6736(10)61086-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Microbicide Trials Network (MTN). MTN statement on decision to discontinue use of Tenofovir gel in VOICE, a major HIV prevention study in women. 2011. http://www.mtnstopshiv.org/node/3909 (Accessed 1 July 2017).
- Padian NS, van der Straten A, Ramjee G, Chipato T, de Bruyn G, Blanchard K, et al. Diaphragm and lubricant gel for prevention of HIV acquisition in southern African women: a randomised controlled trial. Lancet 2007;370:251–61. doi: 10.1016/S0140-6736(07)60950-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pettifor AE, van der Straten A, Dunbar MS, Shiboski SC, Padian NS. Early age of first sex: a risk factor for HIV infection among women in Zimbabwe. AIDS. 2004;18:1435e42. [DOI] [PubMed] [Google Scholar]
- Posel DR (2001) Intra-family transfers and income-pooling a study of remittances in kwazulu-natal. The South African Journal of Economics, Vol. 69: 3 DOI: 10.1111/j.1813-6982.2001.tb00023.x [DOI] [Google Scholar]
- Posel D and Rogan M Women, income and poverty: Gendered access to resources in Post-Apartheid South Africa. Agenda: Empowering Women for Gender Equity 2009;No. 81, GENDER & POVERTY REDUCTION, pp. 25–34. [Google Scholar]
- Ramjee G, Moonsamy S, Abbai NS, Wand H. Individual and Population Level Impact of Key HIV Risk Factors on HIV Incidence Rates in Durban, South Africa. PloS one 2016; 11, 4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shisana O, Rehle T, Simbayi LC, Zuma K, Jooste S, Zungu N, Labadarios D,Onoya D et al. (2014) South African National HIV Prevalence, Incidence and Behaviour Survey, 2012. Cape Town: HSRC Press. [DOI] [PubMed] [Google Scholar]
- Skoler-Karpoff S, Ramjee G, Ahmed K, Altini L, Plagianos M, Friedland B., et al. Efficacy of Carraguard for prevention of HIV infection in women in South Africa: a randomised, double-blind, placebo-controlled trial. Lancet 2008;372:1977–87. doi:10.1016S0140-6736(08)61842-5. [DOI] [PubMed] [Google Scholar]
- Street R, Reddy T, Ramjee G. The generational effect on age disparate partnerships and the risk for human immunodeficiency virus and sexually transmitted infections acquisition. International Journal of STD & AIDS 2015. 29, 9, 746–752. [DOI] [PubMed] [Google Scholar]
- Tanser F, Oliveira T, Maheu-Giroux M, Bärnighausen T. Concentrated HIV sub-epidemics in generalized epidemic settings. Curr Opin HIV AIDS 2014; 9(2):115–25. doi: 10.1097/COH.0000000000000034. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tanser F, Barnighausen T, Cooke GS, Newell ML. Localized spatial clustering of HIV infections in a widely disseminated rural South African epidemic. International Journal of Epidemiology 2009; 38, 4, 1008–1016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- The Global Fund’s New Funding Model: Fourth Replenishment (2014–2016). Geneva: The Global Fund, 2013. International Journal of Epidemiology, 2018, Vol. 47, No. 2 547 [Google Scholar]
- Tshuma N, Muloongo K, Setswe G, Chimoyi L, Sarfo B, Burger D, Nyasulu P Potential barriers to rapid testing for human immunodeficiency virus among a commuter population in Johannesburg, South Africa Dove Press; 25 2014. Volume 2015:7 Pages 11–19 [DOI] [PMC free article] [PubMed] [Google Scholar]
- UNAIDS. UNAIDS report on the global AIDS epidemic. 2013 [Google Scholar]
- Wand H and Ramjee G. Combined impact of sexual risk behaviors for HIV seroconversion among women in Durban, South Africa: implications for prevention policy and planning. AIDS Behav. 2011;15(2):479–86. [DOI] [PubMed] [Google Scholar]
- Wand H, Ramjee G. The effects of injectable hormonal contraceptives on HIV seroconversion and on sexually transmitted infections. AIDS 2012;26:375–80. [DOI] [PubMed] [Google Scholar]
- Wand H, Ramjee G. Early age of coital debut is associated with increased HIV seroprevalence and incidence in a cohort of women in Durban, South Africa. BMJ Open. 2012;2:e000285. doi: 10.1136/bmjopen-2011-000285. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wand H, Reddy T, Naidoo S, Moonsamy S, Siva S, Morar NS, Ramjee G. A Simple Risk Prediction Algorithm for HIV Transmission: Results from HIV Prevention Trials in KwaZulu-Natal, South Africa (2002–2012). AIDS Behavior 2017; DOI 10.1007/s10461-017-1785-7. [DOI] [PubMed] [Google Scholar]
- Williams BG and Gouws E. The epidemiology of human immunodeficiency virus in South Africa. Philosophical Transactions of the Royal Society B: Biological Sciences 2001; 356(1411), 1077–1086. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wood SN. Stable and efficient multiple smoothing parameter estimation for generalized additive models. Journal of the American Statistical Association 2004;99, 673–686. [Google Scholar]
- Wood SN. Low rank scale invariant tensor product smooths for generalized additive mixed models. Biometrics. 2006b;62:1025–1036. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.

