Skip to main content
The Journal of Infectious Diseases logoLink to The Journal of Infectious Diseases
. 2019 Jun 13;220(8):1325–1334. doi: 10.1093/infdis/jiz294

Association Between Single-Nucleotide Polymorphisms in HLA Alleles and Human Immunodeficiency Virus Type 1 Viral Load in Demographically Diverse, Antiretroviral Therapy–Naive Participants From the Strategic Timing of AntiRetroviral Treatment Trial

Christina Ekenberg 1,, Man-Hung Tang 1, Adrian G Zucco 1, Daniel D Murray 1, Cameron Ross MacPherson 1, Xiaojun Hu 2, Brad T Sherman 2, Marcelo H Losso 5, Robin Wood 6, Roger Paredes 7, Jean-Michel Molina 8, Marie Helleberg 1, Nureen Jina 9, Cissy M Kityo 10, Eric Florence 11, Mark N Polizzotto 12, James D Neaton 4, H Clifford Lane 3, Jens D Lundgren 1, for the INSIGHT START Study Group
PMCID: PMC6743845  PMID: 31219150

Abstract

The impact of variation in host genetics on replication of human immunodeficiency virus type 1 (HIV-1) in demographically diverse populations remains uncertain. In the current study, we performed a genome-wide screen for associations of single-nucleotide polymorphisms (SNPs) to viral load (VL) in antiretroviral therapy–naive participants (n = 2440) with varying demographics from the Strategic Timing of AntiRetroviral Treatment (START) trial. Associations were assessed using genotypic data generated by a customized SNP array, imputed HLA alleles, and multiple linear regression. Genome-wide significant associations between SNPs and VL were observed in the major histocompatibility complex class I region (MHC I), with effect sizes ranging between 0.14 and 0.39 log10 VL (copies/mL). Supporting the SNP findings, we identified several HLA alleles significantly associated with VL, extending prior observations that the (MHC I) is a major host determinant of HIV-1 control with shared genetic variants across diverse populations and underscoring the limitations of genome-wide association studies as being merely a screening tool.

Keywords: HIV-1, host genetics, genome-wide association study, GWAS, viral load, HLA


“To investigate the impact of host genetics on human immunodeficiency virus type 1 control among individuals of different ancestry, we performed genome-wide association screening and HLA imputation in a demographically diverse, antiretroviral therapy–naive cohort”.


Human immunodeficiency virus type 1 (HIV-1) viral load (VL) is predictive of disease progression. High VL is associated with faster progression to AIDS and a shorter time to death in HIV-1–infected individuals [1]. Hence, VL serves as an easily accessible prognostic marker [1]. The variability in VL observed among HIV-infected persons may be influenced by various factors, including viral features [2, 3], environmental exposure, and host genetics.

Previous genome-wide association (GWA) studies (GWASs) in persons with HIV have particularly focused on associations with set-point VL (spVL), the asymptomatic phase characterized by a relatively stable VL that most HIV-infected persons enter after acute HIV-1 infection and before initiation of antiretroviral therapy (ART). These studies have shown that host genetics explains approximately 25% [4] of the spVL variability and identified the human major histocompatibility complex class I region (MHC I) and the C-C chemokine receptor type 5 (CCR5) gene region as the major host determinants of HIV-1 control [5–8].

The first GWAS of host genetic variants associated with spVL [7] identified 2 single-nucleotide polymorphisms (SNPs) associated with spVL, rs9264942 and rs2395029, located in the extended gene area of HLA-C and the HLA complex P5, respectively. However, owing to extended linkage disequilibrium (LD) in the HLA region, causal inference of individual SNPs remains challenging. To overcome this issue, many studies have instead focused on associations between HIV control and functional HLA alleles. The primary focus of these studies has been the HLA-B locus at which several alleles associated with lower HIV VL and disease progression have been identified, including HLA-B*57:01 and B*27:05 in European populations [9, 10] and HLA-B*57:03 and B*81:01 [6, 11, 12] in individuals of African descent. Contrary to these protective effects, HLA-B*35 has been associated with a more rapid progression to AIDS [13]. Associations have also been found between HIV VL and HLA-A and HLA-C, including protective effects of HLA-A*25 and HLA-A*32 [5, 6, 9], and HLA-C*06:02 [4, 14, 15] and the aforementioned SNP, rs9264942, which has been associated with both lower VL [6, 7] and increased levels of HLA-C expression [16, 17].

Most of the studies assessed associations with spVL in individuals of predominantly European descent [4, 5, 7, 18]; hence, these studies may not capture critical genetic factors that are present in different demographics. The vast majority (80%) of participants in GWASs across all disciplines are of European descent [19]. This lack of diversity seems even more challenging in the setting of HIV, given that the majority of persons living with HIV are of non-European descent.

Despite extended research within this area and improvements in HIV treatment strategies [20], the generic question on host determinants of HIV viral control remains crucial. Fully understanding the mechanisms controlling VL may allow for the development of interventional strategies that improve the clinical outcome in both treated and untreated HIV-infected persons. In this study, we aimed to replicate current knowledge on HIV control and potentially identify novel host genetic determinants by performing a GWA screen and assess the influence of these variants in a population of demographically diverse, ART-naive individuals from the Strategic Timing of AntiRetroviral Treatment (START) trial [21].

METHODS

Ethics

Samples included in this study were derived from participants who consented in the clinical trial, START (NCT00867048) [21], run by the International Network for Strategic Initiatives in Global HIV Trials (INSIGHT). The study was approved by the institutional review board or ethics committee at each contributing center, and written informed consent was obtained from all participants. All informed consents were reviewed and approved by participant site ethics review committees.

Study Participants and VL Phenotype

The START trial was an international randomized controlled trial of 4684 ART-naive, asymptomatic HIV-infected persons who had 2 CD4+ cell counts >500/μL at least 2 weeks apart within 60 days of enrollment, had no history of AIDS, and were ≥18 years old at study entry. Participants were enrolled from 35 countries across 5 continents between 2009 and 2013. In the current study, we assessed samples collected at trial entry of START participants consenting for genetic analysis and used corresponding measurements of VL (1 measurement per participant) as outcome.

Genotyping and Quality Control

Participants were genotyped using a custom-content Affymetrix Axiom SNP array, consisting of 770 558 probes, enriched with markers related to immune dysfunction. The Ensembl gene database, assembly hg19/GRCh37, was used to annotate genes within a 5-kilobase window of each variant.

Genotyping was performed at Advanced BioMedical Laboratories, followed by standardized quality assurance procedures. SNP-level quality control (QC) included (1) removal of duplicated, multiallelic, nonautosomal probes, (2) missingness >0.03, (3) reproducibility <0.90, (4) minor allele frequency (MAF) <0.05, and (5) deviation from the Hardy-Weinberg equilibrium (a P value <1 × 10–6 was used as the threshold). At the sample level, participants were excluded owing to (1) sex mismatch, (2) duplication, (3) missingness, (4) cryptic relatedness (PI Hat [identity by descent-value calculated by PLINK] >0.1875), and (5) outlying heterozygosity (FHat1 less than −0.1 or FHat1 [inbreeding coefficient calculated by PLINK] >0.1).

Statistics

GWA Screening

First, VLs were log10-transformed and used as a quantitative trait in a multiple linear regression using an additive genetic model. The EIGENSTRAT program [22] was used to create explanatory variables to control for population stratification. The leading 4 eigenvectors (Supplementary Figure 1) and sex, which has previously been shown to be associated with VL [23], were included as covariates in the model. Associations were considered significant if they reached genome-wide significance, defined by a P value <5 × 10–8. Recent infection was assessed in a previous study via self-reporting of recent infection, and a multiassay algorithm [24, 25] and was included in subsequent sensitivity analyses in addition to CD4+ cell count, CD4/CD8 ratio, geographic region, and age group. PLINK (1.90 beta 4.1) and R (3.2.5) software [26] were used to perform the association testing.

Imputation of HLA Alleles

Imputation of classic HLA alleles per locus (ie, class I [HLA-A, HLA-B, and HLA-C] and class II [HLA-DP, HLA-DQ, and HLA-DR]) at 4-digit resolution was performed with the HIBAG software package, using the attribute bagging technique [27]. This imputation method is based on an ensemble of classifiers trained on known SNP genotypes and HLA haplotypes by bootstrapping samples and averaging HLA posterior probabilities of the predictions. HIBAG has been released with prefit classifiers and is available as an R package, which was used in our analyses. In our study, the chosen prefit classifiers were based on a multiethnic trained model for the Affymetrix UK Biobank Axiom Array on data from multiple GlaxoSmithKline clinical trials and HapMap phase 2. Our SNP array had full coverage of the SNPs used in the HIBAG software package. We considered posterior probabilities ≥0.5 as sufficient evidence for high-confidence HLA types; anything less was considered missing. To investigate uncertainty in the imputation, we conducted a second weighted regression whereby allelic dosage reflected the total posterior probability of the allele across all HLA pairs.

START participants who were to be prescribed abacavir were also tested for the presence of the HLA-B*57:01 allele as part of the START trial. This information was further used to assess the sample integrity of the study and reliability of the HLA-B locus imputation by concordance of laboratory-confirmed HLA-B*57:01 alleles with the predicted alleles.

We performed multiple linear regression to test associations between imputed HLA alleles and VL, and because HLA homozygosity was very rare in our population and HLA is codominantly expressed, we used the dominant model. We included the same set of covariates that we used for the GWAS. The false discovery rate [28] was controlled using the Benjamini-Hochberg procedure, and associations with a Q value (false discovery rate–adjusted P value) <.05 were considered significant. Tagging of HLA alleles by the alternate allele of each SNP was assessed using positive predictive value (PPV, ie, the proportion of HLA carriers among SNP carriers), and negative predictive value (NPV, ie, the proportion of HLA-naive participants among SNP-naive participants). Tests of association between SNP alleles and HLA alleles were conducted using the Fisher exact test.

RESULTS

Study Participants and Sample QC

A total of 2549 participants consented to genetic testing and had DNA extracted from stored blood specimens. Samples from 2 participants could not be genotyped owing to sample quality, and 1 participant was discovered to have signed the incorrect consent and was removed from the study. Three additional participants were excluded owing to missing VL data at trial entry. Of these 2543 eligible and genotyped participants, 103 were excluded owing to cryptic relatedness or outlying heterozygosity, leaving 2440 participants in the analysis (see Table 1 and Supplementary Figure 2 for characteristics at trial entry and Supplementary Table 1 for geographic distribution of sampling sites). Principal component analysis showed distinct population structures with strong separation between Black, Hispanic, and White participants (Supplementary Figure 3).

Table 1.

Characteristics at Trial Entry Among Participants Included in Analysis

Characteristic START Participants After Quality Control, No. (%)a (n = 2440)
Age median (IQR), y 37 (29–45)
Sex
 Female 489 (20.0)
 Male 1951 (80.0)
Race/ethnic group
 Black 572 (23.4)
 Hispanic 418 (17.1)
 Asian 13 (0.5)
 White 1398 (57.3)
 Other 39 (1.6)
Geographic region
 Africa 339 (13.9)
 Australia 91 (3.7)
 Europe and Israel 1136 (46.6)
 Latin America 423 (17.3)
 United States 451 (18.5)
Mode of HIV infection
 Injection drug use 45 (1.8)
 Sex with same sex 1560 (63.9)
 Sex with opposite sex 721 (29.5)
 Other 114 (4.7)
Time since HIV diagnosis, median (IQR), y 1.1 (0.4–3.0)
Recent infection (within 6 mo) 165 (6.8)
CD4+ cell count, median (IQR), cells/μL 651 (585–697)
HIV load, median (IQR), copies/mL 14 734 (3500–45 796)

Abbreviations: HIV, human immunodeficiency virus; IQR, interquartile range; START, Strategic Timing of AntiRetroviral Treatment.

aData represent no. (%) of participants unless otherwise specified.

Genotyping and SNP QC

A total of 339 439 SNPs were included in the analysis after probe QC that excluded poorly reproducible, duplicated, multiallelic, or nonautosomal SNPs (n = 91 398), SNPs with high missingness (n = 9317), deviation from Hardy-Weinberg equilibrium (n = 86 511), or MAF <5% (n = 243 893).

GWA Between SNPs and VL

We identified 5 genome-wide significant (P < 5 × 10–8) SNPs associated with VL in the MHC I region (Figures 1 and 2 and Table 2). The strongest association was with rs57989216 (P = 2.6 × 10–13), which lies upstream of HLA-B. The minor A allele of this SNP was associated with lower VL (0.39 log10 copies/mL lower VL per additional A allele). A strong association (P = 3.4 × 10–9) with lower VL was also observed for the C allele of rs4418214 (0.24 log10 copies/mL lower VL per additional C allele), which is located near the MICA gene and was found to be in moderate LD (r2 = 0.45) with our top candidate SNP. Distributions of VL on SNP genotypes of all genome-wide significant SNPs are shown in Figure 2. Outside the MHC region, no variants were significantly associated with VL. In line with our expectations, we observed variations in the MAF across race/ethnic groups for several SNPs (Table 2), and there seemed to be similar trends for distributions of VL across race/ethnic groups (Supplementary Table 2). Effect sizes for all significant SNPs, including 95% confidence intervals, are shown in Figure 3. Performing the regression without covariates changed the overall results markedly, although comparable effect sizes were observed for several SNPs (Supplementary Tables 3 and 4).

Figure 1.

Figure 1.

Single-nucleotide polymorphisms (SNPs) associated with viral load. The Manhattan plot shows the association between SNPs and log10 viral load in the cohort. Each SNP is represented by a point and plotted by chromosomal location (x-axis), and –log10(P-value) per SNP is shown on the y-axis. Genome-wide significance is indicated by the horizontal red line (P-value < 5 × 10–8).

Figure 2.

Figure 2.

Summary of genome-wide significant major histocompatibility complex class I (MHC I) single-nucleotide polymorphisms (SNPs) associated with viral load. A, The location of the 5 significant SNPs in the MHC I region of chromosome 6 and a heatmap of linkage disequilibrium highlighting local structures of SNPs in linkage disequilibrium with one another. The positions of the top 5 SNPs are shown as blue points. B, Violin plots of viral load distributions for each of the 5 SNPs and their different genotypes. Abbreviations: alt/alt, homozygotes for the alternate allele; kb, kilobase; ref/alt, heterozygotes; ref/ref, homozygotes for the reference allele; VL, viral load.

Table 2.

Summary of Genome-Wide Significant Single-Nucleotide Polymorphisms Associated With Viral Loada

Rank SNP Chromosome Location in hg19, bp Nearest Gene (5 kb) Allele MAF by Participant Race/Ethnicityb
Alternate Reference All White Black Hispanic Effect Size,c All (SE) P Value
1 rs57989216 6 31335791 - A G 0.053 0.053 0.056 0.043 −0.39 (0.05) 2.6 × 10−13
2 rs2507976 6 31351887 - A C 0.379 0.321 0.552 0.331 0.16 (0.02) 1.1 × 10−10
3 rs4418214 6 31391401 - C T 0.093 0.102 0.081 0.073 −0.24 (0.04) 3.4 × 10–9
4 rs2523608 6 31322559 HLA-B(0)|MIR6891 (−0.441 kb) G A 0.575 0.591 0.555 0.541 0.14 (0.02) 4.3 × 10–9
5 rs2442752 6 31351764 - C T 0.378 0.358 0.422 0.376 0.14 (0.02) 5.9 × 10–9

Abbreviations: bp, base pairs; hg19, human genome version 19; kb, kilobase; MAF, minor allele frequency; SE, standard error; SNP, single-nucleotide polymorphism.

aThe covariates included in the regression analysis were the leading 4 eigenvectors and sex.

bThe MAF refers to the frequency of the alternate allele. (With PLINK software, the “keep-allele-order” operation was used; hence, the MAF does not necessarily refer to the frequency at which the second-most-common allele occurs in our population.)

cEffect size is defined as the difference in log10 viral load per additional alternate allele.

Figure 3.

Figure 3.

Effect sizes (including 95% confidence intervals [CIs]) of significant single-nucleotide polymorphisms (SNPs) associated with viral load. The effect size (beta) is the difference in log10 viral load per additional alternate allele. The covariates included in the regression analysis were the leading 4 eigenvectors and sex.

Sensitivity Analyses of Genome-Wide Significant SNPs

Sensitivity analyses were performed in which each of the following covariates was included in the regression analysis: CD4+ cell count and CD4/CD8 ratio at trial entry, recent HIV infection (ie, within 6 months before enrollment), age group (median split), and geographic region. Age was not associated with VL in our population but was included in the sensitivity analysis because an association between age and VL had previously been suggested [29]. Each of these covariates was included singly in addition to the leading 4 eigenvectors and sex. After adjustment for these covariates, we observed comparable effect sizes for all significant SNPs (Supplementary Table 5).

Prediction of HLA Alleles

Owing to previous findings and the strong associations between SNPs located in the MHC I and VL observed in our analysis, we aimed to further explore this signal and examine associations between HLA alleles and VL in our population. To investigate this, we performed imputation of HLA alleles at the following loci in all participants: HLA-A, HLA-B, HLA-C, HLA-DPB1, HLA-DQA1, HLA-DQB1, and HLA-DRB1. The mean accuracy of the model based on out-of-bag samples was >90% for all predicted HLA loci (Supplementary Table 6). The proportions of participants with predicted alleles per locus were as follows: HLA-A, 2330 (95.5%); HLA-B, 2020 (82.8%); HLA-C, 2353 (96.4%); HLA-DPB1, 2284 (93.6%); HLA-DQA1, 2353 (96.4%); HLA-DQB1, 2284 (93.6%); and HLA-DRB1, 1956 (80.2%). Calling rates by race/ethnic group are shown in Supplementary Table 7. In addition, 1122 participants included in this study were tested for HLA-B*57:01 using a local routine testing method as part of the conduct of the START trial; 102 participants were reported as being HLA-B*57:01 positive, of whom 100 were successfully predicted by means of the imputation method, whereas among the remaining reported to be HLA-B*57:01 negative, 2 persons were classified as HLA-B*57:01 positive by means of imputation.

Association Between HLA Alleles and VL

We identified 10 HLA alleles significantly associated with VL (Q < .05). Associations were observed within all 3 MHC I loci, with HLA-B*57:01 (Q = 1.4 × 10–5) and HLA-B*57:03 (Q = 2.1 × 10–4) being the most significant alleles and associated with lower VL (Table 3 and Supplementary Table 8). We observed varying HLA allele carrier frequencies across races; however, all significant alleles were present in the major subpopulations of our study, that is, among White, Black, and Hispanic participants (Table 4). Owing to varying sample size and call rates across races/ethnic groups, the HLA associations were determined for the entire population (Supplementary Figure 4) [30]. Results of the dominant model without covariates and using the posterior probabilities to weight the regression are shown in Supplementary Tables 9 and 10, respectively. We also applied the additive model and found 8 significant alleles (Table 3).

Table 3.

Summary of HLA Alleles Significantly Associated With Viral Load Using The Dominant Modela

Rank According to Q Value Locus Allele Log10 VL, Median Effect Size (95% CI) P Value Q Value
Allele Noncarriers Allele Carriers
1 B 57:01b 4.23 (n = 1833) 4.00 (n = 187) −0.32 (−.44 to −.20) 2.1 × 10–7 1.4 × 10–5
2 B 57:03b 4.23 (n = 1951) 3.32 (n = 69) −0.47 (−.67 to −.26) 6.4 × 10–6 2.1 × 10–4
3 C 04:01b 4.15 (n = 1750) 4.26 (n = 603) 0.16 (.09–.24) 2.9 × 10–5 8.5 × 10–4
4 A 30:01b 4.19 (n = 2194) 3.81 (n = 136) −0.26 (−.40 to −.12) 3.5 × 10–4 1.2 × 10–2
5 C 12:03b 4.18 (n = 2107) 4.13 (n = 246) −0.18 (−.29 to −.07) 1.1 × 10–3 1.6 × 10–2
6 B 27:05b 4.21 (n = 1849) 4.08 (n = 171) −0.21 (−.34 to −.09) 8.7 × 10–4 1.9 × 10–2
7 C 05:01b 4.16 (n = 2144) 4.38 (n = 209) 0.16 (.05–.28) 5.7 × 10–3 3.6 × 10–2
8 C 06:02 4.19 (n = 1908) 4.10 (n = 445) −0.12 (−.20 to −.03) 6.2 × 10–3 3.6 × 10–2
9 C 08:04b 4.18 (n = 2329) 3.24 (n = 24) −0.46 (−.79 to −.13) 5.8 × 10–3 3.6 × 10–2
10 C 15:05 4.18 (n = 2330) 3.85 (n = 23) −0.44 (−.77 to −.10) 1.0 × 10–2 4.9 × 10–2

Abbreviations: CI, confidence interval; VL, viral load (copies/mL).

aThe covariates included in the regression analysis were the leading 4 eigenvectors and sex.

bAlleles that were also significant using the additive model.

Table 4.

HLA Allele Carrier Frequencies Among HLA-Predicted Participants, Overall and by Race/Ethnic Group

Locus and
Allele
Participants, No. (%)
All White Black Hispanic Asian Other
Aa n = 2330 n = 1384 n = 494 n = 404 n = 13 n = 35
 30:01 136 (5.8) 46 (3.3) 70 (14.2) 17 (4.2) 0 3 (8.6)
Ba n = 2020 n = 1312 n = 382 n = 285 n = 11 n = 30
 27:05 171 (8.5) 138 (10.5) 14 (3.7) 16 (5.6) 1 (9.1) 2 (6.7)
 57:01 187 (9.3) 144 (11.0) 11 (2.9) 25 (8.8) 3 (27.3) 4 (13.3)
 57:03 69 (3.4) 4 (0.3) 56 (14.7) 5 (1.8) 0 4 (13.3)
Ca n = 2353 n = 1390 n = 505 n = 411 n = 13 n = 34
 04:01 603 (25.6) 317 (22.8) 151 (30.0) 123 (29.9) 5 (38.5) 7 (20.6)
 05:01 209 (8.9) 163 (11.7) 14 (2.8) 29 (7.1) 1 (7.7) 2 (5.9)
 06:02 445 (18.9) 292 (21.0) 91 (18.0) 49 (11.9) 4 (30.8) 9 (26.5)
 08:04 24 (1.0) 3 (0.2) 19 (3.8) 2 (0.5) 0 0
 12:03 246 (10.5) 167 (12.0) 36 (7.1) 38 (9.2) 2 (15.4) 3 (8.8)
 15:05 23 (1.0) 11 (0.8) 7 (1.4) 5 (1.2) 0 0

aNumbers shown in the rows for the 3 loci indicate the number of HLA-predicted participants per locus and by race/ethnic group. See also Supplementary Table 7.

To determine whether SNPs significantly associated with VL were tags of significant HLA alleles, we calculated PPV and NPV as a measure of tagging. We observed high PPVs and NPVs of rs57989216 for HLA-B*57:01, found in both White (PPV, 97.9%; NPV, 99.9%; P = 1.19 × 10–186) and Asian (PPV, 100%; NPV, 100%; P = 6.1 × 10–3) participants. Among Hispanic participants, the PPV of rs57989216 for HLA-B*57:01 was 85.7% (NPV, 99.6%; P = 1.0 × 10–29), and it was even lower in Black participants (PPV, 24.4%; NPV, 100%; P = 1.9 × 10–11). Furthermore, rs57989216 showed a high PPV for HLA-C*06:02 (PPV, 89.8%; NPV, 87.2%; P = 9.3 × 10–84) in White participants. The PPV of rs57989216 for HLA-B*57:03 in Black participants was 53.3% (NPV, 90.8%; P = 2.1 × 10–11), considerably higher than in White and Hispanic (PPV, <15%) participants. PPVs for the other significant SNPs were ≤75% for all significant HLA alleles.

DISCUSSION

The host genetic mechanisms influencing replication of HIV-1 and their possible variation in across ethnic groups remain poorly understood. In the current study, we aimed to examine host determinants of VL variation in a demographically diverse population of HIV-1-infected individuals. We identified proxy SNPs for causal variants in the MHC I that were significantly associated with VL. After HLA imputation, we found several HLA alleles significantly associated with VL.

Our GWA screen revealed 5 genome-wide significant SNPs in the MHC I region. Similar to other studies, the identification of causal variants was challenged because to extended LD in this region. Our top association, rs57989216, is a noncoding SNP and has not previously been reported, which may be owing to its relatively low MAF (<0.05) in many populations [31]. However, this SNP is in moderate LD with rs4418214, a noncoding SNP located 8 kilobases downstream from the MICA gene, and rs2395029 (r2 = 0.66), a nonsynonymous SNP in the nonprotein coding HLA complex P5 gene, respectively. Both SNPs have previously been associated with HIV-1 control [5, 6], and rs2395029 serves as proxy for the HLA-B*57:01 allele, which has been associated with lower spVL and long-term nonprogression in European populations [9, 10]. A significant association with VL was also observed for rs4418214 which has been reported in previous studies [6, 32]. In the study by McLaren et al, rs4418214 was found to be associated with HIV-1 acquisition in long-term nonprogressors. Likewise, the START trial was enriched for participants with better-than-average viral control and/or decline in CD4+ T-cell count without ART, and other participants might have started ART before enrollment began in START and thus would not have been eligible.

We also identified a significant GWA between rs2523608, which has been shown to tag HLA-B*57:03 [11], and VL. An association between this SNP and spVL has previously been reported in populations of African descent [6, 11]. However, to our knowledge this association has not previously been observed in a GWAS including a demographically diverse population in which approximately half of the participants were White. Because previous studies have identified different SNPs associated with VL in different populations, it has been suggested that shared causally associated variants may be tagged by different SNPs across populations, or there may be population-specific mechanisms of HIV-1 control [6]. However, our results suggest that the A alleles of rs57989216 and rs2523608 and the T allele of rs4418214 are shared genetic variants associated with lower VL in Black, Asian, Hispanic, and White participants.

Our findings that 2 members of the HLA-B*57 group, HLA-B*57:01 and B*57:03, are significantly associated with VL are consistent with previous studies [6, 11, 33, 34]. HLA-B*57:01 is an important host determinant of HIV-1 control in European populations, whereas HLA-B*57:03 has been shown to have a similar effect in populations of African descent. These alleles occur at different frequencies according to race, HLA-B*57:01 predominantly being more common than HLA-B*57:03 in European populations and vice versa in African populations [35], which we also observed in our current study. These findings suggest that certain host genetic determinants may be population specific. A previous study [36] found a correlation between the frequency of HLA supertypes and VL, with the least frequent HLA supertypes being associated with lower VL, suggesting that HIV adapts to the most frequent alleles in the population, providing a selective advantage for those individuals who express rare alleles.

In the current study, we observed lower VL in HLA-B*57:01 allele carriers compared with noncarriers among White participants, and lower VL in HLA-B*57:03 allele carriers compared with noncarriers among Black participants, suggesting functional similarities of these alleles. We also replicated the association of HLA-B*27, which has been found to be overrepresented in long-term nonprogressors [9], similar to HLA-B*57:01 and B*57:03 [33, 34]. In our study, HLA-B*57:01 was tagged by our top candidate SNP, rs57989216, in Asian, Hispanic, and White participants, whereas the PPV in Black participants was considerably lower. These findings suggest that SNPs tag HLA alleles differently across racial/ethnic groups or, rather, tag functional groups of HLA alleles [37]. However, further work is required to elucidate this.

Our study has some limitations. A single VL measurement was used as the outcome, so fluctuations around the steady state may not have been adequately captured. Because the inclusion criteria of the START trial may have truncated higher VL, the study population may not be representative of the broader HIV population. However, owing to the numerous sampling sites, we believe this study represents early HIV infection in a global setting. Moreover, because recent infection is likely to influence levels of VL, we included this covariate in our sensitivity analyses and found comparable results. Our study evaluated the influence of host genetic factors on VL variation and did not include viral genetics. The inability to account for HIV adaptation to HLA or other host allelic polymorphisms could affect our results. Given the distribution of sampling sites, the predominant HIV subtypes included in our study would most likely be subtypes B and C [38].

As another limitation, the genomic regions encompassing the HIV VL-associated polymorphisms CCR5-delta32 and SDF1-3’A [8, 39–41] were not covered by the Affymetrix Axiom array. Hence, we were not able to assess associations between these variants and VL. CCR2-64I [42, 43], however, was covered by the array but did not show a significant GWA with VL. Similar to findings in previous studies, we observed varying HLA imputation call rates across races [44, 45]. We did not have a reference population that was specific to the unique composition of our population, which may have reduced imputation accuracy, particularly among participants of non-European ancestry. Decreased ability to call HLA alleles across races accurately, along with differing sample sizes, affected power, preventing investigation in subpopulations.

In conclusion, using a demographically diverse population, we replicated findings from previous studies that were restricted to certain subpopulations and observed comparable distributions of VL across SNP genotypes, suggesting that these VL-associated SNPs are shared genetic variants across populations. We also found that SNP tagging of HLA alleles differs across populations, which could indicate varying linkage patterns between proxy SNPs and causal variants across populations or population-specific causal variants.

Our study demonstrated that GWAS is useful for screening but has limited usefulness for mapping complex associations with the HLA region, because SNP associations are generally identified in noncoding regions and related to gene function through LD. This underlines the need for accurate HLA data. Although HLA imputation seems to be a powerful tool to obtain these data, our study is a reminder that appropriate reference populations are needed for all subpopulations. HIV-1 control is a dynamic interplay between host and viral genetics, and studies have shown that HIV-1 sequence variation, including diversity within subtypes, affects VL and disease progression independently of host factors [3, 46]. Therefore, to fully explore the underlying mechanisms of HIV-1 control and variation in VL, studies with substantial sample sizes, using joint host/viral genetic data to capture diversity across races and regions, are warranted and could provide valuable insight into this research area.

SUPPLEMENTARY DATA

Supplementary materials are available at The Journal of Infectious Diseases online. Consisting of data provided by the authors to benefit the reader, the posted materials are not copyedited and are the sole responsibility of the authors, so questions or comments should be addressed to the corresponding author.

jiz294_Suppl_Supplementary_Figure-S1
jiz294_Suppl_Supplementary_Figure-S2
jiz294_Suppl_Supplementary_Figure-S3
jiz294_Suppl_Supplementary_Figure-S4
jiz294_Suppl_Supplementary_Figure_Legends
jiz294_Suppl_Supplementary_Table_S1
jiz294_Suppl_Supplementary_Table_S2
jiz294_Suppl_Supplementary_Table_S3
jiz294_Suppl_Supplementary_Table_S4
jiz294_Suppl_Supplementary_Table_S5
jiz294_Suppl_Supplementary_Table_S6
jiz294_Suppl_Supplementary_Table_S7
jiz294_Suppl_Supplementary_Table_S8
jiz294_Suppl_Supplementary_Table_S9
jiz294_Suppl_Supplementary_Table_S10

Notes

Acknowledgments. We would like to thank all participants in the Strategic Timing of AntiRetroviral Treatment (START) trial and all trial investigators. (See Lundgren et al [21] for the complete list of START investigators.)

Author contributions. C. E., M. H., J. D. N., and J. D. L. conceived and designed the study. M. H. L., R. W., R. P., J. M. M., N. J., C. M. K., E. F., M. N. P., J. D. N., H. C. L., and J. D. L. contributed to the collection of samples and phenotypic data. X. H. and B. T. S. performed the genotyping. C. E., M. H. E. T., A. G. Z., D. D. M, C. R. M., J. D. N., and J. D. L. analyzed and interpreted the data. C. E., M. H. E. T., A. G. Z., D. D. M., C. R. M., and J. D. L. wrote the manuscript. All authors revised the manuscript critically and approved the final manuscript.

Financial support. This study was funded by the Danish National Research Foundation (grant DNRF126) and the National Institute of Allergy and Infectious Diseases, Division of Clinical Research and Division of AIDS (National Institutes of Health grants UM1-AI068641 and UM1-AI120197). The START trial was supported by the National Institute of Allergy and Infectious Diseases, National Institutes of Health Clinical Center, National Cancer Institute, National Heart, Lung, and Blood Institute, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institute of Mental Health, National Institute of Neurological Disorders and Stroke, National Institute of Arthritis and Musculoskeletal and Skin Diseases, Agence Nationale de Recherches sur le SIDA et les Hépatites Virales (France), National Health and Medical Research Council (Australia), National Research Foundation (Denmark), Bundes Ministerium für Bildung und Forschung (Germany), European AIDS Treatment Network, Medical Research Council (United Kingdom), National Institute for Health Research, National Health Service (United Kingdom), and the University of Minnesota. Antiretroviral drugs were donated to the central drug repository by AbbVie, Bristol-Myers Squibb, Gilead Sciences, GlaxoSmithKline/ViiV Healthcare, Janssen Scientific Affairs, and Merck.

Potential conflicts of interest. M. H. L. has received research grants from the Ministry of Science and Technology and the Ministry of Health (both in Argentina), the University of New South Wales, Family Health International, Westat, and ViiV. R. P. has received research support from and has consulted for MSD, ViiV Healthcare, and Gilead Sciences; he is coinventor for a patent titled “Enhanced rapid immunogen selection method for HIV gp120 variants” (application/patent no. EP 12382328.8-1405). All other authors report no potential conflicts. All authors have submitted the ICMJE Form for Disclosure of Potential Conflicts of Interest. Conflicts that the editors consider relevant to the content of the manuscript have been disclosed.

Presented in part: IDWeek conference, San Francisco, California, October 2018. Abstract 69891; presentation 1284.

References

  • 1. Mellors JW, Rinaldo CR Jr, Gupta P, White RM, Todd JA, Kingsley LA. Prognosis in HIV-1 infection predicted by the quantity of virus in plasma. Science 1996; 272:1167–70. [DOI] [PubMed] [Google Scholar]
  • 2. Bartha I, Carlson JM, Brumme CJ, et al. . A genome-to-genome analysis of associations between human genetic variation, HIV-1 sequence diversity, and viral control. Elife 2013; 2:e01123. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Bartha I, McLaren PJ, Brumme C, Harrigan R, Telenti A, Fellay J. Estimating the respective contributions of human and viral genetic variation to HIV control. PLoS Comput Biol 2017; 13:e1005339. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. McLaren PJ, Coulonges C, Bartha I, et al. . Polymorphisms of large effect explain the majority of the host genetic contribution to variation of HIV-1 virus load. Proc Natl Acad Sci U S A 2015; 112:14658–63. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Fellay J, Ge D, Shianna KV, et al. ; NIAID Center for HIV/AIDS Vaccine Immunology (CHAVI) Common genetic variation and the control of HIV-1 in humans. PLoS Genet 2009; 5:e1000791. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Pereyra F, Jia X, McLaren PJ, et al. . The major genetic determinants of HIV-1 control affect HLA class I peptide presentation. Science 2010; 330:1551–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Fellay J, Shianna KV, Ge D, et al. . A whole-genome association study of major determinants for host control of HIV-1. Science 2007; 317:944–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Dean M, Carrington M, Winkler C, et al. . Genetic restriction of HIV-1 infection and progression to AIDS by a deletion allele of the CKR5 structural gene. Hemophilia Growth and Development Study, Multicenter AIDS Cohort Study, Multicenter Hemophilia Cohort Study, San Francisco City Cohort, ALIVE Study. Science 1996; 273:1856–62. [DOI] [PubMed] [Google Scholar]
  • 9. Kaslow RA, Carrington M, Apple R, et al. . Influence of combinations of human major histocompatibility complex genes on the course of HIV-1 infection. Nat Med 1996; 2:405–11. [DOI] [PubMed] [Google Scholar]
  • 10. Kiepiela P, Leslie AJ, Honeyborne I, et al. . Dominant influence of HLA-B in mediating the potential co-evolution of HIV and HLA. Nature 2004; 432:769–75. [DOI] [PubMed] [Google Scholar]
  • 11. Pelak K, Goldstein DB, Walley NM, et al. ; Infectious Disease Clinical Research Program HIV Working Group; National Institute of Allergy and Infectious Diseases Center for HIV/AIDS Vaccine Immunology (CHAVI) Host determinants of HIV-1 control in African Americans. J Infect Dis 2010; 201:1141–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Leslie A, Matthews PC, Listgarten J, et al. . Additive contribution of HLA class I alleles in the immune control of HIV-1 infection. J Virol 2010; 84:9879–88. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Carrington M, Nelson GW, Martin MP, et al. . HLA and HIV-1: heterozygote advantage and B*35-Cw*04 disadvantage. Science 1999; 283:1748–52. [DOI] [PubMed] [Google Scholar]
  • 14. Bashirova AA, Martin-Gayo E, Jones DC, et al. . LILRB2 interaction with HLA class I correlates with control of HIV-1 infection. PLoS Genet 2014; 10:e1004196. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Naranbhai V, Carrington M. Host genetic variation and HIV disease: from mapping to mechanism. Immunogenetics 2017; 69:489–98. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Thomas R, Apps R, Qi Y, et al. . HLA-C cell surface expression and control of HIV/AIDS correlate with a variant upstream of HLA-C. Nat Genet 2009; 41:1290–4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Apps R, Qi Y, Carlson JM, et al. . Influence of HLA-C expression level on HIV control. Science 2013; 340:87–91. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18. McLaren PJ, Pulit SL, Gurdasani D, et al. . Evaluating the impact of functional genetic variation on HIV-1 control. J Infect Dis 2017; 216:1063–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Popejoy AB, Fullerton SM. Genomics is failing on diversity. Nature 2016; 538:161–4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Consolidated guidelines on the use of antiretroviral drugs for treating and preventing HIV infection: recommendations for a public health approach. 2nd ed. WHO Guidelines Approved by the Guidelines Review Committee. Geneva, Switzerland: World Health Organization,2016. [PubMed] [Google Scholar]
  • 21. Lundgren JD, Babiker AG, Gordin F, et al. . Initiation of antiretroviral therapy in early asymptomatic HIV infection. N Engl J Med 2015; 373:795–807. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, Reich D. Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet 2006; 38:904–9. [DOI] [PubMed] [Google Scholar]
  • 23. Napravnik S, Poole C, Thomas JC, Eron JJ Jr. Gender difference in HIV RNA levels: a meta-analysis of published studies. J Acquir Immune Defic Syndr 2002; 31:11–9. [DOI] [PubMed] [Google Scholar]
  • 24. Laeyendecker O, Kulich M, Donnell D, et al. . Development of methods for cross-sectional HIV incidence estimation in a large, community randomized trial. PLoS One 2013; 8:e78818. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25. Schlusser KE, Sharma S, de la Torre P, et al. ; INSIGHT START Study Group Comparison of self-report to biomarkers of recent HIV infection: findings from the START trial. AIDS Behav 2018; 22:2277–83. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26. R Core Team. R: a language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing, 2018. https://www.R-project.org/. Accessed 2017. [Google Scholar]
  • 27. Zheng X, Shen J, Cox C, et al. . HIBAG–HLA genotype imputation with attribute bagging. Pharmacogenomics J 2014; 14:192–200. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28. Krzywinski M, Altman N. Comparing samples—part II. Nature Methods 2014; 11:355. [DOI] [PubMed] [Google Scholar]
  • 29. The Natural History Project Working Group for the Collaboration of Observational HIV Epidemiological Research Europe (COHERE) in EuroCoord. Factors associated with short-term changes in HIV viral load and CD4 + cell count in antiretroviral-naive individuals. AIDS 2014; 28:1351–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Sham PC, Purcell SM. Statistical power and significance testing in large-scale genetic studies. Nat Rev Genet 2014; 15:335–46. [DOI] [PubMed] [Google Scholar]
  • 31. Auton A, Brooks LD, Durbin RM, et al. . A global reference for human genetic variation. Nature 2015; 526:68–74. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32. McLaren PJ, Coulonges C, Ripke S, et al. . Association study of common genetic variants and HIV-1 acquisition in 6,300 infected cases and 7,200 controls. PLoS Pathog 2013; 9:e1003515. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33. Migueles SA, Sabbaghian MS, Shupert WL, et al. . HLA B*5701 is highly associated with restriction of virus replication in a subgroup of HIV-infected long term nonprogressors. Proc Natl Acad Sci U S A 2000; 97:2709–14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34. Costello C, Tang J, Rivers C, et al. . HLA-B*5703 independently associated with slower HIV-1 disease progression in Rwandan women. AIDS 1999; 13:1990–1. [DOI] [PubMed] [Google Scholar]
  • 35. Cao K, Hollenbach J, Shi X, Shi W, Chopek M, Fernández-Viña MA. Analysis of the frequencies of HLA-A, B, and C alleles and haplotypes in the five major ethnic groups of the United States reveals high levels of diversity in these loci and contrasting distribution patterns in these populations. Hum Immunol 2001; 62:1009–30. [DOI] [PubMed] [Google Scholar]
  • 36. Trachtenberg E, Korber B, Sollars C, et al. . Advantage of rare HLA supertype in HIV disease progression. Nat Med 2003; 9:928–35. [DOI] [PubMed] [Google Scholar]
  • 37. Kennedy AE, Ozbek U, Dorak MT. What has GWAS done for HLA and disease associations? Int J Immunogenet 2017; 44:195–211. [DOI] [PubMed] [Google Scholar]
  • 38. Hemelaar J, Elangovan R, Yun J, et al. . Global and regional molecular epidemiology of HIV-1, 1990–2015: a systematic review, global survey, and trend analysis. Lancet Infect Dis 2019; 19:143–55. [DOI] [PubMed] [Google Scholar]
  • 39. Hill CM, Littman DR. Natural resistance to HIV? Nature 1996; 382:668–9. [DOI] [PubMed] [Google Scholar]
  • 40. Samson M, Libert F, Doranz BJ, et al. . Resistance to HIV-1 infection in Caucasian individuals bearing mutant alleles of the CCR-5 chemokine receptor gene. Nature 1996; 382:722–5. [DOI] [PubMed] [Google Scholar]
  • 41. Winkler C, Modi W, Smith MW, et al. . Genetic restriction of AIDS pathogenesis by an SDF-1 chemokine gene variant. ALIVE Study, Hemophilia Growth and Development Study (HGDS), Multicenter AIDS Cohort Study (MACS), Multicenter Hemophilia Cohort Study (MHCS), San Francisco City Cohort (SFCC). Science 1998; 279:389–93. [DOI] [PubMed] [Google Scholar]
  • 42. Smith MW, Carrington M, Winkler C, et al. . CCR2 chemokine receptor and AIDS progression. Nat Med 1997; 3:1052–3. [DOI] [PubMed] [Google Scholar]
  • 43. Ioannidis JP, Rosenberg PS, Goedert JJ, et al. ; International Meta-Analysis of HIV Host Genetics Effects of CCR5-delta32, CCR2-64I, and SDF-1 3’A alleles on HIV-1 disease progression: an international meta-analysis of individual-patient data. Ann Intern Med 2001; 135:782–95. [DOI] [PubMed] [Google Scholar]
  • 44. Karnes JH, Shaffer CM, Bastarache L, et al. . Comparison of HLA allelic imputation programs. PLoS One 2017; 12:e0172444. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45. Pappas DJ, Lizee A, Paunic V, et al. . Significant variation between SNP-based HLA imputations in diverse populations: the last mile is the hardest. Pharmacogenomics J 2018; 18:367–76. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46. Pant Pai N, Shivkumar S, Cajas JM. Does genetic diversity of HIV-1 non-B subtypes differentially impact disease progression in treatment-naive HIV-1-infected individuals? A systematic review of evidence: 1996–2010. J Acquir Immune Defic Syndr 2012; 59:382–8. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

jiz294_Suppl_Supplementary_Figure-S1
jiz294_Suppl_Supplementary_Figure-S2
jiz294_Suppl_Supplementary_Figure-S3
jiz294_Suppl_Supplementary_Figure-S4
jiz294_Suppl_Supplementary_Figure_Legends
jiz294_Suppl_Supplementary_Table_S1
jiz294_Suppl_Supplementary_Table_S2
jiz294_Suppl_Supplementary_Table_S3
jiz294_Suppl_Supplementary_Table_S4
jiz294_Suppl_Supplementary_Table_S5
jiz294_Suppl_Supplementary_Table_S6
jiz294_Suppl_Supplementary_Table_S7
jiz294_Suppl_Supplementary_Table_S8
jiz294_Suppl_Supplementary_Table_S9
jiz294_Suppl_Supplementary_Table_S10

Articles from The Journal of Infectious Diseases are provided here courtesy of Oxford University Press

RESOURCES