Abstract
Associations between HIV-1 cytotoxic T lymphocyte (CTL) escape mutations and their restricting human leukocyte antigen (HLA) alleles imply that HIV could adapt to divergent HLA repertoires of human populations globally. Using publicly available databases, we examine the relationship between the frequencies of 19 experimentally validated CTL escape mutations in HIV-1 reverse transcriptase and their restricting HLA alleles in 59 countries. From these extensive data, we find evidence of differential HIV adaptations to human populations at only a limited number of the studied epitope sites.
TEXT
The cytotoxic T lymphocyte (CTL) response is a major driving force of host-specific HIV-1 adaptation (1). As a result, HIV polymorphisms in CTL epitopes that “escape” this immune response tend to correlate with human leukocyte antigen (HLA) variations among individuals (2), an observation that has been subsequently refined and corroborated (3, 4). Under the control of this mechanism, it is conceivable that HIV populations are differentially adapting to the HLA allele repertoires of each human population (5). Such adaptation would pose a challenge for HIV vaccine development, because the compositions of CTL epitopes in transmitted HIV variants would differ among regions. In an analysis of cohorts from 8 countries, Kawashima and colleagues (5) reported significant positive correlations between protective HLA alleles, such as B*51, and their respective CTL epitope variants among HIV-1 sequences circulating in these populations. In the present study, we analyzed large public databases to evaluate associations between HLA and HIV genotypes from up to 59 countries using an ecological approach to reassess this hypothesis.
HIV-1 reverse transcriptase (RT) nucleotide sequence data were obtained from the Stanford University HIV Drug Resistance Database (6). Sequences were restricted to those from therapy-naive subjects and filtered to one sequence per patient, leaving 44,934 sequences in total. A total of 132 countries were represented by these data. The median number of sequences per country was 123 (interquartile range [IQR] = 19 to 279), with the greatest number of sequences collected in the United States (n = 4,616). Countries with fewer than 100 sequences were excluded from further analyses. Pairwise alignment of the translated sequences against the HXB2 reference amino acid sequence (GenBank accession number K03455) was performed using HyPhy (7); sequence insertions relative to this reference were excluded. Although the sequence data set comprised many HIV subtypes, CTL epitopes are defined with respect to HXB2. Nonsynonymous mixtures were assumed to contribute equally to all possible amino acid resolutions of the respective codons. We selected 19 HIV RT amino acid polymorphisms within 9 optimally described CTL epitopes restricted by 8 HLA alleles (A*02, A*03, A*11, A*68, B*40, B*51, B*52, and B*57). These polymorphisms have been experimentally validated to confer diminished CTL responses in vitro (8).
HLA frequencies were retrieved from the Allele Frequencies Database (9) using a custom Python Web script. HLA frequency records were restricted to published studies of n = 100 or greater. This criterion excluded countries where the total number of observations exceeded 100 but where no individual study met this cutoff. Frequencies reported at the HLA subtype level (e.g., B*51:01 frequencies) were assumed to be subsets of those reported at the type level (e.g., B*51 frequencies) and were removed to prevent double counting. In total, 27,480,449 HLA alleles from 85 countries were represented in the HLA data set; of these, 59 countries were also represented in the HIV sequence data. The median sample size in the overlapping set was 2,996 (IQR = 1,078 to 62,340) alleles per country, with the largest sample size from Brazil (n = 22,735,132) primarily due to those from the Brazilian Registry of Bone Marrow Volunteer Donors (10). We used “traditional” allele frequency definitions, where the denominator is the number of alleles in the diploid (2N) human genome. Correlations between HIV-1 polymorphisms and HLA frequencies among countries (Table 1) were evaluated in R using Spearman's rank correlation and were adjusted for multiple comparisons using the Benjamini-Hochberg method.
TABLE 1.
Summary of correlations between restricting HLA alleles and select polymorphisms in key optimally described viral epitopesa
| Viral epitope (amino acids) | Epitope sequence | HLA allele | Polymorphism | Spearman's ρ | P valueb |
|---|---|---|---|---|---|
| RT (5–12) | IETVPVKL | B*40 | E6D | 0.67 | 1.6 × 10−5* |
| RT (158–166) | AIFQSSMTK | A*03 | S162A | 0.52 | 4.6 × 10−4* |
| RT (128–135) | TAFTIPSI | B*51 | I135T | 0.47 | 1.3 × 10−3* |
| RT (128–135) | TAFTIPSI | B*52 | I135X | 0.44 | 4.5 × 10−3* |
| RT (128–135) | TAFTIPSI | B*51 | I135X | 0.39 | 9.6 × 10−3* |
| RT (5–12) | IETVPVKL | B*40 | K11R | 0.35 | 0.04 |
| RT (33–41) | ALVEICTEM | A*02 | V35I | 0.32 | 0.04 |
| RT (158–166) | AIFQSSMTK | A*68 | F160L | 0.24 | 0.13 |
| RT (127–135) | YTAFTIPSV | A*02 | V135I | 0.07 | 0.65 |
| RT (158–166) | AIFQSSMTK | A*11 | K166R | 0.07 | 0.68 |
| RT (158–166) | AIFQSSMTK | A*03 | K166R | −0.0 | 0.98 |
| RT (202–210) | IEELRQHLL | B*40 | I202V | −0.10 | 0.55 |
| RT (158–166) | AIFQSSMTK | A*68 | K166R | −0.10 | 0.54 |
| RT (158–166) | AIFQSSMTK | A*03 | F160L | −0.23 | 0.14 |
| RT (158–166) | AIFQSSMTK | A*11 | F160L | −0.24 | 0.16 |
| RT (179–187) | VIYQYMDDL | A*02 | Y181C | −0.24 | 0.12 |
| RT (309–317) | ILKEPVHGV | A*02 | E312D | −0.27 | 0.23 |
| RT (244–252) | IVLPEKDSW | B*57 | V245E | −0.28 | 0.08 |
| RT (202–210) | IEELRQHLL | B*40 | Q207E | −0.46 | 3.4 × 10−3* |
The location of the amino acid polymorphism in the CTL epitope is indicated by bolding, italicizing, and underlining of the corresponding residue. Rows are sorted by Spearman's ρ values in descending order.
Asterisks (*) indicate associations that remained significant (P = 0.05) after adjusting for multiple comparisons using the Benjamini-Hochberg method.
Our results indicated a lack of consistent relationships among human populations between experimentally validated escape mutations in optimally defined CTL epitopes and their corresponding HLA allele frequencies. We found a wide range of associations between these variables, with only about half of correlations being positive: overall, the median rank correlation was ρ = 0.07 (range, −0.46 to 0.67) (Table 1). After adjusting for multiple comparisons, positive correlations predominated the significant subset (5:1), although 3 of the 5 positive correlations involved HIV-1 RT codon 135. The latter subset included the association featured in reference 5 between HIV-1 RT I135X (where X denotes any other residue) and HLA-B*51, which remained significant in comparisons among populations (Spearman's ρ = 0.39; P < 0.01). An even stronger association was obtained by restricting polymorphisms to I135T, the predominant B*51-driven escape mutation at this position (Fig. 1A; ρ = 0.47) (P = 1.3 × 10−3). We also observed a significant positive correlation between I135X and HLA-B*52 among countries (11). The only other HIV RT polymorphisms with significant positive associations with HLA alleles at this level were E6D with B*40 and S162A with A*03 (Table 1). Of note, we observed a significantly negative association between the polymorphism Q207E and HLA-B*40 (Fig. 1C). Like previous work (5), our analysis does not correct for the sequence divergence among HIV subtypes that may segregate with the global distribution of HLA alleles by chance (3). For example, Q207E occurs at high prevalence in non-B subtypes; therefore, its statistical association with B*40 may be partially confounded by the global HIV-1 subtype distribution. Importantly, although these results come from unmatched HIV and HLA data obtained from different samples, we obtained similar results from matched data (see the supplemental material).
FIG 1.
Associations between HIV-1 polymorphisms in CTL epitopes and their restricting HLA alleles among countries. Each point represents the relative frequencies of the HLA allele and HIV polymorphism in the given country. To resolve overlapping labels, a limited number of labels (italicized) were shifted up or down by a small amount. Representative subsets of three associations are shown as examples of positive (I135T and B*51) (A), nonsignificant (K166R and A*03) (B), and negative (Q207E and B*40) (C) correlations. The relative frequencies of I135X (not shown) and B*51 in Japan, the United Kingdom (U.K.), Australia, and South Africa were consistent with the data reported by Kawashima et al. (5). The corresponding Spearman rank correlations are reported in Table 1.
Taken together, our results illustrate the complexity of intra- and interhost adaptation of HIV to HLA-mediated selection pressures. Since the response to CTL selection within hosts may typically span several years (12), it tends to operate on a time scale similar to that seen with transmission, such that the statistical associations that emerge at the level of host populations depend on the intensities of selection for CTL escape in persons expressing the restricting HLA and the frequency of reversion upon transmission to persons lacking it (13). For example, if reversion occurs faster than escape, then the prevalence of the escape mutation in the host population may be substantially lower than the frequency of the restricting HLA allele would predict (14). Consistent reversion, implying fitness costs of CTL escape in relatively conserved viral regions such as RT, would be consistent with previous work; for example, Boutwell and colleagues (15) reported that the majority of CTL escape mutations in HIV-1 gag reduced virus replication capacity by 8% to 69%. Nevertheless, our results suggest that HIV polymorphisms such as RT-I135X are the exception rather than the rule and that adaptation of HIV to human populations may occur at only a limited number of sites in the genome.
Supplementary Material
ACKNOWLEDGMENTS
This work was supported by Canadian Institutes of Health Research (CIHR) operating grants awarded to A.F.Y.P. (HOP-111406) and Z.L.B. (HOP-115700). Z.L.B. is a recipient of a CIHR New Investigator Award and currently holds a Scholar Award from the Michael Smith Foundation for Health Research (MSFHR). A.F.Y.P. is a recipient of an MSFHR Scholar Award in partnership with St. Paul's Hospital Foundation and Providence Health Care Research Institute and of a CIHR New Investigator Award (Canadian HIV Vaccine Initiative).
Footnotes
Supplemental material for this article may be found at http://dx.doi.org/10.1128/JVI.01355-15.
REFERENCES
- 1.Price DA, Goulder PJ, Klenerman P, Sewell AK, Easterbrook PJ, Troop M, Bangham CR, Phillips RE. 1997. Positive selection of HIV-1 cytotoxic T lymphocyte escape variants during primary infection. Proc Natl Acad Sci U S A 94:1890–1895. doi: 10.1073/pnas.94.5.1890. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Moore CB, John M, James IR, Christiansen FT, Witt CS, Mallal SA. 2002. Evidence of HIV-1 adaptation to HLA-restricted immune responses at a population level. Science 296:1439–1443. doi: 10.1126/science.1069660. [DOI] [PubMed] [Google Scholar]
- 3.Bhattacharya T, Daniels M, Heckerman D, Foley B, Frahm N, Kadie C, Carlson J, Yusim K, McMahon B, Gaschen B, Mallal S, Mullins JI, Nickle DC, Herbeck J, Rousseau C, Learn GH, Miura T, Brander C, Walker B, Korber B. 2007. Founder effects in the assessment of HIV polymorphisms and HLA allele associations. Science 315:1583–1586. doi: 10.1126/science.1131528. [DOI] [PubMed] [Google Scholar]
- 4.Carlson JM, Brumme CJ, Martin E, Listgarten J, Brockman MA, Le AQ, Chui CK, Cotton LA, Knapp DJ, Riddler SA, Haubrich R, Nelson G, Pfeifer N, Deziel CE, Heckerman D, Apps R, Carrington M, Mallal S, Harrigan PR, John M, Brumme ZL, International HIV Adaptation Collaborative. 2012. Correlates of protective cellular immunity revealed by analysis of population-level immune escape pathways in HIV-1. J Virol 86:13202–13216. doi: 10.1128/JVI.01998-12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Kawashima Y, Pfafferott K, Frater J, Matthews P, Payne R, Addo M, Gatanaga H, Fujiwara M, Hachiya A, Koizumi H, Kuse N, Oka S, Duda A, Prendergast A, Crawford H, Leslie A, Brumme Z, Brumme C, Allen T, Brander C, Kaslow R, Tang J, Hunter E, Allen S, Mulenga J, Branch S, Roach T, John M, Mallal S, Ogwu A, Shapiro R, Prado JG, Fidler S, Weber J, Pybus OG, Klenerman P, Ndung'u T, Phillips R, Heckerman D, Harrigan PR, Walker BD, Takiguchi M, Goulder P. 2009. Adaptation of HIV-1 to human leukocyte antigen class I. Nature 458:641–645. doi: 10.1038/nature07746. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Rhee SY, Gonzales MJ, Kantor R, Betts BJ, Ravela J, Shafer RW. 2003. Human immunodeficiency virus reverse transcriptase and protease sequence database. Nucleic Acids Res 31:298–303. doi: 10.1093/nar/gkg100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Pond SL, Frost SD, Muse SV. 2005. HyPhy: hypothesis testing using phylogenies. Bioinformatics 21:676–679. doi: 10.1093/bioinformatics/bti079. [DOI] [PubMed] [Google Scholar]
- 8.Frahm N, Baker B, Brander C. 2008. Identification and optimal definition of HIV-derived cytotoxic T lymphocyte (CTL) epitopes for the study of CTL escape, functional avidity and viral evolution, p 3–24. In Korber BTM, Brander C, Haynes BF, Koup R, Moore JP, Walker BD, Watkins DI (ed), HIV molecular immunology. Los Alamos National Laboratory, Theoretical Biology and Biophysics, Los Alamos, NM. [Google Scholar]
- 9.Gonzalez-Galarza FF, Christmas S, Middleton D, Jones AR. 2011. Allele frequency net: a database and online repository for immune gene frequencies in worldwide populations. Nucleic Acids Res 39:D913–D919. doi: 10.1093/nar/gkq1128. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Fraga AM, Sukoyan M, Rajan P, Braga DP, Iaconelli A Jr, Franco JG Jr, Borges E Jr, Pereira LV. 2011. Establishment of a Brazilian line of human embryonic stem cells in defined medium: implications for cell therapy in an ethnically diverse population. Cell Transplant 20:431–440. doi: 10.3727/096368910X522261. [DOI] [PubMed] [Google Scholar]
- 11.Yagita Y, Kuse N, Kuroki K, Gatanaga H, Carlson JM, Chikata T, Brumme ZL, Murakoshi H, Akahoshi T, Pfeifer N, Mallal S, John M, Ose T, Matsubara H, Kanda R, Fukunaga Y, Honda K, Kawashima Y, Ariumi Y, Oka S, Maenaka K, Takiguchi M. 2013. Distinct HIV-1 escape patterns selected by cytotoxic T cells with identical epitope specificity. J Virol 87:2253–2263. doi: 10.1128/JVI.02572-12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Roberts HE, Hurst J, Robinson N, Brown H, Flanagan P, Vass L, Fidler S, Weber J, Babiker A, Phillips RE, McLean AR, Frater J, SPARTAC trial investigators. 2015. Structured observations reveal slow HIV-1 CTL escape. PLoS Genet 11:e1004914. doi: 10.1371/journal.pgen.1004914. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Poon AF, Kosakovsky Pond SL, Bennett P, Richman DD, Leigh Brown AJ, Frost SD. 2007. Adaptation to human populations is revealed by within-host polymorphisms in HIV-1 and hepatitis C virus. PLoS Pathog 3:e45. doi: 10.1371/journal.ppat.0030045. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Fryer HR, Frater J, Duda A, Palmer D, Phillips RE, McLean AR. 2012. Cytotoxic T-lymphocyte escape mutations identified by HLA association favor those which escape and revert rapidly. J Virol 86:8568–8580. doi: 10.1128/JVI.07020-11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Boutwell CL, Carlson JM, Lin TH, Seese A, Power KA, Peng J, Tang Y, Brumme ZL, Heckerman D, Schneidewind A, Allen TM. 2013. Frequent and variable cytotoxic-T-lymphocyte escape-associated fitness costs in the human immunodeficiency virus type 1 subtype B Gag proteins. J Virol 87:3952–3965. doi: 10.1128/JVI.03233-12. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.

