Skip to main content
Journal of Virology logoLink to Journal of Virology
. 2011 Aug;85(15):7523–7534. doi: 10.1128/JVI.02697-10

Demographic Processes Affect HIV-1 Evolution in Primary Infection before the Onset of Selective Processes

Joshua T Herbeck 1,, Morgane Rolland 1,, Yi Liu 1, Sherry McLaughlin 1, John McNevin 4, Hong Zhao 1, Kim Wong 1, Julia N Stoddard 1, Dana Raugi 1, Stephanie Sorensen 1, Indira Genowati 1, Brian Birditt 1, Angela McKay 1, Kurt Diem 2, Brandon S Maust 1, Wenjie Deng 1, Ann C Collier 3, Joanne D Stekler 3, M Juliana McElrath 4, James I Mullins 1,2,3,*
PMCID: PMC3147913  PMID: 21593162

Abstract

HIV-1 transmission and viral evolution in the first year of infection were studied in 11 individuals representing four transmitter-recipient pairs and three independent seroconverters. Nine of these individuals were enrolled during acute infection; all were men who have sex with men (MSM) infected with HIV-1 subtype B. A total of 475 nearly full-length HIV-1 genome sequences were generated, representing on average 10 genomes per specimen at 2 to 12 visits over the first year of infection. Single founding variants with nearly homogeneous viral populations were detected in eight of the nine individuals who were enrolled during acute HIV-1 infection. Restriction to a single founder variant was not due to a lack of diversity in the transmitter as homogeneous populations were found in recipients from transmitters with chronic infection. Mutational patterns indicative of rapid viral population growth dominated during the first 5 weeks of infection and included a slight contraction of viral genetic diversity over the first 20 to 40 days. Subsequently, selection dominated, most markedly in env and nef. Mutants were detected in the first week and became consensus as early as day 21 after the onset of symptoms of primary HIV infection. We found multiple indications of cytotoxic T lymphocyte (CTL) escape mutations while reversions appeared limited. Putative escape mutations were often rapidly replaced with mutually exclusive mutations nearby, indicating the existence of a maturational escape process, possibly in adaptation to viral fitness constraints or to immune responses against new variants. We showed that establishment of HIV-1 infection is likely due to a biological mechanism that restricts transmission rather than to early adaptive evolution during acute infection. Furthermore, the diversity of HIV strains coupled with complex and individual-specific patterns of CTL escape did not reveal shared sequence characteristics of acute infection that could be harnessed for vaccine design.

INTRODUCTION

While some HIV-1 infections result in the initial outgrowth of multiple viral variants, in most cases a single variant establishes infection (1, 4, 14, 19, 22, 24, 33, 34, 37, 54, 55, 61, 65, 73). Seroconversion generally occurs within 3 to 12 weeks (9); then, within 6 to 12 months, plasma viral levels typically reach a quasi-stable set point that is prognostic for disease progression (43, 4648, 64). Symptoms of acute retroviral syndrome, when they are noted, coincide with emerging or peak viral loads, which then decline sharply as HIV-1-specific CD8+ cytotoxic T lymphocyte (CTL) responses emerge (6, 35).

Although viral populations early in HIV infection have been known for 2 decades to typically be nearly homogeneous (14, 75, 77), recent studies have better characterized HIV-1 sequences in the earliest weeks of infection, including sequences obtained prior to the selective pressure imposed by the nascent immune response of the newly infected individual (1, 22, 34, 61). The low genetic variability of viruses in early HIV-1 infection and the rapid viral population expansion and contraction that occur in the first weeks of infection underline the potential importance of stochastic processes in the earliest phases of HIV-1 adaptation to a new host. Analyses of both env (34) and whole-genome sequences (62) showed that, before peak viremia, HIV-1 evolution proceeds randomly under a star-like phylogeny, conforming to a model of exponential HIV-1 population growth without selective pressure (elaborated by Lee and colleagues [38] for single founder strains). In the week(s) after peak viremia, major changes occur in viral sequences, with clear signs of adaptive evolution reflected in the occurrence of new mutations. Most early mutations are selected by cellular immune responses (21), corroborating that CTL responses are a major force acting on the viral population in a single host (2, 41) as well as at the interhost population level (5, 32).

To better understand the interplay between stochastic and selective processes and how this affects initial HIV-1 viral outgrowth and adaptation to a new host, we studied transmitter-recipient transmission pairs and acutely infected individuals (five individuals first sampled in Fiebig stage I and four first sampled in stage V). We examined the evolutionary patterns observed in their HIV-1 subtype B genomes, using 475 nearly full-length (∼9,100 nucleotides) HIV-1 sequences derived at multiple time points for up to 350 days after the onset of clinical symptoms of primary HIV-1 infection. Genomic sequences were obtained prior to and following peak plasma viral load for three individuals, allowing us to assess how viral population dynamics and selection impact HIV-1 evolution in very early stages of infection.

MATERIALS AND METHODS

Study participants.

Eleven adult subjects were recruited through the University of Washington Primary Infection Clinic (PIC) and gave informed consent under clinical protocols approved by the University of Washington Institutional Review Board. All were men who have sex with men (MSM), and nine were enrolled in primary HIV-1 infection (Fiebig stages I to V [17]). All were antiretroviral therapy naïve during the study period. Blood samples were collected every week for the first month and every 1 to 3 months thereafter. Plasma specimens were tested for HIV-1 RNA, p24 antigen, and virus-specific antibodies to determine Fiebig stages. HLA class I genotyping was performed by sequence-specific primer PCR (absolute resolution to two digits and high-probability resolution to four digits) (8). The duration of infection was estimated as the number of days after the onset of symptoms of an acute retroviral syndrome; among PIC enrollees for which a determination could be made, a median of 12 days elapsed between transmission and symptom onset (J. D. Stekler et al., unpublished data).

HIV-1 near-full-length genome sequencing.

Viral RNA extraction from plasma samples, cDNA synthesis, genome (∼9.1 kb) amplification, cloning, and sequencing were performed as described previously (59). We amplified viral genomes from single viral RNA templates. PCR amplification followed endpoint dilution methodology in order to avoid template resampling bias and was conducted using primers 1.U5 (TGAGTGCTTCAAGTAGTGTGTGCCCGTCTGT; HXB2 coordinates 541 to 571) and 1.3′3′pl (GGGTGAAGCACTCAAGGCAAGCTTTATTG; HXB2 coordinates 9611 to 9636) for the first-round PCR and primers 2.U5 (GGCCGCGGATCCAGTAGTGTGTGCCCGTCTGTTGTGTGACTC; HXB2 coordinates 552 to 581) and 2.3′3′pl (GGCCGCGCGGCCGCTGAAGCACTCAAGGCAAGCTTTATTGAGGCTTA; HXB2 coordinates 9604 to 9636) for the second-round PCR. For transmitters, we obtained 10 genomes from one (two for the subject designated transmitter 4 [T4]) time point near the estimated time of HIV-1 transmission. For individuals followed longitudinally, we obtained 9 to 15 genome sequences at up to 12 time points (up to 350 days after onset of symptoms). Genome and subgenomic sequences (encompassing ∼60% of the genome) for two individuals (T1 and recipient 1 [R1]) were reported previously (41).

IFN-γ ELISPOT assays.

Gamma interferon (IFN-γ) enzyme immunospot (ELISPOT) assays were done on cryopreserved peripheral blood mononuclear cells (PBMC) using a panel of HIV-1 subtype B peptides (9- to 11-mers) (see Table 1 posted at http://mullinslab.microbiol.washington.edu/publications/herbeck_2011/); we report spot-forming cells per million PBMC (SFC/M) when >50 SFC/M after background is subtracted (41). IFN-γ ELISPOT assays were performed on PBMC from the individual designated seroconverter 1 (S1) at days 7 and 127 using 54 peptides (10 Gag, 15 Pol, 10 Env, 15 Nef, 2 Vif, and 2 Rev), on PBMC from subject R3 at days 25 and 144 using 47 peptides, and on PBMC from R4 at day 14 using 17 peptides.

Table 1.

Viral genome nucleotide diversity at the first visit in acute infection

Group and subjecta Time postonset of symptomsb Fiebig stagec Genome sequence length (no. of nt)d Mean no. of InSites/genomee Mean pairwise genetic distance (range)f Mean HD (range)g Mean InSites HD (range)h
Transmission pairs
    T1-PIC58368i ∼10 yr VI
    R1- PIC87014i 8 days I 9,217 0.1 0.17 (0.08–0.27) 14.98 (6–25) 0
    T2-PIC88403 14 days V 9,104 0.2 0.33 (0.17–0.48) 28.09 (6–25) 0.71 (0–2)
    R2-PIC38051 3 days I 9,107 0.22 0.29 (0.19–0.37) 25.28 (17–32) 0.39 (0–1)
    T3-PIC51550 ∼9 yr VI
    R3-PIC38417 25 days V 9,030 0.7 0.44 (0.27–0.68) 40.07 (24–63) 2.78 (0–5)
    T4-PIC55751 18 daysj Vj 9,147 0.3 0.31 (0.17–0.45) 27.67 (15–41) 1.07 (0–3)
    R4-PIC11286 13 days V 9,098 18.6 0.72 (0.15–1.49) 67.38 (13–40) 47.51 (0–110)
        Variant 1 (n = 7) 0.32 (0.14–0.52) 28.76 (13–43)
        Variant 2 (n = 3) 0.76 (0.57–0.96)k 67.33 (51–84)
Seroconverters
    S1-PIC71101 7 days I 9,096 0.3 0.36 (0.22–0.51) 31.44 (20–41) 0.36 (0–1)
    S2-PIC83747 6 days I 9,110 0.1 0.3 (0.14–0.39) 26.53 (13–36) 0.36 (0–1)
    S3-PIC90770 3 days I 9,057 0.4 0.39 (0.21–0.57) 34.51 (19–52) 1.07 (0–3)
a

Transmission pairs consist of the transmitting (T) and recipient (R) partners; seroconverters S1 to S3 are subjects without identified transmitting partners. PIC identification numbers are given.

b

Time postonset of clinical symptoms of primary HIV infection at visit 1.

c

Fiebig stages (I to V) within acute infection were defined according to reference 17. Stage VI corresponds to chronic infection of open-ended duration.

d

Total nucleotide (nt) length of nearly full-length genome sequence alignment.

e

Mean number of phylogenetically informative sites (InSites) per genome sequence at visit 1 (phylogenetically informative sites are sites at which a mutation is present in at least two sequences in the individual's data set).

f

Mean pairwise diversity (HKY substitution model corrected) for visit 1 genome sequences.

g

Mean Hamming distance for visit 1 genome sequences (all nucleotide sites).

h

Hamming distance for genome phylogenetically informative sites only.

i

Sequences from these subjects were described in reference 41.

j

Subject T4, although enrolled during primary infection, was infected for ∼9 years prior to transmission to R4.

k

Recombination among variants resulted in higher diversity for this variant population.

Sequence analysis.

Nucleotide sequences were aligned with Clustal W, version 1.8 (70), and manually edited with MacClade, version 4.08 (44). Alignments are available at http://mullins.lab.microbiol.washington.edu/publications/herbeck_2011/. Alignments of phylogenetically informative nucleotide sites omit mutations that occur only once, which are possibly introduced by polymerase-induced errors during PCR. This informative-sites (InSites) approach (http://indra.mullins.microbiol.washington.edu/DIVEIN/insites.html) results in slightly decreased estimates of nucleotide diversity relative to single-template amplification methods (61) although standard methods of PCR/cloning have been shown to produce measures of population structure and genetic diversity equivalent to those found with single-genome amplification methods (31). An insertion or deletion that spanned multiple sites was counted as a single informative site. APOBEC3G/APOBEC3F (APOBEC3F/G)-induced mutations were evaluated using Hypermut, version 2.0 (http://www.hiv.lanl.gov/content/sequence/HYPERMUT/hypermut.html), in intrahost datasets by taking the consensus sequence at visit 1 as a reference; one putative APOBEC-induced G-to-A hypermutated sequence in subject S1 was identified and excluded from subsequent analyses. Maximum-likelihood phylogenetic trees were reconstructed using the general time-reversible model of substitution with gamma distribution in PhyML (version 2.4.5) (23). Potential N-linked glycosylation sites (PNGS) in Env were predicted using N-GLYCOSITE (76). All Env sequences were evaluated for CCR5 or CXCR4 coreceptor specificity using the position-specific site matrix (PSSM) web tool (30) (http://indra.mullins.microbiol.washington.edu/webpssm). For each individual with five or more sequenced time points, the rate of nucleotide diversity increase was estimated using univariate linear regression analysis. Overall rates of diversity increase were calculated by pooling all data points and, alternatively, by estimating the mean of rates calculated separately for each individual. Intrahost phylogenies were reconstructed to identify distinct lineages, taken to indicate multiple founders, replicating within each individual. Using sequences from the first time point examined (visit 1), we examined the distribution of pairwise genetic diversity and Hamming distances (HD; the uncorrected count of nucleotide differences between two sequences) for all nucleotide sites and for phylogenetically informative sites.

Identifying signatures of sequence evolution. (i) Neutrality tests.

Two statistical tests of neutral evolution implemented in the DnaSP software (60) were used. Tajima's D (69) is based on the difference between two estimates of θ (θ = 2Neμ in a haploid population, where Ne is effective population size, and μ is the mutation rate per generation); one estimate is based on the number of segregating nucleotide sites (θW), and the other is based on the average pairwise distance (π, θπ). In a population of constant size in neutral equilibrium, the two estimates of θ will be statistically indistinguishable, and values of D are near zero. Deviations from zero (the null hypothesis of neutral evolution) can reflect selective or demographic processes. The D* of Fu and Li (18) compares θW to θ based on the total number of mutations on a genealogy. D and D* were calculated for genome nucleotide alignments at each time point. Bonferroni corrections for 36 tests (P = 0.05) were done after P values were estimated from null distributions created from 104 simulations under a neutral coalescent model with no recombination, conditioned on the sample size and level of variation in the observed data (60).

(ii) Neutrality tests across multiple loci.

D and D* were calculated separately for each gene, with the heuristic assumption of free recombination among genes (67) and no recombination within them (i.e., to test for deviations from neutrality among genes, we assumed that genes are independent and that selection operating on Gag will not affect Env). D and D* values were then compared across env, gag, nef, and pol, with statistical correction for 144 tests after estimating P values as above (60). Quantitative evidence of distinct evolutionary processes among loci were evaluated using a Hudson-Kreitman-Aguadé (HKA) test (28), treating each time point and gene alignment (env, gag, nef, and pol) as a separate population. The HKA test is based on the fact that selection acting on a specific locus will violate the neutral condition where Ne is equivalent across loci, with statistical significance estimated with a χ2 goodness-of-fit test (28).

(iii) Assessing positive selection.

We tested for evidence of positive selection in all nine HIV-1 genes in each individual with five or more time points. First, we measured the ratio of nonsynonymous (dN) to synonymous (dS) substitutions, dN/dS, or ω, (20, 56) using HyPhy (http://www.datamonkey.org/) (53). The fixed-effects likelihood (FEL) method with the general reversible nucleotide substitution model (REV) was used, and sites with ω of >1 and P of <0.1 were considered to be under positive selection. Second, we tested for directional positive selection using the method of Liu et al. (41), which compares the accumulation rate of amino acid mutations to the expected rate if the accumulation were due to genetic drift alone (determined by simulation).

In silico protein sequence analysis. (i) Epitope repertoires.

HLA-specific HIV-1 epitopes were predicted in all protein sequences using Epipred (25; http://atom.research.microsoft.com/bio/epipred.aspx) and NetMHC (10, 49). Epipred identifies known and potential CTL epitope motifs using 2-digit HLA information; we accepted all epitope motifs with a posterior probability of >0.5. NetMHC predicts binding of peptides to 4-digit HLA alleles; we accepted both strong and weak binders.

(ii) Comparison of each proteome to the consensus at visit 1.

For each individual, we derived a consensus from sequences found at visit 1 (in the event of two founder viruses, two respective consensus sequences were derived). Each sequence from later visits was compared to the visit 1 consensus, and we tracked the frequency of all amino acid mutations longitudinally.

(iii) Population frequency of specific amino acids.

For each amino acid mutation, we calculated the frequency of the mutant and consensus amino acid in circulating HIV-1 sequences, using a data set of independent HIV-1 clade B sequences (comprised of 200 sequences from Env, 125 from Gag, 227 from Pol, 514 from Nef, 184 from Rev, 286 from Tat, 327 from Vif, 225 from Vpr, and 203 from Vpu) (57). We defined an amino acid mutation as a forward (putative escape) mutation when there was a decrease of >50% between the database frequencies of the visit 1 consensus amino acid and the mutated amino acid. Conversely, a reversion corresponded to an increase of >50% between the database frequencies of the visit 1 consensus amino acid and the mutated amino acid.

RESULTS

We analyzed HIV-1 subtype B evolution in 11 individuals, including four transmitter-recipient pairs (Fig. 1) and three independent seroconverters. At the time of transmission, three transmitters were chronically infected while one was acutely infected (Table 1). Sequences from transmission pair 1 (subjects T1 and R1 in Table 1) have been described previously (41); from the nine other individuals, 475 nearly full-length viral genome sequences (“genomes”) were generated, representing an average of 10 genomes at up to 12 serial time points (Table 2). No evidence of dual infection was found. All viruses were predicted by PSSM (30) to use the CCR5 coreceptor, as expected for early HIV-1 infection (63).

Fig. 1.

Fig. 1.

Phylogenetic trees of HIV-1 genome sequences. The large (left side, unboxed) tree contains nine individuals and is based on 9 to 15 full genome sequences from each sampled time point (up to 12 time points extending up to 350 days after the onset of symptoms). The boxed tree (lower right) contains four transmitter-recipient pairs (T1-R1, T2-R2, T3-R3, and T4-R4) and is based on genome sequences from the recipient's first sampled time point during acute infection (and from the transmitter near the time of transmission). Pair T1-R1 is not included in the larger tree because sequencing at later time points primarily involved subgenomic fragments. Both trees were generated using the maximum-likelihood method with the general time reversible-gamma distribution model employed in PhyML (version 2.4.5). The scale bar represents the number of substitutions per site.

Table 2.

Longitudinal follow-up of subjects

Transmission pair follow-up data
Seroconverter follow-up datad
Subject DPSa No. of viral NFLGb Plasma VL (RNA copies/ml)c Subject DPSa No. of viral NFLGb Plasma VL (RNA copies/ml)c
Pair 2 S1 7 10 377,060
    T2 14 10 290,040 13 10 1,912,200
    R2 3 9 620,820 21 15 312,200
7 9 2,498,840 44 10 32,630
Pair 3 68 10 232,050
    T3 ∼9 yr* 10 28,610 96 10 109,950
    R3 25 10 472,810 127 10 45,590
33 10 169,580 181 10 105,870
40 12 117,310 S2 6 10 923
53 10 71,230 12 10 1,049,870
83 10 30,650 20 10 26,785,000
116 11 32,430 42 9 665,680
144 10 24,740 48 10 668,660
200 10 19,460 56 14 124,000
247 10 14,610 69 9 254,300
Pair 4 98 11 151,000
    T4 3,343 10 40,600 151 10 242,940
3,394 9 22,190 210 10 60,950
    R4 13 10 2,804,000 273 10 51,120
18 10 54,160 350 10 49,600
26 10 29,180 S3 3 10 469,830
11 9 >1,600,000
15 10 42,810
71 10 12,200
99 9 14,170
128 10 18,210
196 10 70,780
a

DPS, days postonset of clinical symptoms, with the exception of the value for T3 (*).

b

NFLG, nearly full-length genome sequence.

c

VL, viral load.

d

No transmitter identified.

HIV-1 infection is typically founded by a single variant.

Eight of the nine acutely infected individuals had infections founded by a single HIV-1 lineage, while one individual (R4) replicated two lineages. Founder viral populations were remarkably homogeneous (Fig. 1) based on Hamming distances and pairwise diversity measures (Table 1). For single founder infections, the mean pairwise diversity among genomes at visit 1 was 0.32% (range, 0.17 to 0.44%) (Table 1). For R4, the individual with two founder lineages identified at 13 days postonset of symptoms of acute retroviral infection (referred to as “days”), the two distinct variant lineages differed by a mean of 1.12%, whereas each individual lineage was nearly homogeneous (Table 1).

Transmitter-recipient transmission pairs.

Transmission pair 2 corresponds to two individuals with acute infections. Since the transmitter, T2, had a single variant with little diversity among sequences, infection was founded by a single strain in the recipient partner (R2) (Fig. 1). Sequences from both individuals were intermingled in the tree, and interhost genome pairwise distances ranged from 0.18% to 0.43%, a range conforming to the variation seen with a single founder within one host at the earliest time point (Table 1). Transmissions in two other pairs resulted in infections established by a single founder strain (pairs 1 and 3) even though each of the transmitting partners was chronically infected, with extensive diversity among their sequences (41). Transmitting partner T4 had been enrolled during primary infection, and little viral genetic variation was observed (Fig. 1 and Table 1), but 9 years later, at the time of transmission to R4, genomes from T4 contained extensive variation, and two variants were found in primary infection in the recipient (see Fig. 1 posted at http://mullinslab.microbiol.washington.edu/publications/herbeck_2011/).

For the four transmission pairs, we compared sequences from the recipient to sequences from the respective transmitter. There were exact matches (100% similarity) between transmitter and recipient sequences when we considered the conserved genes gag or pol. However, over the whole genome there were no exact matches between recipient sequences and those from the transmitting partner; the closest sequence between transmitter and recipient had between 1.69% (T3-R3) and 0.18% (T2-R2) divergence.

An important question is whether the founder variant in the recipient can be distinguished from sequences in the transmitter due to properties advantageous for the establishment of infection. That is, is the founder variant rare or common (typical) in the transmitter? To address this question for each transmission pair, we compared the consensus of the recipient population to each sequence in the transmitter. From the resulting ranked pairwise distances, we identified an approximate transmitted variant, i.e., the transmitter sequence that is most closely related to the recipient consensus (founder variant). Next, we compared all transmitter sequences to the transmitter consensus sequence under the hypothesis that an approximate transmitted variant that is rare would have a greater distance to the transmitter consensus than most, if not all, other transmitter sequences. Figure 2 shows the distribution of genetic distances for all the sequences from the transmitter. For each transmission pair, the transmitter sequence that matches most closely the consensus sequence in the recipient at visit 1 (i.e., our best approximation of the transmitted/founder virus) was found to be representative of the sequences in the transmitter. Indeed, it was generally very close to the mean genetic distance corresponding to all the transmitter sequences. Thus, the approximate founder variants did not appear to be unusual or rare in the transmitter (Fig. 2).

Fig. 2.

Fig. 2.

The recipient founder virus is typical of the viral population in the transmitter. Distribution of genetic distances between the 10 transmitter sequences obtained near the time of transmission and the corresponding consensus sequence in the transmitter is shown. The line represents the mean genetic distance for each transmitter. The transmitter sequence that was the closest (had the lowest pairwise genetic distance) to any recipient sequence at visit 1 is represented as an open symbol. For pair T4-R4, two variants established infection in the recipient, so the transmitter sequence closest to each variant is shown. T2 was acutely infected at the time of transmission to R2.

Stochastic versus selective processes in the first weeks of HIV-1 infection.

Viral genome diversity increased over time across all individuals at a yearly rate of 0.55% for all nucleotide sites (Fig. 3A); the rate of accumulation of selected sites corresponded to an average of 40 sites in each subject in the first year of infection (Fig. 3B). However, the evolutionary rate at the genome level masks decoupled rates in the different genes. When we examined individual genes, as expected, the average rates of diversification were lower in gag (0.33%) and pol (0.31%) and higher in env or C2V5 (1.07%) and nef (1.34%) (all values are pooled estimates for the four individuals chosen because they had five or more time points evaluated).

Fig. 3.

Fig. 3.

Trends in genetic diversity, positive selection, potential N-linked glycosylation, and epitope number. (A) Mean pairwise nucleotide diversity across genomes (corrected with the Hasegawa-Kishino-Yano [HKY] substitution model). (B) Cumulative number of amino acid sites under positive selection as identified by FEL or the simulation method of Liu et al. (41). Only positively selected sites at which two or more mutations occur at the same time point were counted (in order to better identify the beginning of selective events). (C) Mean pairwise nucleotide diversity across genomes; values for each time point are reported relative to those found in the first sampled time point (visit 1). (D) Mean number of epitopes predicted by the Epipred algorithm over the first 200 days after symptoms; as in panel C, values for each time point are relative to the first sampled time point. The dashed lines cross zero. (E) Potential N-linked glycosylation sites.

At visits in the first month of infection, we observed a transient decrease (a dip) in nucleotide diversity for both genomes (Fig. 3C) and independent gene sequences (data not shown). This suggested a contraction in diversity following the establishment of infection. We also noted a decrease in APOBEC3F/G-mediated mutations that coincided with the dip in nucleotide diversity (see Fig. 2 posted at http: //mullinslab.microbiol.washington.edu/publications/herbeck_2011/), yet the dip in nucleotide diversity was of substantially larger magnitude and thus not due to the decrease in APOBEC-induced mutations (see Fig. 2 posted at the URL mentioned above).

To evaluate potential factors behind the dip in nucleotide diversity and assess the forces acting on the viral population very early in infection, we assessed how the data conformed to the neutral theory, given that the dramatic, several-order-of-magnitude change in plasma viremia that occurs during acute infection suggests that changes in HIV-1 population size (i.e., demographic processes) might influence genetic diversity in this time period. Trends in genome diversity and divergence are plotted along with viral load data in Fig. 4. We performed neutrality tests on genomes from the four individuals with five or more sequential visits (R3, S1, S2, and S3). Both Tajima's D (69) and Fu and Li's D* tests (18) revealed negative deviations from neutral evolution, suggesting either positive selection and/or demographic events (69) (see Table 2 posted at http://mullinslab.microbiol.washington.edu/publications/herbeck_2011/). The most significant negative deviations (P < 0.001) were observed in the earliest time points after infection, specifically before ∼50 days, coinciding with the rapid viral population growth and contraction during acute infection (shaded in Fig. 4). Next, to distinguish demographic and selective processes, we calculated D and D* separately for env, gag, nef, and pol; there was no evidence of selection acting specifically on a particular gene as genomes and individual genes showed similar patterns, implying the existence of demographic processes acting uniformly across genomes. Significant negative deviations were again more common at the first time points, and the strongest P values in the gene-specific analyses coincided with negative deviations in the whole-genome analyses. Since sequential visits are not independent due to shared evolutionary history, the number of independent tests can be reduced (compared to strict Bonferroni correction for 144 tests), thus revealing significant deviations from neutrality in the early time points (see Table 3 posted at the URL mentioned above). In addition, in pairwise comparisons of genes for each time point, the Hudson-Kreitman-Aguadé (HKA) tests (28) revealed no sign of adaptive positive selection.

Fig. 4.

Fig. 4.

Stochastic processes predominate during acute HIV-1 infection. Plots for four individuals followed from 3 up to 350 days postonset of symptoms. Trends in pairwise diversity and divergence from the first visit consensus for genome nucleotide alignments are shown along with plasma viral RNA load over the same period (the legend is shown in the S3 panel). Times during which significant negative deviations for Tajima's D and Fu and Li's D* neutrality tests were detected are shown in shaded blocks.

The significant negative deviations observed for both genomes and separate genes persisted until the rapid decline in viral loads (Fig. 4). Importantly, negative D and D* values that are due to demographic processes can result from a founder effect or from a recent population expansion with a subsequent delay in the population reaching neutral equilibrium (69). Evolution of HIV-1 during acute infection is therefore marked by both a founder effect and subsequent population expansion. We conclude that the observed early dip in viral diversity is likely caused by rapid viral population expansion; the process of population expansion can result in decreased mean population diversity as most lineages in the growing population are descendant from a limited number of ancestral lineages.

Indicators of selection begin to appear in the first week after onset of symptoms.

While demographic processes predominated at the earliest time points (before ∼50 days), later visits revealed the role of positive selection in HIV-1 primary infection. Using a comparative dN/dS approach (53) and a simulation approach that identifies directional selection (41), we identified amino acid sites under positive selection for the four individuals whose data are shown in Fig. 3B. Over the whole proteome, an average of 24 sites were under positive selection for each individual (range, 20 in R3 with 222 days of follow-up to 37 in S2 with 346 days of follow- up) (see Table 4 posted at http://mullinslab.microbiol.washington.edu/publications/herbeck_2011/). No significant change in the number of potential N-linked glycosylation sites (PNGS) was seen over these time periods or between transmitters and recipients (Fig. 3E). The mean number of PNGS ranged between 27 and 34 per sequence. However, only two to five PNGS had variation (of which only one site, in S1, had a positively selected mutation).

To assess T cell-mediated pressure on HIV-1 evolution, we analyzed CTL responses and predicted epitopes based on each individual's HLA type. Akin to the dip in viral diversity, we noted that the average number of predicted epitopes also decreased in the first ∼50 days after infection (Fig. 3D). However, with the exception of subject S2, these dips occurred later and for a more prolonged period than the dips in viral diversity for the same individuals. The above data along with CTL response data are illustrated for four newly infected individuals: three enrolled in Fiebig stage I (Fig. 5A; see also Fig. 3 posted at http://mullinslab.microbiol.washington.edu/publications/herbeck_2011/) and one in Fiebig stage V (see Fig. 4 posted at the URL mentioned above). Overall, mutations accumulated gradually over the genome through time. The initial appearance of a mutation that later came to fixation in the Tat protein was detected at 7 days in subject S1 (Fig. 5A). The earliest fixations of mutant amino acids were at 21 (in Tat from S1) (Fig. 5A) and 33 (in Nef from R3) (see Fig. 4 posted at the URL mentioned above) days although the mutation in Tat was not identified as positively selected by the two algorithms used here due to the extremely abrupt change in the population. By ∼6 months postonset of symptoms (181 to 210 days), positively selected mutations were much more frequent, ranging from 9 in subject S2 to 18 in S3. Selected loci were more frequent in the 3′ half of the genome, which includes the most variable HIV-1 genes.

Fig. 5.

Fig. 5.

Fig. 5.

InSites diagrams of genomes from longitudinal samples. The figure shows the alignment of phylogenetically informative sites identified in genome sequences relative to the visit 1 consensus sequence in the recipient. Genome sequences from different time points are separated by horizontal lines; days postonset of symptoms are displayed on the left of each row. The header row includes the visit 1 consensus sequence with HXB2 numbering, shaded in gray for positively selected sites (as detected by FEL or by a simulation approach [41, 53]) and in purple for putative N-linked glycosylation sites. The bottom row shows known or potential epitopes predicted by NetMHC (unshaded), by Epipred (shaded in coral), or by both methods (in yellow). Amino acid sites within epitopes are shaded in black; amino acid sites located near known or predicted epitopes (up to 5 aa away) are shaded in gray. Green boxes surround HIV-1 segments recognized by IFN-γ ELISPOT responses. Red boxes surround mutually exclusive mutation patterns. Orange cells represent forward mutations (decrease in database frequency of the amino acid by 50% or more), blue cells represent reverse mutations (increase in database frequency of the amino acid by 50% or more), and green cells represent less substantial changes in database frequency.

When examining mutations in CTL epitopes (recognized and predicted), we noted several instances of the initial mutations being replaced by secondary mutations located nearby and usually in mutually exclusive sequence patterns. These patterns were seen in each of the four individuals followed for more than 180 days at one to eight sites across the proteome (Fig. 5A; see also Fig. 3 and 4 posted at http://mullinslab.microbiol.washington.edu/publications/herbeck_2011/). Similar to the first amino acid mutations, the second mutations noted were most often to amino acids of low database frequency. These mutual exclusion patterns were seen in epitopes corresponding to three CTL responses against Env in S1 (outlined in green boxes in Fig. 5). In this complex case, the original Env epitopes were replaced by day 68 by two to four variants harboring mutually exclusive mutations. A response was detected against the known epitope SFNCGGEFF (C04; residues 375 to 383) (SFC of 620 at day 127; not measured at day 7), which had been replaced by day 68 by two mutually exclusive variants (mutated residues are underlined in the sequences) SVNCGGEFF and SFNCRGEFF. The epitope RRGWEILKY (A01; residues 787 to 795) represented >90% of sequences until day 13 and only 27% at day 21 and was not detected afterwards, while the variant RRGWETLKY became the consensus (the ELISPOT assay response was 15 SFC at day 7 and 715 at day 127). A stronger response at day 127 (SFC of 1,310; not detected at day 7) was elicited against RQGLERALL (B08; residues 848 to 856), which was the predominant variant until day 44, when it was replaced by RQGLERVLL/RQGLERAFL. Five more CTL responses were detected by ELISPOT assay in subject S1; however, their targeted epitopes showed no sequence variation over 181 days of follow-up. Three other examples of mutually exclusive mutations were observed in this subject. In Pol, two mutations were 7 amino acids (aa) apart in the predicted epitope B*0801; in RGRRKVVSL an R-to-K mutation found at position 1 was in mutual exclusion with an S-to-P mutation at position 8 of the epitope. In Rev, the mutually exclusive mutations were 7 aa apart, but only one site was found within a predicted epitope. In one case, the known Gag B08-restricted epitope DCKTILKAL (residues 197 to 205) was transiently replaced between days 68 and 96 by DCRTILKAL (Fig. 5A). This corresponded to a switch in database frequency from 94% to 3% for the K331R mutation (a previously documented HLA-B08-associated polymorphism [50]). The resurgence of the original amino acid at day 127 was accompanied by a mutually exclusive A-to-S mutation located 2 aa downstream of the epitope, corresponding to a 76% decrease in database frequency (from 87% to 11%).

Escape versus reversion.

We assessed the direction of mutations by comparing the conservation level of the founder and mutant amino acids in a database of circulating HIV-1 sequences (41). We defined forward (likely to be escape) mutations as those that reflected a decrease in database frequency of at least 50% (Fig. 5, shown in orange) and reverse (likely to be reversion) mutations as those that reflected an increase of at least 50% (Fig 5, shown in turquoise; see also Fig. 3 and 4 posted at http://mullinslab.microbiol.washington.edu/publications/herbeck_2011/). Amino acids with less substantial changes in database frequency are highlighted in green. A predominance of forward mutations was observed in all individuals. When we counted the mutations that became fixed, the majority corresponded to forward mutations with a drastic switch to amino acids with lower database frequencies. The ratio of forward to reversion mutations was 37/2 for subject S7, 6/0 for T4, 53/7 for S1, 91/7 for S2, 55/10 for R3, and 4/1 for R4 (for R4 for whom two founder variants were identified, we qualified mutations relative to the consensus corresponding to the respective founder variant).

Reverse mutations were also rare when we analyzed only mutations located in targeted/predicted epitopes. Regarding S2, only one reverse mutation among 11 potential epitopes was suggested (Env; C08-QFEDKTIIF replaced by QFENKTIIF at day 98); the original residue D was found in 0.005% of sequences in the HIV database, while N was found in 96%. For S3, two putative reversions were found, both in Env, including the mutation of IYAPPIQGL to MYAPPIQGL, corresponding to a switch from residues found in 1% (I) to 98% (M) of database sequences. In contrast, several possible escape mutations were seen in Env, Nef, and Gag, including some complex patterns with, for example, four different amino acid mutations in the known Nef epitope VLMWKFDSHL (A02); all were found in less than 8% of circulating sequences. Six independent ELISPOT assay responses were detected in R3, all against invariable epitopes, except for one Env response in which the original DPNPQEIRL epitope was replaced by DPNPQEIGL from day 144.

DISCUSSION

In this study, we described the evolution of HIV-1 genome sequences encompassing the entire viral proteome in 11 individuals, including four transmission pairs, and with viral sequencing prior to and following the peak viremia of acute infection in four individuals.

Infection by a single variant that is typical of the distribution of variants in the transmitter.

We reaffirmed that, in MSM, HIV-1 infection is typically established by a single founder variant. A recent report underlined a higher proportion of infections with multiple variants in MSM (36%; 10 of 28) than in heterosexual transmissions (40). In contrast, we have observed multiple founder infections in about 20% of MSM transmissions we studied (1 of 9 in this study; 5 of 37 [22] and 16 of 65 [58] previously). It has been unclear, however, whether the presence of a single founder reflects a selective process and whether that process occurs at transmission or in the earliest stages of HIV-1 infection. A large fraction of transmissions are thought to occur during acute/early infection (12, 66, 72), when viral load is highest (27, 51). On this basis alone, one would expect to find single founder variants in recipients infected by transmitters in the early stages of infection (when viral populations are typically homogeneous). If the transmitting partner were in a later stage of infection with a diverse viral population, however, infections could be established by several variants. Moreover, the number of founder strains may reflect the network of HIV-1 transmissions; recent data showed that 25% of transmissions occurred in the first 6 months of infection in a cohort of MSM in the United Kingdom, as opposed to 1% for heterosexual transmissions (29). Importantly, while our cohort consisted of MSM, the restriction to a single founder virus was not due to a lack of variation in the transmitter viruses. We studied three transmission pairs in which the donor was chronically infected, and, despite extensive variation in the transmitters, only one recipient contained as many as two founder variants (and different genomes of these two were nearly homogeneous). Thus, establishment of infection by a single variant is likely not a result of lack of variation in the transmitter and must be related to a biological mechanism that restricts the establishment of multiple variants.

The founder variant in the recipient did not appear to be rare in the transmitter but was, rather, a representative variant from the complex viral population in the transmitter. This corroborates data showing that viruses isolated during acute or chronic infection were not distinguishable in terms of viral fitness (7, 36). Although the founder variants were typical of the population of sequences found in the transmitter, the founder in the recipient differed from variants predominating in the transmitter, and these differences warrant further analyses. This issue will need to be addressed with larger sampling and deep sequencing of the viral populations of the transmitter and recipient, the evaluation of viruses from semen in addition to plasma (as the possibility of compartmentalization has been shown [e.g., in references 3, 11, and 52], albeit not consistently [3, 13, 15]), and characterization of selective processes that could explain potential differences between transmitter and recipient viruses. Across transmission pairs, there were about 100 sites that varied over the genome. We considered the possibility that the distinguishing mutations between transmitter and recipient viruses represented reversions in the recipient of CTL escape mutations developed in the transmitter, yet we found little evidence of such a retrograde process over the time periods examined.

Stochastic evolution related to viremic expansion.

During the rapid viral population growth of primary infection, HIV-1 evolution appears to be stochastic before signs of adaptation emerge within 1 to 3 weeks after the onset of symptoms of acute HIV-1 infection. Our data conform to a model of exponential population growth with the development of substitutions in a star-like phylogenetic pattern (68), consistent with the proposal of Lee et al. (38) describing a single infection. We observed star-like tree topologies as a result of both the single-variant founder effect (short distances to the most recent common ancestor [MRCA]) and the multiplicity of variable sites rapidly developing in the genomes (data not shown); sequences from these early time points (before ∼50 days) showed no temporal clustering, in contrast to more protracted periods of evolution (65). By including longitudinal sequence data starting before peak plasma viremia, we showed that demographic effects were dominant during the rapid population expansion and contraction of the first 50 days after the onset of symptoms although positively selected sites were detected (in a sampling of an average of 10 viral genomes per time point) within 1 to 3 weeks postonset of symptoms. Whatever forces are responsible for the successful outgrowth of the founder strain, positive selection is minimal at this stage.

Diversity contraction.

Evidence of purifying selection was suggested by the slight contraction of genetic diversity in the first 20 to 40 days. We consider this contraction to be a qualitative observation consistent with (i) the results of the neutrality tests, (ii) the rapid population growth over the same time period, and (iii) the lack of positive selection at the same time period. Although our sample size of 10 sequences per time point limits our ability to comprehensively test the contraction of genetic diversity (with 10 sequences we have a 60% chance of missing a variant that would be found in 5% of sequences), we consider this dip to be an important observation that shall be tested with larger sampling. While this dip in diversity can be explained by the concurrent rapid viral population growth, we also found that the average number of APOBEC-specific G-to-A mutations per genome also dipped during this time frame (see Fig. 2 posted at http://mullinslab.microbiol.washington.edu/publications/herbeck_2011/). This suggests that sequences found at the earliest time points (Fiebig stage I) may have been weakly enriched with APOBEC-induced mutations and that the dip in diversity may be due in part to the elimination of these variants via purifying selection. A previous study also highlighted that APOBEC-induced mutations may be an important phenomenon in early HIV-1 infection (74). In contrast, a change in the number of predicted CTL epitopes, while decreasing, generally did so over a more protracted period (Fig. 3).

Positive selection and onset of CTL-driven immune escape.

Amino acids were identified as positively selected (in a sampling of an average of 10 viral genomes per time point) at about 4 weeks postonset of symptoms, with some of these mutations emerging as early as 1 week postonset of symptoms. The replacement of the founder variant was more rapid in the 3′ half of the genome, with more mutated amino acids and higher rates of mutation of these amino acids, echoing higher nucleotide evolutionary rates in env and nef than in gag and pol. The rate found in the C2V5 region of env is comparable to rates calculated from longitudinal sampling over 6 to 12 years in chronically infected individuals (65). When assessing selected sites observed over time, we focused on the relative importance of forward (putative CTL escape) and reversion mutations as CTL have been found to be the major selective force acting on the virus population early in infection (2, 21, 41). We found more examples of mutations to rare amino acids than to conserved amino acids, suggesting more escape than reversions. Conflicting reports about the relative prevalence of reversions in early HIV-1 infection have been published (16, 21, 26, 32, 33, 39, 45). The large number of reversions found in earlier studies may have been due to the experimental methods, i.e., using (i) consensus sequencing (39), (ii) a cross-sectional cohort (45), or (iii) HXB2 or another unrelated sequence as a reference to score reversions (rather than a baseline sequence) (16). In our study, we analyzed the database frequency of mutations relative to that of the consensus sequence at the first time point, using a perhaps stringent criterion for assigning the direction of mutations: a 50% decrease/increase in database frequency for forward/reverse mutations (the reference database consists of independent circulating HIV-1 subtype B sequences from the Los Alamos National Laboratory HIV database [HIVDB]). Nonetheless, a less strict threshold of 30% also revealed many more forward than reversion mutations. A 50% threshold allows us to partially avoid counting as forward/reversion mutations sites where variation appears well tolerated. For example, we did not consider as a reversion the L80V mutation in Nef where the initial amino acid L was found in 39% of circulating sequences while V was the consensus amino acid found in 45% of sequences. Based on these criteria, evidence for reversions was minimal during acute/early infection.

Mutually exclusive changes characterize a subset of epitopic escapes.

Initial forward mutations were often replaced by (or fluctuated in the population with) variants with one or more secondary mutations that were mutually exclusive. Given the substantial decreases in database frequency observed for the mutated amino acids, the first and, in some cases, every mutation may have been detrimental to HIV-1 viability (42, 71). These maturational escape patterns may be explained by the selective pressure acting on CTL epitopes driving virus escape via multiple pathways through single amino acid mutations. The simultaneous presence of two forward mutations may be too debilitating for variant survival (in almost all cases, the mutation corresponded to a sharp drop in database frequency), resulting in mutual exclusion. Each mutation may be detrimental to viral fitness although the second mutation is perhaps sufficiently less so to provide a selective advantage over the primary mutation. Alternatively, the secondary mutations may confer escape against responses elicited by the initial escape mutant although substantial population shifts reflecting these changes occurred in less than 2 weeks. The A-to-S mutation observed in the 5′ upstream region of the Gag B08-restricted potential epitope DCKTILKAL in subject S1 could also be a processing mutation, as previously reported in R1 (41). Detailed analysis of the functional consequences of these changes and CTL responses toward each of these variants will shed light on the selective forces driving these alternative mutations. We conjecture that this sequential, maturational pattern of linked mutually exclusive mutations might be more flagrant in acute infections as an unstable mutation might rapidly be removed by selection. Later in infection, other new mutations might serve as compensatory sites for previously deleterious mutations, and mutually exclusive patterns may be harder to identify due to the readily available set of potential compensatory mutations in a diverse viral population that could be obtained via recombination.

The results reported here should influence the design of vaccine immunogens. For example, understanding the forces (selective or stochastic) acting on the establishment of the founder strain(s) in HIV-1 infections can help in the design of vaccines that take into account evolutionary pathways shared among founder viruses. Recognition of the dynamic evolution of CTL epitopes will assist efforts to develop antigen cocktails that seek to block escape pathways. However, our studies illustrate the difficulty in blocking such a dynamic repertoire of antigenic determinants.

ACKNOWLEDGMENTS

Funding for this study was provided to J.I.M. by U.S. Public Health Service grants P01AI57005 and R37AI47734, to J.T.H. and J.I.M. by the University of Washington Center for AIDS Research (P30 AI27757), to J.T.H. by NIH T32 AI07140, and to M.R. by an amfAR Mathilde Krim Fellowship, 107005-43-RFNT.

Footnotes

Published ahead of print on 18 May 2011.

REFERENCES

  • 1. Abrahams M. R., et al. 2009. Quantitating the multiplicity of infection with human immunodeficiency virus type 1 subtype C reveals a non-Poisson distribution of transmitted variants. J. Virol. 83:3556–3567 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Allen T. M., et al. 2005. Selective escape from CD8+ T-cell responses represents a major driving force of human immunodeficiency virus type 1 (HIV-1) sequence diversity and reveals constraints on HIV-1 evolution. J. Virol. 79:13239–13249 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Anderson J. A., et al. 2010. HIV-1 populations in semen arise through multiple mechanisms. PLoS Pathog. 6:e1001053. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Bar K. J., et al. 2010. Wide variation in the multiplicity of HIV-1 infection among injection drug users. J. Virol. 84:6241–6247 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Bhattacharya T., et al. 2007. Founder effects in the assessment of HIV polymorphisms and HLA allele associations. Science 315:1583–1586 [DOI] [PubMed] [Google Scholar]
  • 6. Borrow P., Lewicki H., Hahn B. H., Shaw G. M., Oldstone M. B. 1994. Virus-specific CD8+ cytotoxic T-lymphocyte activity associated with control of viremia in primary human immunodeficiency virus type 1 infection. J. Virol. 68:6103–6110 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Brockman M. A., et al. 2010. Early selection in Gag by protective HLA alleles contributes to reduced HIV-1 replication capacity that may be largely compensated for in chronic infection. J. Virol. 84:11937–11949 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Bunce M., Fanning G. C., Welsh K. I. 1995. Comprehensive, serologically equivalent DNA typing for HLA-B by PCR using sequence-specific primers (PCR-SSP). Tissue Antigens 45:81–90 [DOI] [PubMed] [Google Scholar]
  • 9. Busch M. P., Satten G. A. 1997. Time course of viremia and antibody seroconversion following human immunodeficiency virus exposure. Am. J. Med. 102:117–126 [DOI] [PubMed] [Google Scholar]
  • 10. Buus S., et al. 2003. Sensitive quantitative predictions of peptide-MHC binding by a “Query by Committee” artificial neural network approach. Tissue Antigens 62:378–384 [DOI] [PubMed] [Google Scholar]
  • 11. Byrn R. A., Zhang D., Eyre R., McGowan K., Kiessling A. A. 1997. HIV-1 in semen: an isolated virus reservoir. Lancet 350:1141. [DOI] [PubMed] [Google Scholar]
  • 12. Delwart E., et al. 2002. Homogeneous quasispecies in 16 out of 17 individuals during very early HIV-1 primary infection. AIDS 16:189–195 [DOI] [PubMed] [Google Scholar]
  • 13. Delwart E. L., et al. 1998. Human immunodeficiency virus type 1 populations in blood and semen. J. Virol. 72:617–623 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. Delwart E. L., Sheppard H. W., Walker B. D., Goudsmit J., Mullins J. I. 1994. Human immunodeficiency virus type 1 evolution in vivo tracked by DNA heteroduplex mobility assays. J. Virol. 68:6672–6683 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Diem K., et al. 2008. Male genital tract compartmentalization of human immunodeficiency virus type 1 (HIV). AIDS Res. Hum. Retroviruses 24:561–571 [DOI] [PubMed] [Google Scholar]
  • 16. Duda A., et al. 2009. HLA-associated clinical progression correlates with epitope reversion rates in early human immunodeficiency virus infection. J. Virol. 83:1228–1239 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Fiebig E. W., et al. 2003. Dynamics of HIV viremia and antibody seroconversion in plasma donors: implications for diagnosis and staging of primary HIV infection. AIDS 17:1871–1879 [DOI] [PubMed] [Google Scholar]
  • 18. Fu Y. X., Li W. H. 1993. Statistical tests of neutrality of mutations. Genetics 133:693–709 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Furuta Y., Bergstrom T., Norkrans G., Horal P. 1994. HIV type 1 V3 sequence diversity in contact-traced Swedish couples at the time of sexual transmission. AIDS Res. Hum. Retroviruses 10:1187–1191 [DOI] [PubMed] [Google Scholar]
  • 20. Gojobori T., Yamaguchi Y., Ikeo K., Mizokami M. 1994. Evolution of pathogenic viruses with special reference to the rates of synonymous and nonsynonymous substitutions. Jpn. J. Genet. 69:481–488 [DOI] [PubMed] [Google Scholar]
  • 21. Goonetilleke N., et al. 2009. The first T cell response to transmitted/founder virus contributes to the control of acute viremia in HIV-1 infection. J. Exp. Med. 206:1253–1272 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Gottlieb G. S., et al. 2008. HIV-1 variation before seroconversion in men who have sex with men: analysis of acute/early HIV infection in the multicenter AIDS cohort study. J. Infect. Dis. 197:1011–1015 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. Guindon S., Gascuel O. 2003. A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst. Biol. 52:696–704 [DOI] [PubMed] [Google Scholar]
  • 24. Haaland R. E., et al. 2009. Inflammatory genital infections mitigate a severe genetic bottleneck in heterosexual transmission of subtype A and C HIV-1. PLoS Pathog 5:e1000274. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25. Heckerman D., Kadie C., Listgarten J. 2006. Leveraging information across HLA alleles/supertypes improves epitope prediction, p. 296–308 In Proceedings of the Tenth Annual International Conference on Research in Computational Molecular Biology (RECOMB), Venice, Italy [Google Scholar]
  • 26. Herbeck J. T., et al. 2006. Human immunodeficiency virus type 1 env evolves toward ancestral states upon transmission to a new host. J. Virol. 80:1637–1644 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27. Hollingsworth T. D., Anderson R. M., Fraser C. 2008. HIV-1 transmission, by stage of infection. J. Infect. Dis. 198:687–693 [DOI] [PubMed] [Google Scholar]
  • 28. Hudson R. R., Kreitman M., Aguade M. 1987. A test of neutral molecular evolution based on nucleotide data. Genetics 116:153–159 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29. Hughes G. J., et al. 2009. Molecular phylodynamics of the heterosexual HIV epidemic in the United Kingdom. PLoS Pathog. 5:e1000590. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Jensen M. A., et al. 2003. Improved coreceptor usage prediction and genotypic monitoring of R5-to-X4 transition by motif analysis of human immunodeficiency virus type 1 Env V3 loop sequences. J. Virol. 77:13376–13388 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31. Jordan M. R., et al. 2010. Comparison of standard PCR/cloning to single genome sequencing for analysis of HIV-1 populations. J. Virol. Methods 168:114–120 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32. Kawashima Y., et al. 2009. Adaptation of HIV-1 to human leukocyte antigen class I. Nature 458:641–645 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33. Kearney M., et al. 2009. Human immunodeficiency virus type 1 population genetics and adaptation in newly infected individuals. J. Virol. 83:2715–2727 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34. Keele B. F., et al. 2008. Identification and characterization of transmitted and early founder virus envelopes in primary HIV-1 infection. Proc. Natl. Acad. Sci. U. S. A. 105:7552–7557 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35. Koup R. A., et al. 1994. Temporal association of cellular immune responses with the initial control of viremia in primary human immunodeficiency virus type 1 syndrome. J. Virol. 68:4650–4655 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36. Lassen K. G., et al. 2009. Elite suppressor-derived HIV-1 envelope glycoproteins exhibit reduced entry efficiency and kinetics. PLoS Pathog. 5:e1000377. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37. Learn G. H., et al. 2002. Virus population homogenization following acute human immunodeficiency virus type 1 infection. J. Virol. 76:11953–11959 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38. Lee H. Y., et al. 2009. Modeling sequence evolution in acute HIV-1 infection. J. Theor. Biol. 261:341–360 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39. Li B., et al. 2007. Rapid reversion of sequence polymorphisms dominates early human immunodeficiency virus type 1 evolution. J. Virol. 81:193–201 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40. Li H., et al. 2010. High multiplicity infection by HIV-1 in men who have sex with men. PLoS Pathog. 6:e1000890. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41. Liu Y., et al. 2006. Selection on the human immunodeficiency virus type 1 proteome following primary infection. J. Virol. 80:9519–9529 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42. Liu Y., et al. 2007. Evolution of human immunodeficiency virus type 1 cytotoxic T-lymphocyte epitopes: fitness-balanced escape. J. Virol. 81:12179–12188 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43. Lyles R. H., et al. 2000. Natural history of human immunodeficiency virus type 1 viremia after seroconversion and proximal to AIDS in a large cohort of homosexual men. Multicenter AIDS Cohort Study. J. Infect. Dis. 181:872–880 [DOI] [PubMed] [Google Scholar]
  • 44. Maddison W. P., Maddison D. R. 2001. MacClade: analysis of phylogeny and character evolution, version 4. Sinauer Associates, Inc., Sunderland, MA [Google Scholar]
  • 45. Matthews P. C., et al. 2008. Central role of reverting mutations in HLA associations with human immunodeficiency virus set point. J. Virol. 82:8548–8559 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46. Mellors J. W., et al. 1995. Quantitation of HIV-1 RNA in plasma predicts outcome after seroconversion. Ann. Intern. Med. 122:573–579 [DOI] [PubMed] [Google Scholar]
  • 47. Mellors J. W., et al. 2007. Prognostic value of HIV-1 RNA, CD4 cell count, and CD4 Cell count slope for progression to AIDS and death in untreated HIV-1 infection. JAMA 297:2349–2350 [DOI] [PubMed] [Google Scholar]
  • 48. Mellors J. W., et al. 1996. Prognosis in HIV-1 infection predicted by the quantity of virus in plasma. Science 272:1167–1170 (Erratum, 275:14, 1997.) [DOI] [PubMed] [Google Scholar]
  • 49. Nielsen M., et al. 2003. Reliable prediction of T-cell epitopes using neural networks with novel sequence representations. Protein Sci. 12:1007–1017 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50. Nowak M. A., et al. 1995. Antigenic oscillations and shifting immunodominance in HIV-1 infections. Nature 375:606–611 [DOI] [PubMed] [Google Scholar]
  • 51. Pilcher C. D., et al. 2004. Brief but efficient: acute HIV infection and the sexual transmission of HIV. J. Infect. Dis. 189:1785–1792 [DOI] [PubMed] [Google Scholar]
  • 52. Pillai S. K., et al. 2005. Semen-specific genetic characteristics of human immunodeficiency virus type 1 env. J. Virol. 79:1734–1742 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53. Pond S. L., Frost S. D., Muse S. V. 2005. HyPhy: hypothesis testing using phylogenies. Bioinformatics 21:676–679 [DOI] [PubMed] [Google Scholar]
  • 54. Poss M., et al. 1998. Evolution of envelope sequences from the genital tract and peripheral blood of women infected with clade A human immunodeficiency virus type 1. J. Virol. 72:8240–8251 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55. Ritola K., et al. 2004. Multiple V1/V2 env variants are frequently present during primary infection with human immunodeficiency virus type 1. J. Virol. 78:11208–11218 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56. Rodrigo A. G., Mullins J. I. 1996. Human immunodeficiency virus type 1 molecular evolution and the measure of selection. AIDS Res. Hum. Retroviruses 12:1681–1685 [DOI] [PubMed] [Google Scholar]
  • 57. Rolland M., Nickle D. C., Mullins J. I. 2007. HIV-1 group M conserved elements vaccine. PLoS Pathog. 3:e157. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58. Rolland M., et al. 2011. Genetic impact of vaccination on breakthrough HIV-1 sequences from the STEP trial. Nat. Med. 17:366–371 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59. Rousseau C., et al. 2006. Large-scale amplification, cloning and sequencing of near full-length HIV-1 subtype C genomes. J. Virol. Methods 136:118–125 [DOI] [PubMed] [Google Scholar]
  • 60. Rozas J., Rozas R. 1999. DnaSP version 3: an integrated program for molecular population genetics and molecular evolution analysis. Bioinformatics 15:174–175 [DOI] [PubMed] [Google Scholar]
  • 61. Salazar-Gonzalez J. F., et al. 2008. Deciphering human immunodeficiency virus type 1 transmission and early envelope diversification by single-genome amplification and sequencing. J. Virol. 82:3952–3970 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62. Salazar-Gonzalez J. F., et al. 2009. Genetic identity, biological phenotype, and evolutionary pathways of transmitted/founder viruses in acute and early HIV-1 infection. J. Exp. Med. 206:1273–1289 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63. Scarlatti G., et al. 1997. In vivo evolution of HIV-1 co-receptor usage and sensitivity to chemokine-mediated suppression. Nat. Med. 3:1259–1265 [DOI] [PubMed] [Google Scholar]
  • 64. Schacker T. W., Hughes J. P., Shea T., Coombs R. W., Corey L. 1998. Biological and virologic characteristics of primary HIV infection. Ann. Intern. Med. 128:613–620 [DOI] [PubMed] [Google Scholar]
  • 65. Shankarappa R., et al. 1999. Consistent viral evolutionary changes associated with the progression of human immunodeficiency virus type 1 infection. J. Virol. 73:10489–10502 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66. Sheppard H. W., Ascher M. S. 1992. The relationship between AIDS and immunologic tolerance. J. Acquir. Immune Defic. Syndr. 5:143–147 [PubMed] [Google Scholar]
  • 67. Shriner D., Rodrigo A. G., Nickle D. C., Mullins J. I. 2004. Pervasive genomic recombination of HIV-1 in vivo. Genetics 167:1573–1583 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68. Slatkin M., Hudson R. R. 1991. Pairwise comparisons of mitochondrial DNA sequences in stable and exponentially growing populations. Genetics 129:555–562 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69. Tajima F. 1989. Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123:585–595 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70. Thompson J. D., Higgins D. G., Gibson T. J. 1994. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22:4673–4680 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71. Troyer R. M., et al. 2009. Variable fitness impact of HIV-1 escape mutations to cytotoxic T lymphocyte (CTL) response. PLoS Pathog. 5:e1000365. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72. Wawer M. J., et al. 2005. Rates of HIV-1 transmission per coital act, by stage of HIV-1 infection, in Rakai, Uganda. J. Infect. Dis. 191:1403–1409 [DOI] [PubMed] [Google Scholar]
  • 73. Wolfs T. F., Zwart G., Bakker M., Goudsmit J. 1992. HIV-1 genomic RNA diversification following sexual and parenteral virus transmission. Virology 189:103–110 [DOI] [PubMed] [Google Scholar]
  • 74. Wood N., et al. 2009. HIV evolution in early infection: selection pressures, patterns of insertion and deletion, and the impact of APOBEC. PLoS Pathog. 5:e1000414. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75. Zhang L. Q., et al. 1993. Selection for specific sequences in the external envelope protein of human immunodeficiency virus type 1 upon primary infection. J. Virol. 67:3345–3356 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76. Zhang M., et al. 2004. Tracking global patterns of N-linked glycosylation site variation in highly variable viral glycoproteins: HIV, SIV, and HCV envelopes and influenza hemagglutinin. Glycobiology 14:1229–1246 [DOI] [PubMed] [Google Scholar]
  • 77. Zhu T., et al. 1993. Genotypic and phenotypic characterization of HIV-1 in patients with primary infection. Science 261:1179–1181 [DOI] [PubMed] [Google Scholar]

Articles from Journal of Virology are provided here courtesy of American Society for Microbiology (ASM)

RESOURCES