Skip to main content
Journal of Virology logoLink to Journal of Virology
. 2012 Dec;86(23):12582–12590. doi: 10.1128/JVI.01440-12

Constraints on Viral Evolution during Chronic Hepatitis C Virus Infection Arising from a Common-Source Exposure

Justin R Bailey a, Sarah Laskey a, Lisa N Wasilewski a, Supriya Munshaw a, Liam J Fanning c, Elizabeth Kenny-Walsh c, Stuart C Ray a,b,
PMCID: PMC3497661  PMID: 22973048

Abstract

Extraordinary viral sequence diversity and rapid viral genetic evolution are hallmarks of hepatitis C virus (HCV) infection. Viral sequence evolution has previously been shown to mediate escape from cytotoxic T-lymphocyte (CTL) and neutralizing antibody responses in acute HCV infection. HCV evolution continues during chronic infection, but the pressures driving these changes are poorly defined. We analyzed plasma virus sequence evolution in 5.2-kb hemigenomes from multiple longitudinal time points isolated from individuals in the Irish anti-D cohort, who were infected with HCV from a common source in 1977 to 1978. We found phylogenetically distinct quasispecies populations at different plasma time points isolated late in chronic infection, suggesting ongoing viral evolution and quasispecies replacement over time. We saw evidence of early pressure driving net evolution away from a computationally reconstructed common ancestor, known as Bole1b, in predicted CTL epitopes and E1E2, with balanced evolution toward and away from the Bole1b amino acid sequence in the remainder of the genome. Late in chronic infection, the rate of evolution toward the Bole1b sequence increased, resulting in net neutral evolution relative to Bole1b across the entire 5.2-kb hemigenome. Surprisingly, even late in chronic infection, net amino acid evolution away from the infecting inoculum sequence still could be observed. These data suggest that, late in chronic infection, ongoing HCV evolution is not random genetic drift but rather the product of strong pressure toward a common ancestor and concurrent net ongoing evolution away from the inoculum virus sequence, likely balancing replicative fitness and ongoing immune escape.

INTRODUCTION

An estimated 170 million individuals are infected with hepatitis C virus (HCV) worldwide (2, 47), and approximately two-thirds of infected individuals subsequently develop chronic infection, which persists throughout life without antiviral treatment. Chronic HCV infection remains the leading cause of hepatocellular carcinoma and liver transplantation in the United States (30, 38, 43). Despite significant recent improvements in HCV treatment (16), a vaccine to prevent HCV infection is still desperately needed.

HCV replicates to high viral loads using an error-prone polymerase, generating in each host a group of related but genetically distinct viral variants called quasispecies (4, 31). This extensive genetic diversity and high rate of viral genetic evolution are major challenges for vaccine design. It has been demonstrated in simian immunodeficiency virus (SIV) and human immunodeficiency virus (HIV) infection that selection from among a distribution of quasispecies variants allows viral escape from immune pressure, but that this escape is balanced by fitness constraints that drive reversion to restore replicative fitness (15, 27). Similarly, HCV has been shown to escape immune pressure from cytotoxic T lymphocytes (CTL) and neutralizing antibody in acute infection, and studies have shown evidence of both escape and reversion, suggesting that intrinsic viral replicative fitness also constrains HCV escape from immune selection in vivo (8, 11, 19, 23, 37, 39, 42, 46).

Most studies of HCV evolution and escape have been performed early in infection, as the extensive quasispecies diversity during chronic infection complicates analysis of evolution (4, 7, 9, 12, 36, 44). It has therefore remained unclear what forces drive ongoing HCV evolution during decades of chronic infection, particularly given evidence that breadth and magnitude of cellular immune responses wane during this time (7, 17, 24, 41) but neutralizing antibody responses are maintained (11, 29, 35, 45). To analyze HCV evolution during chronic infection, we used longitudinal plasma samples from the Irish anti-D cohort, a group of women inadvertently infected with genotype 1b HCV from a single acutely infected source through treatment with contaminated anti-D immune globulin between May 1977 and November 1978 (Fig. 1) (20). In addition, we used the Bole1b sequence, a phylogenetically reconstructed common ancestor of all known genotype 1b HCV sequences (S. Munshaw and S. C. Ray, unpublished data), as a reference point in the analysis of genetic evolution in the anti-D cohort (Fig. 2A). Bole1b is analogous to the recently described genotype 1a sequence, Bole1a (5, 33). Evolution toward Bole1b is comparable to centripetal change toward a worldwide genotype 1b consensus sequence (1, 6, 23, 27, 42), except, since immune-selected changes tend to be enriched in branch tips on phylogenetic trees (3), measurement of evolution toward Bole1b may more accurately measure pressure toward optimal replicative fitness (5).

Fig 1.

Fig 1

Timeline of anti-D cohort plasma sample isolation. Hemigenomes (5.2 kb) were amplified from inoculum virus (time A) and plasma virus from 10 infected subjects from two time points (time B and time C) during chronic infection.

Fig 2.

Fig 2

Viruses isolated at longitudinal time points are phylogenetically distinct. Phylogenetic trees of anti-D HCV sequences spanning 1,651 amino acids from the Core through NS3 proteins. (A) Neighbor-joining amino acid tree with anti-D inoculum (purple circles), chronic anti-D sequences, unrelated genotype 1b sequences, and Bole1b (green circle). (B) Neighbor-joining amino acid trees of anti-D sequences. Center tree contains Bole 1b (green circle) and time A (inoculum) sequences (purple circles). Time B and time C sequences are shown for each study subject, with each color indicating sequences from a different subject. Dots at proximal nodes indicate bootstrap values of >94. For outer trees, green circles indicate Bole 1b, purple circles indicate time A sequences, blue circles indicate sequences amplified from time B plasma, and red circles indicate sequences amplified from time C plasma. Asterisks indicate subjects with statistically significant phylogenetic separation between time B and time C sequences.

Using a high-fidelity protocol, we amplified, cloned, and sequenced 204 independent 5.2-kb hemigenomes spanning regions encoding Core through NS3 proteins from the inoculum virus and plasma of 10 anti-D cohort subjects at two chronic infection time points separated by 3 to 7 years. We analyzed the phylogenetic relationship between clones from these three time points and, using novel computational methods, mapped and characterized quasispecies amino acid changes from the inoculum sample (time A) to the first chronic time point (time B) as well as from the first chronic infection time point to the second (time C). These amino acid changes were characterized as to whether they represented changes away from, toward, or tangential to the Bole1b amino acid sequence. We also characterized time B to time C amino acid changes as away from, toward, or tangential to the inoculum (time A) virus amino acid sequence. Finally, amino acid changes were mapped relative to HLA-matched or unmatched class I epitopes.

MATERIALS AND METHODS

Study subjects.

Ten women from the anti-D cohort were studied because time B and time C specimens were available, they provided consent, their HLA class I genotyping was complete, and 5.2-kb hemigenomes had previously been amplified from their time B plasma. Informed consent was obtained from the subjects studied, and the research protocol was approved by the Cork University Hospital Ethics Committee.

Reverse transcription-PCR (RT-PCR) amplification and sequencing.

Hemigenomes (5.2 kb) from time A and time B were previously amplified, cloned, and stored as glycerol stocks. Four previously sequenced 5.2-kb time A clones and 2 previously sequenced time B clones per subject were used in this analysis (39). Previous studies demonstrated that the inoculum (time A) virus was homogeneous and time B virus was heterogeneous. To address the heterogeneity of time B virus, 8 additional time B clones for each subject were amplified from glycerol stocks and sequenced. Hemigenomes of 5.2 kb spanning Core-NS3 genes were also amplified from time C plasma according to previously described methods (39), except that PCR was performed using Accuprime Pfx (Invitrogen). The majority of samples could be amplified with outer PCR only. If necessary, a nested PCR was performed. Test PCRs with 1:40 diluted cDNA were invariably positive, confirming low probability of template resampling in PCRs performed with undiluted cDNA. Amplicons of 5.2 kb were Topo cloned (Invitrogen), and 10 clones from each time point in each individual were sequenced. Sequences were aligned using Clustal X, and alignments were manually adjusted in Bioedit. Probable PCR errors (single-nucleotide changes present in only a single clone in the entire 204 clone alignment) were removed using CleanCollapse (v1.6; http://sray.med.som.jhmi.edu/SCRoftware/CleanCollapse). Single-nucleotide insertions or deletions in homopolymeric tracts were likewise considered to be PCR or sequencing artifacts and removed prior to analysis.

Characterization of nucleotide and amino acid changes.

The Bole1b sequence was generated using previously published methods (33). Neighbor-joining amino acid trees were constructed with bootstrapping using the Jones-Taylor-Thornton (JTT) model in Mega version 5. Unifrac analysis was performed using FastUnifrac (http://bmf2.colorado.edu/fastunifrac/) with 1,000 Monte Carlo iterations (18). Sliding window analyses of nonsynonymous and synonymous changes were done with VarPlot (v1.7; http://sray.med.som.jhmi.edu/SCRoftware/VarPlot), with dN and dS calculated by the Nei-Gojobori method (34). Pairwise synonymous and nonsynonymous distances between clones were calculated by the Nei-Gojobori method in Mega version 5. HVR1 was excluded from these calculations. Counting and characterization of amino acid changes from time A to time B and time B to time C relative to Bole1b sequence or time A virus sequence were performed using code written for Python by S. Laskey (available on request). Amino acid changes between time points at each position in the 1,651-amino-acid sequence were categorized as away from, toward, or tangential to reference sequence (Bole1b or time A sequence) using the following criteria: change is classified as “away” when the time A amino acid is the same as the reference sequence amino acid and time B amino acid at the same position is different from the reference sequence amino acid; change is classified as “toward” when the time A amino acid is different from the reference sequence amino acid and the time B amino acid at the same position is the same as the reference sequence amino acid; and change is classified as “tangential” when the time A amino acid is different from the reference sequence amino acid and the time B amino acid at the same position is different from the reference sequence amino acid.

CTL epitope analysis.

A list of published HCV T cell epitopes was obtained from the Immune Epitope Database (www.immuneepitope.org). All class I-restricted epitopes of known HLA restriction that were less than or equal to 12 amino acids in length were aligned to the Bole1b amino acid sequences using PepMap at the Los Alamos Sequence Database (www.hiv.lanl.gov), and epitopes with less than 50% homology to Bole1b were discarded. A final list of 135 unique epitopes were used for further analysis. Evolution in each subject was analyzed relative to class I epitopes restricted by that individual's HLA type (HLA matched) as well as class I epitopes restricted by HLA types of any other study subject (HLA unmatched). Amino acid changes occurring in HLA-matched or HLA-unmatched class I epitopes were counted for all sequence comparisons from time A to time B or time B to time C. These values were then divided by the total number of amino acids in HLA-matched or HLA-unmatched epitopes for all of the clones analyzed to give a proportion of changes/amino acids. Number of changes were also added for all comparisons of all subjects together and then divided by the total number of amino acids examined to generate a proportion of all amino acids changing in HLA-matched or HLA-unmatched epitopes.

Statistical analysis.

Unifrac significance was calculated using 1,000 Monte Carlo iterations with Bonferroni correction for multiple comparisons. Significance of differences between rates of evolution away from, toward, and tangential to reference sequences was calculated using paired t tests in Excel. Significance of differences in pairwise synonymous and nonsynonymous distances was calculated using t tests with Bonferroni correction for multiple comparisons. Significance of differences in proportions of amino acid changes in HLA-matched or HLA-unmatched epitopes was calculated by comparison of proportions (z test) in SigmaPlot with Bonferroni correction for multiple comparisons.

Nucleotide sequence accession numbers.

The GenBank accession numbers for the sequences used in the study are JX649674 to JX649854.

RESULTS

Viruses isolated at longitudinal time points are phylogenetically distinct.

We constructed neighbor-joining trees with 204 clonal, 1,651-amino-acid sequences from inoculum (time A) and each subject (time B and time C) (Fig. 2B). The majority of subjects showed clear phylogenetic separation between time A, time B, and time C amino acid sequences, suggesting ongoing replication and quasispecies replacement over time. For subjects AD01, AD05, AD07, and AD11, the phylogenetic separation between time B and time C sequences was statistically significant by UniFrac analysis (P < 0.001) (18). For the remaining subjects, separation between time B and time C sequences was also statistically significant when UniFrac analysis was repeated with a larger number of E1E2-only amino acid sequences (data not shown). Bootstrap analysis confirmed that time B and C sequences from each individual were more related to each other than to sequences from any other study subject (bootstrap, >94), except for subject AD05, whose time B and C sequences were not clearly related by bootstrap analysis. Given that reinfection or cross-contamination could not be ruled out for this subject, AD05 sequences were not used for further analyses.

Purifying selection dominates evolution late in chronic infection.

To better characterize the genetic evolution occurring at a nucleotide level among time A, time B, and time C, we quantitated the nonsynonymous and synonymous nucleotide changes occurring from time A to time B clones and from time A to time C clones in sliding windows across the 5.2-kb hemigenome (Fig. 3). As has been previously described from analysis of a smaller number of sequences from this cohort (39), from time A to time B, the number of synonymous changes exceeded the number of nonsynonymous changes for all regions except HVR1. The median pairwise synonymous distance from time A to time C clones was greater than the distance from time A to time B clones (0.071 versus 0.057; P < 0.001), which was expected given continuous viral replication with an error-prone polymerase. In contrast, the median pairwise nonsynonymous distance from time A to time C clones was very similar to the distance from time A to time B clones (0.011 versus 0.010; P was not significant), suggesting relatively little net nonsynonymous evolution away from the time A virus sequence over the time B to time C period. These results suggest that purifying selection dominates nonsynonymous nucleotide change during late chronic infection, likely due to viral fitness constraints.

Fig 3.

Fig 3

Purifying selection dominates evolution late in chronic infection. Sliding window analysis of nonsynonymous and synonymous nucleotide changes from time A to time B and time A to time C. An average nonsynonymous and synonymous distance for all pairwise comparisons was calculated with a 20-nucleotide sliding window and 1-nucleotide steps. Average synonymous change from time A to all time B sequences is indicated by a light blue line and synonymous change from time A to time C by a light red line. Nonsynonymous change from time A to time B is indicated by a dark blue line and nonsynonymous change from time A to time C by a dark red line. Borders of each gene and HVR1 (dashed lines) are indicated. Synonymous change exceeds nonsynonymous change in all regions except HVR1.

Evolution toward Bole1b accelerates later in chronic infection.

We next analyzed changes from time A to time B and time B to time C at an amino acid level. Pairwise comparisons were performed between each clonal amino acid sequence at each time point (4 time A, 10 time B, and 10 time C sequences for each study subject). We observed a total of 38,462 amino acid changes for 1,260 comparisons of 204 independent clonal 1,651-amino-acid sequences: 17,294 total amino acid changes from time A to time B sequences and 21,168 total amino acid changes from time B to time C sequences. The number of observed amino acid changes was divided by the total number of clonal sequence comparisons, the amino acid length of the region in question, and the number of years between time points to give an average rate of change at each site per pairwise comparison. Since E1E2 could be expected to evolve differently from non-E1E2 genes due to pressure from both cellular and humoral immunity (7, 8, 13, 19, 23, 28, 45), and because past studies have found different evolutionary patterns in HVR1, E1E2 excluding HVR1, and the non-E1E2 genes (Core, P7, NS2, and NS3 genes), we analyzed these regions separately in subsequent analyses. Here, we refer to these regions as HVR1, E1E2, and non-E1E2, respectively.

Changes in HVR1, E1E2, and non-E1E2 were characterized as away from, toward, or tangential to the Bole1b amino acid sequence, a computationally reconstructed ancestor representing genotype 1b HCV sequences. As shown in Fig. 4, from time A to time B, both non-E1E2 and E1E2 exhibited significantly higher rates of amino acid evolution away from the Bole1b sequence than toward the Bole1b sequence. From time B to time C, the rate of amino acid change away from Bole1b in both non-E1E2 and E1E2 remained constant relative to the time A to time B period. However, the rate of evolution toward Bole1b increased significantly for both regions of the genome, resulting in net neutral evolution relative to the Bole1b sequence during the time B to time C period. Surprisingly, in both non-E1E2 and E1E2, for time A to time B as well as time B to time C, the rate of amino acid change toward Bole1b significantly exceeded the rate of change tangential to the Bole1b sequence (Fig. 4). This was unexpected, since random probability of tangential change in the absence of selection (18 possible amino acids) would be much higher than probability of change toward Bole1b (1 possible amino acid). This suggests nonrandom selection across the genome favoring amino acids present in the Bole1b sequence.

Fig 4.

Fig 4

Evolution toward Bole1b accelerates later in chronic infection in all regions except HVR1. Rate of amino acid change from time A (inoculum) to time B and time B to time C relative to Bole1b. Total amino acid changes were counted for all pairwise comparisons between 4 time A sequences and 10 time B sequences for each subject and between 10 time B and 10 time C sequences for each subject. Each change was characterized as either away from, toward, or tangential to the Bole1b amino acid sequence. These values were then divided by the number of comparisons, the number of amino acids in the region in question, and the number of years between time A and time B or time B and time C for each subject. Each symbol indicates the rate for a single subject. Horizontal lines indicated medians. (A) Rate of amino acid change in non-E1E2 (Core, P7, NS2, and NS3 proteins; 1,096 sites). (B) Rate of amino acid change in E1E2 without HVR1 (529 sites). (C) Rate of amino acid change in HVR1 (26 sites).

T cell epitope evolution occurs early.

We mapped changes for each subject from time A to time B and time B to time C in 15-amino-acid sliding windows across the hemigenome. Most amino acid changes occurred in E2, with variation between subjects in sites of evolution away from, toward, and tangential to Bole1b (see Fig. S1 in the supplemental material). To better understand the sites of common amino acid evolution, we calculated the proportion of all amino acids changing as well as the proportion changing away from, toward, and tangentially to the Bole1b sequence in HLA-matched and HLA-unmatched class I epitopes for each subject and for all subjects combined (Fig. 5; also see Table S2 in the supplemental material). In non-E1E2, from time A to time B, a significantly higher proportion of amino acids changed within HLA-matched epitopes relative to HLA-unmatched epitopes (Fig. 5A). A significantly higher proportion of amino acids also changed away from and tangentially to Bole1b within HLA-matched epitopes relative to unmatched epitopes. Taken together, these findings suggest selective pressure driving evolution away from Bole1b at HLA-matched class I epitopes, which was likely due to CD8+ T cell pressure. This agrees with previous studies of HCV evolution early in infection (8, 39). In the same genes from time B to time C, this difference in evolution in matched and unmatched epitopes was no longer present. From time B to time C, the proportion of all amino acids changing, the proportion changing away from Bole1b, the proportion changing toward Bole1b, and the proportion changing tangentially to Bole1b were all equivalent in HLA-matched and HLA-unmatched epitopes (Fig. 5B). This suggests that CD8+ T cells exert less selective pressure on the virus late in chronic infection.

Fig 5.

Fig 5

T cell epitope evolution occurs early. The location of amino acid changes relative to HLA-matched and HLA-unmatched class I epitopes. Total amino acid changes, changes away from the Bole 1b amino acid sequence, changes toward Bole1b, and changes tangential to Bole 1b were mapped and identified as falling within HLA-matched (black bars) or HLA-unmatched epitopes (gray bars) for each study subject. The total number of changes of each type was added for all study subjects and divided by the total number of amino acids analyzed. P values were calculated by comparison of proportions (z test). An asterisk indicates P < 0.0001 after correction for multiple comparisons. (A) Changes in non-E1E2 (Core, P7, NS2, and NS3 proteins) from time A to time B. (B) Changes in non-E1E2 (Core, P7, NS2, and NS3 proteins) from time B to time C. (C) Changes in E1E2 (without HVR1) from time A to time B. (D) Changes in E1E2 (without HVR1) from time B to time C.

We performed similar analyses for E1E2 (Fig. 5C and D). While E1E2, like non-E1E2, showed net evolution away from Bole1b over the time A to time B period, the proportion of all E1E2 amino acids changing as well as the proportion changing away from Bole1b was equivalent in HLA-matched and HLA-unmatched epitopes (Fig. 5C). This was also true of E1E2 during the time B to time C period (Fig. 5D). Moreover, in E1E2 from time A to time B, there was actually more evolution toward Bole1b in HLA-matched epitopes than in unmatched epitopes. This suggests that CTL are not the primary force driving evolution in E1E2 in either the early or the late periods of infection.

Evolution away from inoculum continues late in infection.

Given the lack of net evolution relative to Bole1b over the time B to time C period, we also analyzed changes during this time period relative to time A (inoculum) virus sequences (Fig. 6). Surprisingly, from time B to time C, both non-E1E2 and E1E2 regions showed overall net evolution away from time A virus sequence, despite approximately 2 decades of preceding chronic infection. In both non-E1E2 and E1E2, this evolution did not localize preferentially to HLA-matched or HLA-unmatched epitopes (Fig. 7). To better understand whether ongoing evolution away from the time A virus sequence over the time B to time C period was more likely due to immune pressure or genetic drift, we compared the rate of evolution away from the time A sequence to the rate of evolution away from an unrelated genotype 1b sequence, Con1 (26) (see Fig. S2 in the supplemental material). Normalized rate of evolution in non-E1E2 away from time A virus sequence and Con1 sequence were equivalent, but in E1E2, the rate of change away from time A significantly exceeded the rate of change away from Con1, suggesting that the observed net evolution in E1E2 away from the time A virus sequence is likely nonrandom.

Fig 6.

Fig 6

Evolution away from inoculum continues late in infection. Rate of amino acid change from time B to time C relative to the time A (inoculum) virus amino acid sequence. Amino acid changes were counted for all pairwise comparisons between 10 time B and 10 time C sequences for each subject, and then each change was characterized as either away from, toward, or tangential to time A virus amino acid sequence. These values were divided by the number of sequence comparisons performed, the number of amino acids in the region in question, and the number of years between time B and time C for each subject. Each symbol indicates the rate for a single subject. Horizontal lines indicate medians. (A) Rate of amino acid change in non-E1E2 (Core, P7, NS2, and NS3 proteins; 1,096 sites). (B) Rate of amino acid change in E1E2 (without HVR1) (529 sites).

Fig 7.

Fig 7

Late evolution away from the time A virus sequence does not localize to class I-restricted epitopes. Time B to time C amino acid changes relative to time A virus sequence mapped to HLA-matched and HLA-unmatched class I epitopes. Total amino acid changes, changes away from time A virus amino acid sequence, changes toward time A virus sequence, and changes tangential to time A virus sequence were mapped and identified as falling within HLA-matched (black bars) or HLA-unmatched epitopes (gray bars) for each study subject. The total number of changes of each type was added for all study subjects and divided by the total number of amino acids analyzed. P values were calculated by comparison of proportions (z test). An asterisk indicates P < 0.0001 after correction for multiple comparisons. (A) Changes in non-E1E2 (Core, P7, NS2, and NS3 proteins) from time B to time C. (B) Changes in E1E2 (without HVR1) from time B to time C.

Evolution away from inoculum and toward Bole1b occurs simultaneously and independently.

To test the hypothesis that evolution from time B to time C was the result of concurrent and independent evolution away from time A (inoculum) virus amino acid sequence and toward the Bole1b amino acid sequence, we reanalyzed amino acid changes from time B to time C, first excluding all changes toward the Bole1b amino acid sequence and then excluding all changes away from time A virus sequence (see Fig. S3 in the supplemental material). After exclusion of all time B to C changes that were toward Bole1b, most remaining changes were away from the time A virus sequence, with very low rates of evolution toward or tangential to the time A sequence. The median rate of evolution away from the time A virus sequence was 13 times higher than the rate of evolution toward time A sequence in non-E1E2 and was 5 times higher in E1E2 (P = 0.001 for non-E1E2 and 0.002 for E1E2) (see Fig. S3A). These changes were not enriched in HLA-matched epitopes (see Fig. S4). After exclusion of all time B to C changes that were away from the time A virus sequence, the majority of remaining changes were toward Bole1b (see Fig. S3B). The median rate of evolution toward Bole1b was 21 times higher than the rate of evolution away from Bole1b in non-E1E2 and 4 times higher in E1E2 (P < 0.001 for non-E1E2 and 0.001 for E1E2). These findings confirm that evolution away from the inoculum virus amino acid sequence and toward the Bole1b amino acid sequence contribute independently to ongoing evolution late in chronic infection.

Evolution of HVR1 accelerates late in infection.

HVR1 showed a different pattern of evolution than non-E1E2 and the remainder of E1E2 excluding HVR1 (Fig. 4C; also see Fig. S5 in the supplemental material). Over both time periods studied, HVR1 showed an extremely high rate of amino acid change, approximately 10-fold higher than the rates of non-E1E2 or E1E2 (without HVR1) proteins. Unlike the remainder of the hemigenome, which showed relatively constant rates of evolution away from and tangential to Bole1b with accelerating evolution toward Bole1b, HVR1 showed a relatively constant rate of evolution toward Bole1b and an accelerating rate of evolution away from and tangential to the Bole1b sequence, suggesting ongoing immune pressure on HVR1 late in chronic infection (Fig. 4C). Rates of evolution tangential to and toward Bole1b were nearly constant from time A to time B and time B to time C, suggesting that there is less constraint on evolution in this region than in the remainder of the hemigenome. However, unlike the remainder of the hemigenome, the rate of time A to time B evolution in HVR1 toward Bole1b exceeded the rate of evolution away from Bole1b, and there was no significant net evolution in HVR1 relative to the Bole1b sequence or time A virus sequence over the time B to time C period (see Fig. S5), suggesting that evolution in HVR1 is not entirely unconstrained. Overall, HVR1 showed high rates of amino acid evolution from time A to time B and from time B to time C, suggesting strong immune pressure, and high rates of tangential change, suggesting less constraint on evolution in this region than in the remainder of the hemigenome.

DISCUSSION

Extensive quasispecies diversity is a key feature of HCV infection, and this diversity complicates analysis of pressures driving HCV evolution during chronic infection (4, 7, 8, 13, 36, 45). In this study, we amplified and sequenced 204 independent 5.2-kb clones using an RT-PCR spanning regions encoding Core through NS3 proteins, allowing analysis of longitudinal evolution of both structural and nonstructural genes. We used multiple techniques to minimize background from sporadic mutations, including use of high-fidelity PCR enzymes, in silico elimination of sporadic mutations prior to analysis, and averaging of rates of amino acid change across multiple clonal sequence comparisons. This analysis also utilized a novel computationally reconstructed ancestral genotype 1b HCV sequence, Bole1b. This sequence inherently contains fewer common immune escape mutations than an arbitrarily chosen outgroup or a consensus sequence, and amino acid changes toward the Bole1b sequence therefore are more likely to represent evolution toward optimal replicative fitness (3, 5).

Our analysis of the time A to time B period of infection supports previous analyses suggesting that T cells play a key role in control of HCV early in infection and confirms previous findings that, early in infection, in Core and nonstructural proteins, amino acid changes accumulate preferentially in HLA-matched class I epitopes (39). A previous study in other subjects from the Irish anti-D cohort concluded that HLA-B*27, HLA-A*03, and HLA-Cw*01 were associated with viral clearance (32). Later studies identified T cell epitopes commonly targeted by HLA-A*03- or HLA-B*27-positive women in the cohort and showed that T cell pressure drives selection of escape mutations (10, 14). In another study of HLA-B*57-positive individuals with persistent infection, mutations accumulated in HLA*B57-restricted epitopes in E2 and NS5 (21). It is likely that the majority of these amino acid changes occur quite early, as previous longitudinal studies of acute infection have shown development of mutations in the majority of recognized class I epitopes during the first year of infection (8). Our analysis of E1E2 from time A to time B also agrees with other analyses showing early net evolution away from the consensus, which has been shown to accelerate around a year after infection as neutralizing antibody titers increase (28). This early evolution by E1E2 away from Bole1b did not localize to HLA-matched class I epitopes. In fact, in E1E2 from time A to time B, for unclear reasons, there was more evolution toward Bole1b in HLA-matched than in HLA-unmatched epitopes. This may represent reversion of mutations that developed at highly polymorphic sites very early in infection. Together, these results suggest that CTL likely are not the primary force driving evolution in E1E2.

We have extended these observations by also studying evolution late in chronic infection between time points approximately 20 and 25 years after infection (time B to time C). We found that most amino acid changes late in chronic infection are negatively selected, as shown by the greater rate of synonymous than nonsynonymous nucleotide change over this time period in all regions except HVR1. Surprisingly, though, at the amino acid level, we found phylogenetically distinct populations of virus at these longitudinal late chronic infection time points, suggesting ongoing viral evolution and quasispecies replacement. Strikingly, the vast majority of amino acid changes late in chronic infection either represented evolution away from the time A (inoculum) virus sequence or evolution toward the Bole1b sequence.

It is noteworthy that net evolution away from the time A (inoculum) virus sequence continues late in chronic infection. Since evolution away from time A virus sequence in non-E1E2 genes late in chronic infection did not localize to known CTL epitopes, it may represent CTL escape mutations developing at subdominant epitopes, compensatory mutations for CTL escape mutations that developed earlier in infection or genetic drift. It is not possible to distinguish between these possibilities currently, given the limited availability of T cells from these subjects. Targeting of new subdominant epitopes may be less likely given that T cell responses have been shown to decline in magnitude and become dysfunctional in chronic infection (7, 8, 22, 25, 40). Given prior studies associating HLA-B*57 and HLA-B*27 with development of CTL escape mutations, it is possible that we would have detected additional mutations if our study had included subjects with these alleles and we had sequenced the entire genome, including NS5B (10, 21).

The majority of net evolution away from the inoculum virus sequence from time B to time C occurred in E1E2, which is not surprising given that high titers of neutralizing antibody are present in many chronically infected individuals, and a previous study showed some evidence of ongoing escape from neutralizing antibody in chronic HCV infection (29, 45). The rate of evolution away from the inoculum virus sequence in E1E2 exceeded the rate of evolution away from an unrelated genotype 1b sequence (Con1), suggesting nonrandom immune pressure rather than genetic drift. The accelerating rate of HVR1 evolution late in infection was surprising and also supports a key role for antibody in driving E1E2 evolution in late chronic infection. While not definitive, these results argue against cell-to-cell transmission without antibody exposure as a major mechanism of viral persistence in chronic infection.

Even more striking than the ongoing evolution away from inoculum was the observation of positive selective pressure toward the Bole1b amino acid sequence. From time A to time B and from time B to time C, across the entire hemigenome aside from HVR1, the rate of evolution toward Bole1b significantly exceeded the rate of evolution tangential to Bole1b. This was quite unexpected, since at any position that did not initially match the Bole1b sequence, there were 18 possible amino acids that would result in tangential change and only 1 amino acid that would result in change toward Bole1b. Therefore, evolution toward the Bole1b sequence would be extremely unlikely to occur by random chance. From time A to time B, non-E1E2 and E1E2 showed net evolution away from Bole1b, but by the time B to time C period, rates of evolution toward Bole1b accelerated, resulting in net neutral evolution relative to Bole1b. It may be that the virus could not diverge any further from the Bole1b sequence at that point and still maintain adequate replicative fitness.

Together, these data suggest that, late in infection, most ongoing HCV evolution is not random genetic drift but rather the product of strong pressure toward a common ancestor (Bole1b) and concurrent net ongoing evolution away from the inoculum virus sequence. These two types of amino acid change likely balance replicative fitness and ongoing immune escape.

Supplementary Material

Supplemental material

ACKNOWLEDGMENTS

We thank David L. Thomas and members of the Center for Viral Hepatitis Research for useful discussions, Anna Snider for technical assistance, and plasma donors from the anti-D cohort.

This study was supported by NIH grants R01 DA024565 and U19 AI088791-2.

Footnotes

Published ahead of print 12 September 2012

Supplemental material for this article may be found at http://jvi.asm.org/.

REFERENCES

  • 1. Allen TM, et al. 2004. Selection, transmission, and reversion of an antigen-processing cytotoxic T-lymphocyte escape mutation in human immunodeficiency virus type 1 infection. J. Virol. 78:7069–7078 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Alter MJ, et al. 1999. The prevalence of hepatitis C virus infection in the United States, 1988 through 1994. N. Engl. J. Med. 341:556–562 [DOI] [PubMed] [Google Scholar]
  • 3. Bhattacharya T, et al. 2007. Founder effects in the assessment of HIV polymorphisms and HLA allele associations. Science 315:1583–1586 [DOI] [PubMed] [Google Scholar]
  • 4. Bukh J, Miller RH, Purcell RH. 1995. Genetic heterogeneity of hepatitis C virus: quasispecies and genotypes. Semin. Liver Dis. 15:41–63 [DOI] [PubMed] [Google Scholar]
  • 5. Burke KP, et al. 2012. Immunogenicity and cross-reactivity of a representative ancestral sequence in hepatitis C virus infection. J. Immunol. 188:5177–5188 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Cabrera R, et al. 2004. An immunomodulatory role for CD4(+)CD25(+) regulatory T lymphocytes in hepatitis C virus infection. Hepatology 40:1062–1071 [DOI] [PubMed] [Google Scholar]
  • 7. Cox AL, et al. 2005. Comprehensive analyses of CD8+ T cell responses during longitudinal study of acute human hepatitis C. Hepatology 42:104–112 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Cox AL, et al. 2005. Cellular immune selection with hepatitis C virus persistence in humans. J. Exp. Med. 201:1741–1752 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Cox AL, et al. 2005. Prospective evaluation of community-acquired acute-phase hepatitis C virus infection. Clin. Infect. Dis. 40:951–958 [DOI] [PubMed] [Google Scholar]
  • 10. Dazert E, et al. 2009. Loss of viral fitness and cross-recognition by CD8+ T cells limit HCV escape from a protective HLA-B27-restricted human immune response. J. Clin. Investig. 119:376–386 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Dowd KA, Netski DM, Wang XH, Cox AL, Ray SC. 2009. Selection pressure from neutralizing antibodies drives sequence evolution during acute infection with hepatitis C virus. Gastroenterology 136:2377–2386 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Farci P, Bukh J, Purcell RH. 1997. The quasispecies of hepatitis C virus and the host immune response. Springer Semin. Immunopathol. 19:5–26 [DOI] [PubMed] [Google Scholar]
  • 13. Farci P, et al. 2000. The outcome of acute hepatitis C predicted by the evolution of the viral quasispecies. Science 288:339–344 [DOI] [PubMed] [Google Scholar]
  • 14. Fitzmaurice K, et al. 2011. Molecular footprints reveal the impact of the protective HLA-A*03 allele in hepatitis C virus infection. Gut 60:1563–1571 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Friedrich TC, et al. 2004. Reversion of CTL escape-variant immunodeficiency viruses in vivo. Nat. Med. 10:275–281 [DOI] [PubMed] [Google Scholar]
  • 16. Ghany MG, Nelson DR, Strader DB, Thomas DL, Seeff LB. 2011. An update on treatment of genotype 1 chronic hepatitis C virus infection: 2011 practice guideline by the American Association for the Study of Liver Diseases. Hepatology 54:1433–1444 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Gruener NH, et al. 2001. Sustained dysfunction of antiviral CD8+ T lymphocytes after infection with hepatitis C virus. J. Virol. 75:5550–5558 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18. Hamady M, Lozupone C, Knight R. 2010. Fast UniFrac: facilitating high-throughput phylogenetic analyses of microbial communities including analysis of pyrosequencing and PhyloChip data. ISME J. 4:17–27 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Keck ZY, et al. 2009. Mutations in hepatitis C virus E2 located outside the CD81 binding sites lead to escape from broadly neutralizing antibodies but compromise virus infectivity. J. Virol. 83:6149–6160 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Kenny-Walsh E. 1999. Clinical outcomes after hepatitis C infection from contaminated anti-D immune globulin. Irish Hepatology Research Group. N. Engl. J. Med. 340:1228–1233 [DOI] [PubMed] [Google Scholar]
  • 21. Kim AY, et al. 2011. Spontaneous control of HCV is associated with expression of HLA-B 57 and preservation of targeted epitopes. Gastroenterology 140:686–696 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Kim AY, et al. 2006. Impaired hepatitis C virus-specific T cell responses and recurrent hepatitis C virus in HIV coinfection. PLoS Med. 3:e492 doi:10.1371/journal.pmed.0030492 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. Kuntzen T, et al. 2007. Viral sequence evolution in acute hepatitis C virus infection. J. Virol. 81:11658–11668 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24. Lechner F, et al. 2000. CD8+ T lymphocyte responses are induced during acute hepatitis C virus infection but are not sustained. Eur. J. Immunol. 30:2479–2487 [DOI] [PubMed] [Google Scholar]
  • 25. Lechner F, et al. 2000. Analysis of successful immune responses in persons infected with hepatitis C virus. J. Exp. Med. 191:1499–1512 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26. Le Pogam S, et al. 2006. In vitro selected Con1 subgenomic replicons resistant to 2′-C-methyl-cytidine or to R1479 show lack of cross resistance. Virology 351:349–359 [DOI] [PubMed] [Google Scholar]
  • 27. Leslie AJ, et al. 2004. HIV evolution: CTL escape mutation and reversion after transmission. Nat. Med. 10:282–289 [DOI] [PubMed] [Google Scholar]
  • 28. Liu L, et al. 2010. Acceleration of hepatitis C virus envelope evolution in humans is consistent with progressive humoral immune selection during the transition from acute to chronic infection. J. Virol. 84:5067–5077 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29. Logvinoff C, et al. 2004. Neutralizing antibody response during acute and chronic hepatitis C virus infection. Proc. Natl. Acad. Sci. U. S. A. 101:10149–10154 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Maheshwari A, Ray S, Thuluvath PJ. 2008. Acute hepatitis C. Lancet 372:321–332 [DOI] [PubMed] [Google Scholar]
  • 31. Martell M, et al. 1992. Hepatitis C virus (HCV) circulates as a population of different but closely related genomes: quasispecies nature of HCV genome distribution. J. Virol. 66:3225–3229 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32. McKiernan SM, et al. 2004. Distinct MHC class I and II alleles are associated with hepatitis C viral clearance, originating from a single source. Hepatology 40:108–114 [DOI] [PubMed] [Google Scholar]
  • 33. Munshaw S, et al. 2012. Computational reconstruction of bole1a, a representative synthetic hepatitis C virus subtype 1a genome. J. Virol. 86:5915–5921 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34. Nei M, Gojobori T. 1986. Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. Mol. Biol. Evol. 3:418–426 [DOI] [PubMed] [Google Scholar]
  • 35. Netski DM, et al. 2004. The development of neutralizing antibodies during acute hepatitis C virus infection. Abstr. 11th Int. Symp. Hepatitis C Vir. Rel. Vir., abstr. P-201 [Google Scholar]
  • 36. Netski DM, et al. 2005. Humoral immune response in acute hepatitis C virus infection. Clin. Infect. Dis. 41:667–675 [DOI] [PubMed] [Google Scholar]
  • 37. Neumann AU, et al. 1998. Hepatitis C viral dynamics in vivo and the antiviral efficacy of interferon-alpha therapy. Science 282:103–107 [DOI] [PubMed] [Google Scholar]
  • 38. Perz JF, Armstrong GL, Farrington LA, Hutin YJ, Bell BP. 2006. The contributions of hepatitis B virus and hepatitis C virus infections to cirrhosis and primary liver cancer worldwide. J. Hepatol. 45:529–538 [DOI] [PubMed] [Google Scholar]
  • 39. Ray SC, et al. 2005. Divergent and convergent evolution after a common-source outbreak of hepatitis C virus. J. Exp. Med. 201:1753–1759 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40. Rutebemberwa A, et al. 2008. High-programmed death-1 levels on hepatitis C virus-specific T cells during acute infection are associated with viral persistence and require preservation of cognate antigen during chronic infection. J. Immunol. 181:8215–8225 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41. Thimme R, et al. 2001. Determinants of viral clearance and persistence during acute hepatitis C virus infection. J. Exp. Med. 194:1395–1406 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42. Timm J, et al. 2004. CD8 epitope escape and reversion in acute HCV infection. J. Exp. Med. 200:1593–1604 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43. Tong MJ, El-Farra NS, Reikes AR, Co RL. 1995. Clinical outcomes after transfusion-associated hepatitis C. N. Engl. J. Med. 332:1463–1466 [DOI] [PubMed] [Google Scholar]
  • 44. Villano SA, Vlahov D, Nelson KE, Cohn S, Thomas DL. 1999. Persistence of viremia and the importance of long-term follow-up after acute hepatitis C infection. Hepatology 29:908–914 [DOI] [PubMed] [Google Scholar]
  • 45. von Hahn T, et al. 2007. Hepatitis C virus continuously escapes from neutralizing antibody and T-cell responses during chronic infection in vivo. Gastroenterology 132:667–678 [DOI] [PubMed] [Google Scholar]
  • 46. Wang XH, et al. 2007. Progression of fibrosis during chronic hepatitis C is associated with rapid virus evolution. J. Virol. 81:6513–6522 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47. World Health Organization 1997. Hepatitis C: global prevalence. Wkly. Epidemiol. Rec. 72:341–348 [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental material

Articles from Journal of Virology are provided here courtesy of American Society for Microbiology (ASM)

RESOURCES