Skip to main content
Journal of Virology logoLink to Journal of Virology
. 2016 Jul 27;90(16):7171–7183. doi: 10.1128/JVI.00243-16

Single-Molecule Sequencing Reveals Complex Genome Variation of Hepatitis B Virus during 15 Years of Chronic Infection following Liver Transplantation

B D Betz-Stablein a,*, A Töpfer b,c, M Littlejohn d, L Yuen d, D Colledge d, V Sozzi d, P Angus e, A Thompson d,f, P Revill d, N Beerenwinkel b,c, N Warner d, F Luciani a,*,
Editor: J-H J Oug
PMCID: PMC4984637  PMID: 27252524

ABSTRACT

Chronic hepatitis B (CHB) is prevalent worldwide. The infectious agent, hepatitis B virus (HBV), replicates via an RNA intermediate and is error prone, leading to the rapid generation of closely related but not identical viral variants, including those that can escape host immune responses and antiviral treatments. The complexity of CHB can be further enhanced by the presence of HBV variants with large deletions in the genome generated via splicing (spHBV variants). Although spHBV variants are incapable of autonomous replication, their replication is rescued by wild-type HBV. spHBV variants have been shown to enhance wild-type virus replication, and their prevalence increases with liver disease progression. Single-molecule deep sequencing was performed on whole HBV genomes extracted from samples, including the liver explant, longitudinally collected from a subject with CHB over a 15-year period after liver transplantation. By employing novel bioinformatics methods, this analysis showed that the dynamics of the viral population across a period of changing treatment regimens was complex. The spHBV variants detected in the liver explant remained present posttransplantation, and a highly diverse novel spHBV population as well as variants with multiple deletions in the pre-S genes emerged. The identification of novel mutations outside the HBV reverse transcriptase gene that co-occurred with known drug resistance-associated mutations highlights the relevance of using full-genome deep sequencing and supports the hypothesis that drug resistance involves interactions across the full length of the HBV genome.

IMPORTANCE Single-molecule sequencing allowed the characterization, in unprecedented detail, of the evolution of HBV populations and offered unique insights into the dynamics of defective and spHBV variants following liver transplantation and complex treatment regimens. This analysis also showed the rapid adaptation of HBV populations to treatment regimens with evolving drug resistance phenotypes and evidence of purifying selection across the whole genome. Finally, the new open-source bioinformatics tools with the capacity to easily identify potential spliced variants from deep sequencing data are freely available.

INTRODUCTION

Hepatitis B virus (HBV) causes a widely prevalent chronic viral infection and infects 240 million people worldwide (1). Chronic hepatitis B (CHB) can lead to cirrhosis, liver failure, and hepatocellular carcinoma (HCC) (1, 2). HBV is a member of the Hepadnaviridae family and has a partially double-stranded relaxed circular DNA genome (3). The approximately 3,200-nucleotide (nt) genome consists of four overlapping, frame-shifted open reading frames (ORF) encoding the polymerase (P), envelope (ENV), X, and core (C) proteins. HBV can be divided into nine genotypes, labeled genotypes A to I, with a further putative genotype (genotype J) being proposed (4). With the exception of genotypes E and G, all genotypes can be further classified into subgenotypes. HBV genotypes differ in diversity by more than 8%, while subgenotypes differ by 4 to 8% (5). HBV replicates via an RNA intermediate, but its reverse transcriptase (RT) lacks the ability to proofread (6). This, in combination with a high replication rate, leads to the presence of many closely related but not identical variants that form a rapidly evolving population of viral genomes.

CHB is typically managed with oral antiviral agents that inhibit the function of HBV RT, thus impairing viral replication (7). However, the presence of low-level replication may result in the selection of drug resistance-associated mutations, which in turn can lead to treatment failure and the further progression of liver disease (8). Currently, five nucleoside/nucleotide analogues (NAs) have been approved for use for the treatment of CHB (3). The rate of emergence of drug resistance-associated mutations is NA dependent (9). Following 5 years of monotherapy by drug-naive patients, greater than 80% developed lamivudine (LMV) resistance, 29% developed resistance to adefovir (ADV), and 1.2% developed resistance to entecavir (ETV) (9). Moreover, after 5 years of treatment with ETV, the proportion of LMV-experienced patients who developed resistance to ETV was 51%. There are eight codons within the RT domain of HBV polymerase that are associated with primary resistance to the five approved NAs (9).

HBV variants with deletions in the pre-S domains of the ENV ORF are also commonly detected, with estimates of prevalence being up to 30%, depending on the geographic region and HBV genotype (10). The HBV ENV ORF encodes three envelope proteins of graduating length: large (pre-S1, pre-S2, and S), middle (pre-S2 and S), and small/surface (S). The mRNAs of these envelope proteins are transcribed from a single ORF; however, each has its own initiation codon (11). In general, the number of pre-S mutations can increase during the progression of CHB (12), and pre-S mutations are more prevalent in patients with HCC and liver cirrhosis (13). It has also been shown that NA treatments can select viral variants whose genomes encode truncated surface proteins, which could, in turn, accelerate liver disease progression (14).

The HBV population within a host may also harbor variants generated by splicing of pregenomic RNA (spHBV variants) (15). These variants can be packaged and reverse transcribed, but viral particle packaging requires the presence of wild-type virus (HBV with a full-length genome) (16, 17) to provide the necessary proteins. In addition, the spHBV variants have been reported to enhance wild-type virus replication in subjects with CHB and have been associated with advanced liver disease (18, 19). Indeed, the genome of the most frequently detected spliced variant (termed Sp1) encodes a novel protein, termed the hepatitis B spliced protein (HBSP), which has also been associated with disease progression. Recent analysis also showed a significant association between an increase in the level of spHBV variants in serum and the time to diagnosis of HCC (20). To date, spliced variants have mostly been identified by population-based sequencing. This does not allow the accurate detection of the full distribution of these variants and the identification of the exact locations of the donor and acceptor splice sites (1517, 20). In addition, only a few studies have addressed the longitudinal dynamics of spHBV variants. For example, population-based sequencing was employed to examine spHBV variants prior to and following the diagnosis of HCC (20).

Few studies of a mixed viral population have been performed to identify coevolving single-nucleotide variants (SNVs) within a single HBV genome. One study carried out limited dilution followed by single-molecule sequencing of 639 full-length HBV genomes to assess the evolution of drug-resistant HBV (21). The study demonstrated that LMV resistance is a complex genetic trait, with multiple mutations within a single HBV genome being responsible for LMV resistance. With the advent of next-generation sequencing (NGS), deep sequencing of complex and highly variable genomic populations has enabled detailed estimates of rare genomic variants (22). However, because the sequence reads produced are error prone and generally of a short length, analysis of complete genome sequences to study rare variants and complex structural genomic rearrangements, such as recombinant variants and variants with large insertion and deletions, remains challenging (23). The recent advent of high-throughput single-molecule sequencing technologies, such as the Pacific Biosciences (PacBio) and the Oxford Nanopore technologies, has allowed investigation of complex genomic rearrangements by generating long reads (up to 60,000 nt in length) (24). These technologies can be used to generate full-length HBV genomes in one read, thus allowing investigation of the complex and diverse distribution of rapidly mutating genomes. To our knowledge, no studies have applied high-throughput single-molecule deep sequencing to assess the full-length genomes of HBV variants, including spliced variants, and to determine their distributions. Hence, the current level of HBV genomic diversity is likely underestimated.

In this study, single-molecule deep sequencing and a novel bioinformatic work flow were employed to study the genomes of HBV variants extracted from an individual with CHB following liver transplantation. Samples collected longitudinally over a period spanning 15 years included the liver explant and follow-up blood samples. Through the identification of novel HBV variants that could not be detected by population-based sequencing or NGS, this study demonstrates the power of long-read single-molecule sequencing of complete viral genomes for analysis of novel HBV variants over time.

MATERIALS AND METHODS

Subject and samples.

The patient was a 69-year-old male who underwent a liver transplant in 1991 for HBV-related liver cirrhosis and was diagnosed with recurrent HBV infection at 6 months posttransplantation. This patient is highly drug experienced and over time has developed resistance to LMV, ETV, and ADV. The case history of this patient has been previously reported (patient B in reference 25), and clinical information is shown in Fig. 1. The patient was infected with HBV genotype D, and samples available following liver transplantation were collected over a period spanning 15 years. These included a sample from the liver explant (designated sample collected at time zero [T0], from 1991) and three posttransplantation blood samples (designated samples collected at time 1 [T1; 2000], time 2 [T2; 2001], and time 3 [T3; 2005]). Approval for this work was obtained from the Austin Health Human Research Ethics Committee. Informed consent to use these samples for research purposes was obtained from the patient.

FIG 1.

FIG 1

Treatment history of the chronically HBV-infected subject. The antiviral therapy (colored lines) and immunosuppressants (shaded boxes) taken by the subject throughout the study period are shown at the top. The peaks in viral load and ALT levels reflect the drug resistance observed in this patient that led to a change in the treatment regimen. The genomes of isolates from four samples were deep sequenced, and the samples are identified by the black diamonds. The sample obtained at T0 was from the explant liver, while the samples obtained at T1 to T3 were blood samples. Shannon entropy, calculated across the full genome, represents the diversity of the viruses within each sample that underwent deep sequencing.

HBV extraction and amplification.

HBV DNA was extracted from both liver tissue and 200-μl serum samples using a QIAamp DNA minikit (Qiagen, Hilden, Germany) according to the manufacturer's instructions. Extracted DNA was eluted in a final volume of 50 μl of elution buffer supplied by the manufacturer. In order to generate a sufficient amount of DNA for deep sequencing, amplification of the nearly complete HBV genome was carried out as previously described (26) using a FastStart high-fidelity PCR system (Roche, Basel, Switzerland) and primers P1 (CCG GAA AGC TTG AGC TCT TCT TTT TCA CCT CTG CCT AAT CA, nucleotides 1821 to 1841 [numbering from the EcoRI start site]) and P2 (CCG GAA AGC TTG AGC TCT TCA AAA AGT TGC ATG GTG CTG G, nucleotides 1806 to 1825). These primers are located at conserved sites which include direct repeat 1, a segment required for replication.

For analysis of pre-S deletions, the entire surface gene (pre-S1, pre-S2, surface) of HBV was amplified using a PicoMaxx high-fidelity PCR amplification system (Stratagene, La Jolla, CA, USA) and primers OS1 (GCC TCA TTT TGT GGG TCA CCA TA, nucleotides 2846 to 2868) and OS2 (TCT CTG ACA TAC TTT CCA AT, nucleotides 2798 to 2817).

Clonal analysis.

The HBV genome amplicons were cloned into a PCRScript Amp SK+ cloning kit (Stratagene) per the manufacturer's instructions. HBV DNA from the clones was PCR amplified, and 30 to 32 clones from each sample were sequenced by the population-based method described previously (27), using the PCR primers as sequencing primers. To specifically identify spHBV variants, amplicons of lower molecular weight were cut from an agarose gel following electrophoresis and cloned, and their sequences were determined by the population-based method.

Single-molecule sequencing.

For single-molecule sequencing, SMRTbells templates were produced using a DNA template preparation kit (version 2.0; 250 bp to <3,000 bp; protocol number [p/n] 001-540-726; Pacific Biosciences) according to the manufacturer's instructions. Briefly, the DNA amplicons were inspected for their quality by their size and integrity on an Agilent Bioanalyzer 2100 1-kb DNA chip. The PCR product (500 to 750 ng) was end repaired using polishing enzymes. A blunt-end adapter ligation was performed to create the SMRTbell template. The library was inspected for quality and quantified on an Agilent Bioanalyzer 12-kb DNA chip and on a Qubit fluorimeter (Life Technologies), respectively. A ready-to-sequence SMRTbell-polymerase complex was created using a DNA/polymerase P6 binding kit (Pacific Biosciences) according to the manufacturer's instructions. For the liver sample, the Pacific Biosciences RS2 instrument was programmed to sequence the library on one SMRT cell using P6/C4 chemistry with the magnetic bead loading method, taking one movie of 180 min. The three blood samples were prepared as described above, and two SMRT cells were used with P4/C2 chemistry.

454 Roche FLX sequencing.

Nearly full-length HBV genomes amplified from the three blood samples were also sequenced using a 454 Roche FLX Titanium sequencing instrument, as previously described (28).

Bioinformatics analyses. (i) Processing of raw data.

For each raw PacBio RS read, the SMRT Analysis software suite was used to remove SMRTbell adapters and to generate a circular consensus sequence (CCS) if the amplicon had at least five full passes. A de novo consensus sequence was then created (29) from the CCS reads obtained from the liver genome (obtained at T0) and manually checked for software-generated errors in the open reading frames.

Manual correction involved alignment of the de novo consensus sequence to a full-genome HBV subtype D1 reference sequence (GenBank accession number X02496.1). In total, 2 single-base deletions and a sequence of 9 consecutive unknown bases (coded N by Vicuna software) were replaced by the corresponding nucleotides from the reference sequence, ensuring the complete translation of all ORFs. Additionally, the 5′ and 3′ ends of the full genome were trimmed of 21 and 30 nucleotides, respectively, because the de novo consensus sequence consisted of strings of unknown bases coded as N. CCS data from the PacBio read of the samples collected longitudinally were aligned to the consensus sequence obtained from the liver explant using the BWA MEM algorithm (30).

Roche 454 raw sequence reads were first quality checked and trimmed using an in-house script as previously described (28) and then also aligned using the BWA MEM algorithm.

(ii) SNV identification and SE.

For data sets generated by the PacBio RS and 454 Roche systems, SNVs were called from aligned reads (bam file) using the LoFreq program (31). Shannon entropy (SE) was then calculated using an in-house script in the R package (32).

Identification of spliced variants.

Detection of spliced variants was performed with a new algorithm implemented in the software Split2Del (https://github.com/armintoepfer/split2del). This was applied to both PacBio RS CCS reads and 454 Roche sequence data. Briefly, the algorithm was developed to detect large deletions from aligned single reads in the bam format generated by the BWA MEM algorithm. Split2Del takes as the input reads that are split and aligned to two or more distinct regions of the genome. The longest partial read aligned is referred to as the primary read, and any other partial reads are referred to as supplementary reads. Split2Del takes the supplementary reads aligned by the BWA algorithm and merges them together with primary reads when supplementary reads are longer than 50 nt in length. The gap between the primary and secondary reads is further treated as a deletion, and a new bam file is generated. After identification of the deletions using Split2Del, only deletions that were greater than 500 nt and that occurred in more than 10 reads from samples from each time point were selected for further analysis. In a final step, all reads ±5 nt from the putative deletion start and stop positions were selected and then used to generate a consensus sequence for each identified deletion. By allowing a length mismatch of ±5 nt at either end, this algorithm accounts for sequencing errors that lead to alignment errors in the anchoring sequence of each deletion.

Haplotype reconstruction of full-length HBV genomic variants.

To analyze full-length reads that did not contain deletions, statistical haplotype reconstruction was performed on single-molecule sequencing data using the software ShoRAH (33), which performs error correction as well as haplotype reconstruction to identify full-genome variants. ShoRAH was applied to CCS reads that had a length greater than 3,000 nt to create error-corrected full-length haplotypes. Local error correction was performed with an overlapping window of 1,002 nt (parameter w in ShoRAH).

Reconstructed haplotypes representing the distribution of complete genome variants as well as spliced variants identified with Split2Del were combined and used to construct phylogenetic trees with MEGA7 software (34). The substitution model GTR (G+I) was used for this analysis with 500 bootstrap samples. The gaps and missing data option was set to include all sites to retain full-length sequences when large deletions were present (35). The SMuPFi program (36) was used to identify co-occurring substitutions, and frequencies were calculated on the basis of the number of haplotypes containing drug resistance-associated combinations which also contained the co-occurring substitution.

Accession number.

PacBio sequences data are available at http://www.ebi.ac.uk/ena/data/view/ERP013934.

RESULTS

Single-molecule nearly full-length HBV genome sequencing.

In order to study the diversity and evolutionary dynamics of a complex population of HBV variants, longitudinally collected viremic samples from a subject with CHB were analyzed using deep sequencing across the nearly full-length genome (3,158 nt). Analysis was performed on four samples, one sample collected from a liver explant (T0, 1991) and three blood samples collected over a 15-year period after liver transplantation (T1, 2000; T2, 2001; T3, 2005) (Fig. 1). Single-molecule deep sequencing analysis of a single amplicon covering the full-length genome of each of the samples was performed with the PacBio RS technology; thus, each of the high-quality circular consensus sequence (CCS) reads covering the nearly full-length genome was obtained. The distribution of CCS reads had average read lengths of 1,493, 1,780, 1,465, and 1,404 nt, respectively, for time points T0, T1, T2, and T3. The CCS read coverage was 36,618, 9,116, 25,769, and 19,578 reads respectively, across the time points, with the sequences having greater than 98% alignment to the de novo consensus sequence obtained at T0. Of these CCS reads obtained at T0, T1, T2, and T3, 4.5%, 16.9%, 6.7%, and 7.1%, respectively, had lengths greater than 3,000 nt. These reads were utilized to estimate the diversity of the nearly full-length genome sequences of the circulating HBV genomes, as well as defective smaller genome variants, such as spHBV variants and variants with pre-S deletions. The consensus sequences of the HBV genomes obtained at T1, T2, and T3 were also generated from the 454 Roche data, and these matched the corresponding sequences obtained by PacBio RS sequencing.

Spliced variants/large deletions.

Novel bioinformatics methods were developed to utilize CCS data in order to study the distribution of structural variations in the form of large deletions and spliced variants across the four time points at which samples were obtained (Split2Del; see Materials and Methods). This analysis revealed the presence of a highly diverse population of complete and smaller genomes both in the liver explant and in the blood samples. For each sample (samples obtained at T0 to T3), the distribution of aligned reads against the full HBV genome revealed that 68.6%, 35.5%, 38.8%, and 34.6% of those reads, respectively, contained deletions of greater than 500 nt, which made them potential spHBV variants. A total of 95 distinct spHBV variants were identified in the four samples studied. Twelve of these were present at multiple time points (Fig. 2), while the remaining variants were detected at one time point only (Fig. 3). Five of the distinct spHBV variants (5.2%) have been previously identified on the basis of their deletions breakpoints (37) and were present at all four time points (Fig. 2). Seven (7.4%) putative novel spHBV variants were also identified at multiple time points, and three of these were identified as putative double-spliced variants (Fig. 2). The majority (80%) of deletion breakpoints in the seven novel spHBV variants were located at known splice donor and acceptor sites but in combinations different from those previously reported (37) (Fig. 2). The remaining 83 (87.4%) spHBV variants could be detected in only one of the four samples studied (Fig. 3), and 12 of these contained double deletions with lengths of >500 nt each. Of the spHBV variants present in all four samples, the proportion with breakpoints located between known splice donor and acceptor motifs was 17% for single-deletion variants and 73% for double-deletion variants. Notably, nine of the spHBV variants identified in the liver explant sample were also found in the posttransplant blood samples, with four still being present 15 years posttransplant (Fig. 2). The most common spliced variant present at T1 was the previously reported Sp1 (37), and known variant Sp3 dominated at T0 and T2 (15, 37) (Fig. 4). The largest number of distinct, novel, and known spHBV variants was observed at T0 (n = 81), and the smallest number was observed at T2 (n = 13) (Fig. 4). Interestingly, a more diverse population of novel spHBV variants was observed at T3, following a change in immunosuppressant therapy from cyclosporine to mycophenolate mofetil (MMF) that occurred between T2 and T3. All the spHBV variants identified carried the pregenomic RNA encapsidation signal sequence required for the viral RNA to be encapsidated.

FIG 2.

FIG 2

Distribution of spliced variants common to multiple time points. The spliced variants present at two or more time points relative to the four overlapping reading frames of the HBV genome (shown at the top) are shown. Each unique variant is represented with a different color. The spliced regions are represented as black dashed lines. Known spliced variants (indicated with Sp plus a number) and predicted spliced variants (indicated with pSp plus a number) are shown. Known donor and acceptor splicing sites are indicated by black diamonds above the horizontal axis, while putative splice sites are represented below the horizontal axis. Checkmarks, the variant was found at the indicated sampling time; # and *, spHBV variants that were also detected by use of the 454 Roche technology and population-based sequencing, respectively.

FIG 3.

FIG 3

Distribution of spliced variants unique to one time point. The spHBV variants are colored to indicate the time point that each unique variant was observed. The majority of spliced variants unique to one time point were identified in the liver explant sample at T0 (red). As in Fig. 2, the spliced regions are represented as black dashed lines, and known donor and acceptor splicing sites are indicated by black diamonds above the horizontal axis. The unmarked splice sites are putative sites. # and *, spHBV variants that were also detected by use of the 454 Roche technology and population-based sequencing, respectively.

FIG 4.

FIG 4

Diversity of spliced variants over the course of the infection. Colored sectors in the pie chart represent variants identified at at least two time points, while gray sectors represent variants unique to one time point. The color code is the same as that used to represent spHBV variants in Fig. 2. The majority of spliced variants present at any time point are known spliced variant Sp1 and/or Sp3.

Validation of single-molecule sequencing analysis.

To validate the spHBV variants detected via single-molecule deep sequencing, population sequencing was performed on 25 clones derived from the HBV population in the sample obtained at T3. The HBV sequences determined from the clones were then compared with the single-molecule sequencing data. Six of the 25 clones covered the full genome. Six clones carried pre-S deletions and were also matched to variants detected by single-molecule sequencing (Table 1). The remaining 19 clones (76%) presented large deletions, identifying 6 distinct spHBV variants, 4 of which were experimentally known and previously reported (Sp1 [7/19], Sp3 [8/19], Sp6 [1/19], and Sp11 [1/19]) (Fig. 2). The remaining two spHBV variants were novel, but only one was detected by single-molecule sequencing (Fig. 3). In total, single-molecule sequencing in combination with the novel bioinformatics tools identified 94% (5/6) of the spHBV variants that were identified via molecular cloning of the sample obtained at T3.

TABLE 1.

Length, position, and frequency of pre-S deletions across all samples

Protein and segment Amino acid
Frequency of deletion (% of isolates)
Length Position July 1991 (T0) (cova = 1,638) Sept 2000 (T1) (cov = 1,313) Oct 2001 (T2) (cov = 1,833) July 2002b (nc = 31) March 2004b (n = 30) May 2005 (T3) (cov = 1,359) May 2005b (T3) (n = 44)
Pre-S1
    a 15 54–68 72 14 23 100 98 100
    b 3 98–100 86 12 18 100 97 100
Pre-S2
    c 1 108 99 26 82 83 3
    d 5 109–113 60 5 8 100 97 100
    e 7 124–130 >99 93 99 100 100 88 100
a

cov, number of reads aligned across the pre-S deletions.

b

Data determined from clones (Sanger sequencing).

c

n, number of clones sequenced.

To validate the bioinformatics algorithm, the viral amplicons from samples obtained at T1, T2, and T3 were also sequenced by use of the short-read 454 Roche technology. The average read lengths were 374, 384, and 543 nt for samples obtained at T1, T2, and T3, respectively, and the coverage was 48,158, 66,164, and 119,667 sequences, respectively. All the spHBV variants detected by the Roche 454 technology were also detected by use of the PacBio technology, and no other novel variants were identified. The percentages of reads containing deletions of greater than 500 nt were 1.2%, 4.1%, and 9.7%, respectively, which were significantly lower than those from single-molecule sequencing. Five unique spHBV variants were identified at T1 and T2 (Fig. 2), and seven were identified at T3 (Fig. 2 and 3). The unique spHBV that was identified by population-based sequencing (Fig. 3) was not identified by PacBio RS or by 454 Roche technology.

Pre-S deletions.

In order to assess the longitudinal changes of viral variants carrying deletions in the pre-S regions, the consensus sequences from the CCS reads were compared with the complete sequence of the HBV subtype D1 reference genome (GenBank accession number X02496.1). This analysis identified a total of five types of pre-S deletions, two of which were present at all time points (Table 1). Several of these deletions were also identified by molecular cloning of the HBV population in the sample obtained at T3 and two additional samples obtained between time points T2 and T3 (Table 1). The pre-S deletion variants detected by cloning and single-molecule sequencing were consistent in terms of both their size and frequency of occurrence.

Viral evolution, diversity, and antiviral treatments.

To assess the relationship between viral diversity and the patient's clinical history, the viral load, alanine aminotransferase (ALT) level, and distribution of single-nucleotide variants (SNVs) were measured and compared across time points. Shannon entropy (SE), a measure of viral diversity, across the genomes was also calculated. Overall, SE showed a sizable increase in viral diversity between T0 and T1 (Fig. 1). The sample obtained at T1 was collected 10 years posttransplant following reinfection of the new liver, and the patient had been treated with LMV for the 3 years prior to collection of the sample at T1. This increase in diversity was additionally supported by the increase in SE across all ORFs (Fig. 5A) and, more specifically, the increase in SE for the functional domains of the P ORF (Fig. 5B) between T0 and T1. This trend was more pronounced for nonsynonymous mutations. The drop in viral load between T1 and T2 was associated with a concomitant reduction in viral diversity. During this time period, the treatment regimen was ETV, which has a strong genetic barrier to resistance (25). Despite the decline in the total SE value between T1 and T2 (Fig. 1 and 5A), the number of SNVs reaching a frequency of occurrence in the population above 75% (defined from here on as a fixation event) continued to increase across all genomic regions (Fig. 5C and D). At T3, further fixation events occurred, corresponding to an increased SE and viral load, which coincided with a treatment change to ADV in 2002. Overall, the majority of the fixation mutations across all time points were nonsynonymous. The majority of the nonsynonymous mutations were present in the P ORF, which harbors the RT region containing the known drug resistance-associated variants. Of note, lower rates of diversity were observed in the X gene, which was less affected by the presence of large deletions.

FIG 5.

FIG 5

Viral diversity over the course of the infection. (A and B) Shannon entropy values normalized for all ORFs within the HBV genome (A) and functional domains within the polymerase P ORF (B) are shown. The different colors represent distinct genomic regions. Solid lines, Shannon entropy values from the full distribution of SNVs; dashed lines, Shannon entropy values from the distribution of nonsynonymous (NonSyn) mutations only. (C and D) Mutations reaching fixation (≥75% in frequency) within all ORFs of the HBV genome (C) and within the P ORF (D). The functional domains within the P ORF are the terminal protein (tp), spacer (sp), reverse transcriptase (rt), and RNase H (rn).

To further address the rapid change in viral diversity between the time points, phylogenetic analysis was performed to assess the relationship between HBV variants across all time points (Fig. 6). This analysis was performed by utilizing the distribution of reconstructed HBV genomic variants obtained by single-molecule sequencing of the nearly full-length genome, the spHBV variants, as well as the population-based sequences of clones generated from the sample obtained at T3. This analysis showed distinct populations of HBV variants, with little overlap between the virus populations at different time points being detected. The highly diverse population identified at T0 (in the liver explant) was consistent with the results from the SNV and SE analyses described above. The HBV population from the liver explant evolved into an almost completely distinct population at each time point. At T1 (10 years posttransplant, following reinfection of the new liver), a distinct new population of viruses characterized by novel viral variants emerged, consistent with the increased genetic diversity estimated by the SE values (Fig. 1 and 5). Novel HBV variants were observed at T2, although they had a reduced overall genomic diversity. Similarly, at T3, the distribution of both complete genome and spHBV variants showed that they formed a new population, with the population-based sequences of the clones being genetically similar to those of the variants detected via single-molecule sequencing.

FIG 6.

FIG 6

Phylogenetic tree of the rapid evolution of HBV populations over a 15-year period. The phylogenetic tree (rooted with the consensus sequence from the liver explant sample T0) shows the diversity of the viral populations across the time points. Circles, full-genome variants; triangles, spliced variants. Different colors indicate the different time points. The location of an unrelated genotype D sequence is also shown (solid orange circle).

Co-occurrence of novel mutations with drug-resistant variants.

Known drug resistance-associated amino acid substitutions were observed in the viral variants detected by single-molecule sequencing and were found to evolve dynamically as the treatment regimens changed. At T1, mutations that result in the known antiviral resistance-associated amino acid substitutions rtL180M, rtT184T/S, and rtM204V in the RT sequence, corresponding to resistance to LMV and reduced susceptibility to ETV, were detected. At T2, viral variants with rtT184S were replaced by variants with rtT184G and rtS202I, resulting in ETV resistance. At T3, rtM204V was replaced by rtM204A, and the novel amino acid substitutions rtN236A/V were also detected. In the three blood samples, nonsynonymous mutations associated with drug resistance were found in five codons, and all these mutations became fixed over time (Table 2). Population-based sequencing as well as Roche 454 sequencing of the samples obtained at T1 to T3 identified all the SNVs associated with drug resistance at similar frequencies (Table 2).

TABLE 2.

Frequencies of known drug-resistant and vaccine escape-associated mutations across time points obtained with the PacBio with Roche 454 technologies

Amino acid change Drug Frequency of mutationa
T0 T1 T2 T3
rtI169T ETV 0.16 (0.13)
rtV173L LMV 0.15 (0.23)
rtL180 M LMV 0.99 (0.99) 0.97 (0.99) 0.02 (0.05)
rtT184G ETV 0.97 (0.98) 0.97 (0.99)
rtT184S ETV 0.78 (0.74) 0.01 (0.01)
rtS202I ETV 0.93 (0.96) 0.96 (0.97)
rtM204V LMV 0.96 (0.97) 0.98 (0.99) 0.02 (0.05)
rtM204A LMV 0.01 (0.00) 0.96 (0.91)
rtN236A ADV 0.67 (0.56)
rtN236V ADV 0.22 (0.26)
S P120Tb 1.00 (1.00) 0.95 (0.93) 1.00 (1.00) 1.00 (1.00)
a

Mutation frequencies determined by use of the PacBio RS technology (454 Roche technology) are shown. Boldface indicates a frequency of mutation of >75%, i.e., fixation event.

b

Vaccine escape-associated substitution.

Haplotype reconstruction was employed to identify the distribution of full-length viral variants and to study the rate co-occurring drug resistance-associated mutations in the viral variants identified. This approach employed a statistical model for error correction of reads and subsequent reconstruction of viral variants across the full genome, thus allowing detection of the genomic population down to a frequency of 1% (see Materials and Methods). Analysis was focused on viral variants that had either a complete genome or pre-S deletions only and was confined to the nonsynonymous mutations in the P ORF, representing ∼80% of the genome. While known drug resistance-associated mutations were observed in the genomes of the viral variants at multiple time points, their combinations were unique at each time point. The amino acid change rtM204V (Table 2), which is known to be associated with LMV resistance and reduced replication efficiency (1, 8), was always detected with a co-occurring rtL180M change in samples collected at T1 and T2. This variant also co-occurred with rtV173L in samples collected at T1 only, confirming previous findings showing that rtL180M is a compensatory amino acid change that can restore the reduced replication efficiency observed in viral variants with a modified YMDD catalytic domain (9). Interestingly, a high proportion of the viral variants observed at T1 carried, in addition to rtL180M and rtM204V, the rtT184S mutation, which is associated with ETV resistance, although the patient had received only LMV monotherapy at this time point. Thus, the patient was primed for resistance when his treatment was switched to ETV monotherapy, and a viral load rebound was observed within a year of the LMV-to-ETV switch. At T2, the P ORF of all the viral variants had additional mutations resulting in multiple amino acid substitutions, rtL180M, rtT184G, rtS202I, and rtM204V. It is noteworthy that the amino acid change from rtT184S to rtT184G requires an additional mutation within the same codon of the RT gene. Given that close to 100% of the viral variants observed at T2 possessed amino acid changes associated with LMV resistance, the short switch of ETV monotherapy to LMV monotherapy resulted in no difference in the viral load. A subsequent switch of LMV monotherapy to ADV monotherapy initially resulted in a viral load reduction to ∼106 IU/ml, but it rapidly rebounded to ∼107 IU/ml. At T3, although the patient had been on ADV monotherapy for 3 years, nearly all the viral variants had maintained the LMV and ETV resistance-associated mutations within their genomes. In addition, these viral variants further accumulated within their genomes novel mutations that may be associated with ADV resistance. At T3, nearly all viral variants had a complex combination of amino acid changes in the polymerase: rtT184G, rtS202I, rtM204A, and rtN236A/V (Table 2; Fig. 7) (38). It should be noted that the amino acid changes rtT184G, rtM204A, rtN236A, and rtN236V all require two codon mutations to occur. The classic ADV resistance-associated amino acid substitution rtN236T was not detected in any of the HBV isolates recovered from this patient at any time point.

FIG 7.

FIG 7

Heat plot of co-occurring amino acid substitutions associated with drug resistance-associated (DR) combinations over time. Nonsynonymous mutations associated with drug-resistant haplotypes are presented by the amino acid position across the P ORF. The shade of blue represents the frequency of each substitution within each haplotype, where white indicates that the substitution in not present and dark blue indicates that the substitution is present in all variants. Each row represents a haplotype which carries specific combinations of known drug resistance-associated substitutions. Each row is labeled by time point and the drug-resistant variants present in each haplotype. The frequency of that haplotype within each time point is shown in parentheses (rows on the left-hand side). At T0, there were no drug-resistant SNVs present in any haplotypes. At T1, there were four haplotypes carrying different combinations of drug resistance-associated mutations, while there were only two at T2 and T3. The drug received prior to the sampling time point is indicated at the left. The x axis shows the amino acid positions with the terminal protein (tp), spacer (sp), reverse transcriptase/polymerase (rt), and RNase H (rn) regions of the P ORF. Known drug resistance-associated substitutions are shown in red.

Extension of the analysis to regions outside the RT domain resulted in the identification of several co-occurring nonsynonymous mutations in the P ORF in association with drug-resistant viral variants (Fig. 7). An interesting pattern consisting of amino acid changes at terminal protein codons 163 and 164 (tp163 and tp164, respectively) and spacer codon 117 (sp117) was observed in the viral variants detected at T1. Although these amino acid changes reverted back to the wild-type sequence in the sample obtained at T2, they reappeared again in the sample obtained at T3. It is possible that these mutation patterns are a reflection of the changes in treatment regimens over time. A vaccine escape-associated amino acid substitution, S P120T (39), located in a B-cell epitope of the HBV surface antigen (HBsAg), was detected in the viral variants obtained at all time points, including the variants obtained from the liver explant (Table 2). An additional substitution of interest was the surface gene nonsynonymous mutation that resulted in S C69* (data not shown). This amino acid change was observed at T1 (89%) and T3 (97%) but at a much lower rate (13%) at T2 and would result in the production of a truncated S protein, which has been associated with false-negative HBsAg detection in diagnostic serological assays (40). The corresponding change in the overlapping P ORF was rtS78T (Fig. 7).

DISCUSSION

In this work, single-molecule sequencing was employed to investigate the dynamics of HBV variants observed in a CHB patient who had had a liver transplant followed by complex treatment regimens and from whom samples were taken from the liver explant and blood over a 15-year period. The complex clinical history of this patient offered a possibility to study the long-term evolution of viral populations while they were undergoing different types of selection pressure. The application of single-molecule deep sequencing and novel bioinformatic analyses allowed the detection of both novel and previously known spHBV variants as well as coevolving drug-resistant variants in unprecedented detail and depth. This novel approach revealed a much higher level of genetic complexity in the viral population in an individual with CHB than has previously been possible using population-based or short-read next-generation sequencing protocols. Overall, three types of viral variants were observed in the four samples examined in this study: complete genome viral variants, pre-S deletion viral variants, and large-deletion viral variants (spHBV variants).

Novel spliced variants.

A diversity of spHBV genomes greater than that reported previously was identified (Fig. 2 to 3), with evidence of novel spliced variants characterized by both novel splice donor and receptor sites as well as novel combinations of known donor and acceptor motifs being identified (37). Similar to the findings described in previous reports, the majority of the novel spliced variants identified here retained the X gene (Fig. 3) (6, 15), while approximately 50% retained the precore/core genes. The greatest diversity of spHBV variants was observed in the liver explant (obtained at T0), and evidence of reinfection and the ongoing evolution of these variants in the new liver over the 15 years of analysis was obtained. The viruses detected in the serum samples following liver transplantation most likely reflect variants newly generated following reinfection of the new liver and as a result of the selection pressure of both immunosuppression and subsequent sequential antiviral therapy.

The dynamics of the spHBV variants observed over the course of the infection could be explained by a genetic bottleneck event upon liver transplantation, with a small subset of the HBV population reestablishing infection in the new liver. It remains to be proven whether spHBV variants are themselves infectious and, if so, whether they establish covalently closed circular DNA upon infection of hepatocytes. Nonetheless, all the spHBV variants identified in this study had the encapsidation signal within their genomes, and, hence, their genomes could be packaged and reverse transcribed using polymerase and core proteins supplied in trans (15). Finally, the level of liver damage over the course of infection may be associated with positive selection of spHBV variants. This is supported by the increased diversity of spHBV variants observed concomitantly with flares in ALT levels (e.g., in the liver explant at T0 and at the final time point, T3), suggesting that splicing may be more frequent with the progression of liver disease. Indeed, population-based sequencing has shown that spliced variants are more common in patients with advanced liver disease (18, 19). These findings suggest that not only the abundance but also the diversity of spliced variants is influenced by disease status. In addition, the change in immunosuppressant therapy between time points T2 and T3 may have impacted the increase in diversity in the spHBV population observed between these two time points. Contrary to previous reports, our longitudinal analysis revealed that Sp1 is not always the dominant spliced variant (15, 17). This finding highlights the importance of deep sequencing and longitudinal sampling, showing that spHBV variants are more varied than currently appreciated. Standard population-based sequencing of variants in samples obtained at T3 supported the validity of detection of spHBV variants via computational analyses of sequences obtained by single-molecule deep sequencing, with only one clone sequence not being detected by single-molecule sequencing with the PacBio RS instrument. In the proposed approach, a conservative coverage threshold of 10 reads was employed to identify a designated spHBV variant to eliminate potential in silico artifacts mostly due to low numbers of technical errors in the reads obtained with the PacBio RS instrument. Future analysis could investigate the sensitivity of the novel bioinformatics work flow employed in this study.

Distribution and kinetics of pre-S deletions.

HBV variants with pre-S deletions were identified in both liver explant and blood samples. In general, these variants are often observed in patients with advanced disease, such as liver cirrhosis and HCC (12, 13). This is consistent with the findings of this study, as the subject evaluated here had end-stage liver disease prior to a liver transplantation. The variant with a pre-S2 deletion (amino acids [aa] 16 to 22) was present at all sampled time points at a prevalence of greater than 94%, suggesting that this viral variant was involved with reinfection of the new liver and thus may be transmissible, despite the lack of evidence of direct transmission to a new host (9, 41). Other pre-S2 deletion variants detected included those with a complete abolishment of the pre-S2 start codon. This viral variant would not be able to produce M protein, but infectivity would not be affected (42). Nonetheless, as the first 5 aa of the middle envelope protein are involved with capsid binding (43), deletion of pre-S2 aa 1 to 6 likely interferes with HBV assembly.

Evolution of drug-resistant variants.

Over the 15-year study period following liver transplantation, the subject underwent five distinct regimens of antiviral drugs and a change in immunosuppressive drugs. The antiviral regimens would predominantly affect HBV variants with full-length genomes or genomes with minor insertions/deletions and not spHBV variants. A viral response was observed following each treatment regimen change. Importantly, single-molecule sequencing demonstrated the rapid selection and fixation of nonsynonymous mutations that accumulated within the same viral genome as the treatment regimen changed. Such dynamic evolution is consistent with antiviral agents exerting a strong purifying selection across the HBV genomes.

The decreased diversification at T2 was due to long-term LMV monotherapy, which resulted in the selection of viral variants that possessed triple RT amino acid substitutions that were associated with resistance to both LMV and ETV (rtL180M, rtT184S, and rtM204V) (Fig. 7), and additional nonsynonymous mutations were not selected in these viral variants when therapy was switched to ETV monotherapy (9). Nonetheless, these viral variants did accumulate additional RT mutations that resulted in a double mutation at RT codon 184 (a change from rtT184S to rtT184G) and rtS202I, possibly to compensate for the higher fitness costs associated with ETV resistance, as previously noted (25). These data confirm that preexisting drug resistance-associated RT mutations in HBV variants can influence the response to future antiviral treatments.

The majority of the LMV resistance-associated mutations were maintained following the switch to ADV monotherapy. This observation is consistent with conformational changes of the polymerase and reduced sensitivity to ADV. To escape the influence of ADV, RT mutations resulting in the amino acid changes rtM204A and rtN236A/V (requiring two codon mutations), which have not previously been described in HBV isolates recovered from patients with LMV or ADV resistance, were further selected in these viral variants. In vitro phenotypic analysis of HBV clones whose genomes encoded rtM204A (44) demonstrated that they had reduced susceptibility to LMV compared to wild-type virus (which was susceptible to all antiviral agents) but were more susceptible than HBV isolates whose genomes encode the classic LMV resistance-associated changes, rtM204V/I. Although the study did not investigate sensitivity to ADV, this amino acid substitution has emerged during ADV therapy, thus raising the possibility that it could contribute to ADV resistance.

Several patterns of amino acid substitutions consistent with the existence of coevolving nonsynonymous mutations were observed in the four functional domains of the HBV polymerase, including several amino acid fixation events not known to be associated with drug resistance. However, accumulation concomitant with the emergence of known drug resistance-associated amino acid substitutions suggests a potential role in the life cycle of these drug-resistant viral variants. Strong associations between known drug resistance-associated mutations and other nucleotide variations found across the full HBV genome have previously been identified via Bayesian network analysis (21). Our analysis supports the findings of these studies, in that the development of drug resistance in HBV is a complex trait involving the coevolution of multiple sites throughout the HBV genome. It should also be noted that the majority of the nucleotide variations were observed as fixation events and, hence, have coevolved with known drug-resistant SNVs. This result precluded the possibility of identifying compensatory mutations that occur after the onset of drug-resistant SNVs.

In general, the biological role for viral populations is to create a pool of viruses that have a reservoir of mutations that enable viral adaptation to changing environments, such as drug selection pressure. Changes in selection pressure, in this instance, sequential antiviral therapy, select for different dominant viral quasispecies. The mutation rate, viral population dynamics, and selection pressures all affect the evolution of the overall viral quasispecies. In addition, the preexistence of mutations may place constraints on the selection of subsequent mutations in the presence of new antiviral therapies. Finally, this study showed that the majority of spHBV variants lacked the highly antigenic S regions as well as the RT region, which is consistent with the hypothesis of an advantage in terms of escape from strong immune pressures as well as treatment.

Limitations and future application of single-molecule sequencing.

Although only one subject was analyzed in this study, the application of novel single-molecule sequencing along with detailed clinical data and specialized bioinformatics methods revealed novel insights into the complex evolution of HBV infection. The population-based sequencing results were consistent with the more common spHBV variant distribution identified from single-molecule deep sequencing of several known and novel variants identified with both approaches at the same sampling time points. In this study, PCR amplification was required to ensure that minor variants were detected via deep sequencing; hence, the rare novel variants detected via single-molecule sequencing may be affected by primer-specific amplification bias. However, the positions of the primers utilized for this experiment were within very conserved regions of the HBV genome. A limitation of any next-generation sequencing approach is the possible preferential amplification of shorter HBV genomes during the PCR, as there is a size selection bias toward shorter genomes (45). The high degree of diversification observed across the full genome may also be useful in directing future work investigating the effect of new drug treatment regimens. However, experimental validation of the novel spHBV variants identified is required.

Conclusions.

This study showed that deep sequencing analyses of full-length HBV genomes reveal accurate evolutionary pathways that shape the complex dynamics of CHB. In this first study to apply single-molecule sequencing to full-length HBV genomes, it has been shown that the population of spliced variants is much more diverse than that previously reported. Longitudinal analysis revealed that the diversity of spHBV variants changes considerably over time. Future studies should be directed to the longitudinal analysis of spliced variants and their effect on immune escape and the evolution of drug resistance. The identification of mutations outside RT which co-occur with known drug resistance-associated mutations highlights the relevance of using full-genome sequencing and supports the hypothesis that drug resistance involves epistatic interactions across the full-length HBV genome. The proposed approach could be utilized in future studies and with larger cohorts to investigate the between-patient variation in the distribution of HBV variants and their clinical relevance in determining disease outcome. Finally, the new open-source bioinformatics tools are freely available and have the capacity to easily identify potential spliced variants from deep sequencing data.

REFERENCES

  • 1.Trépo C, Chan HLY, Lok A. 2014. Hepatitis B virus infection. Lancet 384:2053–2063. doi: 10.1016/S0140-6736(14)60220-8. [DOI] [PubMed] [Google Scholar]
  • 2.Lavanchy D. 2004. Hepatitis B virus epidemiology, disease burden, treatment, and current and emerging prevention and control measures. J Viral Hepat 11:97–107. doi: 10.1046/j.1365-2893.2003.00487.x. [DOI] [PubMed] [Google Scholar]
  • 3.Locarnini S, Zoulim F. 2010. Molecular genetics of HBV infection. Antivir Ther 15:3–14. doi: 10.3851/IMP1619. [DOI] [PubMed] [Google Scholar]
  • 4.Kramvis A. 2014. Genotypes and genetic variability of hepatitis B virus. Intervirology 57:141–150. doi: 10.1159/000360947. [DOI] [PubMed] [Google Scholar]
  • 5.McMahon BJ. 2009. The influence of hepatitis B virus genotype and subgenotype on the natural history of chronic hepatitis B. Hepatol Int 3:334–342. doi: 10.1007/s12072-008-9112-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Rodriguez-Frias F, Buti M, Tabernero D, Homs M. 2013. Quasispecies structure, cornerstone of hepatitis B virus infection: mass sequencing approach. World J Gastroenterol 19:6995–7023. doi: 10.3748/wjg.v19.i41.6995. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Khudyakov Y. 2010. Coevolution and HBV drug resistance. Antivir Ther 15:505–515. doi: 10.3851/IMP1515. [DOI] [PubMed] [Google Scholar]
  • 8.Bartholomeusz A, Locarnini SA. 2006. Antiviral drug resistance: clinical consequences and molecular aspects. Semin Liver Dis 26:162–170. doi: 10.1055/s-2006-939758. [DOI] [PubMed] [Google Scholar]
  • 9.Zoulim F, Locarnini S. 2009. Hepatitis B virus resistance to nucleos(t)ide analogues. Gastroenterology 137:1593–1608.e1–e2. doi: 10.1053/j.gastro.2009.08.063. [DOI] [PubMed] [Google Scholar]
  • 10.Huy TT-T, Ushijima H, Win KM, Luengrojanakul P, Shrestha PK, Zhong Z-H, Smirnov AV, Taltavull TC, Sata T, Abe K. 2003. High prevalence of hepatitis B virus pre-S mutant in countries where it is endemic and its relationship with genotype and chronicity. J Clin Microbiol 41:5449–5455. doi: 10.1128/JCM.41.12.5449-5455.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Gao ZY, Li T, Wang J, Du JM, Li YJ, Li J, Lu FM, Zhuang H. 2007. Mutations in preS genes of genotype C hepatitis B virus in patients with chronic hepatitis B and hepatocellular carcinoma. J Gastroenterol 42:761–768. doi: 10.1007/s00535-007-2085-1. [DOI] [PubMed] [Google Scholar]
  • 12.Liu S, Zhang H, Gu C, Yin J, He Y, Xie J, Cao G. 2009. Associations between hepatitis B virus mutations and the risk of hepatocellular carcinoma: a meta-analysis. J Natl Cancer Inst 101:1066–1082. doi: 10.1093/jnci/djp180. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Choi MS, Kim DY, Lee DH, Lee JH, Koh KC, Paik SW, Rhee JC, Yoo BC. 2007. Clinical significance of pre-S mutations in patients with genotype C hepatitis B virus infection. J Viral Hepat 14:161–168. doi: 10.1111/j.1365-2893.2006.00784.x. [DOI] [PubMed] [Google Scholar]
  • 14.Warner N, Locarnini S. 2008. The antiviral drug selected hepatitis B virus rtA181T/sW172* mutant has a dominant negative secretion defect and alters the typical profile of viral rebound. Hepatology 48:88–98. doi: 10.1002/hep.22295. [DOI] [PubMed] [Google Scholar]
  • 15.Günther S, Sommer G, Iwanska A, Will H. 1997. Heterogeneity and common features of defective hepatitis B virus genomes derived from spliced pregenomic RNA. Virology 238:363–371. doi: 10.1006/viro.1997.8863. [DOI] [PubMed] [Google Scholar]
  • 16.Ma Z-M, Lin X, Wang Y-X, Tian X-C, Xie Y-H, Wen Y-M. 2009. A double-spliced defective hepatitis B virus genome derived from hepatocellular carcinoma tissue enhanced replication of full-length virus. J Med Virol 81:230–237. doi: 10.1002/jmv.21393. [DOI] [PubMed] [Google Scholar]
  • 17.Abraham TM, Lewellyn EB, Haines KM, Loeb DD. 2008. Characterization of the contribution of spliced RNAs of hepatitis B virus to DNA synthesis in transfected cultures of Huh7 and HepG2 cells. Virology 379:30–37. doi: 10.1016/j.virol.2008.06.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Soussan P, Pol J, Garreau F, Schneider V, Le Pendeven C, Nalpas B, Lacombe K, Bonnard P, Pol S, Kremsdorf D. 2008. Expression of defective hepatitis B virus particles derived from singly spliced RNA is related to liver disease. J Infect Dis 198:218–225. doi: 10.1086/589623. [DOI] [PubMed] [Google Scholar]
  • 19.Soussan P, Tuveri R, Nalpas B, Garreau F, Zavala F, Masson A, Pol S, Brechot C, Kremsdorf D. 2003. The expression of hepatitis B spliced protein (HBSP) encoded by a spliced hepatitis B virus RNA is associated with viral replication and liver fibrosis. J Hepatol 38:343–348. [DOI] [PubMed] [Google Scholar]
  • 20.Bayliss J, Lim L, Thompson AJV, Desmond P, Angus P, Locarnini S, Revill PA. 2013. Hepatitis B virus splicing is enhanced prior to development of hepatocellular carcinoma. J Hepatol 59:1022–1028. doi: 10.1016/j.jhep.2013.06.018. [DOI] [PubMed] [Google Scholar]
  • 21.Thai H, Campo DS, Lara J, Dimitrova Z, Ramachandran S, Xia G, Ganova-Raeva L, Teo C-G, Lok A, Khudyakov Y. 2012. Convergence and coevolution of hepatitis B virus drug resistance. Nat Commun 3:789. doi: 10.1038/ncomms1794. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Barzon L, Lavezzo E, Costanzi G, Franchin E, Toppo S, Palù G. 2013. Next-generation sequencing technologies in diagnostic virology. J Clin Virol 58:346–350. doi: 10.1016/j.jcv.2013.03.003. [DOI] [PubMed] [Google Scholar]
  • 23.Beerenwinkel N, Guenthard HF, Roth V, Metzner KJ. 2012. Challenges and opportunities in estimating viral genetic diversity from next-generation sequencing data. Front Microbiol 3:329. doi: 10.3389/fmicb.2012.00329. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Rhoads A, Au KF. 2015. PacBio sequencing and its applications. Genomics Proteomics Bioinformatics 13:278–289. doi: 10.1016/j.gpb.2015.08.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Tenney DJ, Levine SM, Rose RE, Walsh AW, Weinheimer SP, Discotto L, Plym M, Pokornowski K, Yu CF, Angus P, Ayres A, Bartholomeusz A, Sievert W, Thompson G, Warner N, Locarnini S, Colonno RJ. 2004. Clinical emergence of entecavir-resistant hepatitis B virus requires additional substitutions in virus already resistant to lamivudine. Antimicrob Agents Chemother 48:3498–3507. doi: 10.1128/AAC.48.9.3498-3507.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Günther S, Li BC, Miska S, Krüger DH, Meisel H, Will H. 1995. A novel method for efficient amplification of whole hepatitis B virus genomes permits rapid functional analysis and reveals deletion mutants in immunosuppressed patients. J Virol 69:5437–5444. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Ayres A, Locarnini S, Bartholomeusz A. 2004. HBV genotyping and analysis for unique mutations. Methods Mol Med 95:125–149. [DOI] [PubMed] [Google Scholar]
  • 28.Bull RA, Luciani F, McElroy K, Gaudieri S, Pham ST, Chopra A, Cameron B, Maher L, Dore GJ, White PA, Lloyd AR. 2011. Sequential bottlenecks drive viral evolution in early acute hepatitis C virus infection. PLoS Pathog 7:e1002243. doi: 10.1371/journal.ppat.1002243. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Yang X, Charlebois P, Gnerre S, Coole MG, Lennon NJ, Levin JZ, Qu J, Ryan EM, Zody MC, Henn MR. 2012. De novo assembly of highly diverse viral populations. BMC Genomics 13:475. doi: 10.1186/1471-2164-13-475. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Li H, Durbin R. 2009. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25:1754–1760. doi: 10.1093/bioinformatics/btp324. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Wilm A, Aw PPK, Bertrand D, Yeo GHT, Ong SH, Wong CH, Khor CC, Petric R, Hibberd ML, Nagarajan N. 2012. LoFreq: a sequence-quality aware, ultra-sensitive variant caller for uncovering cell-population heterogeneity from high-throughput sequencing datasets. Nucleic Acids Res 40:11189–11201. doi: 10.1093/nar/gks918. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.R Core Team. 2015. R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. [Google Scholar]
  • 33.Zagordi O, Bhattacharya A, Eriksson N, Beerenwinkel N. 2011. ShoRAH: estimating the genetic diversity of a mixed sample from next-generation sequencing data. BMC Bioinformatics 12:119. doi: 10.1186/1471-2105-12-119. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Kumar S, Stecher G, Tamura K. 22 March 2016. MEGA7: molecular evolutionary genetics analysis version 7.0 for bigger datasets. Mol Biol Evol doi: 10.1093/molbev/msw054. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Hall BG. 2013. Building phylogenetic trees from molecular data with MEGA. Mol Biol Evol 30:1229–1235. doi: 10.1093/molbev/mst012. [DOI] [PubMed] [Google Scholar]
  • 36.Leung P, Bull R, Lloyd A, Luciani F. 2014. A bioinformatics pipeline for the analyses of viral escape dynamics and host immune responses during an infection. Biomed Res Int 2014:264519. doi: 10.1155/2014/264519. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Sommer G, Heise T. 2008. Posttranscriptional control of HBV gene expression. Front Biosci 13:5533–5547. [DOI] [PubMed] [Google Scholar]
  • 38.Liu Y, Li X, Xin S, Xu Z, Chen R, Yang J, Liu L, Wong VW-S, Yang D, Chan HL-Y, Xu D. 2015. The rtA181S mutation of hepatitis B virus primarily confers resistance to adefovir dipivoxil. J Viral Hepat 22:328–334. doi: 10.1111/jvh.12298. [DOI] [PubMed] [Google Scholar]
  • 39.Torresi J, Earnest-Silveira L, Civitico G, Walters TE, Lewin SR, Fyfe J, Locarnini SA, Manns M, Trautwein C, Bock TC. 2002. Restoration of replication phenotype of lamivudine-resistant hepatitis B virus mutants by compensatory changes in the “fingers” subdomain of the viral polymerase selected as a consequence of mutations in the overlapping S gene. Virology 299:88–99. doi: 10.1006/viro.2002.1448. [DOI] [PubMed] [Google Scholar]
  • 40.Pourkarim MR, Sharifi Z, Soleimani A, Amini-Bavil-Olyaee S, Fakhr AE, Sijmons S, Vercauteren J, Karimi G, Lemey P, Maes P, Alavian SM, Van Ranst M. 2014. Evolutionary analysis of HBV “S” antigen genetic diversity in Iranian blood donors: a nationwide study. J Med Virol 86:144–155. doi: 10.1002/jmv.23798. [DOI] [PubMed] [Google Scholar]
  • 41.Cao G-W. 2009. Clinical relevance and public health significance of hepatitis B virus genomic variations. World J Gastroenterol 15:5761–5769. doi: 10.3748/wjg.15.5761. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Ni Y, Sonnabend J, Seitz S, Urban S. 2010. The pre-S2 domain of the hepatitis B virus is dispensable for infectivity but serves a spacer function for L-protein-connected virus assembly. J Virol 84:3879–3888. doi: 10.1128/JVI.02528-09. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Urban S, Bartenschlager R, Kubitz R, Zoulim F. 2014. Strategies to inhibit entry of HBV and HDV into hepatocytes. Gastroenterology 147:48–64. doi: 10.1053/j.gastro.2014.04.030. [DOI] [PubMed] [Google Scholar]
  • 44.Ono-Nita SK, Kato N, Shiratori Y, Masaki T, Lan KH, Carrilho FJ, Omata M. 1999. YMDD motif in hepatitis B virus DNA polymerase influences on replication and lamivudine resistance: a study by in vitro full-length viral DNA transfection. Hepatology 29:939–945. doi: 10.1002/hep.510290340. [DOI] [PubMed] [Google Scholar]
  • 45.Head SR, Komori HK, LaMere SA, Whisenant T, Van Nieuwerburgh F, Salomon DR, Ordoukhanian P. 2014. Library construction for next-generation sequencing: overviews and challenges. Biotechniques 56:61–64, 66, 68. doi: 10.2144/000114133. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Journal of Virology are provided here courtesy of American Society for Microbiology (ASM)

RESOURCES