Abstract
To find the predictors of early HCC based on the dynamic changes of HBV quasispecies, this study utilizing the second-generation sequencing (NGS) and high-order multiplex droplet digital PCR (ddPCR) technology to examine the HBV quasispecies in serum of total 247 subjects recruited from high-incidence area of HCC. In the discovery stage, 15 non-synonymous Single Nucleotide Polymorphisms (SNPs) with higher variant proportion in HCC case group were founded (all P<0.05). Furthermore, the variant proportions in some of these SNPs were observed changing regularly within 5 years before the onset of HCC, and 5 of them located in HBX, 2 in HBS and 2 in HBC. The HBV predominant quasispecies and their consensus sequences were identified by genetic evolution analysis, in which the high HBS and HBC quasispecies heterogeneity were found associated with the forming of multifarious quasispecies clones, and the HBX gene had the highest proportion of predominant quasispecies (46.7 % in HBX vs 12.7 % and 13.8 % in HBS and HBC respectively) with the key variations (G1512A, A1630G, T1753C/G/A, A1762T and G1764A) determined. In the validation stage, we confirmed that the combined double mutations of G1512A+A1630G, A1762T+G1764A, and the combined triple mutations of T1753C/G/A + A1762T+G1764A, all expressed higher in early HCC cases when comparing with control group (all P<0.05). We also demonstrated the advantages of ddPCR using in multi-variations detection in large-sample for early HCC surveillance and screening. So we think that the dynamic of key HBV variation positions and their different combinations determined by quasispecies anlysis in this study can act as the novel predictors of early hepatocarcinoma and suitable to popularize and apply in HCC screening.
Keywords: Hepatitis B virus, Hepatocellular carcinoma, Single-nucleotide polymorphism, Quasispecies
1. Introduction
Chronic hepatitis B virus (HBV) infection is still a serious public health problem, with potential adverse sequelae, resulted in 887 000 deaths such as liver cirrhosis, liver failure, and hepatocellular carcinoma (HCC) in 2022 (Jeng et al., 2023). An estimated 257 million people were living with chronic HBV infection globally, and 75 % resided in the Asia-Pacific region (Liaw et al., 2012). China is the major country suffering HBV infection and HCC. According to the International Cancer Center of World Health Organization, the new liver cancer cases and deaths in China were estimated to account for nearly about 50 % of the total number worldwide (Siegel et al., 2016). Moreover, about two-thirds of the HCC patients in China had reached the middle and advanced stage when diagnosed, and they missed the appropriate opportunity of surgical treatment and had a poor prognosis. Even those patients accepted synthetic treatment, their 5-years survival rate was lower than 10 % (Schmidt et al., 2011); in contrast, the 5-years survival rate of HCC in early stage could reach 60 %∼80 % (Carr, 2004; Nathan et al., 2009). Therefore, screening and surveillance in high risk population for early diagnosis and early treatment of HCC is very important, which is an effective preventive and therapeutic measure to reduce mortality and increase the survival rate of HCC.
Since HBV infection accounts for up to 60 % of HCC in developing countries (El-Serag, 2012), and even more than 90 % in some areas of China, screening for liver cancer among HBV infected people is more cost-effective than screening in the general population (Bruix and Sherman, 2005). In China, current screening strategies are to test serum alpha-fetoprotein (AFP) and perform liver ultrasonography every six months in male patients with chronic hepatitis B virus (HBsAg) infection who are above 40 years old and female patients with HBsAg above 50 years. However, China has the highest burden of hepatitis B virus infection in the entire world, with an estimated 86 million HBsAg carriers living there, including 32 million chronic hepatitis B patients, making up more than one-third of all HBsAg chronic carriers worldwide (Global prevalence, treatment, and prevention of hepatitis B virus infection in 2016: a modelling study, 2018). The HCC screening project we implemented in rural areas with high incidence of liver cancer in Guangxi, China also showed the high carrier rate of HBsAg, which was 17.83 % among the people who took part in the screening (20.21 % in males and 13.18 % in females respectively) (Zhang et al., 2006). Moreover, the HBV-infected persons in this region rarely receive antiviral treatment for economic reasons. In a word, the high positive rate of HBsAg made the screening work expensive and need a lot of labor force, resulting in the current liver cancer screening program difficult to popularized. Therefore, except for HBsAg, more non-invasive markers for risk stratification are necessary to identify patients at high risk for HCC for early diagnosis and treatment.
It have been recommended that some special HBV genome variation positions should be evaluated for inclusion in a screening of a personal risk prediction for HCC (Ringelhan et al., 2015). The epidemiological characteristics, clinical manifestations and prognosis after infection have been validated different between various of genotypes of HBV; there is heterogeneity of HBV genome in different HBV-infected person as well. However, it was still unclear that what are the differences between the DNA sequences of different copies of HBV genome in the same HBV infected person and what is the significance, especially the differences with the duration of HBV infection and the progression of the disease. Since the polymerase encoded by the P gene of HBV virus has the function of reverse transcription, the life cycle of HBV is to synthesize the pre-genomic RNA, and then use it as a template to copy the genomic sequence of the progeny HBV through reverse transcription, and the reverse transcriptase does not have 3 '−5′ proofreading function. Theoretically, the HBV genome of the progeny should have some variation compared with the parental template. According to this theory, there are a large number of copies of HBV in the serum of each HBV-infected person, and each copy of HBV DNA sequence is different, thus forming a quasispecies of HBV (Blum, 1993; Ngui and Teo, 1997).
For quasispecies test, the Clone-based Sequencing (CBS) is the main and the classical method. The main steps of the CBS included viral genome extraction, PCR amplification of target genes, cloning of PCR products, identification of positive recombinant clones and Sanger sequencing. However, the CBS method has complicated experimental procedures, and the number of clones selected from each sample is limited (Chevaliez et al., 2012; Beerenwinkel and Zagordi, 2011), which will cause selection bias and cannot fully represent the genetic variation of the whole virus population (Forns et al., 1997). Since the advent of next-generation sequencing (NGS) and the "sequence-by-synthesis" approach adopted by the NGS technology, it is considered feasible to apply to the detect of quasispecies. Compared with CBS, NGS can detect < 1 % of low abundance mutations, and the sensitivity is higher than CBS. Meanwhile, NGS can produce hundreds of sequences per sample, which may cover the entire quasispecies (including dominant and recessive strains), so it can obtain a much larger virus population than CBS, and can be used to evaluate viral diversity and avoid the bias caused by the limitation of clone number. These advantages of NGS have been confirmed by several studies (Yousif et al., 2014; Ramírez et al., 2013; Mese et al., 2013; Margeridon-Thermet et al., 2013; Gong et al., 2013) and our previous study by HBX gene quasispecies analysis in one HCC case (Mei et al., 2019). However, when we applied NGS to the whole HBV gene quasispecises analysis in much more subjects for early HCC surveillance and screening, the disadvantages of NGS began to be prominent, such as the low accessibility with high costs, and the requirement of highly specialized technology for massive raw-data analysis. Additionally, the numerous mutations previously found in the HBX gene, including A1630G, T1753, and the region spanning from 1753 to 1764 (Mei et al., 2019), were difficult to detect in one reaction tube by traditional PCR. So it is necessary to focus on key variations related with HCC and explore new experimental techniques suitable to popularize and apply in HCC screening. Recently, the droplet digital PCR (ddPCR) technology, with the advantages including high detection accuracy, elevated detection rates for low-copy mutations, and the ability to simultaneously detect multiple mutations in a single reaction, is gradually accepted as a new choice for precise detection of mutation (Zhang et al., 2015; Link-Lenczowska et al., 2018).
In this study, we aimed to find out the HBV variation positions and the suspicious carcinogenic sequence of HBV quasispecies correlated with the progress of HCC. At first, the dynamic changes of nucleotide sequences of HBV quasispecies during 5 years before HCC onset were detected and analyzed by NGS. To shed light on the association between HCC and these variants and dominant quasispecies sequences recognized by NGS, we then used ddPCR to validate the differential expression of HBX mutant positions in HCC cases and high-risk HCC populations. Through this study, we hope a new selection criteria of high-risk population by detection of variation positions and predominant quasispecies of HBV would be provided for the early diagnosis and early treatment of HCC.
2. Materials and methods
2.1. Subjects
Subjects of this study were recruited from a cohort including 3500 chronic HBV carriers, who were enrolled in the hyperendemic areas of HCC in Southwest of China, aged from 30 to 65 years old, with positive HBsAg in serum but no history of cirrhosis and antiviral therapy during follow-up performed every six months.
In the discovery stage, five cases who eventually developed to early HCC in the cohort and two controls without HCC and other malignant tumors were selected randomly as the subjects of this study. The diagnosis of HCC depended on positive indications in the histopathology; or an imaging indication (including abdominal ultrasonography, angiography, CT and MRI examination) plus AFP concentration in serum higher than 400 μg/L or the concentration increase higher than 50 μg/L (Yang et al., 2011). Referring to the TNM staging criteria of Union for International Cancer Control for HCC, the one with a single nodule diameter less than two centimeter or multiple nodules with a total diameter less than two centimeter was defined as early HCC (Fishman and Branch, 2009). A total of 40 sera from these 7 study subjects were successively collected during the follow-up period of 5 years before the diagnosis of HCC patients, and the HBV quasispecies in serum were detected by NGS and analized for the discovery of HBV dynamic variation positions.
In the validation stage, eighty-three HCC patients derived from the same cohort were recruited as the case group, and 157 non-HCC as the control group. The inclusion and exclusion criteria for subjects in the validation set were exactly the same as those in the discovery set. By ddPCR, serum of these 240 subjects were collected to validate the variation positions found in the discovery set.
This study was approved by the Ethics Committee of Guangxi Medical University Cancer Hospital (NO. KY2020041), and the informed consents were obtained from all subjects.
2.2. Measured parameters of HBV infection and hepatic lesion
The biomarkers of HBV infection, including HBsAg and HBeAg in serum, were detected by Enzyme-Linked Immune Sorbent Assay (ELISA) kits (Rong Sheng Biotech, Shanghai, China). HBV DNA levels were tested by PCR-Fluorescence Quantification Kit for Hepatitis B Virus (Da An Gene Co, Ltd, of Sun Yat-sen University, Guangzhou, China) with the lowest detection limit of 100 IU/mL. ALT was detected by Reitman-Frankel method and AFP by ELISA (Biocell, Zhengzhou, China) to estimate the degree of liver damage and the hepatic lesion of the subjects. All the experimental operations were implemented according to the manufacturer's instructions.
2.3. Extraction of HBV DNA
QIAamp MinElute Virus Spin Kit (Qiagen, Hilden, Germany) was used to extract HBV DNA from the serum samples of all objects according to the manufacturer's instructions. The extracted HBV DNA were stored in −80 °C refrigerator for the consequent detection of HBV genotype, HBV quasispecics and the variation positions validation.
2.4. PCR amplification of HBS, HBC and HBX genes
The complete length of HBV X gene (HBX), segmental HBV C (HBC) and HBV S genes (HBS) in serum samples of all subjects were amplified by nested PCR with HotStarTaq Plus DNA Polymerase (Qiagen, Hilden, Germany). The reaction system consisted of 10 × Coral Load PCR Buffer 10 µl, dNTP mix (10 mM of each) 2 µl, forward and reward primer solutions (10 µM) 5 µl respectively, HotStarTaq Plus DNA Polymerase 0.5 µl, HBV DNA10 µl, dd H2O 67.5 µl. The PCR procedure was as follows: pre-denaturation at 95 °C for 5 min followed by 35 cycles of denaturation at 94 °C for 1 min, annealing at 50∼68 °C for 1 min and extension at 72 °C for 1 min and a final extension at 72 °C for 10 min. The PCR products were separated by 1 % agarose gel electrophoresis and analyzed by the gel imaging and analysis system (Bio-Rad, USA). The PCR primers and products length of HBS, HBC and HBX genes were shown in Supplementary Materials Table S1.
2.5. Next generation sequencing and data processing
The amplified HBS, HBC and HBX genes fragments derived from discovery set were purified and sent to the Beijing Genomics Institute for quasispecics detection by NGS (Illumina miseq PE250, Santiago, California). At first, at least 3 ug genome DNA was fragmented randomly to 170 bp for most fragments size. Then DNA end repair were performed, and a single 'A' nucleotide was added to the 3′ ends of the blunt fragments. Next, adapters were ligated and additional sequences were added by tailed primers during PCR to construct library. Subsequently, these constructed libraries were hybridized with virus probe to enrich target fragments. After elution, PCR amplification was performed to construct hybridized library. In the end, sequencing was conducted to obtain raw data for each sample.
For data processing, the reads with too many Null-bases (> 10 %) or low base quality (> 50 % bases with base quality < 5) in raw data were discarded and formed the “clean data”. Sequencing reads were aligned to the reference genome sequence using Burrows-Wheeler Alignment Tool (BWA 0.6). The HBV genome (NC_003977.1) was used as the reference genome for mapping. The picard (version 1.141) was used to mark duplicated reads (redundant information produced by PCR). After that, the alignment results were combined to one BAM format file which is the compressed form of SAM file. The sequence fragments in the BAM file were assembled by using the overlap via FLASH v1.2.11, and the mismatch density was controlled less than 0.1. Thus, the nucleotide sequence data of HBV quasispecies colonies from 40 serum samples, which covered the complete length of HBX, segmental HBC and HBS genes were obtained and used for further single-nucleotide polymorphism (SNP) analysis and quasispecies analysis.
2.6. The detection and analysis of HBV genotype
HBV was classified into eight genotypes including A, B, C, D, E, F, G, and H, based on nucleotide sequence heterogeneity of >4 % in the S gene region. In this study, the HBS sequence of each sample was obtained by Sanger sequencing and identified HBV genotype using the NCBI HBV genotyping tool (https://www.ncbi.nlm.nih.gov/projects/genotyping).
2.7. SNPs analysis of HBV quasispecies colonies
Genome Analysis Toolkit (GATK 3.1) was used to detect SNPs of the HBX, HBC and HBS gene quasispecies. Genotype with the highest probability at a given locus was identified for each individual sequencing sample and the consensus sequence of the sample was assembled and saved as CNS format (a kind of export file of Microsoft). Using the consensus sequence, the polymorphic loci between the identified genotype and the reference can be filtered and highlighted; this will constitute the high confident SNPs dataset. The dataset was saved as tab-separated file in text format.
2.8. The quasispecies heterogeneity analysis
The heterogeneity of the quasispecies, including complexity and diversity, was evaluated at the nucleotide and amino acid levels respectively. Quasispecies complexity was expressed as the Shannon entropy (Sn) (Fishman and Branch, 2009), which refers to the proportion of different virus quasispecies. Sn ranges from 0 to 1; 0 means that all sequences are identical and 1 means that each sequence is different and unique. Sn was calculated as Sn= −∑i(pi lnpi)/lnN, where pi is the frequency of each sequence in the virus population and N is the total clone number. The diversity of quasispecies was described by the average genetic distance (d, also known as the Hamming distance), the number of synonymous substitutions per site (ds) and the number of non-synonymous substitutions per site (dn). The diversity of quasispecies was calculated by Molecular Evolutionary Genetics Analysis 7.0 (MEGA 7.0) software (Zhang et al., 2016), in which d was calculated under the Kimura 2-Parameter model and ds and dn were calculated under the Jukes-Cantor model after 300 sequences selected randomly from each sample were put into MEGA7.0.
2.9. Analysis of HBV predominant quasispecies
By MEGA7.0, the Neighbor-Joining method (Jukes-Cantor Model) was used to construct a phylogenetic tree of 300 sequences for evaluating the genetic similarity of the HBV variations, that was HBV quasispecies, and to find out the predominant quasispecies sequence that with the largest proportion. By comparing and analyzing the differences of the predominant quasispecies sequences between cases and controls in the discovery stage, the relationship between predominant HBV quasispecies and key variation positions with early HCC would be revealed.
2.10. ddPCR for validation of multi-variation positions
To validate the differential expression of HBV variation positions found in the discovery stage, we performed multiplex ddPCR in the validation stage using the MicroDrop-100A droplet digital PCR system (FOREVENRGEN) to quantify the copy number of HBV variation positions. Two ddPCR reaction systems were designed in this study: the ddPCR Reaction I was designed to detect the co-occurring G1512A and A1630G mutations, while the ddPCR Reaction II was designed to detect the co-occurring T1753C/G/A, A1762T, and G1764A mutations. We designed probes for these mutation sites and detected them in the droplets by dual-channel fluorescence of the MicroDrop-100A droplet digital PCR system (FOREVENRGEN). The ddPCR mixture (20 μL) included 10 μL of ddPCR super-mix for probes, 1800 nmol/L of upstream and downstream primers to generate a 161 bp amplicon, 500 nmol/L of probes (as shown in Table S2), and 4 μL of DNA sample. The mixture was loaded into a DG8 cartridge and combined with 50 μL of droplet generation oil to form droplets using the droplet generator (FOREVENRGEN). After processing, the droplets were transferred to a 96-well PCR plate (Eppendorf). The thermal cycling conditions in the two ddPCR Reactions were as follows: an initial denaturation at 95 °C for 10 min, followed by 30 s at 95 °C, 60 s at 62 °C for 45 cycles, and a final extension at 98 °C for 10 min, and then cooled to 16 °C. We included a no-template control to control PCR contamination during each detecting. After amplification, the plate was loaded onto the MicroDrop-100B biochip analyzer (FOREVENRGEN) and obtain the quantification data of the target molecules, which were represented as the copy number per microliter of DNA sample and analyzed using the QuantDrop analysis software (FOREVENRGEN).
2.11. Statistical analysis
Statistical analysis was performed by SPSS 22.0 and GraphPad Prism 5.0 software package. Between the case and control group, the distribution of gender and age were calculated and compared by Fisher's Exact Test; the SNPs proportional differences were compared using Chi-squared Test; the quasispecies complexity and diversity were compared using a Wilcoxon Signed Rank Test. The statistical significance was considered at p < 0.05.
3. Results
3.1. Demographic characteristic and baseline information of the subjects
This study contained a total of 247 subjects, with 7 in the discovery set and 240 in the validation set respectively. The 40 serum samples of 7 subjects in discovery dataset were numbered in the order of sampling. In the discovery stage of this study, the time from the subjects entering the cohort to the end of follow-up spanned 5 years, which was divided into the following four stages: 2.6–5years, 1.6–2.5 years, 0.6–1.5 years and 0–0.5 year, according to the time from sampling to HCC diagnosis (it was replaced with the last sampling time in control group) to assess the dynamic change of HBV variation when approaching to the onset of HCC. The demographic characteristic and baseline information of these subjects, including the specimen Number, gender, age, HBV-DNA level at each sampling point and other information, were shown in Supplementary Materials Table S3. The results of NGS derived from discovery set were with an average sequencing depth of 577 ×, each sample produced 1,095,885 clean reads on average. The distributions of the demographic characteristics and baseline information between the discovery and validation populations were compared and showed in Supplementary Materials Table S4.
3.2. SNPs distribution of HBV quasispecies
As the data shown in Table 1, of the total clean reads of HBV quasispecies in discovery set, 167 nucleotide loci were detected variation, covering 4843 (167 × 29) positions in 29 samples of case group and 1837 (167 × 11) in the 11 samples of control group. The average variation frequency in the case group (31.80 %, 1540/4843) was higher than that in the control group (18.29 %, 336/1837), and the difference was statistically significant (χ2=120.30, p < 0. 001).
Table 1.
The overall variation frequencies of SNPs in HBV Quasispecies between case and control group in discovery stage.
| Variation types | Case (n = 4843) |
Control (n = 1837) |
Total (n = 6680) |
χ2 | P | |||
|---|---|---|---|---|---|---|---|---|
| n | % | n | % | n | % | |||
| A/G | 372 | 7.68 | 98 | 5.33 | 470 | 7.04 | 11.21 | 0.001⁎⁎ |
| C/T | 587 | 12.12 | 130 | 7.08 | 717 | 10.73 | 35.36 | <0.001⁎⁎⁎ |
| Transition | 959 | 19.80 | 228 | 12.41 | 1187 | 17.77 | 49.78 | <0.001⁎⁎⁎ |
| A/C | 325 | 6.71 | 56 | 3.05 | 381 | 5.70 | 33.21 | <0.001⁎⁎⁎ |
| A/T | 132 | 2.73 | 11 | 0.60 | 143 | 2.14 | 28.76 | <0.001⁎⁎⁎ |
| C/G | 66 | 1.36 | 23 | 1.25 | 89 | 1.33 | 0.12 | 0.724 |
| G/T | 58 | 1.20 | 18 | 0.98 | 76 | 1.14 | 0.56 | 0.454 |
| Transversion | 581 | 12.00 | 108 | 5.88 | 689 | 10.31 | 53.88 | <0.001⁎⁎⁎ |
| SNPs | 1540 | 31.80 | 336 | 18.29 | 1876 | 28.08 | 120.30 | <0.001⁎⁎⁎ |
Note:1.* p < 0.05;.
p < 0.01;.
p < 0.001
2.#:Statistical analysis using exact probability method.
3.3. SNPs effected on the HBV gene expression and associated with HCC
Of the 167 SNPs found in the HBV quasispecies in discovery stage, there were 43 SNPs with different variation proportions in the HCC case group compared with the control (Supplementary Materials Table S5), in which 11 were in the HBS region, 20 in the HBX and 12 in the HBC.
When considering their influence on HBV gene expression, 15 non-synonymous SNPs, which would change the amino acid codons, were found in these 43 SNPs. Among the 15 non-synonymous SNPs, there were four in HBS including C312T,A453G,C455A and T531A/C; nine in HBX (T1473C,C1480A,G1512A,C1629T,A1630G,C1653T,T1753C/G/A,A1762T and G1764A; and two in HBC (G2153A,C2288A). The 15 non-synonymous SNPs and the changes of amino acid codons were showed in Table 2.
Table 2.
The 15 SNPs in HBV quasispecies related to the amino acid changes and HCC.
| NO. | Nucleotide | Wide-type | Variant-type | Amino acid substitution |
||
|---|---|---|---|---|---|---|
| HBS | HBX | HBC | ||||
| 1 | nt 312 | C | T | S53L | ||
| 2 | nt 453 | A | G | Y100C | ||
| 3 | nt 455 | C | A | Q101K | ||
| 4 | nt 531 | T | A/C | I126N/T | ||
| 5 | nt 1473 | T | C | F34L | ||
| 6 | nt 1480 | C | A | P36H | ||
| 7 | nt 1512 | G | A | A47T | ||
| 8 | nt 1629 | C | T | H86R | ||
| 9 | nt 1630 | A | G | H86R | ||
| 10 | nt 1653 | C | T | H94Y | ||
| 11 | nt 1753 | T | C/G/A | I127T/S/N | ||
| 12 | nt 1762 | A | T | K130M | ||
| 13 | nt 1764 | G | A | V131I | ||
| 14 | nt 2153 | G | A | V114I | ||
| 15 | nt 2288 | C | A | P159T | ||
3.4. Dynamics of the 15 HBV non-synonymous variations during HCC oncogenesis
We found that the average variation proportion of 15 SNPs before 2.6–5 years of HCC onset was the highest (66.67 %), followed by 0.6–1.5 years (65.71 %), 1.6–2.5 years (60.95 %) and 0–0.5 years (58.33 %) respectively. When perspective into the variation proportion in different stage of each SNP, the dynamics differences of this 15 HBV non-synonymous variations during HCC oncogenesis was found between the case group and the control group (Fig. 1).
Fig. 1.
(A) The average variation proportion of 15 HBV SNPs variation during four stages of HCC progression, the highest in 2.6–5 years (66.67 %), followed by 0.6–1.5 years (65.71 %), 1.6–2.5 years (60.95 %), the lowest in 0–0.5 years (58.33 %). (B) Dynamics of 15 HBV SNPs variation during HCC progress. In the control group there was only one nucleotide (nt1630) changed, and occurred two years before the end of the study (blue line). In case group, the variation proportion in 2.6–5 years, 1.6–2.5 years, 0.6–1. 5 years and 0–0.5 year of the 15 HBV non-synonymous variation were drawn in red, green, yellow and purple line respectively. (C) The 15 SNPs were categorized into the high mutation proportion group(100 %)(nt1630, nt1762 and nt1764), medium mutation proportion group (80 % ∼ 60 %) (nt312, nt531, nt1653, nt1753 and nt2288), and low mutation proportion group(60 % ∼ 20 %)(nt453, nt455, nt1473, nt1480 nt1512, and nt1629). Nucleotide nt2153 was the most special, which variation proportion was higher in 2.6–5 years before onset, and then maintained at a lower level until HCC occurred.
3.5. Analysis of the characteristics of HBV quasispecies
Compared the diversity and the complexity of HBS, HBX and HBC quasispecies between case and control respectively in discovery set, the results were showed in Table 3. It was found that, in HBS quasispecies, the d_nt, ds, dn and Sn of case group were statistically higher than that of control. There was no significant difference between case and control group in the diversity and complexity of HBX quasispecies. While all the heterogeneity indexes of HBC quasispecies in the case group were statistically higher than that of control. The results of the dynamic changes of HBS, HBX and HBC quasispecies heterogeneity during the disease progress in cases group were showed in Supplementary Materials Figure S1.
Table 3.
Comparison of the diversity and the complexity of HBS, HBX and HBC between case and control in discovery stage.
| Quasispecies characteristic | HBS |
HBX |
HBC |
||||||
|---|---|---|---|---|---|---|---|---|---|
| Case (N = 29) | Control (N = 11) | P | Case (N = 29) | Control (N = 11) | P | Case (N = 29) | Control (N = 11) |
P | |
| Diversity | |||||||||
| d_nt | 0.0117 | 0.0073 | 0.019* | 0.0082 | 0.0068 | 0.584 | 0.0154 | 0.0038 | <0.001⁎⁎⁎ |
| d_AA | 0.0187 | 0.0181 | 0.125 | 0.0172 | 0.0134 | 0.331 | 0.0208 | 0.0046 | <0.001⁎⁎⁎ |
| ds | 0.0205 | 0.0108 | 0.006⁎⁎ | 0.0105 | 0.0106 | 0.879 | 0.0322 | 0.0072 | <0.001⁎⁎⁎ |
| dN | 0.0086 | 0.0056 | 0.014* | 0.0075 | 0.0055 | 0.292 | 0.0124 | 0.0023 | <0.001⁎⁎⁎ |
| Complexity | |||||||||
| Sn | 0.9333 | 0.6585 | <0.001⁎⁎ | 0.6639 | 0.7671 | 0.209 | 0.928 | 0.6123 | <0.001⁎⁎⁎ |
Note:1.
p < 0.05;.
p < 0.01;.
p < 0.001
2.#:Statistical analysis using exact probability method.
3.6. Phylogenetic tree analysis of HBV quasispecies
The quasispecies composition of HBS, HBX and HBC were calculated and compared, and the results were shown in Fig. 2. Among them, the average proportions of HBS (Fig. 2A) and HBC predominant quasispecies (Fig. 2. C) in the case group were lower than that in the control group (12.71% vs 48.89 % in HBS, 13.82% vs 47.18 % in HBC, each p < 0.001); while the average proportion of HBX predominant (Fig. 2B) in the case groups was higher than that in control group (46.69% vs 41.99 %, p < 0.001). Comparing the average proportion of HBX predominant quasispecies in case group with that of HBS and HBC, it was found that HBX predominant quasispecies proportion was striking higher than HBS and HBC (46.69 % in HBX vs 12.71 % and 13.82 % in HBS and HBC respectively).
Fig. 2.
The quasispecies composition of HBS (A), HBX (B) and HBC (C), The bar with blue, pink and green color represented the first, the second and the third dominant quasispecies in each sample. Among them, the average proportions of HBS (A) and HBC predominant quasispecies (C) in the case group were lower than that in the control group (12.7 % vs 48.9 % in HBS, 13.8 % vs 47.2 % in HBC, each p < 0.05); while the average proportion of HBX predominant (B) in the case groups was higher than that in control group (46.7 % vs 42.0 %, P <0.05). Comparing the average proportion of HBX predominant quasispecies in case group with that of HBS and HBC, it was found that HBX predominant quasispecies proportion was striking higher than HBS and HBC (46.7 % % in HBX vs 12.7 % and 13.8 % in HBS and HBC respectively).
By constructing phylogenetic tree and alignment these predominant sequences of HBS, HBX and HBC quasispecies respectively, the consensus sequences of case and control group were identified as Fig. 3 and Supplementary Materials Table S6. For HBS, the different amino acid between consensus predominant sequence of case and control were aa23 and aa96, the amino acid were Leucine and Asparagine in the case group while Serine and Isoleucine in the control group (Fig. 3A–C). For HBX, there were five amino acid sites differently in aa47, aa86, aa127, aa130 and aa131, which were Threonine, Arginine, Threonine, Methionine and Isoleucine in case group while Alanine, Histidine, Isoleucine, Lysine and Valine in control (Fig. 3D, E, F). For HBC, the consensus predominant sequences of case and control group were identical completely (Fig. 3G, H, I).
Fig. 3.
Phylogenetic tree of HBS, HBX and HBC predominant quasispecies and consensus sequences after alignment. (A) and (B) were the phylogenetic trees of HBS quasispecies predominant sequences in case group and control; (C) presented the different amino acid between HBS consensus predominant sequence of case and control were aa23, and aa96. (D) and (E) were the phylogenetic trees of HBX quasispecies predominant sequences in case group and control; there were five amino acid sites differently in aa47, aa86, aa127, aa130 and aa131 between HBC consensus predominant sequence of case and control(F). (G) and (H) were the phylogenetic trees of HBC quasispecies predominant sequences in case group and control; the consensus predominant sequences of case and control group were identical completely(I). When the predominant sequences of each case were compared, it was found that HBS quasispecies were most stable in genetic evolution, only Case C and D were found nucleotides fluctuated in the HBS predominant sequences three years before the HCC onset (A). The stability of HBX quasispecies population (D) was close to HBS, Case C and D were found nucleotides fluctuated three years and one year before the HCC onset respectively. While HBC quasispecies (G) were the most variegated fragment of HBV genes, all cases except D were found non-consensus HBC predominant sequences during HCC progress.
When the predominant sequences of each case were compared, it was found that HBS quasispecies were most stable in genetic evolution, only Case C and D were found nucleotides fluctuated in the HBS predominant sequences three years before the HCC onset (Fig. 3A). The stability of HBX quasispecies population (Fig. 3D) was close to HBS, Case C and D were found nucleotides fluctuated three years and one year before the HCC onset respectively. While HBC quasispecies (Fig. 3G) were the most variegated fragment of HBV genes, all cases except D were found non-consensus HBC predominant sequences during HCC progress.
3.7. Validation of multi-HBX variation positions using ddPCR
Using the high-order multiplex ddPCR method, the HBX variation positions of 240 subjects in validation dataset were tested and the results were shown in Fig. 4 and Table 4.
Fig. 4.
The two-dimensional scatterplots of the ddPCR Reaction I and Reaction II. The ddPCR Reaction I was designed to detect the co-occurring G1512A and A1630G mutations, while the ddPCR Reaction II was designed to detect the co-occurring T1753C/G/A, A1762T, and G1764A mutations. The mutation types found in the droplets of ddPCR Reaction I system included (A), (B), (C), and (D), in which the gray dots cluster in the lower-left corner represented the blank control, the green dots cluster in the lower-right corner represented the HBV-Wild Type (WT), and the other clusters of blue, green, and red dots at different locations represented WT, G1512A and A1630G mutation, or the different mutation combinations of them. The mutation types found in the droplets of ddPCR Reaction II system included (E), (F), (G), (H), (I), (J), (K), and (L), in which the gray dots cluster in the lower left corner represented the blank control, and the other clusters of blue, green, and red dots at different locations represented T1753C/G/A, A1762T, and G1764A mutation, or the different mutation combinations of them.
Table 4.
Comparison of the mutation types distribution in case and control groups in the validation stage.
| The ddPCR Reaction system | Mutation types | Cases (%) |
Χ2 | P | Cases (%) |
Χ2 | P | ||
|---|---|---|---|---|---|---|---|---|---|
| HCC Group n = 83 | Control Group n = 157 | Early HCC Group n = 68 | Control Group n = 157 | ||||||
| Reaction I | G1512A+A1630G | 44(53.01) | 42(26.75) | 16.284 | <0.001⁎⁎⁎ | 36(52.94) | 42(26.75) | 14.369 | <0.001⁎⁎⁎ |
| G1512A | 2(2.41) | 7(4.46) | 0.191 | 0.662 | 1(1.47) | 7(4.46) | 0.518 | 0.472 | |
| A1630G | 20(24.10) | 45(28.66) | 0.573 | 0.449 | 16(23.53) | 45(28.66) | 0.633 | 0.426 | |
| Raction II | T1753/C/G/A+A1762T+G1764A | 30(36.14) | 42(26.75) | 2.281 | 0.131 | 28(41.18) | 42(26.75) | 4.607 | 0.032* |
| T1753/C/G/A+A1762T | 1(1.20) | 0(0) | / | 0.346# | 0(0) | 0(0) | / | / | |
| T1753/C/G/A+G1764A | 7(8.43) | 9(5.73) | 0.637 | 0.425 | 7(10.29) | 9(5.73) | 0.884 | 0.347 | |
| A1762T+G1764A | 27(32.53) | 28(17.83) | 6.638 | 0.010* | 22(32.35) | 28(17.83) | 5.787 | 0.016* | |
| T1753/C/G/A | 0(0) | 2(1.27) | / | 0.546# | 0(0) | 2(1.27) | / | 1.000# | |
| A1762T | 1(1.20) | 3(1.91) | 0.000 | 1.000 | 1(1.47) | 3(1.91) | 0.000 | 1.000 | |
| G1764A | 11(13.25) | 48(30.57) | 8.785 | 0.003⁎⁎ | 7(10.29) | 48(30.57) | 10.565 | 0.001⁎⁎ | |
In ddPCR Reaction I, we found that the rate of G1512A+A1630G combined mutation in the HCC group was higher than that in the control group (53.01% vs 26.75 %), and the difference was statistically significant (χ2=16.284, P<0.001); the rate of this combined double mutation was also statistically higher in the early HCC group than in the control group (χ2=14.369, P<0.001), with 52.94 % (36/68) in the early HCC group and 26.75 % (42/157) in the control group. In ddPCR Reaction II, we found that the rate of A1762T+G1764A combined mutation in the HCC group was higher than that in the control group (32.53 % vs 17.83 %), and the difference was statistically significant (χ2=6.638, P = 0.010); the rate of this combined double mutation was also statistically higher in the early HCC group than in the control group (χ2=5.787, P = 0.016), with 32.35 % (22/68) in the early HCC group and 17.83 % (28/157) in the control group. The T1753/C/G/A + A1762T+G1764A triple mutation rate was higher in the early HCC group (41.18 %, 28/68) than that in the control group (28.66 %, 42/157), and the difference was statistically significant (χ2=4.607, P = 0.032). The G1764A mutation rates was 11.25 % (11/83) in the HCC group and lower than that in the control group (30.57 %, 48/157), and the difference was statistically significant (χ2=8.785, P = 0.003); the rate of this single mutation was also statistically lower in the early HCC group than in the control group (χ2=10.565, P = 0.001), with 10.29 % (7/68) in the early HCC group and 30.57 % (48/157) in the control group.
As showed in Table S7, when comparing the HBX variation positions after stratified the validation population by different HBV genotypes, HBV DNA and ALT level, we found that for subjects infected with HBV C genotype, G1512A+A1630G double mutation, T1753/C/G/A+A1762T+G1764A triple mutation and G1764A single mutation were significantly different between HCC cases/early HCC cases and controls (all p < 0.05); but only G1512A+A1630G double mutation distribution was found different in subjects infected with HBV B genotype (p < 0.05). For subjects with high HBV DNA level in serum, G1512A+A1630G double mutation, A1762T+G1764A double mutation and G1764A single mutation were significantly different between HCC cases/early HCC cases and controls (all p < 0.05); but only T1753/C/G/A+A1762T+G1764A triple mutation distribution was found different in subjects with low HBV DNA level (p < 0.05). For subjects with normal ALT level, G1512A+A1630G double mutation, A1762T+G1764A double mutation and G1764A single mutation were significantly different between HCC cases/early HCC cases and controls (all p < 0.05); but no variation positions distribution was found different in subjects with abnormal ALT level (all P > 0.05).
4. Discussion
Utilizing the NGS and ddPCR technology to examine the HBV quasispecies, this study described the dynamics of nucleotide variants and the predominant sequence of HBV quasispecies during the progress of HCC, especially the semi-annual changes approaching the time node of HCC diagnosis, to elucidate how the key HBV variant positions and the predominant sequences influenced a asymptomatic HBsAg carrier developing to HCC.
In perspective of the SNPs of HBV quasispecies, the present study found that except the C/G and G/T transversion, the frequency of other types of variation in the HCC case group were higher than that in the control group in the discovery stage. The greater amount of HBV variant positions provided more choices to adapt the environment when interacting with the host. There were 15 non-synonymous SNPs with higher variant proportion in HCC case group eventually confirmed in the discovery stage. Most of these SNPs located at X gene. HBx is highly expressed in cytoplasm, especially in mitochondria. It can inhibit thymine DNA glycosylase-mediated base resection and repair, cell apoptosis and tumor suppressor gene expression; as well as promote cell proliferation, malignant transformation, distant metastasis and oncogene expression; and also cause oxidative stress and mitochondrial DNA damage through the integration with liver cells (Link-Lenczowska et al., 2018; Xu et al., 2014). In all of the HBX SNPs found in the discovery stage, The mutation at nt1630, nt1753 and the A1762T/G1764A double mutations are the most emphasized HBV mutations for HCC (Yu et al., 2012; Chou et al., 2008; Yin et al., 2011; Liu et al., 2009; Zhu et al., 2010; Sung et al., 2008; Li and Ou, 2001). We also have found these four variation positions in one HCC case in our previous study that concerned the detection technology of HBV quasispecies (Mei et al., 2019), and referring to this previously constructing experimental technique we confirmed their associations with HCC in a large sample in the present study. Besides, The other HBV sites overlap with the X gene including T1473C,C1480A,G1512A,C1629T and C1653T we found in this study also contributed to the HBV-induced hepatocarcinogenesis, in which the mechanism may be related to the enhancement of HBV replication (Kay and Zoulim, 2007; Chen et al., 2007), the binding ability alteration of nuclear factors and the transactivation oncogenes responsible for the development of HCC or that transactivators encoded by some oncogenes select the specific HBV mutations during HBV-induced hepatocarcinogenesis (Kay and Zoulim, 2007; Chen et al., 2006). Except for the SNPs found and confirmed in HBX, there were 6 novel variants that might play important role in hepatocarcinogenisis found in HBS and HBC in this study. For HBS gene, the sG145R, K141E, T131I variant and other insertion mutations can significantly affect the structure of HBsAg and induct the endoplasmic reticulum stress with consequent oxidative DNA damage and genomic instability (Seddigh-Tonekaboni et al., 2000). More recently, some determinant substitutions were reported in association with vaccine escape (i.e., T116N, P120S/E, I/T126A/N/I/S, Q129H/R, M133L, K141E, P142S and D144A/E) (Basuni et al., 2004). In which only the I/T126A/N/I/S encoded by T531A/C variant was consistent of this study. In HBC gene, the SNP occurring at nt 2288, leading to a Pro-to-Thr substitution at position 159, was located on both T- and B-cell epitopes in the viral core gene. This mutation was found to increase markedly during acute exacerbation of hepatitis B, but was not yet reported for HCC (Okumura et al., 2001).
Recently, the continuous increase HBV variants with time approaching clinical onset of HCC were concerned, and it showed the highest cumulative mutation number of HBV occurred within 4.5 years before the diagnosis of HCC comparing with those >4.5 years (Sung et al., 2016). Another study verified the increase in HBV pre-S quasispecies complexity and diversity at 1–3 years greater than that at 4–6 years and 7–9 years before HCC development (Zhang et al., 2017). So, in this study, we implemented a further investigation to illuminate the semi-annual changes of HBV variants 5 years before the diagnosis of HCC, which may be a critical period for the development of HBV-induced HCC. Although there were many evidences have showed the existence of the 1762T/G1764A double mutations and the other variant positions especially in the enhancer II/basal core promoter sequence even 10 years before the diagnosis of HCC (Chou et al., 2008; Kao et al., 2003; Guo et al., 2008; Zhu et al., 2008; Tong et al., 2006; Jang et al., 2007), the temporal fluctuation of these variants, especially approaching the onset of HCC, were seldom described due to the high replication frequency and mutation rate of HBV, as well as the limitation of sequencing technology. HBV, as one member of the hepadnavirus family, replicates by a reverse transcriptase that lacks proofreading activity, and this fact seems to justify their high mutation rate (3.2 × 10−5–7.9 × 10−5 nucleotide substitutions/replicative cycle); 100 times higher than other DNA viruses (Dandri et al., 2008). This poses a great challenge to the sequencing technology. In recent years, the emergence of NGS technology provided a simpler, convenient and high-throughput method for quasispecies research, which was verified in our previous study (Mei et al., 2019). By NGS, the temporal trend of HBV variations within 5 years before the onset of HCC were detected and analyzed in this study. According to our findings, three variants (A1630G, A1762T and G1764A) were continually observed or one variant (G2153A) suddenly disappeared, were related with a high risk of HBV infection and causing poor prognosis. Those individuals who met this criterion should be recommended a regular follow-up and an active surveillance. Additionally, C312T, T531A/C, C1653T, T1753C/G/A and C2288A maintained at high variation proportions before HCC onset, so these five variants also can be served as available biomarkers for predicting the clinical outcomes of patients with chronic hepatitis B.
Also due to the high replication rate and the absence of proofreading during reverse transcription, the HBV population consists of genetically distinct but closely related variants known as quasispecies. Quasispecies means components of variants that take up different survivability in certain environments (Domingo and Gomez, 2007; Domingo et al., 2012; Domingo et al., 2006). Usually, HBV quasispecies not only play a carcinogenic role in the form of a single mutation site, but also in the form of haplotype. Therefore, this study further explored the relationship between HBV quasispecies haplotypes and their temporal changes with HCC progress. The results showed that the quasispecies heterogeneity of HBS and HBC in case groups were higher than that in the control group, explaining the significant variety in the quasispecies nucleotide sequences and genetic distances of both genes between two groups. It meant that in the case group, HBS and HBC quasispecies may execute immune escape and maintain a high level of HBV replication by generating more multifarious quasispecies clones and more adaptive to the host environment changes, so as to exert its role in hepatocacinomagenesis through endoplasmic reticulum stress, activating the signaling cascades and mediating necro-inflammation (Seddigh-Tonekaboni et al., 2000; Basuni et al., 2004; Okumura et al., 2001) . Although this study find no difference in the quasispecies heterogeneity of HBX between case and control group, by comparing and analysis of the phylogenetic tree, we found that there were high proportions of HBX predominant quasispecies in both two groups, as well as their striking distinction in the nucleotide sequences between two groups with five different non-synonymous SNPs, including G1512A, A1630G, T1753C/G/A, A1762T and G1764A, which were exactly consistent with the predicted variant positions mentioned above. This result suggested that HBX was more likely to play its role as a carcinogen by altering functional proteins or changing the topological structure (Ringelhan et al., 2015; Xu et al., 2014).
Meanwhile, the results of the high-order multiplex ddPCR in the validation stage of this study demonstrate that compared to the control group, the combined double mutations of G1512A and A1630G, A1762T and G1764A, as well as the combined triple mutations of T1753C/G/A, A1762T, and G1764A, all exhibited high expression in early-stage and advanced-stage HCC cases. This finding further validates the hypothesis established in the discovery stage by the NGS detection and phylogenetic tree analysis that these combined mutations were more sensitive than single mutations for early HCC screening and diagnosis, especially the T1753/C/G/A+A1762T+G1764A triple mutation for people who infected with HBV C genotype and low HBV DNA level; and G1512A+A1630G double mutation for people with HBV B genotype and normal ALT in serum. Previous research has demonstrated that the A1762T/G1764A double mutation increases the risk of hepatocellular carcinoma by decreasing the synthesis of HBeAg and increasing the replication level of HBV DNA, resulting in HBeAg-negative chronic HBV infection, as well as by upregulating S-phase kinase-related protein 2 (Skp2) and downregulating p53 (Yan et al., 2015) through the expression of this double mutant protein. The combination mutation of T1753A/A1762T/G1764A has a larger impact on cell migration and proliferation compared to a single mutation, and the combined mutation of T1753A/A1762T/G1764A promotes HBV replication (Chen et al., 2016). Based on the foregoing, we suspect that these five variation positions located on HBX may be crucial in the oncogenesis of HCC, and recommented incorporating these five variation positions, especially the combined mutation of these five variation positions, into the existing liver cancer prediction model to identify high risk population of HCC for precisely screening and surveillance.
There were several limitations in this present study. Firstly, the limited operation ability of MEGA software restricts the amount of inputting sequences. Only 300 sequences per sample were randomly selected for the quasispecies analysis in this study. Nevertheless, we have selected 300, 500 and 1000 sequences respectively and carried out multiple rounds of operation to test the running speed of MEGA software with the same method. The results showed that the difference of quasispecies characteristic indexes calculated was very negligible, but the running speed of the 300 sequences group was significantly faster than the other two. This is the ground for this study to select 300 sequences for quasispecies research. Secondly, due to the potential mutual interference between different primer and probe combinations, this ddPCR detection method was not able to build a 5-plex ddPCR reaction to simultaneously detect all 5 HBX mutation sites (G1512A, A1630G, T1753C/G/A, A1762T and G1764A), as these 5 mutation sites are located close to each other. Finally, in this study, only variant positions, predominant quasispecies and their dynamic changes within 5 years before the onset of HCC were proposed, a comprehensive prediction model have not been established yet. It is well known that the synergistic effects increase the risk for hepatocarcinogenesis, including demographic factors, as well as environmental factors, furthermore, viral co-infections (hepatitis C, hepatitis D or HIV co-infections), HBV genotype, viral DNA integration into the host genome or other direct effects of viral proteins, even the host susceptibility gene polymorphism might increase the risk for HCC development. Therefore, in the future it is necessary to add the HBV risk factors found in this study into the known prediction model to establish a more complete model with higher sensitivity and specificity for HCC screening, and to validate the predict model by perspective cohort study.
In conclusion, the present study firstly found 15 HBV non-synonymous SNPs that were associated with HCC, and by analyzing the dynamic changes of these SNPs, 5 variation positions in HBX, 2 in HBS and 2 in HBC were confirmed related with the disease progress. Additionally, the high HBS and HBC quasispecies heterogeneity were found associated with the forming of multifarious quasispecies clones. The predominant quasispecies and their consensus sequences were found out by genetic evolution analysis, and finally the key variations G1512A, A1630G, T1753C/G/A, A1762T and G1764A were determined by comparing the HBV predominant quasispecies. We also verified the relationship of these 5 variations with HCC by ddPCR and demonstrated the advantages of ddPCR using in multi-variations detection in large-sample for early HCC surveillance and screening. With respect to the novelty of study method, experimental technique and the application value, the results of this study can make contribute to identify the high-risk persons in the HBsAg carriers for screening the early HCC, and eventually provide preliminary references to optimize and establish novel screening strategy for prevention and treatment of HCC.
Funding
This study was supported by grants from the National Natural Science Foundation of China (No. 82160638); The Guangxi Natural Science Foundation (No. 2021GXNSFAA220094 and No. 2019GXNSFDA245001); 2018 Guangxi One Thousand Young and Middle-aged College and University Backbone Teachers Cultivation Program (To Wei Deng).
Ethics approval and consent to participate
The study was conducted according to the guidelines of the Declaration of Helsinki and was approved by the Ethics Committee of Guangxi Medical University Cancer Hospital (NO.KY2020041).
Informed consent statement
The Informed consent was obtained from all individual participants included in the study.
Author statement
We confirm that this manuscript has not been published elsewhere and is not under consideration by another journal. All authors have approved the revised manuscript and agreed with the author list and resubmission to Journal of Virus Research.
CRediT authorship contribution statement
Chaojun Zhang: Formal analysis, Validation, Visualization, Writing – original draft, Writing – review & editing. Sanchun An: Formal analysis, Validation, Visualization, Writing – original draft, Writing – review & editing. Ruibo Lv: Formal analysis, Validation. Kezhi Li: Formal analysis, Validation. Haizhou Liu: Formal analysis, Validation. Jilin Li: Formal analysis, Validation. Yanping Tang: Formal analysis, Validation. Zhengmin Cai: Formal analysis, Validation. Tianren Huang: Formal analysis, Validation. Long Long: Conceptualization, Methodology, Project administration, Supervision. Wei Deng: Conceptualization, Formal analysis, Funding acquisition, Methodology, Project administration, Resources, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing.
Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
Authors would like to thank the staff members of The People's Hospital of Fusui County who implemented the HCC screening and following up the HCC patients.
Footnotes
Supplementary material associated with this article can be found, in the online version, at doi:10.1016/j.virusres.2024.199317.
Contributor Information
Tianren Huang, Email: tianrenhuang@sina.com.
Long Long, Email: frank@nnnu.edu.cn.
Wei Deng, Email: dengwei@gxmu.edu.cn.
Appendix. Supplementary materials
Fig. S1.The relation between the quasispecies heterogeneity of HBS, HBX and HBC and four periods before HCC diagnosis.
Table S1. The PCR primers and products length of HBS, HBC and HBX genes;
Table S2.Primers and probes in high-order multiplex ddPCR;
Table S3.Demographic characteristics and baseline information for 7 study participants in discovery dataset;
Table S4.The comparison of demographic characteristics and baseline data between the discovery and validation datasets;
Table S5.The 44 SNPs with different variation proportions between the HCC case and control group;
Table S6.The consensus sequences of the predominant sequences of HBS, HBX and HBC quasispecies in case and control group in the discovery stage;
Table S7.Comparison of the mutation distribution between case and control groups after stratified by genotypes, ALT levels, and HBVDNA levels in the validation set.
Data availability
Data will be made available on request.
References
- Basuni A.A., Butterworth L., Cooksley G., Locarnini S., Carman W.F. Prevalence of HBsAg mutants and impact of hepatitis B infant immunisation in four Pacific Island countries. Vaccine. 2004;22(21–22):2791–2799. doi: 10.1016/j.vaccine.2004.01.046. [DOI] [PubMed] [Google Scholar]
- Beerenwinkel N., Zagordi O. Ultra-deep sequencing for the analysis of viral populations. Curr. Opin. Virol. 2011;1(5):413–418. doi: 10.1016/j.coviro.2011.07.008. [DOI] [PubMed] [Google Scholar]
- Blum H.E. Hepatitis B virus: significance of naturally occurring mutants. Intervirology. 1993;35(1–4):40–50. doi: 10.1159/000150294. [DOI] [PubMed] [Google Scholar]
- Bruix J., Sherman M. Management of hepatocellular carcinoma. Hepatology. 2005;42(5):1208–1236. doi: 10.1002/hep.20933. [DOI] [PubMed] [Google Scholar]
- Carr B.I. Hepatocellular carcinoma: current management and future trends. Gastroenterology. 2004;127(5 Suppl 1):S218–S224. doi: 10.1053/j.gastro.2004.09.036. [DOI] [PubMed] [Google Scholar]
- Chen B.F., Liu C.J., Jow G.M., Chen P.J., Kao J.H., Chen D.S. High prevalence and mapping of pre-S deletion in hepatitis B virus carriers with progressive liver diseases. Gastroenterology. 2006;130(4):1153–1168. doi: 10.1053/j.gastro.2006.01.011. [DOI] [PubMed] [Google Scholar]
- Chen C.H., Hung C.H., Lee C.M., et al. Pre-S deletion and complex mutations of hepatitis B virus related to advanced liver disease in HBeAg-negative patients. Gastroenterology. 2007;133(5):1466–1474. doi: 10.1053/j.gastro.2007.09.002. [DOI] [PubMed] [Google Scholar]
- Chen Z., Tang J., Cai X., et al. HBx mutations promote hepatoma cell migration through the Wnt/β-catenin signaling pathway. Cancer Sci. 2016;107(10):1380–1389. doi: 10.1111/cas.13014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chevaliez S., Rodriguez C., Pawlotsky J.M. New virologic tools for management of chronic hepatitis B and C. Gastroenterology. 2012;142(6):1303–1313. doi: 10.1053/j.gastro.2012.02.027. e1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chou Y.C., Yu M.W., Wu C.F., et al. Temporal relationship between hepatitis B virus enhancer II/basal core promoter sequence variation and risk of hepatocellular carcinoma. Gut. 2008;57(1):91–97. doi: 10.1136/gut.2006.114066. [DOI] [PubMed] [Google Scholar]
- Dandri M., Murray J.M., Lutgehetmann M., Volz T., Lohse A.W., Petersen J. Virion half-life in chronic hepatitis B infection is strongly correlated with levels of viremia. Hepatology. 2008;48(4):1079–1086. doi: 10.1002/hep.22469. [DOI] [PubMed] [Google Scholar]
- Domingo E., Gomez J. Quasispecies and its impact on viral hepatitis. Virus Res. 2007;127(2):131–150. doi: 10.1016/j.virusres.2007.02.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Domingo E., Martin V., Perales C., Grande-Pérez A., García-Arriaza J., Arias A. Viruses as quasispecies: biological implications. Curr. Top. Microbiol. Immunol. 2006;299:51–82. doi: 10.1007/3-540-26397-7_3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Domingo E., Sheldon J., Perales C. Viral quasispecies evolution. Microbiol. Mol. Biol. Rev. 2012;76(2):159–216. doi: 10.1128/MMBR.05023-11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- El-Serag H.B. Epidemiology of viral hepatitis and hepatocellular carcinoma. Gastroenterology. 2012;142(6):1264–1273. doi: 10.1053/j.gastro.2011.12.061. e1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fishman S.L., Branch A.D. The quasispecies nature and biological implications of the hepatitis C virus. Infect. Genet. Evol. 2009;9(6):1158–1167. doi: 10.1016/j.meegid.2009.07.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Forns X., Bukh J., Purcell R.H., Emerson S.U. How Escherichia coli can bias the results of molecular cloning: preferential selection of defective genomes of hepatitis C virus during the cloning procedure. Proc. Natl. Acad. Sci. U. S. A. 1997;94(25):13909–13914. doi: 10.1073/pnas.94.25.13909. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Global prevalence, treatment, and prevention of hepatitis B virus infection in 2016: a modelling study. Lancet Gastroenterol. Hepatol. 2018;3(6):383–403. doi: 10.1016/S2468-1253(18)30056-6. [DOI] [PubMed] [Google Scholar]
- Gong L., Han Y., Chen L., et al. Comparison of next-generation sequencing and clone-based sequencing in analysis of hepatitis B virus reverse transcriptase quasispecies heterogeneity. J. Clin. Microbiol. 2013;51(12):4087–4094. doi: 10.1128/JCM.01723-13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Guo X., Jin Y., Qian G., Tu H. Sequential accumulation of the mutations in core promoter of hepatitis B virus is associated with the development of hepatocellular carcinoma in Qidong, China. J. Hepatol. 2008;49(5):718–725. doi: 10.1016/j.jhep.2008.06.026. [DOI] [PubMed] [Google Scholar]
- Jang J.W., Lee Y.C., Kim M.S., et al. A 13-year longitudinal study of the impact of double mutations in the core promoter region of hepatitis B virus on HBeAg seroconversion and disease progression in patients with genotype C chronic active hepatitis. J. Viral Hepat. 2007;14(3):169–175. doi: 10.1111/j.1365-2893.2006.00788.x. [DOI] [PubMed] [Google Scholar]
- Jeng W.J., Papatheodoridis G.V., Lok A, Hepatitis B. Lancet. 2023;401(10381):1039–1052. doi: 10.1016/S0140-6736(22)01468-4. [DOI] [PubMed] [Google Scholar]
- Kao J.H., Chen P.J., Lai M.Y., Chen D.S. Basal core promoter mutations of hepatitis B virus increase the risk of hepatocellular carcinoma in hepatitis B carriers. Gastroenterology. 2003;124(2):327–334. doi: 10.1053/gast.2003.50053. [DOI] [PubMed] [Google Scholar]
- Kay A., Zoulim F. Hepatitis B virus genetic variability and evolution. Virus Res. 2007;127(2):164–176. doi: 10.1016/j.virusres.2007.02.021. [DOI] [PubMed] [Google Scholar]
- Li J., Ou J.H. Differential regulation of hepatitis B virus gene expression by the Sp1 transcription factor. J. Virol. 2001;75(18):8400–8406. doi: 10.1128/JVI.75.18.8400-8406.2001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liaw Y.F., Kao J.H., Piratvisuth T., et al. Asian-Pacific consensus statement on the management of chronic hepatitis B: a 2012 update. Hepatol. Int. 2012;6(3):531–561. doi: 10.1007/s12072-012-9365-4. [DOI] [PubMed] [Google Scholar]
- Link-Lenczowska D., Pallisgaard N., Cordua S., et al. A comparison of qPCR and ddPCR used for quantification of the JAK2 V617F allele burden in Ph negative MPNs. Ann. Hematol. 2018;97(12):2299–2308. doi: 10.1007/s00277-018-3451-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liu S., Zhang H., Gu C., et al. Associations between hepatitis B virus mutations and the risk of hepatocellular carcinoma: a meta-analysis. J. Natl. Cancer Inst. 2009;101(15):1066–1082. doi: 10.1093/jnci/djp180. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Margeridon-Thermet S., Svarovskaia E.S., Babrzadeh F., et al. Low-level persistence of drug resistance mutations in hepatitis B virus-infected subjects with a past history of Lamivudine treatment. Antimicrob. Agents Chemother. 2013;57(1):343–349. doi: 10.1128/AAC.01601-12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mei F., Ren J., Long L., Li J., Li K., Liu H., Tang Y., Fang X., Wu H., Xiao C., et al. Analysis of HBV X gene quasispecies characteristics by next-generation sequencing and cloning-based sequencing and its association with hepatocellular carcinoma progression. J. Med. Virol. 2019;91:1087–1096. doi: 10.1002/jmv.25421. [DOI] [PubMed] [Google Scholar]
- Mese S., Arikan M., Cakiris A., et al. Role of the line probe assay INNO-LiPA HBV DR and ultradeep pyrosequencing in detecting resistance mutations to nucleoside/nucleotide analogues in viral samples isolated from chronic hepatitis B patients. J. Gen. Virol. 2013;94(Pt 12):2729–2738. doi: 10.1099/vir.0.053041-0. [DOI] [PubMed] [Google Scholar]
- Nathan H., Schulick R.D., Choti M.A., Pawlik T.M. Predictors of survival after resection of early hepatocellular carcinoma. Ann. Surg. 2009;249(5):799–805. doi: 10.1097/SLA.0b013e3181a38eb5. [DOI] [PubMed] [Google Scholar]
- Ngui S.L., Teo C.G. Hepatitis B virus genomic heterogeneity: variation between quasispecies may confound molecular epidemiological analyses of transmission incidents. J. Viral Hepat. 1997;4(5):309–315. doi: 10.1046/j.1365-2893.1997.00066.x. [DOI] [PubMed] [Google Scholar]
- Okumura A., Ishikawa T., Yoshioka K., Yuasa R., Fukuzawa Y., Kakumu S. Mutation at codon 130 in hepatitis B virus (HBV) core region increases markedly during acute exacerbation of hepatitis in chronic HBV carriers. J. Gastroenterol. 2001;36(2):103–110. doi: 10.1007/s005350170138. [DOI] [PubMed] [Google Scholar]
- Ramírez C., Gregori J., Buti M., et al. A comparative study of ultra-deep pyrosequencing and cloning to quantitatively analyze the viral quasispecies using hepatitis B virus infection as a model. Antiviral Res. 2013;98(2):273–283. doi: 10.1016/j.antiviral.2013.03.007. [DOI] [PubMed] [Google Scholar]
- Ringelhan M., O'Connor T., Protzer U., Heikenwalder M. The direct and indirect roles of HBV in liver cancer: prospective markers for HCC screening and potential therapeutic targets. J. Pathol. 2015;235(2):355–367. doi: 10.1002/path.4434. [DOI] [PubMed] [Google Scholar]
- Schmidt S., Follmann M., Malek N., Manns M.P., Greten T.F. Critical appraisal of clinical practice guidelines for diagnosis and treatment of hepatocellular carcinoma. J. Gastroenterol. Hepatol. 2011;26(12):1779–1786. doi: 10.1111/j.1440-1746.2011.06891.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Seddigh-Tonekaboni S., Waters J.A., Jeffers S., et al. Effect of variation in the common "a" determinant on the antigenicity of hepatitis B surface antigen. J. Med. Virol. 2000;60(2):113–121. doi: 10.1002/(sici)1096-9071(200002)60:2<113::aid-jmv2>3.0.co;2-0. [DOI] [PubMed] [Google Scholar]
- Siegel R.L., Miller K.D., Jemal A. Cancer statistics, 2016. CA Cancer J. Clin. 2016;66(1):7–30. doi: 10.3322/caac.21332. [DOI] [PubMed] [Google Scholar]
- Sung F.Y., Lan C.Y., Huang C.J., et al. Progressive accumulation of mutations in the hepatitis B virus genome and its impact on time to diagnosis of hepatocellular carcinoma. Hepatology. 2016;64(3):720–731. doi: 10.1002/hep.28654. [DOI] [PubMed] [Google Scholar]
- Sung J.J., Tsui S.K., Tse C.H., et al. Genotype-specific genomic markers associated with primary hepatomas, based on complete genomic sequencing of hepatitis B virus. J. Virol. 2008;82(7):3604–3611. doi: 10.1128/JVI.01197-07. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tong M.J., Blatt L.M., Kao J.H., Cheng J.T., Corey W.G. Precore/basal core promoter mutants and hepatitis B viral DNA levels as predictors for liver deaths and hepatocellular carcinoma. World J. Gastroenterol. 2006;12(41):6620–6626. doi: 10.3748/wjg.v12.i41.6620. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xu C., Zhou W., Wang Y., Qiao L. Hepatitis B virus-induced hepatocellular carcinoma. Cancer Lett. 2014;345(2):216–222. doi: 10.1016/j.canlet.2013.08.035. [DOI] [PubMed] [Google Scholar]
- Yan J., Yao Z., Hu K., et al. Hepatitis B Virus Core Promoter A1762T/G1764A (TA)/T1753A/T1768A Mutations Contribute to Hepatocarcinogenesis by Deregulating Skp2 and P53. Dig. Dis. Sci. 2015;60(5):1315–1324. doi: 10.1007/s10620-014-3492-9. [DOI] [PubMed] [Google Scholar]
- Yang H.I., Yuen M.F., Chan H.L., et al. Risk estimation for hepatocellular carcinoma in chronic hepatitis B (REACH-B): development and validation of a predictive score. Lancet Oncol. 2011;12(6):568–574. doi: 10.1016/S1470-2045(11)70077-8. [DOI] [PubMed] [Google Scholar]
- Yin J., Xie J., Liu S., et al. Association between the various mutations in viral core promoter region to different stages of hepatitis B, ranging of asymptomatic carrier state to hepatocellular carcinoma. Am J Gastroenterol. 2011;106(1):81–92. doi: 10.1038/ajg.2010.399. [DOI] [PubMed] [Google Scholar]
- Yousif M., Bell T.G., Mudawi H., Glebe D., Kramvis A. Analysis of ultra-deep pyrosequencing and cloning based sequencing of the basic core promoter/precore/core region of hepatitis B virus using newly developed bioinformatics tools. PLoS One. 2014;9(4):e95377. doi: 10.1371/journal.pone.0095377. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yu H., Zhu R., Zhu Y.Z., Chen Q., Zhu H.G. Effects of mutations in the X gene of hepatitis B virus on the virus replication. Acta Virol. 2012;56(2):101–110. doi: 10.4149/av_2012_02_101. [DOI] [PubMed] [Google Scholar]
- Zhang A.Y., Lai C.L., Huang F.Y., et al. Deep sequencing analysis of quasispecies in the HBV pre-S region and its association with hepatocellular carcinoma. J. Gastroenterol. 2017;52(9):1064–1074. doi: 10.1007/s00535-017-1334-1. [DOI] [PubMed] [Google Scholar]
- Zhang A.Y., Lai C.L., Poon R.T., et al. Hepatitis B virus full-length genomic mutations and quasispecies in hepatocellular carcinoma. J. Gastroenterol. Hepatol. 2016;31(9):1638–1645. doi: 10.1111/jgh.13316. [DOI] [PubMed] [Google Scholar]
- Zhang B.O., Xu C.W., Shao Y., et al. Comparison of droplet digital PCR and conventional quantitative PCR for measuring EGFR gene mutation. Exp. Ther. Med. 2015;9(4):1383–1388. doi: 10.3892/etm.2015.2221. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhang C.H.T, Zhang Z.Y.J, He Z.Z.D, et al. Study on HBV infection of the cancer prevention screening population in the primary liver cancer high incidence area of Guangxi. Guangxi Med. J. 2006;(12):1857–1859. [Google Scholar]
- Zhu R., Zhang H.P., Yu H., et al. Hepatitis B virus mutations associated with in situ expression of hepatitis B core antigen, viral load and prognosis in chronic hepatitis B patients. Pathol. Res. Pract. 2008;204(10):731–742. doi: 10.1016/j.prp.2008.05.001. [DOI] [PubMed] [Google Scholar]
- Zhu Y., Jin Y., Guo X., et al. Comparison study on the complete sequence of hepatitis B virus identifies new mutations in core gene associated with hepatocellular carcinoma. Cancer Epidemiol. Biomarkers Prev. 2010;19(10):2623–2630. doi: 10.1158/1055-9965.EPI-10-0469. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Fig. S1.The relation between the quasispecies heterogeneity of HBS, HBX and HBC and four periods before HCC diagnosis.
Table S1. The PCR primers and products length of HBS, HBC and HBX genes;
Table S2.Primers and probes in high-order multiplex ddPCR;
Table S3.Demographic characteristics and baseline information for 7 study participants in discovery dataset;
Table S4.The comparison of demographic characteristics and baseline data between the discovery and validation datasets;
Table S5.The 44 SNPs with different variation proportions between the HCC case and control group;
Table S6.The consensus sequences of the predominant sequences of HBS, HBX and HBC quasispecies in case and control group in the discovery stage;
Table S7.Comparison of the mutation distribution between case and control groups after stratified by genotypes, ALT levels, and HBVDNA levels in the validation set.
Data Availability Statement
Data will be made available on request.




