Abstract
An HCV outbreak occurred in 2012 in China, affecting hundreds of patients. We characterized HCV subtype 2a and 6a sequences from 60 and 102 patients, respectively, and co-analyzed with 82 local controls and 103 calibrating references. The close grouping of the patients’ sequences contrasted sharply with the diversity of local controls. Scaled by the calibrating references, the emergence of patients’ isolates was estimated at 2–5 years before sampling. In contrast, the controls intermingled with the calibrating references that were much older. For both subtypes, the major and minor clusters could be defined, with the closeness to indicate linked transmission.
Conclusion
HCV sequences from the study patients grouped into three subtype 2a and two subtype 6a clusters, in addition to three 6a solitary branches, representing descendants of eight earlier strains that were distinct and otherwise sporadic. Due to iatrogenic transmission through reusing needles, five strains were highly selected and preferentially spread.
Keywords: Outbreak, evolution, HCV, genetic sequence
INTRODUCTION
The analysis of genetic sequences has been effectively used to characterize the source of infection and transmission routes in outbreaks or linked cases caused by rapidly evolving RNA viruses, such as hepatitis C virus (HCV) and human immunodeficiency virus type 1 (HIV-1). Since 1992 a number of such studies have been reported (Ou et al., 1992; Esteban et al., 1996; Birch et al., 2000; Yerly et al., 2001; Metzker et al., 2002; Bracho et al., 2005), but it was not until 2013 that a molecular clock approach was first used in evolutionary analysis to confirm HCV transmission from an anesthetist to 275 patients (González-Candelas et al., 2013).
An effective analysis of this kind requires two critically important factors. First, the inclusion of a number of local cases of sporadic infection caused by a virus of the same genetic lineage. Second, the characterization of sequences in different genomic regions of the virus, at least one of which is in a highly variable region, provided that sufficient time has elapsed to allow new variations to accumulate (González-Candelas et al., 2013). Hypervariable region 1 (HVR1) is an example of such a highly variable region in HCV; by using the sequences in HVR1 from both donors and recipients, a very precise analysis may be performed even when only a short time has elapsed from transmission to sampling.
Although HCV is endemic worldwide, there is a large degree of geographic variability in its distribution. Countries with the highest prevalence rates are located in Africa and Asia. Areas with lower prevalence include the industrialized nations in North America, northern and western Europe, and Australia. The global prevalence of HCV viraemia is estimated at 1.4% (1.2%-1.7%) among adults and 1.1% (0.9%-1.4%) in all ages, corresponding to 75 (62–89) and 80 (64–103) million people, respectively. In China, a country that holds one-fifth of the world’s population, these rates are at 1.3% and 0.8%, corresponding to 8.9 (2.7–13.4) and 14.8 (4.4–22.3) million people (Gower et al., 2014). The most common risk factors for HCV transmission are blood transfusion from unscreened donors, injection drug use (IDU), unsafe medical injections, and other healthcare related procedures. Over the past 30 years, IDU has been the predominant source of new HCV infection in developed countries, such that in the US and Australia IDU has accounted for 68% and 80% of current HCV infections, respectively (Alter, 2002; Dore et al., 2003). In the developing world, limited information is known about the prevalence of IDU and its contribution to HCV transmission (Wasley & Alter, 2000). In a few developed countries, however, high prevalence of HCV has been also seen in some older age groups. It may indicate a substantial role of unsafe medical injections as this was thought to have occurred 30–50 years ago in a few isolated, hyper-endemic areas (Kiyosawa et al., 1994; Guadagnino et al., 1997; Okayama et al., 2002). Unsafe medical injections are the major and continuing contribution factor of new HCV infections in many developing countries such as China and India. In the remote and rural areas of these countries, sterile medical supplies may be inadequate or in shortage. Non-professionals often give injections in an unsanitary setting, and injections are often given to deliver medications that could otherwise be delivered by the oral route (Hauri et al., 2004). In such an environment, people may receive multiple contaminated injections over the course of a lifetime, incurring a substantial cumulative risk of HCV infection (Shepard et al., 2005). Although not officially acknowledged, such scenarios are often reported, characterizing the current HCV epidemics in China with multiple small-scale outbreaks scattered in different geographic regions. We strongly believe that the recent rapid increase in new HCV-infected cases in the country (http://www.chinacdc.cn/tjsj/fdcrbbg/) could have largely resulted from such scenarios.
In February 2012, a small outbreak of HCV infection affecting hundreds of people was reported in the Zicheng Township of Zijin County, Guangdong Province in China. Medical malpractice relating to improper reuse of needles was suspected as the cause, because the majority of the patients had received medical care in the same small clinic. A number of cases had been identified earlier in 2010, but this was not formally recognized until many patients went to Guangzhou City, the capital of Guangdong province, for medication, after which the outbreak was made known to the public (http://www.huffingtonpost.com/2012/02/27/hepatitis-c-china-guangdong_n_1304040.html; http://www.chinadaily.com.cn/china/2012-02/24/content_14681120.htm). In response to this situation, the central and provincial governments sent a group of experts to the site to conduct an investigation and to devise a treatment plan covered by a government health insurance program. Approximately 400 people from the community or its vicinity, who had been exposed to this outbreak, had their blood screened. However, for some reason this result has not been officially reported. In this study, we obtained the serum samples that remained from the investigation for an evolutionary analysis to explore the genetic relatedness of the viral isolates among these patients.
RESULTS
Sequence amplification and genotyping
RT-PCR and DNA sequencing were used to characterize HCV sequences. In 162/192 (84.4%) samples, we obtained HCV sequences in four genomic regions: core-E1, E1-E2, NS5A, and NS5B (Fig.S1). In 21/192 (10.9%) samples, we obtained HCV sequences only in 1–2 regions (Table S1). Collectively, viral isolates were genotyped in 185/192 (96.4%) samples, of which 66 (35.7%) belonged to subtype 2a and 116 (62.7%) belonged to subtype 6a. For three (1.6%) isolates their 5’UTR sequences were classified into genotype 6 but subtypes were not differentiated. Given a sample in which HCV sequences were amplifiable in several genomic regions, consistent genotyping results were shown.
Fig.S2 shows eight maximum likelihood (ML) phylogenies used to genotype 162 HCV isolates, for which sequences were first obtained in both Core-E1 and NS5B regions and then in E1-E2 and NS5A. In the trees based on Core-E1 and NS5B sequences, many isolates appear to be genetically identical, which may reflect the same viral sources in the outbreak or carry-over contamination in the laboratory. To exclude the latter possibility, we further amplified more variable E1-E2 and NS5A sequences. Since E1-E2 contains HVR1 and NS5A includes V3, both of which are fast evolving, they should allow us to better differentiate the isolates that have newly diverged from a recent common ancestor (Penin et al., 2004). Analyses of E1-E2 and NS5A sequences showed that isolates of subtype 2a appeared to be more variable than subtype 6a, while E1-E2 sequences better differentiated those from different individuals. However, exceptions were observed in four cases: BL122 (KJ678485) was identical to ZJ15 (KJ678544); and BL228 (KJ678506) was identical to ZJ96 (KJ678577) in their E1-E2 sequences (Fig.S2). These can arise in different isolates that are derived from the same viral sources. However, new genetic variations may have been selected after transmission through hosts. Such variations may be identified by cloning E1-E2 amplicons and sequencing multiple clones. By sequencing 9–10 clones for each isolate, we subsequently revealed their differences in terms of quasi-species (Fig.S3). Similar to the tree based on the consensus sequences, these quasi-species were more distant from the BL37 group (KJ678857-KJ678866), which were used to root the tree, than from each other. They were more closely related within hosts than among hosts, while a few identical quasi-species were only seen within hosts.
Analysis of concatenated sequences
To provide an overview of the phylogenetic relationship among the isolates characterized from the patients in the outbreak, further phylogenetic analyses were performed by including a number of local controls and calibrating references (which had their sampling dates known). Longer sequences are analyzed, more phylogenetic information may be provided with stronger bootstrap values to better support the ancestral relatedness of the determined viral isolates. In addition, analysis of sequences concatenated from different genomic regions is a way to verify that no viral recombination is resulted from crossover contaminations, which are otherwise common and will considerably affect the significance of evolutionary analysis. As a result, two comprehensive ML phylogenies were reconstructed, one for subtype 2a and the other for 6a, based on the sequences concatenated from four subgenomic regions: Core-E1, E1-E2, NS5A, and NS5B. The resulting sequences were each 2227 nucleotides (nt) in length for subtype 2a and 2102 nt for subtype 6a. These included 82 controls that represented the local HCV-infected population (in which the virus was probably contracted via sporadic transmission): 24 infected with subtype 2a strains and 58 infected with 6a strains (Lu et al., 2013). These also included 103 calibrating references: 68 determined in this study (Lu et al., 2005; Murphy et al., 2007; Fu et al., 2011, 2012; Gu et al., 2013; Li et al., 2014) and 35 retrieved from GenBank (Kato et al., 2001; Kurihara et al., 2001; Noppornpanth et al., 2008; Okamoto et al., 1991, 1992; Tokita et al., 1994; Zhou et al., 2011).
The two ML trees based on the concatenated sequences are too large to display and are thus provided in Fig.S4 and S5, respectively, and both look similar to those presented in Fig.S2. Although the sequences from the study patients are distinguishable from each other, they form tightly grouped clusters in both trees. In contrast, the local controls and calibrating references are mixed to some extent, with the latter appearing to be more genetically diverse. Two additional features are indicated by the 6a tree: (1) three isolates with the ZJ initials, ZJ60 (KJ678563, KJ678717 and KJ678960), ZJ72 (KJ678567, KJ678721 and KJ678964) and ZJ78 (KJ678570, KJ678724 and KJ678967), from the study patients sampled in the surrounding region, each diverge in a solitary branch positioned outside the tightly grouped cluster - they appear to represent sporadic strains (this was also apparent in Fig.S2); and (2) almost all of the calibrating references from Vietnam and from Canada are located at the other end of the tree and are genetically highly diverse.
Evolutionary analysis
Based on the concatenated sequences, the entire E1 and partial NS5A regions were trimmed for use in BEAST (Bayesian Evolutionary Analysis by Sampling Trees) analyses based on three clock models, exponential, lognormal and strict. We used different models in an attempt to show that, regardless of the models used, we would determine similar epidemic histories of HCV for the study patients - this should demonstrate the robustness of our approach.
Fig.1 presents six maximum clade credibility (MCC) trees for subtype 2a sequences, all of which suggest very similar ancestral relationships among subtype 2a isolates determined from the study patients. The closely grouped sequences from the study patients are in sharp contrast with the local controls, which are genetically diverse. Scaled by the calibrating references, the ages of the majority of the isolates from the study patients can be delimited within 2–5 years before their sampling. In contrast, the diverse branches of the controls were older and to a certain extent mixed with the calibrating references that had emerged even earlier. The closely grouped sequences from the study patients can be divided into three clusters, “a”, “b” and “c”, containing 50, 2 and 8 sequences, respectively. The “a” cluster can be further divided into several small subsets. Using this group information in further BEAST analyses, their tMRCAs (the time of the most recent common ancestor) were estimated as 2.7–3.9, 3.1–3.2 and 3.6–4.4 years for “a”, 2.2–2.7, 2.2–3.2 and 3.5–4.0 years for “b”, 2.0–4.6, 3.8–4.6 and 3.1–5.0 years for “c”, and 7.4–7.8, 7.7–9.1 and 9.3–9.8 years for the three clusters as a whole, using the exponential, lognormal and strict models, respectively (Fig.2).
Figure 1.
Six MCC trees estimated for subtype 2a sequences in the entire E1 and partial NS5A regions, corresponding to the nucleotides numbered 739–1490 and 6822–7664 in the H77 genome, using the exponential, lognormal and strict clock models. Each tree includes three groups of sequences: 60 from the study patients (blue), 24 local controls (red), and 32 calibrating references (green). Branch length represents the evolutionary ages measured by the grids corresponding to a reverse timescale at the base of each tree, starting from the sampling time (right) to the past (left). Three clusters, a, b, and c, indicate the temporarily classified three subtype 2a lineages determined from the patients exposed in this outbreak. Prior to these three clusters, a common ancestor is indicated with a magenta circle.
Figure 2.
The tMRCAs estimated for three clusters (a, b, and c) of subtype 2a sequences and two clusters (L and S) of subtype 6a sequences determined from the study patients and their common ancestors (ANCES) using three clock models, exponential (red), lognormal (green), and strict (yellow). The error bars indicate the 95% highest posterior density credible intervals.
Fig.3 depicts six MCC trees based on subtype 6a sequences. Although the trees may be drawn in different ways, the example of the trees in this figure is used for the following description. It was shown that the sequences from the study patients are all restricted to the upper parts and markedly right skewed -these sequences display very short branches that appear to have aligned to the right ends. In contrast, those from the local controls, most of which were >10 years old (branch length from the most peripheral node to the branch end that was measured by the ruler under the tree), are distributed largely in the middle and are intermingled with a fraction of the calibrating references that were sampled in China (Fu et al., 2011, 2012; Gu et al., 2013; Lu et al., 2005; Zhou et al., 2011). At the bottom of the trees are distributed another fraction of the calibrating references that were sampled in Vietnam or from Asian immigrants living in Canada; many of these sequences were >20 years old (branch length from the most peripheral node to the branch end) (Murphy et al., 2007; Li et al., 2014; Tokita et al 1994; Noppornpanth et al., 2008). The 6a sequences from the study patients can be divided into a large (L) and a small (S) cluster in addition to three solitary branches. The large cluster contains 93 isolates while the small one includes six. The tMRCAs were estimated to be 4.0–4.4, 3.5–4.8 and 4.6–4.8 years for cluster (L), and 2.3–2.3, 2.1–2.8 and 2.3–2.8 years for cluster (S), using the exponential, lognormal and strict models, respectively (Fig.2). Cluster (L) can be also divided into 3–4 small subsets, each containing a comparable number of isolates, with their node ages being very similar. Regardless of a few controls that are interspersed between the two clusters and three solitary branches, we included these sequences as a whole to estimate their tMRCAs, which were 11.5–14.9, 11.7–14 and 17.3–17.6 years, using the exponential, lognormal and strict models, respectively (Fig.2).
Figure 3.
Six MCC trees estimated for subtype 6a sequences in the entire E1 and partial NS5A regions, corresponding to the nucleotides numbered 739–1490 and 6822–7664 in the H77 genome, using the exponential, lognormal and strict models. Each tree contains three groups of sequences: 102 from the study patients (blue or cyan), 58 from the local controls (red), and 71 calibrating references. The latter includes three subgroups: 46 from China (green), 15 from Vietnam (purple), and 10 from Canada (yellow). As a whole, these sequences are divided into four subsets: I, II, III and IV, and many other lineages. Into subset III are classified all the sequences determined from the study patients, of which the majority are divided into clusters S and L. Within cluster L, 3–4 internal nodes are marked, each with a magenta circle and a number, which indicate the ages of the putative earlier 6a strains that had been serially transmitted among patients, likely via the iatrogenic network.
The estimated Bayes factor indicated that for both subtype 2a and 6a the exponential model was better than the lognormal model, which was in turn better than the strict model (Table S2). Based on the concatenated sequences, similar BEAST analyses were also performed. However, because of the heavy computational burden resulting from a large number of long sequences, the subtype 6a analyses were difficult to converge and therefore these results are not included.
DISCUSSION
We characterized the epidemiological relationship among sequences of HCV determined from 162 patients involved in an outbreak occurring in southern China in February 2012. We revealed the robust grouping of subtype 2a and 6a isolates in five distinct clusters all showing tMRCAs between 2–5 years before sampling, independent of the molecular clocks used and genomic regions sequenced. These distinct clusters may represent separate introductions of HCV in the population who visited the clinic and resemble individual isolates of the local background strains that were otherwise sporadic in this area.
The HCV isolates from the study patients belong to either subtype 2a or 6a and no other genetic lineages were detected. In a series of our recent studies, we reported that approximately 75% of the HCV isolates in China are 1b strains, followed in prevalence (i.e. at 7.7%-14.0%) by subtype 2a (Lu et al., 2005, 2013; Fu et al., 2011). However, in Guangdong province (the Zhijin county where this outbreak occurred is in the east of this province) subtype 6a is increasingly prevalent, detectable in 51.5% of HCV-infected IDUs (Fu et al., 2012), 49.7% of HCV-infected volunteer blood donors (Fu et al., 2011), and 17.1% of patients with HCV-related chronic liver disease (Gu et al., 2013; Lu et al., 2013), along with multiple other genotypes and subtypes. Although the higher frequency of subtype 6a than 2a detected in this study was consistent with the findings described above, the complete absence of other HCV lineages, particularly subtype 1b, was completely unexpected, for which at least three plausible scenarios exist. First, other HCV lineages may have been missed because of laboratory errors. Second, the absence was due to insufficient sampling. Third, both subtype 2a and 6a have been locally epidemic over a long period of time and thus predominant over other HCV genetic lineages.
To reduce the first possibility, we used a combination of strategies for HCV genotyping. By following the criteria in the consensus paper for HCV classification (Simmonds et al., 2005; Smith et al., 2014), we first amplified the Core-E and NS5B sequences, which yielded the expected amplicons from 177 (i.e. 166 yielded both Core-E and NS5B sequences, and 11 yielded only NS5B) out of 192 study patients. However, it was possible that the negativity in 15 patients was due to rare or new HCV genetic variations. Therefore, we further amplified a more conserved Core sequence and the most conserved 5’UTR, which yielded positive results in eight patients (Table S1). Analyses of these sequences demonstrated the exclusive presence of subtypes 2a and 6a strains.
It is possible that the sampling was insufficient because we only tested 192 out of the total 269 HCV-RNA positive patients. Among the remaining 77 patients who were not included in this study, other HCV genetic lineages may have been present. In parallel, aliquots of samples from 193 patients who were positive for HCV-RNA and lived in a small area (denoted BL) were also tested by another research group. They reported Core, E1 and/or NS5B sequences detected in 129 patients, showing the similar exclusive presence of subtype 2a and 6a strains. We did not test 48 patient samples from this set of 129 patients, but their 2a or 6a sequences have been reported to GenBank with the following accession numbers: JX194813-5062 and KF926865-982.
Although we were unable to test the third hypothesis in detail due to the lack of a wide-ranging HCV surveillance program, the absence of other HCV genotypes and subtypes can be true. Since there was a free treatment plan covered by a government health insurance program, every household in this local area was encouraged to participate in the epidemiological investigation. We therefore believe that the individuals exposed in this outbreak or with either a pre-existing HCV infection or similar symptoms would all have been included for HCV screening. Given that the HCV screening assays used were sufficiently sensitive, the exclusive detection of subtypes 2a and 6a strains could reflect a local HCV epidemic pattern that has become established over years.
Excluding the above-described three plausible scenarios, other possibilities also exist. For example, due to differences in the history of viral transmission, particular lineages of HCV may be predominant in a certain subset of people or prevalent in a specific region of a country as we have recently described in China (An et al., 2014; Fu et al., 2011, 2012; Lu et al., 2014) and reported by others in Europe (Ansaldi et al., 2005; Dal Molin et al., 2002) and in Latin America (Ré et al., 2003, 2011; Di Lello et al., 2015).
In the 2a trees, three clusters were consistently adjacent and formed a larger clade. The age of the common ancestor of this clade was estimated to be 7.4–9.8 years before sampling, depending on the clock model used. This time frame suggests an earlier 2a strain seeded in this local area. After divergence, many genetically related but distinct descendants could have been generated. As the iatrogenic transmission proceeded in the local population and persisted, the infections caused by a few descendant strains would have been highly selected and accumulated. In other words, some patients may have accessed the transmission network many times, which made it likely that they were infected first and then passed the related viral strains to other patients. Although there are fewer 2a cases than 6a examples, they nevertheless comprised a significant portion of those in the outbreak.
The accumulated transmission of a few selected HCV strains from such genetically related but distinct descendants is more clearly illustrated by the analysis of subtype 6a sequences. This analysis showed not only phylogenetic clusters analogous to those of 2a sequences, but also indicated other descendants that did not appear to be linked to the outbreak. In the trees, all 6a sequences from the study patients formed one large and one small cluster in addition to three solitary branches that were all classified into a tentatively designated 6a group, group III (Fu et al., 2011, 2012; Gu et al., 2013). However, if we compress the two clusters into two single branches, it is evident that this group III is consistently formed by five branches from the study patients, 10 branches of the local controls and two branches of the calibrating references. This indicates that at least five related descendant strains of 6a group III have been circulating in this area, which may indicate a long-term local epidemic circulation. We hypothesize that, because of medical malpractice, two such strains emerged, became highly selected and spread efficiently through the iatrogenic network, causing the majority of cases in the outbreak. In contrast, the other three strains were sporadically distributed, but were only seen in the surrounding region. Based on these trees, we were able to estimate the tMRCAs that preceded the two clusters and the three solitary branches, although they were interspersed with a few local controls. For subtype 6a strains, the tMRCAs were 4–6 years longer than the 2a counterparts. These estimates may indicate a period when an earlier strain of 6a group III was first introduced into this local area. After divergence, related but distinct descendants were generated, five of which were detected in this study. There could be additional 6a strains in this local area that were not detected. Characterization of such strains requires more extensive HCV surveillance to be performed in a wider range of region including this local area, which will allow us to assess the likelihood of why these otherwise efficiently transmitted 6a strains were not detected in this outbreak.
With reference to two other equally important blood-borne pathogens, HBV and HIV-1, it is known that the former has shown a much higher prevalence than HCV in the general population in China (He et al., 2005), while the latter has now entered a generally disseminated stage (Saksena et al., 2005). Strikingly, no similar outbreaks of HBV and HIV-1 infections were observed in the same area and during the same period. For HBV, this is likely for two reasons: 1) the national HBV immunization program, which has resulted in a drastic decrease of the incidence of new HBV infections in the country, particularly in Guangdong province (Xiao et al., 2012) in which the described outbreak of HCV infection occurred; and 2) possible saturation of HBV infection in the general population, as suggested by the relatively constant sero-prevalence of HBV markers (HBsAg and anti-HBs) in a Chinese IDU cohort over five years (Garten et al., 2004). Concerning the potential spread of HIV-1, this agent has shown 50–100 times lower infectivity than HBV and HCV and much lower prevalence in the general population in the country (WHO, 2011). Regardless, there was an outbreak of HIV transmission in neighboring Cambodia last year, which affected ⩾200 inhabitants of a rural village. An unlicensed doctor who reused syringes and other medical equipment was charged with spreading the infection. Unfortunately, in such a rural area of extreme poverty and lax law enforcement, it is not yet known how widespread the reusing of needles is and whether this will cause a tragedy to happen (http://www.nytimes.com/2015/01/20/world/asia/farming-village-in-cambodia-grieves-as-hundreds-learn-they-have-hiv.html?_r=0).
The transfusion of contaminated blood or blood products was the main route of HCV transmission in China before 2000. Since the implementation of volunteer blood donors and the outlawing of paid blood donation in 1998, IDU and unsafe medical practice have become the major transmission routes (Fu et al., 2012). Nevertheless, in the consecutive four years before and up to this 2012 HCV outbreak, the number of newly reported HCV-infected cases in China continued to surge, with annual increases of 21.58%, 16.07%, 13.61% and 15.96%, respectively, such that the total reported new cases increased from 108,446 in 2008 to 201,622 in 2012 (http://www.chinacdc.cn/tjsj/fdcrbbg/). Among these cases, a large number of patients may have acquired HCV infections in similar outbreaks occurred in other regions. For example, an outbreak was reported in December 2011, in a region adjoining the Danchen Township of Guoyang County in Anhui Province and the Maqiao Township of Yongcheng City in Henan Province (http://hzdaily.hangzhou.com.cn/dskb/html/2011-12/01/content_1181001.htm?jdfwkey=yujoo3). More than 400 people were affected, including many children, all of whom had received injections in a clinic in which the doctors reused needles. A second outbreak occurred in November 2012 in Fengyang County, which is also in Anhui Province. Approximately 200 people were affected, all of whom had visited a clinic in the Chezhan Village of Banqiao Township. The doctors there reused needles and intravenous drip bottles (http://health.dir.groups.yahoo.com/neo/groups/hepcan/conversations/topics/42852). A similar event also occurred in June 2012 in Xuyong County of Sichuan Province, but the patient number remains unknown (http://news.lzep.cn/2012/0612/70228.html). Additional events are suggested by the high prevalence of anti-HCV reported in some populations. For example, in a village in Putian City of Fujian Province, the anti-HCV prevalence was as high as 29% among people ⩾2 years old. However, this prevalence appeared to not be linked to blood donation, transfusion or IDU (http://health.people.com.cn/GB/15521324.html). Based on our personal observations, a similar scenario may also characterize a region adjoining Shizong County and Luxi County in Yunnan province. In Zhehei Village and its surrounding region, we believe that an HCV epidemic of considerable scale has persisted for years, affecting nearly a thousand people, with many families having more than one patient. Most individuals affected have complained of receiving unsterilized injections from clinics in the village.
These findings illustrate the continuing spread of HCV in Asian countries due to unsterile injections or contaminated medical equipment. The reporting of other events in China is very educational for people from the western world who start to believe that nosocomial HCV transmission is something that belongs in the past. In fact, China is not the only country with infection control problems. Other countries such as India, Pakistan, and the central Asian countries of the former Soviet Union are all faced with the same problem where the battle against HCV is not easily won if medical standards do not improve.
MATERIALS AND METHODS
Serum samples from studied patients
Serum samples from the studied patients were surplus to the requirements of the epidemiological investigation described above. After collection, these samples were sent to the Third Affiliated Hospital of Sun Yat-sen University to screen for anti-HCV antibodies (ELISA kit, Kehua Biotech Co. Ltd, Shanghai, China) and HCV RNA (Roche AmpliPrep/Cobas TaqMan HCV Assay version 2, Roche Molecular Systems, Pleasanton, CA), which identified 269 patients positive for both markers. Of these 269 patients, 193 lived in a small area (denoted BL) along Xiangshui Street, while 76 lived in the surrounding region (denoted ZJ). There were sufficient samples (one from each patient) for use in this study (Fig. 1) from 192 patients (87 males and 105 females), with an age range of 3 to 90 and mean age of 44.2 ± 18.4. Written informed consent was obtained from all patients. The ethical review committee of the Third Affiliated Hospital of Sun Yat-sen University approved this study based on the provisions of the Declaration of Helsinki.
Serum samples from local controls
As local controls (individuals who may have acquired HCV infection due to sporadic transmission other than in an outbreak), archived serum samples from 24 patients infected with HCV subtype 2a strains and 58 patients infected with subtype 6a strains were also used in this study (Fig. 1). We had previously collected these samples during 2009–2011 from patients who sought medication at the Third Affiliated Hospital of Sun Yat-sen University; both Core-E1 and NS5B sequences of HCV have recently been reported for these patients (Lu et al., 2013).
Serum samples for calibrating sequences
To calibrate the timescale over which HCV has circulated and diversified, sequences of 32 subtype 2a strains and 71 subtype 6a strains, for which the sampling dates are known, were used as calibrating references (Fig.S4–S5). Among them, 15 of subtype 2a and 53 of subtype 6a strains were sequenced in this study using archived serum samples that we had collected during the years 2002–2011 or kindly provided by our collaborators (Okamoto et al., 1991; Lu et al., 2005; Murphy et al., 2007; Fu et al., 2011, 2012; Gu et al., 2013; Li et al., 2014).
Sequence amplification and characterization
HCV sequences were amplified in six subgenomic regions, 5’UTR, Core, Core-E1, E1-E2, NS5A, and NS5B, using the protocols previously described (Lu et al., 2005) and the primers listed in Table S3. To obtain consensus sequences, all amplicons were directly sequenced in both directions. Any errors in base calling were corrected using the SeqMan program, and the sequences were edited using the EditSeq program followed by alignment using the MegAlign program (DNASTAR Inc., Madison, WI). To characterize the higher genetic variation in the E1-E2 region, selected amplicons were cloned followed by plasmid sequencing (Lu et al., 2008). All finalized sequences, together with reference sequences, were classified by phylogenetic analyses, for which the most appropriate substitution model GTR+I+r4 (General time-reversible with invariable-sites-plus-gamma) was selected using the jModeltest program (Posada, 2008) and the ML phylogenies were heuristically searched using the PhyML program (Guindon and Gascuel, 2003).
Evolutionary analysis
Sequences spanning the entire E1 and partial NS5A regions were assembled based on HCV subtypes with the inclusion of local controls and calibrating references. Evolutionary analyses were performed using the Bayesian MCMC (Markov chain Monte Carlo) algorithm implemented in the BEAST package (version 1.6.1) (Drummond & Rambaut, 2007) under the default prior setting except that GTR+I+Г substitution model and the Bayesian skyline coalescent model were selected in combination with one of the three clock models: the exponential, lognormal and strict models. For the exponential model, this was done by first estimating the calibrating sequences to obtain such a rate followed by an inclusion of the rate as a prior in the subsequent analysis of a given dataset. For the latter two models, however, the analyses used the prior evolutionary rates (Table S2). In setting the Bayesian skyline coalescent model, different prior numbers were tested for different datasets using different clock models until all the posteriors reached the effective sampling size of ^200. The MCMC analysis was run for 100–500 million steps, outputting a tree every 20,000–30,000 steps. To interpret the chain lengths, the Tracer (Version 1.5) program was used, which was also used to calculate the Bayes factor for comparing the analyses under different models, to summarize the tMRCAs and to delimit the 95% HDP (highest posterior density credible intervals). Lastly, we used the TreeAnnotator program (Version 1.6.1) to summarize a tree from the resulting set of credible trees, which is called the MCC tree. Since a molecular clock was incorporated, the branch lengths and the tree node heights were marked in units of years. Phylogenetic structure was then displayed using the FigTree program, in which clades, lineages and internal node heights were indicated as required.
Nucleotide sequence accession numbers
The nucleotide sequences reported in this study were deposited in GenBank with the following accession numbers: KJ678155-KJ678975, KJ700856-KJ700861, and KM502262-KM502265 (Fig. S4–S5).
Supplementary Material
Highlights.
We studied linked cases in an HC V outbreak occurred in February 2012 in China.
We consistently characterized subtype 2a and 6a isolates from 60 and 102 cases.
They were descendants of 8 earlier strains closely related and otherwise sporadic.
Evolutionary analysis estimated the common ancestors 2–5 years before sampling.
Due to unsafe injections, 5 strains were selected causing linked transmission.
Acknowledgments
Financial support: The study described was supported by a grant from the National Institute of Allergy and Infectious Diseases (5 R01 AI080734). The funding agencies had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Confliction of interest: None reported.
REFERENCES
- Alter MJ. Prevention of spread of hepatitis C. Hepatology. 2002;36:S93–S98. doi: 10.1053/jhep.2002.36389. [DOI] [PubMed] [Google Scholar]
- An Y, Wu T, Wang M, Lu L, Li C, Zhou Y, Fu Y, Chen G. Conservation in China of a novel group of HCV variants dating to six centuries ago. Virology. 2014;464–465C:21–25. doi: 10.1016/j.virol.2014.06.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ansaldi F, Bruzzone B, Salmaso S, Rota MC, Durando P, Gasparini R, Icardi G. Different seroprevalence and molecular epidemiology patterns of hepatitis C virus infection in Italy. J Med Virol. 2005;76:327–332. doi: 10.1002/jmv.20376. [DOI] [PubMed] [Google Scholar]
- Birch CJ, McCaw RF, Bulach DM, Revill PA, Carter JT, Tomnay J, Hatch B, Middleton TV, Chibo D, Catton MG, Pankhurst JL, Breschkin AM, Locarnini SA, Bowden DS. Molecular analysis of human immunodeficiency virus strains associated with a case of criminal transmission of the virus. J. Infect. Dis. 2000;182:941–944. doi: 10.1086/315751. [DOI] [PubMed] [Google Scholar]
- Bracho MA, Gosalbes MJ, Blasco D, Moya A, Gonzalez-Candelas F. Molecular epidemiology of a hepatitis C virus outbreak in a hemodialysis unit. J. Clin. Microbiol. 2005;43:2750–2755. doi: 10.1128/JCM.43.6.2750-2755.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dal Molin G, Ansaldi F, Biagi C, D’Agaro P, Comar M, Crocè L, Tiribelli C, Campello C. Changing molecular epidemiology of hepatitis C virus infection in Northeast Italy. J. Med. Virol. 2002;68:352–356. doi: 10.1002/jmv.10210. [DOI] [PubMed] [Google Scholar]
- Di Lello FA, Farias AA, Culasso AC, Pérez PS, Pisano MB, Contigiani MS, Campos RH, Ré VE. Changing epidemiology of hepatitis C virus genotypes in the central region of Argentina. Arch. Virol. 2015;160:909–915. doi: 10.1007/s00705-015-2390-6. [DOI] [PubMed] [Google Scholar]
- Dore GJ, Law M, MacDonald M, Kaldor JM. Epidemiology of hepatitis C virus infection in Australia. J. Clin. Virol. 2003;26:171–184. doi: 10.1016/s1386-6532(02)00116-6. [DOI] [PubMed] [Google Scholar]
- Drummond AJ, Rambaut A. BEAST: Bayesian evolutionary analysis by sampling trees. BMC Evol. Biol. 2007;7:214. doi: 10.1186/1471-2148-7-214. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Esteban J, Gómez J, Martell M, Cabot B, Quer J, Camps J, González A, Otero T, Moya A, Esteban R, Guardia J. Transmission of hepatitis C virus by a cardiac surgeon. N. Engl. J. Med. 1996;334:555–560. doi: 10.1056/NEJM199602293340902. [DOI] [PubMed] [Google Scholar]
- Fu Y, Xia W, Wang Y, Pybus OG, Lu L, Nelson K. New trends of HCV infection in China revealed by genetic analysis of first-time volunteer blood donors. J. Viral. Hepat. 2011;18:42–52. doi: 10.1111/j.1365-2893.2010.01280.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fu Y, Qin W, Cao H, Xu R, Tan Y, Lu T, Chen G. HCV 6a prevalence in Guangdong province had the origin from Vietnam and recent dissemination to other regions of China: a phylogeography analysis. PLoS One. 2012;7:e28006. doi: 10.1371/journal.pone.0028006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Garten RJ, Lai S, Zhang J, Liu W, Chen J, Vlahov D, Yu XF. Rapid transmission of hepatitis C virus among young injecting heroin users in Southern China. Int. J. Epidemiol. 2004;33:182–188. doi: 10.1093/ije/dyh019. [DOI] [PubMed] [Google Scholar]
- González-Candelas F, Bracho MA, Wróbel B, Moya A. Molecular evolution in court: analysis of a large hepatitis C virus outbreak from an evolving source. BMC Biology. 2013;11:76. doi: 10.1186/1741-7007-11-76. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gu L, Tong W, Yuan M, Lu T, Li C, Lu L. An increased diversity of HCV isolates were characterized among 393 patients with liver disease in China representing six genotypes, 12 subtypes, and two novel genotype 6 variants. J. Clin. Virol. 2013;57:311–317. doi: 10.1016/j.jcv.2013.04.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Guadagnino V, Stroffolini T, Rapicetta M, et al. Prevalence, risk factors, and genotype distribution of hepatitis C virus infection in the general population: a community-based survey in Southern Italy. Hepatology. 1997;26:1006–1011. doi: 10.1002/hep.510260431. [DOI] [PubMed] [Google Scholar]
- Guindon S, Gascuel O. A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst. Biol. 2003;52:696–704. doi: 10.1080/10635150390235520. [DOI] [PubMed] [Google Scholar]
- Hauri AM, Armstrong GL, Hutin YJF. The global burden of disease attributable to contaminated injections given in health care settings. Int J STD AIDS. 2004;15:7–16. doi: 10.1258/095646204322637182. [DOI] [PubMed] [Google Scholar]
- He J, Gu D, Wu X, Reynolds K, Duan X, Yao C, Wang J, Chen CS, Chen J, Wildman RP, Klag MJ, Whelton PK. Major causes of death among men and women in China. N. Engl. J. Med. 2005;353:1124–1134. doi: 10.1056/NEJMsa050467. [DOI] [PubMed] [Google Scholar]
- Kiyosawa K, Tanaka E, Sodeyama T, Yoshizawa K, Yabu K, Furuta K, Imai H, Nakano Y, Usuda S, Uemura K, et al. Transmission of hepatitis C in an isolated area of Japan: community-acquired infection. Gastroenterology. 1994;106:1596–1602. doi: 10.1016/0016-5085(94)90416-2. [DOI] [PubMed] [Google Scholar]
- Kato T, Furusaka A, Miyamoto M, Date T, Yasui K, Hiramoto J, Nagayama K, Tanaka T, Wakita T. Sequence analysis of hepatitis C virus isolated from a fulminant hepatitis patient. J. Med. Virol. 2001;64:334–339. doi: 10.1002/jmv.1055. [DOI] [PubMed] [Google Scholar]
- Kurihara C, Ishiyama N, Nishiyama Y, Fukushi S, Kageyama T, Katayama K. Molecular characterization of hepatitis C virus genotype 2a from the entire sequences of four isolates. J. Med. Virol. 2001;64:466–475. doi: 10.1002/jmv.1073. [DOI] [PubMed] [Google Scholar]
- Li C, Yuan M, Lu L, Lu T, Xia W, Pham VH, Vo AX, Nguyen MH, Abe K. The genetic diversity and evolutionary history of hepatitis C virus in Vietnam. Virology. 2014;468–470C:197–206. doi: 10.1016/j.virol.2014.07.026. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lu L, Nakano T, He Y, Fu Y, Hagedorn CH, Robertson BH. Hepatitis C virus genotype distribution in China: predominance of closely related subtype 1b isolates and existence of new genotype 6 variants. J. Med. Virol. 2005;75:538–549. doi: 10.1002/jmv.20307. [DOI] [PubMed] [Google Scholar]
- Lu L, Tastunori N, Li C, Gao F, Robertson BH. Selection of HCV strains and evolution of HVR1 variants in a chimpanzee chronically infected with HCV-1 over 12 years. Hepatol. Res. 2008;38:704–716. doi: 10.1111/j.1872-034X.2008.00320.x. [DOI] [PubMed] [Google Scholar]
- Lu L, Tong W, Li C, Gu L, Lu T, Tee KK, Chen G. The current HCV prevalence in China may have mainly resulted from an officially encouraged plasma campaign in the 1990s: a coalescence inference with genetic sequences. J. Virol. 2013;87:12041–12050. doi: 10.1128/JVI.01773-13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lu L, Xu R, Huang J, Yuan M, Huang K, Wang M, Zhang X, Xiong H, Nakano T, Bennett P, Fu Y. The migration patterns of HCV in China characterized for five major subtypes based on 411 volunteer blood donors sampled in 17 provinces/municipalities. J. Virol. 2014;88:7120–7129. doi: 10.1128/JVI.00414-14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Metzker ML, Mindell DP, Liu XM, Ptak RG, Gibbs RA, Hillis DM. Molecular evidence of HIV-1 transmission in a criminal case. Proc. Natl. Acad. Sci. USA. 2002;99:14292–11297. doi: 10.1073/pnas.222522599. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Murphy DG, Willems B, Deschenes M, Hilzenrat N, Mousseau R, Sabbah S. Use of sequence analysis of the NS5B region for routine genotyping of hepatitis C virus with reference to C/E1 and 5’ untranslated region sequences. J. Clin. Microbiol. 2007;45:1102–1112. doi: 10.1128/JCM.02366-06. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Noppornpanth S, Poovorawan Y, Lien TX, Smits SL, Osterhaus AD, Haagmans BL. Complete genome analysis of hepatitis C virus subtypes 6t and 6u. J. Gen. Virol. 2008;89:1276–1281. doi: 10.1099/vir.0.83593-0. [DOI] [PubMed] [Google Scholar]
- Okamoto H, Kurai K, Okada S, Yamamoto K, Lizuka H, Tanaka T, Fukuda S, Tsuda F, Mishiro S. Full-length sequence of a hepatitis C virus genome having poor homology to reported isolates: comparative study of four distinct genotypes. Virology. 1992;188:331–341. doi: 10.1016/0042-6822(92)90762-e. [DOI] [PubMed] [Google Scholar]
- Okamoto H, Okada S, Sugiyama Y, Kurai K, Iizuka H, Machida A, Miyakawa Y, Mayumi M. Nucleotide sequence of the genomic RNA of hepatitis C virus isolated from a human carrier: comparison with reported isolates for conserved and divergent regions. J. Gen. Virol. 1991;72:2697–2704. doi: 10.1099/0022-1317-72-11-2697. [DOI] [PubMed] [Google Scholar]
- Okayama A, Stuver SO, Tabor E, et al. Incident hepatitis C virus infection in a community-based population in Japan. J Viral Hepat. 2002;9:43–51. doi: 10.1046/j.1365-2893.2002.00331.x. [DOI] [PubMed] [Google Scholar]
- Ou CY, Ciesielski CA, Myers G, Bandea CI, Luo CC, Korber BT, Mullins JI, Schochetman G, Berkelman RL, Economou AN, Witte JJ, Furman LJ, Satten GA, MacInnes KA, Curran JW, Jaffe HW Laboratory Investigation Group. Epidemiologic Investigation Group. Molecular epidemiology of HIV transmission in a dental practice. Science. 1992;256:1165–1171. doi: 10.1126/science.256.5060.1165. [DOI] [PubMed] [Google Scholar]
- Penin F, Dubuisson J, Rey FA, Moradpour D, Pawlotsky JM. Structural biology of hepatitis C virus. Hepatology. 2004;39:5–19. doi: 10.1002/hep.20032. [DOI] [PubMed] [Google Scholar]
- Posada D. jModelTest: phylogenetic model averaging. Mol. Biol. Evol. 2008;25:1253–1256. doi: 10.1093/molbev/msn083. [DOI] [PubMed] [Google Scholar]
- Ré V, Lampe E, Yoshida CF, de Oliveira JM, Lewis-Ximénez L, Spinsanti L, Elbarcha O, Contigiani M. Hepatitis C virus genotypes in Córdoba, Argentina. Unexpected high prevalence of genotype 2. Medicina (B Aires) 2003;63:205–210. [PubMed] [Google Scholar]
- Ré VE, Culasso AC, Mengarelli S, Farías AA, Fay F, Pisano MB, Elbarcha O, Contigiani MS, Campos RH. Phylodynamics of hepatitis C virus subtype 2c in the province of Córdoba, Argentina. PLoS One. 2011;6:e19471. doi: 10.1371/journal.pone.0019471. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Saksena NK, Wang B, Steain M, Yang RG, Zhang LQ. Snapshot of HIV pathogenesis in China. Cell Res. 2005;15:953–961. doi: 10.1038/sj.cr.7290373. [DOI] [PubMed] [Google Scholar]
- Shepard CW, Finelli L, Alter MJ. Global epidemiology of hepatitis C virus infection. Lancet Infect. Dis. 2005;5:558–567. doi: 10.1016/S1473-3099(05)70216-4. [DOI] [PubMed] [Google Scholar]
- Simmonds P, Bukh J, Combet C, Deleage G, Enomoto N, Feinstone S, Halfon P, Inchauspé G, Kuiken C, Maertens G, Mizokami M, Murphy DG, Okamoto H, Pawlotsky JM, Penin F, Sablon E, Shin-I T, Stuyver LJ, Thiel HJ, Viazov S, Weiner AJ, Widell A. Consensus proposals for a unified system of nomenclature of hepatitis C virus genotypes. Hepatology. 2005;42:962–973. doi: 10.1002/hep.20819. [DOI] [PubMed] [Google Scholar]
- Tokita H, Shrestha SM, Okamoto H, Sakamoto M, Horikita M, Iizuka H, Shrestha S, Miyakawa Y, Mayumi M. Hepatitis C virus variants from Nepal with novel genotypes and their classification into the third major group. J. Gen. Virol. 1994;75:931–936. doi: 10.1099/0022-1317-75-4-931. [DOI] [PubMed] [Google Scholar]
- Wasley A, Alter M. Epidemiology of hepatitis C: geographic differences and temporal trends. Semin. Liver Dis. 2000;20:1–16. doi: 10.1055/s-2000-9506. [DOI] [PubMed] [Google Scholar]
- World Health Organization. Sexually transmitted infections. [Accessed 5 October 2011];Fact sheet NO 110. Available: http://www.who.int/mediacentre/factsheets/fs110/en/index.html.
- Xiao J, Zhang J, Wu C, Shao X, Peng G, Peng Z, Ma W, Zhang Y, Zheng H. Impact of hepatitis B vaccination among children in Guangdong Province, China. Int. J. Infect. Dis. 2012;16:e692–e696. doi: 10.1016/j.ijid.2012.05.1027. [DOI] [PubMed] [Google Scholar]
- Yerly S, Quadri R, Negro F, Barbe KP, Cheseaux JJ, Burgisser P, Siegrist CA, Perrin L. Nosocomial outbreak of multiple bloodborne viral infections. J. Infect. Dis. 2001;184:369–372. doi: 10.1086/322036. [DOI] [PubMed] [Google Scholar]
- Zhou X, Chan PK, Tam JS, Tang JW. A possible geographic origin of endemic hepatitis C virus 6a in Hong Kong: evidences for the association with Vietnamese immigration. PLoS One. 2011;6:e24889. doi: 10.1371/journal.pone.0024889. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Okayama A, Stuver SO, Tabor E, et al. Incident hepatitis C virus infection in a community-based population in Japan. J Viral Hepat. 2002;9:43–51. doi: 10.1046/j.1365-2893.2002.00331.x. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.



