Abstract
In the late 1980s an HIV-1 epidemic emerged in Romania that was dominated by subtype F1. The main route of infection is believed to be parenteral transmission in children. We sequenced partial pol coding regions of 70 subtype F1 samples from children and adolescents from the PENTA-EPPICC network of which 67 were from Romania. Phylogenetic reconstruction using the sequences and other publically available global subtype F sequences showed that 79% of Romanian F1 sequences formed a statistically robust monophyletic cluster. The monophyletic cluster was epidemiologically linked to parenteral transmission in children. Coalescent-based analysis dated the origins of the parenteral epidemic to 1983 [1981–1987; 95% HPD]. The analysis also shows that the epidemic's effective population size has remained fairly constant since the early 1990s suggesting limited onward spread of the virus within the population. Furthermore, phylogeographic analysis suggests that the root location of the parenteral epidemic was Bucharest.
The use of unsterilized needles and syringes in a healthcare setting led to the emergence of an HIV-1 epidemic in Romania involving approximately 10,000 babies and children in the late 1980s.1–4 A distinctive feature of the Romanian epidemic was that it was dominated by subtype F HIV-1, which accounts for less than 1% of the infections worldwide, with the majority being found in Africa and South America.5–8 Subtype F is further divided into two sub-subtypes, namely F1 and F2.9 The Romanian epidemic was caused by subtype F1. In Europe, subtype F is rarely observed outside of Romania suggesting that the Romanian F1 epidemic might have originated from elsewhere.10–12
A recent study showed that F1 sequences sampled in Romania are closely related to those from Angola and dated the time of the most recent common ancestor (tMRCA) of the Romanian sequences to 1978.12 The study also revealed that the Romanian epidemic is composed of two distinct strains of viruses: one consisting of sequences sampled mainly from children and the second consisting mostly of adult sequences. Here, we specifically examine in detail the phylodynamic and phylogeographic patterns of the Romanian parenteral epidemic in children using new sequences with linked epidemiological data collected over a 6-year period as part of a EuroCoord-CHAIN study. EuroCoord-CHAIN is a large European collaboration of HIV observational cohorts (EuroCoord), including the PENTA-EPPICC network (Paediatric European Network for Treatment of AIDS-European Pregnancy and Paediatric HIV Cohort Collaboration and the European Collaborative HIV and anti-HIV Drug Resistance Network (CHAIN).
As part of the study we performed population-based sequencing of part of the pol gene of 176 samples from HIV-1-infected children and adolescents in the PENTA-EPPICC network, collected between 2002 and 2008. Seventy of the samples were from children infected with HIV-1 subtype F1, of which 67 were from Romania. To investigate the genetic relationship of the subtype F1 sequences from Romania to worldwide subtype F sequences, we compiled a dataset that included 57 subtype F1 sequences we genotyped plus 616 subtype F1 and F2 sequences from different geographic regions of the world that were downloaded from the Los Alamos HIV sequence database (Table 1). Where multiple sequences from the same patient or identical sequences were present, only the most recent sequence was used for the analysis.
Table 1.
|
Subtype F1 |
Subtype F2 |
|
---|---|---|---|
Country of sampling | EuroCoorda | Los Alamosb | Los Alamosb |
Romania | 54 | 338 | |
Brazil | 157 | ||
Angola | 22 | ||
Belgium | 7 | 2 | |
Spain | 4 | 2 | |
France | 1 | 4 | 1 |
Germany | 1 | 1 | |
Italy | 24 | ||
Portugal | 3 | ||
Czech Republic | 5 | ||
Russia | 3 | ||
Great Britain | 2 | ||
Luxembourg | 1 | ||
Finland | 1 | ||
Switzerland | 1 | ||
Austria | 1 | ||
Slovenia | 1 | ||
Argentina | 3 | ||
United States | 3 | ||
Cuba | 1 | ||
Honduras | 1 | ||
Cameroon | 1 | 23 | |
DRC | 1 | 1 | |
Equatorial Guinea | 1 | ||
Republic of Congo | 1 | ||
Mozambique | 1 | ||
Total | 57 | 585 | 31 |
Sequences from the EuroCoord/PENPACT1 study generated by this study.
Sequences downloaded from the Los Alamos HIV sequence database.
The sequences were aligned using the Clustal W algorithm in MEGA413 and were manually edited and trimmed to encompass only the first 858 nucleotides of the reverse transcriptase region of the HIV-1 pol gene. A preliminary phylogenetic analysis of the sequence alignment was performed using a rapid parallelized maximum-likelihood (ML) inference method (RAxML)14 utilizing a general-time reversible (GTR) model of nucleotide substitution. This analysis showed that the sequences generally segregated into three main groups of subtype F1 sequences and one group of subtype F2 sequences (Fig. 1).
The first group of F1 sequences contained a majority of sequences from Romania (390 out of 411; in red in Fig. 1). The other two F1 groups contained a majority of sequences from Angola (19 out of 21; in green in Fig. 1) and Brazil (157 out of 197; in yellow in Fig. 1), respectively. The three F1 clusters had high bootstrap support values of 84%, 100%, and 75%, respectively, as did the F2 cluster that had a bootstrap value of 100%. A sequence from the Republic of Congo was at the base of the F1 and F2 clusters indicating that the origin of the global subtype F epidemic was in Central Africa, which concurs with previous observations.11,12 Thus, this initial analysis suggests that most of the Romanian F1 epidemic is composed of lineages that cluster separately from other worldwide subtype F1 lineages.
To investigate the evolution and origins of the Romanian subtype F1 epidemic further we performed a more robust phylogenetic reconstruction using Bayesian Markov chain Monte Carlo (MCMC) methods as implemented in the BEAST program version 1.6.1.15,16 To undertake this analysis, we reduced the number of sequences in our dataset to 128 using the following criteria: (1) sequences without drug resistance mutations or from treatment-naive individuals, and (2) sequences with known year of sampling. The new dataset included 52 sequences from Romania, of which 38 were from the EuroCoord-CHAIN study.
An alignment of the dataset was then used for phylogenetic analyses using Bayesian MCMC inference under the Hasegawa, Kishino, and Yano (HKY) model of substitution. All the Bayesian MCMC analyses carried out in this study were preceded by a molecular clock and demographic model comparison using Bayes factor (with a BF >20 level of significance) to determine the appropriate models to use (data not shown). A Bayesian-inferred phylogeny was estimated from one MCMC run of 50,000,000 generations with sampling every 5,000th generation and a 10% burn-in under a relaxed molecular clock and constant population size. The resulting maximum clade credibility (MCC) tree showed three main clusters of F1 sequences each containing a majority of sequences from Romania, Angola, and Brazil (Fig. 2; Clusters I, II, and III, respectively) and a cluster of F2 sequences (Cluster IV) with posterior probabilities >0.9.
This clustering of F sequences was similar to that observed using the rapid parallelized ML method (Fig. 1). Of particular note, 79% (41 out of 52) of the Romanian F1 sequences were part of the highly supported monophyletic cluster (Cluster I) with a posterior probability of 1. The cluster included three non-Romanian sequences from Austria, Germany, and United States. Interestingly, all the sequences within the monophyletic cluster were from children, whereas the sequence at the base of the cluster was from an adult patient (labeled “a†” in Fig. 2). Furthermore, the ancestral node of the monophyletic cluster and the adult sequence at the base of the cluster were highly supported with a posterior probability of 0.99. The adult sequence was part of a group of nine sequences, eight of which were from Romania and one from France. Of the eight Romanian sequences, five were from adults.
This phylogenetic analysis suggests that a major bottleneck event might have occurred resulting in the establishment of the Romanian monophyletic cluster from one or a few closely related lineages. Data on the route of transmission were available for the samples within the EuroCoord-CHAIN study as well as some of the samples downloaded from the Los Alamos HIV sequence database.
When this information was mapped on the monophyletic cluster it showed that the route of transmission was parenteral for 37 out of 41 of the Romanian sequences in Cluster I. The routes of transmission for the remaining Romanian sequences in Cluster I were blood transfusion (1/41), perinatal (1/41), heterosexual contact (1/41), and unknown (1/41). In contrast, only two out of the remaining 13 Romanian sequences were transmitted parenterally and both sequences came from children. The remaining sequences from children in this group were acquired perinatally (2/5) or through an unknown route (1/5). In contrast, two of the adult sequences in the group were acquired sexually, whereas the remaining sequences had unknown routes of transmission. Taken together these data indicate that sequences in Cluster I represent the Romanian F1 parenteral epidemic in children and that the founder of Cluster I was probably a lineage introduced into the adult population in Romania and/or Europe at an earlier time.
The Bayesian-inferred phylogeny also showed that the Angolan sequences (Cluster II) were nested between a group of sequences from Romania and Europe suggesting that the Romanian sequences are more closely related to the Angolan than to the Brazilian sequences (Fig. 2.). This suggests an African origin for the HIV-1 subtype F1 lineage in Romania; however, this observation had weak statistical support (posterior probability of 0.48 for the closest ancestral node linking the Angolan and Romanian lineages).
Next, the history of the parenteral F1 epidemic in Romania was reconstructed using the program BEAST. The tMRCA of the parenteral cluster was estimated to be 1983 [1981–1987; 95% highest probability density (HPD)]. Demographic reconstruction under a strict molecular clock and a Bayesian Skyline Plot (BSP) demographic model showed that the number of effective infections (Ne) for the parenteral epidemic, a number that represents infections contributing to new transmissions, grew exponentially until the early 1990s with an initial median estimate of 7 [2–38; 95% HPD] in 1983. However, the effective population size has remained constant from the early 1990s to 2008, the final year of sampling, with a final median estimate of 465 [193–1182; 95% HPD], which is approximately 5% of the estimated 10,000 children infected parenterally (Fig. 3). This suggests that very little onward spread through the parenteral route or by other means has occurred since the early 1990s. The analysis also revealed that the parenteral epidemic viruses have evolved at a slightly slower rate compared to other HIV-1 epidemics and was estimated to be 1.1×10−3 [0.7–1.5×10−3; 95% HPD].17
Lastly, we estimated the phylogeographic pattern of the subtype F1 epidemic in Romania. Epidemiological data were available for the EuroCoord-CHAIN study as well as some of the sequences downloaded from the Los Alamos HIV sequence database, which included the county of origin. The data showed that the children were mostly infected in their first year of life; therefore the county of birth of the child was ascribed as the origin of the sequences.
The sequences originated from 16 counties in Romania (Fig. 4). This information was used to estimate the geographic origin of the ancestral node of the parenteral epidemic using Bayesian inference framework, which models spatial diffusion of time-stamped lineages as a continuous-time Markov chain (CTMC) process over discrete sampling locations as implemented in BEAST software.18 Using a relaxed molecular clock model and a BSP demographic model we show that Bucharest is the location of the ancestral node subtending the parenteral cluster (root state probability of 0.47; Fig. 4). Only two other locations showed a root state probability greater than the prior one of 0.0625, these being Giurgiu (0.25) and Mures (0.08).
Of note, the rest of the parenteral cluster showed no clear spatial structure with the major ancestral nodes having undetermined root state locations. This finding was not surprising considering the route of transmission and the constant population dynamics. In contrast, the ancestral nodes at the base of the parenteral cluster that includes sequences sampled from the adult population indicate Bucharest as the predominant location. However, it is worth noting that these results could have been influenced by the high number of sequences from Bucharest in this group.
In summary, our findings make three important observations. First, we show that a large part of the Romanian subtype F1 epidemic is represented by a subpopulation of viruses sampled from children who mostly acquired the infection via the parenteral route as earlier as 1983, 5 years after the estimated introduction of subtype F1 into Romania.12 This is also in keeping with the first reports of the HIV-1 F1 epidemic in Romanian children in 1989.1
Second, we show that the parenteral epidemic has mostly maintained a constant population size after an initial exponential growth period that lasted up to the beginning of the 1990s. This suggests that very little parenteral transmissions and onward spread of the virus within this population has occurred since the early 1990s. This can be rationalized by the fact that the infections occurred in children who are unlikely to be engaged in high risk behavior and that public health measures to control the spread of the epidemic initiated in the early 1990s were a major success.
Lastly, phylogeographic analysis indicated that the root state location of the epidemic in the early 1980s was the adult population in Bucharest. However, it is possible that the location does not represent the ultimate source of the parenteral epidemic as the children could have relocated within Romania or worldwide since acquiring the infection, which occurred after the fall of the Communist regime in 1989. This could explain the presence of the three non-Romanian sequences within the monophyletic cluster from the United States, Germany, and Austria (Fig. 2). Further analysis with increased sampling of F1 sequences from both children and adult patients in the other significantly affected counties in Romania will need to be undertaken to obtain a clear phylogeographic picture of the Romanian parenteral epidemic.
Sequence Data
GenBank accession numbers of the subtype F1 HIV-1 sequences from EuroCoord-CHAIN generated and used in this study are JQ280885–JQ280942.
Acknowledgments
We especially thank Hannah Castro, David Dunn, Ali Judd, and the MRC Clinical Trials Unit for administrative and technical support for the EuroCoord-CHAIN study. We also thank all the members of the EuroCoord-CHAIN study group and all the patients, staff, and project management staff of the cohorts participating in the EuroCoord-CHAIN study, in particular those from the PENTA-EPPICC network. The research leading to these results has received part funding from the European Community's Seventh Framework Programme (FP7/2007-2013) under the project “Collaborative HIV and Anti-HIV Drug Resistance Network (CHAIN),” grant agreement no. 223131.
Author Disclosure Statement
No competing financial interests exist.
References
- 1.Patrascu IV. Constantinescu SN. Dublanchet A. HIV-1 infection in Romanian children. Lancet. 1990;335(8690):672. doi: 10.1016/0140-6736(90)90466-i. [DOI] [PubMed] [Google Scholar]
- 2.Hersh BS. Popovici F. Apetrei RC, et al. Acquired immunodeficiency syndrome in Romania. Lancet. 1991;338(8768):645–649. doi: 10.1016/0140-6736(91)91230-r. [DOI] [PubMed] [Google Scholar]
- 3.Hersh BS. Popovici F. Jezek Z, et al. Risk factors for HIV infection among abandoned Romanian children. AIDS. 1993;7(12):1617–1624. doi: 10.1097/00002030-199312000-00012. [DOI] [PubMed] [Google Scholar]
- 4.Patrascu IV. Dumitrescu O. The epidemic of human immunodeficiency virus infection in Romanian children. AIDS Res Hum Retroviruses. 1993;9(1):99–104. doi: 10.1089/aid.1993.9.99. [DOI] [PubMed] [Google Scholar]
- 5.Apetrei C. Necula A. Holm-Hansen C, et al. HIV-1 diversity in Romania. AIDS. 1998;12(9):1079–1085. [PubMed] [Google Scholar]
- 6.Hemelaar J. Gouws E. Ghys PD. Osmanov S. Global and regional distribution of HIV-1 genetic subtypes and recombinants in 2004. AIDS. 2006;20(16):W13–W23. doi: 10.1097/01.aids.0000247564.73009.bc. [DOI] [PubMed] [Google Scholar]
- 7.Apetrei C. Loussert-Ajaka I. Collin G, et al. HIV type 1 subtype F sequences in Romanian children and adults. AIDS Res Hum Retroviruses. 1997;13(4):363–365. doi: 10.1089/aid.1997.13.363. [DOI] [PubMed] [Google Scholar]
- 8.Dumitrescu O. Kalish ML. Kliks SC. Bandea CI. Levy JA. Characterization of human immunodeficiency virus type 1 isolates from children in Romania: Identification of a new envelope subtype. J Infect Dis. 1994;169(2):281–288. doi: 10.1093/infdis/169.2.281. [DOI] [PubMed] [Google Scholar]
- 9.Robertson DL. Anderson JP. Bradac JA, et al. HIV-1 nomenclature proposal. Science. 2000;288(5463):55–56. doi: 10.1126/science.288.5463.55d. [DOI] [PubMed] [Google Scholar]
- 10.Bandea CI. Ramos A. Pieniazek D, et al. Epidemiologic and evolutionary relationships between Romanian and Brazilian HIV-subtype F strains. Emerg Infect Dis. 1995;1(3):91–93. doi: 10.3201/eid0103.950305. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Guimaraes ML. Vicente AC. Otsuki K, et al. Close phylogenetic relationship between Angolan and Romanian HIV-1 subtype F1 isolates. Retrovirology. 2009;6:39. doi: 10.1186/1742-4690-6-39. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Mehta SR. Wertheim JO. Delport W, et al. Using phylogeography to characterize the origins of the HIV-1 subtype F epidemic in Romania. Infect Genet Evol. 2011;11(5):975–979. doi: 10.1016/j.meegid.2011.03.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Tamura K. Dudley J. Nei M. Kumar S. MEGA4: Molecular Evolutionary Genetics Analysis (MEGA) software version 4.0. Mol Biol Evol. 2007;24(8):1596–1599. doi: 10.1093/molbev/msm092. [DOI] [PubMed] [Google Scholar]
- 14.Stamatakis A. Ludwig T. Meier H. RAxML-III: A fast program for maximum likelihood-based inference of large phylogenetic trees. Bioinformatics. 2005;21(4):456–463. doi: 10.1093/bioinformatics/bti191. [DOI] [PubMed] [Google Scholar]
- 15.Guindon S. Gascuel O. A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst Biol. 2003;52(5):696–704. doi: 10.1080/10635150390235520. [DOI] [PubMed] [Google Scholar]
- 16.Drummond AJ. Rambaut A. BEAST: Bayesian evolutionary analysis by sampling trees. BMC Evol Biol. 2007;7:214. doi: 10.1186/1471-2148-7-214. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Hue S. Pillay D. Clewley JP. Pybus OG. Genetic analysis reveals the complex structure of HIV-1 transmission within defined risk groups. Proc Natl Acad Sci USA. 2005;102(12):4425–4429. doi: 10.1073/pnas.0407534102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Lemey P. Rambaut A. Drummond AJ. Suchard MA. Bayesian phylogeography finds its roots. PLoS Comput Biol. 2009;5(9):e1000520. doi: 10.1371/journal.pcbi.1000520. [DOI] [PMC free article] [PubMed] [Google Scholar]