Abstract
We reconstructed the 2016–2017 Zika virus epidemic in Puerto Rico by using complete genomes to uncover the epidemic’s origin, spread, and evolutionary dynamics. Our study revealed that the epidemic was propelled by multiple introductions that spread across the island, intricate evolutionary patterns, and ≈10 months of cryptic transmission.
Keywords: Zika, Zika virus, next-generation sequencing, NGS, Puerto Rico, phylogenetics, genomic epidemiology, viruses, vector-borne infections, molecular evolution, United States
Puerto Rico reported the first confirmed case of Zika virus (ZIKV) disease in November 2015 and subsequently experienced epidemic transmission that peaked by mid-August 2016 (1). Despite the large number of confirmed cases detected by traditional surveillance, the origin, spread, and evolutionary dynamics of this epidemic remain undetermined. We sought to reconstruct the epidemic transmission period by using a genomic epidemiology approach and determine evolution of the virus in the island.
To investigate the emergence and subsequent epidemic of ZIKV in Puerto Rico, we generated 83 complete genomes (2,3) directly from PCR-positive serum samples (4) (Appendix) collected from the 8 health regions of Puerto Rico during March 2016–January 2017, congruent to a geotemporal representation of the epidemic in the island. We then performed phylogenetic analysis with an additional 233 published genomes from GenBank that represent the emergence and spread of ZIKV in the Americas during 2015–2017. The resulting reconstructed phylogeny was consistent with published tree topologies, nucleotide substitution rate ranges, and divergence patterns observed elsewhere for the entirety of the Americas (Appendix Figure 1, panel A), providing a pragmatic context to the proposed model of spread and divergence of ZIKV in Puerto Rico (5). At least 8 separate foreign-introduction events were captured within the ancestry of the viruses sequenced, including 2 that expanded into autochthonous lineages and 6 separate introduction events represented by individual sequences associated with genomes from the United States, the Caribbean, South America, and Central America, thus suggesting limited spread.
In addition, we analyzed the temporal molecular evolutionary signal in our dataset by reconstructing time-calibrated phylogenies by using genomes annotated with date of sample collection based on year, month, and days for temporal precision. The correlation between date of sample collection and root-to-tip genetic distance supported the heterochronous nature of our dataset. The estimated divergence from the root (i.e., time of most recent common ancestor [tMRCA] of this tree) occurred in February 2013 (because 2013–2014 ZIKV genomes from French Polynesia were used as the root), and the within-epidemic evolutionary rate was 1.09 × 10−3 substitutions/site/year (Appendix Figure 1, panel B).
Bayesian reconstruction of Puerto Rico clade 1 (PR C1) presents the largest autochthonous monophyletic cluster that originated from viruses from South America and the Caribbean, including Brazil, Suriname, French Guyana, the US Virgin Islands, and Dominican Republic (Figure). tMRCA estimates place the divergence of PR C1 in mid-June 2015 (95% highest posterior density [HPD] February 2015–October 2015) and a within-outbreak evolutionary rate of 1.61 × 10−3 (95% HPD 1.13–2.10 × 10−3) substitutions/site/year. In addition, PR C1 was observed to diverge further into 2 subclades (SC1 and SC2) spreading across the island. The second clade, Puerto Rico clade 2 (PR C2), presents a smaller autochthonous monophyletic cluster that originated from viruses in Central America, including Nicaragua and Honduras (Figure). Our tMRCA estimates placed the emergence of PR C2 in February 2016 (95% HPD October 2015–April 2016) and its evolutionary rate was similar to PR C1 at 1.87 × 10−3 (95% HPD 1.1–2.64 × 10−3). We compared the ZIKV epidemic history of Puerto Rico to the time-calibrated Bayesian phylogenies and observed that the tMRCA of PR C1 precedes the initial confirmation of ZIKV in the island through traditional surveillance methods by 3–10 months and that expansion of all PR lineages coincides with the peak of the epidemic curve (Figure). We assessed phylogenetic clustering patterns for geographic association with each of the health regions and detected none (Appendix Figure 2).
We inferred past viral population dynamics by using Bayesian Skygrid plots, which show an increase in genomic diversity that coincides in time with the emergence of ZIKV in the Americas, followed by a series of fluctuations in the effective population size, characteristic of the virus spreading rapidly through the region (Appendix Figure 3). In Puerto Rico, we observed a similar sharp increase upon emergence and subsequent patterns that mirror the trends observed in the Americas.
Our study revealed the origin and epidemic spread of ZIKV in the island after a period of cryptic transmission undetected by traditional surveillance. Similar cryptic transmission was reported in Brazil and Colombia (6–8), where case detection was hindered by the difficulty to capture asymptomatic or mild cases with clinical manifestations that overlap endemic arboviruses and other laboratory testing limitations particular to ZIKV (9). The dataset we generated in our study presents a relevant contribution to the geotemporal sampling of ZIKV genomes from the region, enabling the study the evolutionary and epidemic dynamics in the Americas.
The integration of genomic epidemiology to arbovirus surveillance has proven to be central to the ascertainment of disease epidemiology, uncovering information otherwise concealed by the nature of the disease and limitations of surveillance systems. Fundamentally, integrated proactive genomic surveillance may help us to predict virus emergence and mitigate more effectively their regional or global expansion.
Acknowledgments
We thank the collaborators from the Ponce Medical School Foundation, Inc. (grant no. U01CK000580), the Puerto Rico Health Department, and members of the Puerto Rico Zika Task Force for the valuable contributions to the enhanced surveillance during the Zika outbreak in 2016.
This project was partially funded by the Centers for Disease Control and Prevention’s Advanced Molecular Detection Program and the Yale University’s School of Public Health start-up package provided to N.D.G. Additional support for coauthors C.K. and A.H. was provided by the Yale University’s Jackson Institute of Global Health Field Experience Award and the Yale Collaborative Action Fellowship.
Biography
Dr. Santiago is a lead research microbiologist at the Centers for Disease Control and Prevention in San Juan, Puerto Rico. His research is focused on the development of molecular diagnostic tests and genomic epidemiology of dengue virus and severe acute respiratory syndrome coronavirus 2.
Footnotes
Suggested citation for this article: Santiago GA, Kalinich CC, Cruz-López F, González GL, Flores B, Hentoff A, et al. Tracing the origin, spread, and molecular evolution of Zika virus, Puerto Rico, 2016–2017. Emerg Infect Dis. 2021 Nov [date cited]. https://doi.org/10.3201/eid2711.211575
These first authors contributed equally to this article.
These senior authors contributed equally to this article.
References
- 1.Sharp TM, Quandelacy TM, Adams LE, Aponte JT, Lozier MJ, Ryff K, et al. Epidemiologic and spatiotemporal trends of Zika Virus disease during the 2016 epidemic in Puerto Rico. PLoS Negl Trop Dis. 2020;14:e0008532. 10.1371/journal.pntd.0008532 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Quick J, Grubaugh ND, Pullan ST, Claro IM, Smith AD, Gangavarapu K, et al. Multiplex PCR method for MinION and Illumina sequencing of Zika and other virus genomes directly from clinical samples. Nat Protoc. 2017;12:1261–76. 10.1038/nprot.2017.066 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Grubaugh ND, Gangavarapu K, Quick J, Matteson NL, De Jesus JG, Main BJ, et al. An amplicon-based sequencing framework for accurately measuring intrahost virus diversity using PrimalSeq and iVar. Genome Biol. 2019;20:8. 10.1186/s13059-018-1618-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Santiago GA, Vázquez J, Courtney S, Matías KY, Andersen LE, Colón C, et al. Performance of the Trioplex real-time RT-PCR assay for detection of Zika, dengue, and chikungunya viruses. Nat Commun. 2018;9:1391. 10.1038/s41467-018-03772-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Metsky HC, Matranga CB, Wohl S, Schaffner SF, Freije CA, Winnicki SM, et al. Zika virus evolution and spread in the Americas. Nature. 2017;546:411–5. 10.1038/nature22402 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Faria NR, Quick J, Claro IM, Thézé J, de Jesus JG, Giovanetti M, et al. Establishment and cryptic transmission of Zika virus in Brazil and the Americas. Nature. 2017;546:406–10. 10.1038/nature22401 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Black A, Moncla LH, Laiton-Donato K, Potter B, Pardo L, Rico A, et al. Genomic epidemiology supports multiple introductions and cryptic transmission of Zika virus in Colombia. BMC Infect Dis. 2019;19:963. 10.1186/s12879-019-4566-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Grubaugh ND, Saraf S, Gangavarapu K, Watts A, Tan AL, Oidtman RJ, et al. ; GeoSentinel Surveillance Network. GeoSentinel Surveillance Network. Travel surveillance and genomics uncover a hidden Zika outbreak during the waning epidemic. Cell. 2019;178:1057–1071.e11. 10.1016/j.cell.2019.07.018 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Peters R, Stevenson M. Zika virus diagnosis: challenges and solutions. Clin Microbiol Infect. 2019;25:142–6. 10.1016/j.cmi.2018.12.002 [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.