ABSTRACT
Western equine encephalitis virus (WEEV) is an arbovirus from the genus Alphavirus, family Togaviridae, which circulates in North America between birds and mosquitoes, occasionally causing disease in humans and equids. In recent decades, human infection has decreased dramatically; the last documented human case in North America occurred in 1994, and the virus has not been detected in mosquito pools since 2008. Because limited information exists regarding the evolution of WEEV, we analyzed the genomic sequences of 33 low-passage-number strains with diverse geographic and temporal distributions and performed comprehensive phylogenetic analyses. Our results indicated that WEEV is a highly conserved alphavirus with only approximately 5% divergence in its most variable genes. We confirmed the presence of the previously determined group A and B lineages and further resolved group B into three sublineages. We also observed an increase in relative genetic diversity during the mid-20th century, which correlates with the emergence and cocirculation of several group B sublineages. The estimated WEEV population size dropped in the 1990s, with only the group B3 lineage being sampled in the past 20 years. Structural mapping showed that the majority of substitutions in the envelope glycoproteins occurred at the E2-E2 interface. We hypothesize that an event occurred in the mid-20th century that resulted in the increased genetic diversity of WEEV in North America, followed by genetic constriction due to either competitive displacement by the B3 sublineage or stochastic events resulting from a population decline.
IMPORTANCE Western equine encephalitis virus (WEEV) has caused several epidemics that resulted in the deaths of thousands of humans and hundreds of thousands of equids during the past century. During recent decades, human infection decreased drastically and the virus has not been found in mosquito pools since 2008. Because limited information exists regarding the evolution of WEEV, we analyzed 33 complete genome sequences and conducted comprehensive phylogenetic analyses. We confirmed the presence of two major lineages, one of which diverged into three sublineages. Currently, only one of those sublineages is found circulating in nature. Understanding the evolution of WEEV over the past century provides a unique opportunity to observe an arbovirus that is in decline and to better understand what factors can cause said decline.
INTRODUCTION
Western equine encephalitis virus (WEEV) is a mosquito-borne arbovirus and the causative agent of western equine encephalitis (WEE). Infections of humans and horses can be fatal, and survivors often suffer permanent neurological sequelae (1, 2). WEEV belongs to the genus Alphavirus in the family Togaviridae and has a positive-sense, single-stranded RNA genome approximately 11.5 kb in length, including two open reading frames (ORFs) flanked by 5′- and 3′-untranslated regions (UTRs) (3, 4). One unusual feature of WEEV is that it is the descendant of an ancient recombination event between Sindbis virus (SINV)-like and eastern equine encephalitis virus (EEEV)-like ancestors (5, 6).
WEEV is found in North and South America. In North America, it circulates enzootically among passerine birds and is transmitted by its primary mosquito vector, Culex (Culex) tarsalis. Mammals can participate in a secondary cycle (7–9). Both humans and horses are thought to be dead-end hosts (10), although some equids, such as burros and ponies, develop low to moderate levels of viremia (slightly under 104 PFU/ml) (11, 12), which could allow these hosts to contribute to epizootic amplification.
In the 1930s through 1950s, WEEV produced widespread outbreaks encompassing western North America, extending north into Saskatchewan, Canada (10). Western states were affected by several outbreaks during the 1930s and, by 1937, the epidemic/epizootic reached the eastern side of the Canadian Rockies (13, 14). Sporadic outbreaks continued to occur throughout the early 20th century in the western and midwestern United States. However, the incidence of WEE has drastically decreased over the past 4 decades. The 1970s saw 209 human cases; 87 were reported during the 1980s, only 4 cases during the 1990s, and no cases have been reported in the United States or Canada since 1998 (15).
Several studies investigated possible reasons for the decrease in human WEE incidence to explain these epidemiological data (16–21). While some suggested a reduction in mammalian virulence, interpretations were confounded by the viral strains used (different viral lineages, various passage histories, etc.).
Only two detailed phylogenetic studies of WEEV have been conducted (6, 22). By sequencing partial E1 envelope glycoprotein and nsP4 genes, Weaver et al. (6) identified two monophyletic lineages and proposed that one had become extinct. Kramer and Fallah (22) sequenced the E2 envelope glycoprotein gene of a large collection of WEEV isolates from California and observed the maintenance over time of local enzootic lineages. However, both studies were limited by the short sequence fragments employed and the phylogenetic methods available at that time.
To accurately assess the evolutionary history of WEEV and identify population changes and mutations that might be related to the historic decline in WEEV incidence, we conducted robust phylogenetic analyses using complete WEEV genomic sequences representing a diverse temporal and geographic distribution, with a focus on low-passage-number virus strains. We also generated a three-dimensional (3D) homology model of the E1 and E2 proteins and mapped the locations of several substitutions that define major WEEV lineages and evolutionary events.
MATERIALS AND METHODS
Virus strain selection, propagation, and isolation of RNA.
Thirty-three WEEV strains were chosen based on varied locations and years of collection, with a focus on low-passage-number histories (Table 1). Viruses were propagated on C6/36 cells (17) and precipitated with polyethylene glycol (23), and RNA was extracted using TRIzol LS (Invitrogen, Carlsbad, CA) per the manufacturer's instructions.
TABLE 1.
Strain | Location | Date (day-mo-yr) | Host | Passage historya | Accession no. |
---|---|---|---|---|---|
AG80646 | Chaco Province, Argentina | 1980 | Culex ocossa | v (2), sm (1) | GQ287646 |
California | San Joaquin Valley, CA | 1930 | Horse | gp (?), sm (27), C6 (1) | KJ554965 |
McMillan | Ontario Province, Canada | 1941 | Human | mp (2), sm (2), v (2), C6 (1) | GQ287640 |
BFS932 | Bakersfield, CA | 1946 | Culex tarsalis | sm (1), v (1) | KJ554966 |
EP-6 | Missouri | 1950 | Mosquito | ce (1), C6 (1) | KJ554967 |
BFS1703 | Bakersfield, CA | 1953 | Culex tarsalis | sm (1), C6 (1) | KJ554968 |
BFS2005 | Bakersfield, CA | 1954 | Culex tarsalis | de (1) | GQ287644 |
E1416 | Kern County, CA | 25-Jan-1961 | Zonotrichia leucophrys | bhk (4), C6 (1) | KJ554969 |
Montana64 | Montana | 1967 | Horse | de (1), C6 (1) | GQ287643 |
S8–122 | Butte County, CA | 2-Aug-1968 | Sclurus griseus | sm (1), C6 (1) | KJ554970 |
BFS3060 | Butte County, CA | 19-Jul-1971 | Culex tarsalis | ce (1), sm (1), C6 (1) | KJ554972 |
71V1658 | Oregon | 13-Aug-1971 | Horse | v (2), smb (1) | GQ287645 |
TBT-235 | Texas | 1971 | Gopherus berland | wc (1), de (1), sm (1), bhk (1), C6 (1) | KJ554971 |
75V9291 | Wilkin City, MN | 26-Jul-1975 | Culex tarsalis | v (2), C6 (1) | KJ554973 |
BFS09997 | Kern County, CA | 30-Jun-1978 | Culex tarsalis | v (1), C6 (1) | KJ554974 |
CHLV53 | Riverside County, CA | 19-Jul-1983 | Culex tarsalis | v (1), C6 (1) | KJ554976 |
KERN5547 | Kern County, CA | 1983 | Culex tarsalis | v (1), C6 (1) | KJ554975 |
85452NM | New Mexico | 1985 | Culex tarsalis | sm (2), C6 (1) | GQ287647 |
PV02808A | Lubbock County, TX | 1990 | Mosquito | v (1) or sm (1), C6 (1) | KJ554977 |
IMPR441 | Imperial County, CA | 21-Jul-1992 | Culex tarsalis | sm (1), C6 (1) | KJ554978 |
CO921356 | Larimer City, CO | 30-Jul-1992 | Culex tarsalis | v (1), C6 (1) | KJ554979 |
93A38 | Tacna, AZ | 8-Jun-1992 | Mosquito | v (1), C6 (1) | KJ554980 |
93A27 | Parker, AZ | 9-Jun-1992 | Mosquito | v (1) | KJ554981 |
93A30 | Phoenix, AZ | 10-Jun-1993 | Mosquito | v (1), C6 (1) | KJ554982 |
93A79 | Yuma, AZ | 13-Jul-1993 | Mosquito | v (1), C6 (1) | KJ554983 |
CNTR34 | Contra Costa County, CA | 1993 | Culex tarsalis | v (1), C6 (1) | KJ554984 |
Lake43 | Lake County, CA | 1994 | Culex tarsalis | v (2), C6 (1) | KJ554985 |
PV72102 | El Paso County, TX | 1997 | Mosquito | v (1) or sm (1), C6 (1) | KJ554986 |
PV012357A | El Paso County, TX | 2001 | Mosquito | v (1) or sm (1), C6 (1) | KJ554987 |
R02PV002957B | El Paso County, TX | 2002 | Mosquito | v (1) or sm (1), C6 (1) | KJ554988 |
R02PV001807A | El Paso County, TX | 2002 | Mosquito | v (1) or sm (1), C6 (1) | KJ554989 |
R05PV003422B | El Paso County, TX | 2005 | Mosquito | v (1) or sm (1), C6 (1) | KJ554990 |
R0PV00384A | El Paso County, TX | 2005 | Mosquito | v (1) or sm (1), C6 (1) | KJ554991 |
Imperial181 | Imperial County, CA | 2005 | Mosquito | v (2) | GQ287641 |
Passage numbers are in parentheses. Abbreviations: mp, mouse; sm, suckling mouse; smb, suckling mouse brain; v, Vero cells; bhk, baby hamster kidney cells; wc, wet chicks; de, duck embryonic fibroblast; cd, chick embryonic fibroblast; C6, C6/36; p, passage in unknown medium; ?, unknown passage number.
RT-PCR, PCR amplification, and sequencing.
cDNA was prepared using SuperScript III (Invitrogen) per the manufacturer's instructions. Overlapping PCR amplicons covering the WEEV genome were generated using WEEV-specific primers (sequences are available on request) and Phusion high-fidelity DNA polymerase (New England BioLabs, Ipswich, MA). PCR amplicons were purified from agarose gels using a gel extraction kit (Qiagen, Netherlands), and direct sequencing of amplicons was performed using WEEV-specific internal primers and a BigDye terminator v1.3 cycle sequencing kit (ABI, Foster City, CA) and a 3500 Genetic Analyzer (ABI). Sequences were assembled using Sequencher v5.0.1 (Gene Codes Corporation, Ann Arbor, MI).
Sequence analysis.
The ORFs from 27 genomic WEEV sequences we determined were aligned with all genomic sequences from the GenBank library using Seaview v4.1 (24). MacVector v.11.0.2 (MacVector Inc., Cary, NC) was used to determine percent nucleotide and amino acid identity for the complete concatenated ORFs and for each gene. Comparisons of all strains were made against Imperial181.
Phylogenetic methods.
A coalescent phylogenetic analysis of the WEEV sequences was performed using BEAST v.1.7.5 (25). The analysis was run once for 50 million steps, sampling every 10,000 steps and discarding the first 10% as burn-in; 1st, 2nd, and 3rd codon positions were analyzed independently. A Bayesian skyline analysis was conducted under the strict clock, uncorrelated log-normal clock (UCLN), and uncorrelated exponential clock (UCEX) models, and convergence was assessed by examining the stationary ln-likelihood and effective sample size (ESS, >200) parameters in Tracer v1.4 (http://tree.bio.ed.ac.uk/software/tracer/). To determine the best-fit substitution and clock models, path-sampling and stepping-stone analyses were used (26, 27). A maximum clade credibility (MCC) tree, node heights (hypothesized dates of divergence events), evolutionary rates, and a Bayesian skyline plot were then generated. The BEAST output tree file was analyzed using Tree Annotator (included in the BEAST v.1.7.5 software package), discarding the first 10% as burn-in, and visualized in FigTree v.1.3.1. To verify its accuracy, a Bayesian phylogeny was inferred in MrBayes (http://mrbayes.sourceforge.net/) using the general time-reversible (GTR+I+Γ4) model, which was determined to be optimal using Modeltest (28). The analysis was performed for 1 million steps, with sampling every 1,000 steps and discarding the first 10% as burn-in.
Nonsynonymous synapomorphic mutations of interest.
To identify mutations that define major WEEV lineages, each amino acid was manually traced on the inferred MCC tree using MacClade v4.08 (http://macclade.org/macclade.html). To assess potential selective pressures accompanying WEEV evolution, we estimated the number and locations of nonsynonymous and synonymous nucleotide substitutions per site and determined if the sites were positively or negatively selected using the Data Monkey server (29). The dN/dS ratio reflects the predominance of synonymous mutations, which generally reflect neutral change, versus nonsynonymous mutations that more often reflect phenotypic alterations. Codon-based selection analyses use dN/dS to estimate the overall impact of selection on specific codons, which, when paired with nucleotide substitution models and viewed in a phylogenetic framework, can identify selected mutations across lineages (30). The overall dN/dS ratio and selection pressure were determined by single likelihood ancestor counting (SLAC) and fast unbiased Bayesian approximation (FUBAR) methods (31, 32). Positive and negative selection events at each codon also were inferred using internal fixed-effect likelihood (IFEL), FEL, and FUBAR methods, and appropriate statistical tests were run on these tests as part of the Data Monkey server package (31–33).
Molecular modeling of WEEV E1 and E2 envelope proteins.
Sequences of the E1 and E2 proteins from the BFS932 and Imperial181 WEEV strains were submitted to fold recognition servers (34) for homologous sequence alignment using crystal structures of SINV (PDB entry 3MUU) and chikungunya virus (CHIKV) (PDB entry 3N40) proteins as templates. Because domain 2 is missing from the SINV E2 structure, we used the SINV E1 protein to generate three-dimensional (3D) model structures of the BFS932 and Imperial181 proteins, while the CHIKV E2 protein was used to generate 3D model structures for E2. MPACK (35, 36) was used to build homology model structures, which were energy minimized using the Fantom program (37). Finally, trimeric model structures of BFS932 and Imperial181 were obtained by fitting their E1 and E2 proteins into a trimeric structure of SINV (http://www.pymol.org/).
RESULTS
Percent nucleotide and amino acid identities indicated that WEEV has maintained a highly conserved genome since 1930 (Table 2). The percent identities for the genome and individual genes all were greater than 95%. Some genes, including E2, contained higher nucleotide than amino acid identities, and E2 had the greatest variation in both amino acids and nucleotides. Conversely, nsP1 was the most highly conserved gene.
TABLE 2.
Strain | % Nucleotide (amino acid) divergence from Imperial181 strain |
|||||||||
---|---|---|---|---|---|---|---|---|---|---|
Genome | nsP1 | nsP2 | nsP3 | nsP4 | Capsid | E3 | E2 | 6K | E1 | |
California | 97.3 (98.0) | 98.7 (99.4) | 97.1 (98.9) | 97.0 (97.6) | 97.5 (98.4) | 97.0 (97.3) | 97.2 (96.7) | 96.3 (95.1) | 97.3 (100) | 97.0 (98.6) |
McMillan | 97.3 (98.1) | 98.7 (99.4) | 97.2 (98.9) | 97.0 (97.6) | 97.5 (98.4) | 97.0 (97.3) | 97.8 (98.3) | 96.3 (95.6) | 97.3 (100) | 97.2 (98.9) |
BFS932 | 98.0 (98.6) | 99.2 (99.8) | 97.8 (99.1) | 97.5 (98.1) | 98.1 (99.0) | 97.9 (98.1) | 98.3 (98.3) | 97.4 (97.0) | 97.3 (98.0) | 97.9 (99.1) |
EP6 | 97.9 (98.6) | 99.1 (99.8) | 97.8 (99.1) | 97.6 (98.1) | 98.1 (99.2) | 97.7 (97.7) | 98.3 (98.3) | 97.2 (97.2) | 97.3 (98.0) | 98 (98.9) |
BFS1703 | 97.8 (98.4) | 98.9 (99.2) | 97.6 (99.1) | 97.2 (97.6) | 97.7 (98.7) | 97.8 (98.1) | 98.3 (98.3) | 97.4 (97.0) | 97.3 (98.0) | 97.7 (98.9) |
BFS2005 | 97.7 (98.4) | 99 (99.4) | 97.6 (99.0) | 97.2 (97.6) | 97.7 (98.5) | 97.7 (98.1) | 98.3 (98.3) | 97.4 (97.0) | 97.3 (98.0) | 97.7 (98.9) |
E1416 | 97.8 (98.4) | 98.9 (99.2) | 97.6 (99.1) | 97.2 (97.6) | 97.7 (98.7) | 97.8 (98.1) | 98.3 (98.3) | 97.4 (97.0) | 97.3 (98.0) | 97.7 (98.9) |
Montana64 | 98.4 (98.8) | 99.2 (99.6) | 98.3 (99.1) | 98.0 (98.5) | 98.7 (99.2) | 98.5 (98.5) | 98.3 (98.3) | 97.7 (97.4) | 98.7 (98.0) | 98.2 (99.3) |
S8122 | 98.2 (98.8) | 99.2 (99.8) | 98.1 (99.0) | 97.9 (98.3) | 98.4 (99.3) | 98.3 (98.5) | 98.3 (98.3) | 97.7 (97.4) | 98 (100) | 98 (99.3) |
TBT235 | 98.6 (99.0) | 99.3 (99.6) | 98.6 (99.2) | 97.8 (98.5) | 99.1 (99.5) | 99.1 (99.2) | 98.3 (98.3) | 98.1 (97.7) | 98.0 (98.0) | 98.3 (99.1) |
BFN3060 | 98.2 (98.7) | 99.2 (99.8) | 98.1 (99.0) | 97.9 (98.1) | 98.4 (99.3) | 98.1 (98.2) | 98.3 (98.3) | 97.7 (97.4) | 98.0 (100) | 97.8 (98.6) |
71V1658 | 98.2 (98.7) | 99.5 (99.8) | 98.2 (99.0) | 97.7 (98.1) | 98.5 (99.2) | 97.9 (98.1) | 97.2 (96.7) | 97.4 (97.0) | 98.0 (98.0) | 98.1 (99.3) |
75V9291 | 98.2 (98.8) | 99.2 (99.8) | 98.2 (99.1) | 97.6 (98.1) | 98.5 (99.0) | 97.8 (98.5) | 97.8 (98.3) | 97.7 (97.7) | 98.0 (98.0) | 98.2 (99.3) |
BFS09997 | 97.7 (98.4) | 98.9 (99.2) | 97.6 (99.1) | 97.2 (97.6) | 97.7 (98.7) | 97.8 (98.1) | 98.3 (98.3) | 97.2 (96.5) | 97.3 (98.0) | 97.7 (98.9) |
KERN5547 | 98.9 (99.2) | 99.4 (99.8) | 98.7 (99.2) | 98.6 (98.7) | 99.2 (99.7) | 99.4 (99.6) | 98.3 (98.3) | 98.6 (98.6) | 98 (96.0) | 98.6 (99.5) |
CHLV53 | 98.8 (99.1) | 99.4 (99.8) | 98.6 (99.1) | 98.6 (98.9) | 99.1 (99.5) | 99.1 (99.6) | 98.3 (98.3) | 98.3 (98.4) | 98 (96.0) | 98.8 (99.5) |
85452NM | 99.2 (99.4) | 99.7 (99.8) | 99.2 (99.6) | 99.0 (99.1) | 99.3 (99.5) | 99.5 (100) | 98.3 (98.3) | 98.9 (98.8) | 98.7 (98.0) | 99.1 (99.5) |
PV02808A | 97.7 (98.7) | 99.1 (99.8) | 97.3 (99.0) | 97.0 (97.9) | 98.2 (99.3) | 97.7 (98.1) | 96.7 (98.23) | 97.3 (97.2) | 97.3 (96.0) | 97.7 (99.3) |
IMPR441 | 99.0 (99.2) | 99.2 (99.6) | 99.0 (99.5) | 98.8 (98.3) | 99.2 (99.5) | 99.0 (100) | 98.3 (98.3) | 98.8 (98.8) | 98.7 (98.0) | 98.9 (99.5) |
CO921356 | 97.7 (98.6) | 99.1 (99.8) | 97.3 (99.0) | 96.9 (97.6) | 98.2 (99.3) | 97.4 (98.1) | 96.7 (98.3) | 97.4 (97.2) | 97.3 (96.0) | 97.8 (99.3) |
93A38 | 97.7 (98.6) | 99.1 (99.8) | 97.4 (98.9) | 97.1 (97.7) | 98.0 (99.2) | 97.7 (98.1) | 96.7 (98.3) | 97.4 (97.2) | 97.3 (96.0) | 97.7 (99.3) |
93A27 | 99.0 (99.2) | 99.4 (99.6) | 99.0 (99.6) | 98.8 (98.5) | 99.0 (99.3) | 99.1 (100) | 98.3 (98.3) | 98.5 (98.8) | 98.7 (98.0) | 98.9 (99.3) |
93A30 | 99.5 (99.7) | 99.8 (99.8) | 99.6 (99.9) | 99.4 (99.6) | 99.5 (99.6) | 99.6 (100) | 98.3 (98.3) | 99.4 (99.3) | 99.3 (100) | 99.2 (99.5) |
93A79 | 99.0 (99.4) | 99.6 (99.8) | 98.9 (99.6) | 98.8 (99.1) | 99.0 (99.7) | 99.2 (99.6) | 98.3 (98.3) | 98.7 (98.8) | 98.7 (98.0) | 99.0 (99.5) |
CNTR34 | 97.7 (98.6) | 99.0 (99.8) | 97.4 (99.0) | 97.0 (97.7) | 98.0 (99.3) | 97.7 (98.1) | 96.7 (98.3) | 97.3 (97.2) | 97.3 (96.0) | 97.7 (99.3) |
Lake43 | 97.7 (98.6) | 99.1 (99.8) | 97.4 (98.9) | 96.9 (97.6) | 98.1 (99.3) | 97.7 (98.1) | 96.7 (98.3) | 97.4 (97.2) | 97.3 (96.0) | 97.7 (99.3) |
PV71202 | 99.5 (99.7) | 99.6 (99.8) | 99.5 (99.9) | 99.4 (99.6) | 99.6 (99.8) | 99.7 (100) | 98.3 (98.3) | 99.4 (99.3) | 99.3 (100) | 99.3 (99.8) |
PV012357A | 99.6 (99.7) | 99.7 (99.8) | 99.6 (99.7) | 99.4 (99.4) | 99.8 (99.8) | 99.9 (100) | 99.4 (100) | 99.5 (99.5) | 100 (100) | 99.5 (99.8) |
R02PV002957B | 99.6 (99.8) | 99.9 (100) | 99.5 (99.7) | 99.2 (99.4) | 99.7 (99.8) | 99.9 (100) | 99.4 (100) | 99.4 (99.5) | 100 (100) | 99.5 (99.8) |
R02PV001807A | 99.6 (99.7) | 99.7 (99.8) | 99.6 (99.7) | 99.4 (99.2) | 99.8 (99.8) | 99.9 (100) | 99.4 (100) | 99.5 (99.5) | 100 (100) | 99.5 (99.8) |
R02PV003422B | 99.9 (100) | 99.9 (100) | 99.9 (100) | 99.8 (100) | 99.9 (100) | 99.9 (100) | 100 (100) | 99.8 (100) | 100 (100) | 99.9 (100) |
R0PV003814A | 99.9 (99.9) | 99.7 (99.8) | 99.9 (100) | 99.7 (99.8) | 99.9 (100) | 99.9 (100) | 100 (100) | 99.8 (100) | 100 (100) | 100 (100) |
Stepping-stone and path-sampling analyses indicated GTR+I+Γ4 as the best-fit nucleotide substitution model along with an uncorrelated exponential clock model. The inferred Bayesian MCC phylogeny showed the presence of four main WEEV lineages (Fig. 1). The California and McMillan isolates were not monophyletic, as in previous analyses (6), due to the absence of a more divergent WEEV strain as an outgroup. However, group A was confirmed as monophyletic, using an Markov chain Monte Carlo (MCMC) analysis with the South American isolate AG80-646 as an outgroup (data not shown).
Estimated dates of lineage divergence were obtained for groups B1 to B3. Group A sequences were difficult to resolve, and the divergence of group A from group B1 could not be reliably estimated. Based on the phylogenetic data and the history of WEE outbreaks, this divergence probably occurred in the mid-1930s to early 1940s. Group B2 diverged from group B1 in approximately 1944, with 95% highest posterior density (HPD) values of 1942 to 1946. Finally, group B3 diverged around 1967 (95% HPD = 1965 to 1970). The overall rate of WEEV evolution was estimated at 2.8 × 10−4 (HPD = 3.4 × 10−4 to 2.2 × 10−4), with rates for individual lineages in a narrow range of 8.0 × 10−4 to 3.0 × 10−4 substitutions/site/year.
A Bayesian skyline plot showed a slight increase in the WEEV estimated population size between 1940 and 1965 during the era of the last major outbreaks (Fig. 2) (10). This increase was followed by a plateau and then by a decline beginning around 1990, corresponding to the establishment of group B3 that contains all recently circulating strains.
Upon manual analysis of the MCC tree and alignment file using MacClade v4.08, six nonsynonymous synapomorphic mutations of interest were found that delineated the clades we resolved on the MCC tree (Fig. 1 and Table 3). Selection analysis showed that the WEEV genome has evolved mainly under purifying selection (dN/dS ratio of 0.145). The IFEL analysis detected only one positively selected site versus 39 negatively selected sites at P ≤ 0.1. The positively selected site, encoding a Val-to-Ile substitution, involved part of group B3 (strain 93A30 and more recent strains) as well as strains S8122, BFN3060, California, and McMillan (Fig. 1). When we looked at the mutations we manually traced, IFEL analysis suggested positive selection; however, P values were >0.1 (Table 3). Positive selection on these sites also was suggested by both FEL and FUBAR analyses (although P values still were >0.1).
TABLE 3.
Gene | Amino acid change | Amino acid position | Mutation | Nucleotide position (from beginning of nonstructural protein ORF) | Codon position | IFEL analysis |
|
---|---|---|---|---|---|---|---|
Selection type | P value | ||||||
nsP3 | Thr→Ileu | 152 | C→T | 4436 | 2 | + | 0.11 |
nsP4 | Asn→Ser | 602 | A→G | 7382 | 2 | + | 0.23 |
Capsid | Lys→Arg | 89 | A→G | 7714 | 2 | + | 0.16 |
Capsid | Lys→Trp | 250 | A→T | 8196 | 1 | + | 0.28 |
A→G | 8197 | 2 | |||||
E2 | Ala→Thr | 23 | G→A | 8472 | 1 | + | 0.38 |
E1 | Thr→Ser | 374 | A→T | 10959 | 1 | + | 0.28 |
3D models for both the BFS932 and Imperial181 E1 proteins indicated three domains: 1 and 2 had an interlinking beta sheet structure with a long hydrophobic fusion loop at one end of domain 2, while domain 3 shared high structural similarity with the immunoglobulin domain.
A complete list of amino acid positions that differed between BFS932 and Imperial181 was obtained for the E1 and E2 proteins. The BFS932 E1 protein had 11 differences compared to all other WEEV strains (Fig. 3A to C, yellow), while there were only 4 differences compared to Imperial181 (Fig. 3D to F, yellow). There were 25 differences in the BFS932 E2 protein compared to all WEEV strains (Fig. 3A to C, red) and 13 differences compared to Imperial181 (Fig. 3D to F, red). To visualize these amino acids, we used a trimeric structure of the BFS932 E1-E2 heterodimer (Fig. 3). Most of the E2 substitutions were located at the E2-E2 interface. Nonsynonymous, synapomorphic mutations of interest were mapped on the 3D structure and are indicated in magenta (Thr→Ser at position 374 in E1) and green (Ala→Thr at position 23 in E2) (Fig. 3).
DISCUSSION
We report here the first detailed phylogenetic and evolutionary analysis of WEEV using complete genomic sequences. We confirmed the prior presence of two major North American lineages as previously described (6) and further delineated several subgroups (B1 to 3) within lineage B. We also identified purifying selection as the major influence on WEEV evolution in North America, as described previously for several other arboviruses (38, 39), and identified several mutations that define the divergence of groups A and B1 to B3. No phylogenetically significant differences in glycosylation sites, cysteine residues, and UTR folding patterns were recognized.
In North America, WEEV was first isolated in 1930 from a fatal case of equine encephalitis (40). This California strain fell into the group A lineage along with the 1941 McMillan isolate. This relationship suggests WEEV's early spread to the eastern side of the Canadian Rockies. We hypothesize that group A became extinct in the 1940s and was displaced by group B1, which then became predominant. Strains from ancestral groups A and B1 generally are more virulent than more recent strains from groups B2 and B3 (17, 19, 20).
In the late 1940s, group B2 displaced group B1, which probably went extinct. Subsequently, in the late 1960s, group B3 lineage emerged from group B2, eventually displacing it. The consistency of the tightly grouped HPDs throughout the MCC tree supports the reliability of these temporal estimates (Fig. 1).
When all 3 group B sublineages were circulating, our Bayesian skyline analysis indicated a concurrent increase in WEEV estimated population size between 1965 and the late 1980s. However, after the late 1980s, a reduction in estimated population size occurred when the group B3 viruses became predominant in North America. These interpretations require caution, because the 95% HPD values for 1930 to 1950 were relatively broad. There is more confidence in the estimates between the years 1950 and 1995, when the HPD range was tighter (Fig. 2). However, when the estimated tMRCAs and Skyline analysis (Fig. 2) are considered in concert, the evidence for these population size interpretations is compelling. Interestingly, the pattern seen on our skyline analysis is similar to that of the annual influenza A virus cycle, although not as pronounced and over a much longer period of time (41). This could be an effect of purifying selection on WEEV, trimming the tree to the group B3 lineage.
The selection analyses indicated that many nucleotide sites within the WEEV genome are under purifying selection. When population sizes are reduced, stochastic drift could result in the accumulation of deleterious mutations, fitness declines, and lineage extinction events that could explain not only the lineage replacements we observed but also the decline in WEEV genetic diversity. However, manual inspection of our WEEV alignment revealed several mutations that may represent positively selected codons. These mutations were, for the most part, just under the threshold of significance as determined by IFEL analysis (P > 0.1) (Table 2). Both FEL and FUBAR flagged these same mutations at P values that narrowly missed the threshold of significance, suggesting that they have important phenotypes. Furthermore, current codon-based analyses sometimes lack the sensitivity to detect positive selection. For example, analyses preformed on CHIKV sequences failed to identify all mosquito vector-adaptive mutations shown by experimental studies (38, 42–44).
Our homology models suggested that the E2-E2 and E1-E2 interfaces, locations far removed from receptor binding or potential antibody binding sites, are important sites of WEEV evolution. Mutations at these interfaces (Fig. 3), including the substitution at E2 position 23 (Table 3), may stabilize the E2-E2 trimer spikes and further prevent the release of genomic RNA during fusion.
The recent epidemiology of WEE, with no reported human cases in North America since 1998, the dearth of WEEV detected in mosquito surveillance since 2008 (2, 15), and the pattern of lineage displacement observed in our phylogeny (Fig. 1), with one lineage becoming predominant along with a decline in genetic diversity in lineage B3, raises a key question: what factor(s) caused the apparent reduction in WEEV circulation and resulting spillover disease? We hypothesize that a significant disturbance in WEEV circulation occurred roughly between 1945 and 1965. This event affected WEEV evolution in one of two ways: (i) changes in selective pressures altered the trajectory of WEEV evolution during the late 20th century, or (ii) a reduction in WEEV populations and/or diversity caused genetic drift and a decline in WEEV fitness, possibly coupled with reduced mammalian virulence. The key synapomorphic mutations we delineated, including those that may be subject to positive selection, deserve reverse genetic analyses to test these hypotheses by assessing their phenotypic properties.
In summary, using comprehensive phylogenetic analyses, we confirmed the major group A and B lineages described previously (6) and determined the further divergence of group B into 3 sublineages, two of which probably went extinct. We delineated several mutations that define groups A and B1 to B3 and which may have been positively selected. However, overall, WEEV's evolution has been dominated by purifying selection. WEEV has undergone a reduction in genetic diversity coincident with the circulation of only the group B3 lineage since the 1970s, suggesting that drift reduced its fitness, levels of circulation, and possibly its virulence for mammals. These data, as well as the apparent submergence of WEEV as an equine and human pathogen, provide a unique opportunity to study a phenomenon that, compared to studies of arboviral emergence, may be equally instructive regarding their maintenance and evolution and the prediction of future trends.
ACKNOWLEDGMENTS
We thank William Reisen of the University of California, Davis, Mary D'Anton of the Texas Department of Health and Human Services, and Robert Tesh of the World Reference Center for Emerging Viruses and Arboviruses (WRCEVA) at UTMB for providing WEEV isolates.
Footnotes
Published ahead of print 4 June 2014
REFERENCES
- 1.Whitley RJ, Gnann JW. 2002. Viral encephalitis: familiar infections and emerging pathogens. Lancet 359:507–513. 10.1016/S0140-6736(02)07681-X [DOI] [PubMed] [Google Scholar]
- 2.Centers for Disease Control and Prevention. 1995. Arboviral disease–United States, 1994. MMWR Morb. Mortal. Wkly. Rep. 44:641–644 [PubMed] [Google Scholar]
- 3.Netolitzky DJ, Schmaltz FL, Parker MD, Rayner GA, Fisher GR, Trent DW, Bader DE, Nagata LP. 2000. Complete genomic RNA sequence of western equine encephalitis virus and expression of the structural genes. J. Gen. Virol. 81:151–159 http://vir.sgmjournals.org/content/81/1/151.long [DOI] [PubMed] [Google Scholar]
- 4.Strauss JH, Strauss EG. 1994. The alphaviruses: gene expression, replication, and evolution. Microbiol. Rev. 58:491–562 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Hahn CS, Lustig S, Strauss EG, Strauss JH. 1988. Western equine encephalitis virus is a recombinant virus. Proc. Natl. Acad. Sci. U. S. A. 85:5997–6001. 10.1073/pnas.85.16.5997 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Weaver SC, Kang W, Shirako Y, Rumenapf T, Strauss EG, Strauss JH. 1997. Recombinational history and molecular evolution of western equine encephalomyelitis complex alphaviruses. J. Virol. 71:613–623 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Hardy JL, Milby MM, Wright ME, Beck AJ, Presser SB, Bruen JP. 1977. Natural and experimental arboviral infections in a population of blacktail jackrabbits along the Sacramento River in Butte County, California (1971–1974). J. Wildl. Dis. 13:383–392. 10.7589/0090-3558-13.4.383 [DOI] [PubMed] [Google Scholar]
- 8.Hardy JL, Reeves WC, Scrivani RP, Roberts DR. 1974. Wild mammals as hosts of group A and group B arboviruses in Kern County, California. Am. J. Trop. Med. Hyg. 23:1165–1177 [DOI] [PubMed] [Google Scholar]
- 9.Bowers JH, Hayes RO, Hughes TB. 1969. Studies on the role of mammals in the natural history of western encephalitis in Hale County, Texas. J. Med. Entomol. 6:175–178 [DOI] [PubMed] [Google Scholar]
- 10.Reisen WK, Monath TP. 1988. Western equine encephalomyelitis, p 89–137 In Monath TP. (ed), The arboviruses: epidemiology and ecology, vol V CRC Press, Boca Raton, FL [Google Scholar]
- 11.Byrne RJ, French GR, Yancey FS, Gochenour WS, Russell PK, Ramsburg HH, Brand OA, Scheider FG, Buescher EL. 1964. Clinical and Immunological interrelationships among Venezuelan, eastern, and western encephalomyelitis in burros. Am. J. Vet. Res. 25:24–31 [PubMed] [Google Scholar]
- 12.Sponseller ML, Binn LN, Wooding WL, Yager RH. 1966. Field strains of western encephalitis virus in ponies: virologic, clinical, and pathologic observations. Am. J. Vet. Res. 27:1591–1598 [PubMed] [Google Scholar]
- 13.Artsob H, Spence L. 1979. Arboviruses in Canada, p 39 In Kurstack E. (ed), Arctic and tropical arboviruses. Academic Press, New York, NY [Google Scholar]
- 14.Cameron GDW. 1942. Western equine encephalitis. Can. Public Health J. 33:383–387 [Google Scholar]
- 15.CDC. 2010. Western equine encephalitis virus neuroinvasive disease cases reported by state, 1964–2010. Centers for Disease Control and Prevention, Atlanta, GA: http://www.cdc.gov/easternequineencephalitis/tech/epi.html [Google Scholar]
- 16.Zhang M, Fang Y, Brault AC, Reisen WK. 2011. Variation in western equine encephalomyelitis viral strain growth in mammalian, avian, and mosquito cells fails to explain temporal changes in enzootic and epidemic activity in California. Vector-Borne Zoonotic Dis. 11:269–275. 10.1089/vbz.2010.0078 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Forrester NL, Kenney JL, Deardorff E, Wang E, Weaver SC. 2008. Western equine encephalitis submergence: lack of evidence for a decline in virus virulence. Virology 380:170–172. 10.1016/j.virol.2008.08.012 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Reisen WK, Fang Y, Brault AC. 2008. Limited interdecadal variation in mosquito (Diptera: Culicidae) and avian host competence for western equine encephalomyelitis virus (Togaviridae: Alphavirus). Am. J. Trop. Med. Hyg. 78:681–686 http://www.ajtmh.org/content/78/4/681.long [PubMed] [Google Scholar]
- 19.Nagata LP, Hu W-G, Parker M, Chau D, Rayner GA, Schmaltz FL, Wong JP. 2006. Infectivity variation and genetic diversity among strains of Western equine encephalitis virus. J. Gen. Virol. 87:2353–2361. 10.1099/vir.0.81815-0 [DOI] [PubMed] [Google Scholar]
- 20.Logue CH, Bosio CF, Welte T, Keene KM, Ledermann JP, Phillips A, Sheahan BJ, Pierro DJ, Marlenee N, Brault AC, Bosio CM, Singh AJ, Powers AM, Olson KE. 2009. Virulence variation among isolates of western equine encephalitis virus in an outbred mouse model. J. Gen. Virol. 90:1848–1858. 10.1099/vir.0.008656-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Mossel EC, Ledermann JP, Phillips AT, Borland EM, Powers AM, Olson KE. 2013. Molecular determinants of mouse neurovirulence and mosquito infection for western equine encephalitis virus. PLoS One 8:e60427. 10.1371/journal.pone.0060427 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Kramer LD, Fallah HM. 1999. Genetic variation among isolates of western equine encephalomyelitis virus from California. Am. J. Trop. Med. Hyg. 60:708–713 [DOI] [PubMed] [Google Scholar]
- 23.Vasilakis N, Forrester NL, Palacios G, Nasar F, Savji N, Rossi SL, Guzman H, Wood TG, Popov V, Gorchakov R, González AV, Haddow AD, Watts DM, da Rosa APAT, Weaver SC, Lipkin WI, Tesh RB. 2013. Negevirus: a proposed new taxon of insect-specific viruses with wide geographic distribution. J. Virol. 87:2475–2488. 10.1128/JVI.00776-12 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Galtier N, Gouy M, Gautier C. 1996. SEAVIEW and PHYLO_WIN: two graphic tools for sequence alignment and molecular phylogeny. Comput. Appl. Biosci. 12:543–548 [DOI] [PubMed] [Google Scholar]
- 25.Drummond AJ, Suchard MA, Xie D, Rambaut A. 2012. Bayesian phylogenetics with BEAUti and the BEAST 1.7. Mol. Biol. Evol. 29:1969–1973. 10.1093/molbev/mss075 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Baele G, Lemey P, Bedford T, Rambaut A, Suchard MA, Alekseyenko AV. 2012. Improving the accuracy of demographic and molecular clock model comparison while accommodating phylogenetic uncertainty. Mol. Biol. Evol. 29:2157–2167. 10.1093/molbev/mss084 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Baele G, Li WLS, Drummond AJ, Suchard MA, Lemey P. 2013. Accurate model selection of relaxed molecular clocks in Bayesian phylogenetics. Mol. Biol. Evol. 30:239–243. 10.1093/molbev/mss243 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Posada D, Crandall KA. 1998. Modeltest: testing the model of DNA substitution. Bioinformatics 14:817–818. 10.1093/bioinformatics/14.9.817 [DOI] [PubMed] [Google Scholar]
- 29.Pond SLK, Frost SDW. 2005. Datamonkey: rapid detection of selective pressure on individual sites of codon alignments. Bioinformatics 21:2531–2533. 10.1093/bioinformatics/bti320 [DOI] [PubMed] [Google Scholar]
- 30.Pond SLK, Poon AFY, Frost SDW. 2009. Estimating selection pressures on alignments of coding sequences, p 419–490 In Lemey P, Salemi M, Vandamme A-M. (ed), The phylogenetic handbook. Cambridge University Press, New York, NY [Google Scholar]
- 31.Murrell B, Moola S, Mabona A, Weighill T, Sheward D, Kosakovsky Pond SL, Scheffler K. 2013. FUBAR: a fast, unconstrained Bayesian approximation for inferring selection. Mol. Biol. Evol. 30:1196–1205. 10.1093/molbev/mst030 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Kosakovsky Pond SL, Frost SDW. 2005. Not so different after all: a comparison of methods for detecting amino acid sites under selection. Mol. Biol. Evol. 22:1208–1222. 10.1093/molbev/msi105 [DOI] [PubMed] [Google Scholar]
- 33.Kosakovsky Pond SL, Frost SDW, Grossman Z, Gravenor MB, Richman DD, Brown AJL. 2006. Adaptation to different human populations by HIV-1 revealed by codon-based analyses. PLoS Comput. Biol. 2:e62. 10.1371/journal.pcbi.0020062 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Söding J, Biegert A, Lupas AN. 2005. The HHpred interactive server for protein homology detection and structure prediction. Nucleic Acids Res. 33:W244–W248. 10.1093/nar/gki408 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Mumenthaler C, Braun W. 1995. Automated assignment of simulated and experimental NOESY spectra of proteins by feedback filtering and self-correcting distance geometry. J. Mol. Biol. 254:465–480. 10.1006/jmbi.1995.0631 [DOI] [PubMed] [Google Scholar]
- 36.Sanner M, Widmer A, Senn H, Braun W. 1989. GEOM, a new tool for molecular modeling based on distance geometry calculations with NMR data. J. Comp. Aided Mol. Des. 3:195–210. 10.1007/BF01533068 [DOI] [PubMed] [Google Scholar]
- 37.Schaumann T, Braun W, Wuthrich K. 1990. A program, FANTOM, for energy refinement of polypeptides and proteins using a Newton-Raphson Minimizer in the torsion angle space. Biopolymers 29:679–694. 10.1002/bip.360290403 [DOI] [Google Scholar]
- 38.Volk SM, Chen R, Tsetsarkin KA, Adams AP, Garcia TI, Sall AA, Nasar F, Schuh AJ, Holmes EC, Higgs S, Maharaj PD, Brault AC, Weaver SC. 2010. Genome-scale phylogenetic analyses of chikungunya virus reveal independent emergences of recent epidemics and various evolutionary rates. J. Virol. 84:6497–6504. 10.1128/JVI.01603-09 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Auguste AJ, Lemey P, Pybus OG, Suchard MA, Salas RA, Adesiyun AA, Barrett AD, Tesh RB, Weaver SC, Carrington CVF. 2010. Yellow fever virus maintenance in trinidad and its dispersal throughout the Americas. J. Virol. 84:9967–9977. 10.1128/JVI.00588-10 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Meyer KF, Haring CM, Howitt B. 1931. The etiology of epizootic encephalomyelitis of horses in the San Joaquin Valley, 1930. Science 74:227–228. 10.1126/science.74.1913.227 [DOI] [PubMed] [Google Scholar]
- 41.Rambaut A, Pybus OG, Nelson MI, Viboud C, Taubenberger JK, Holmes EC. 2008. The genomic and epidemiological dynamics of human influenza A virus. Nature 453:615–619. 10.1038/nature06945 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Tsetsarkin KA, McGee CE, Volk SM, Vanlandingham DL, Weaver SC. 2009. Epistatic roles of E2 glycoprotein mutations in adaption of Chikungunya virus to Ades albopictus and Ae. aegypti mosquitos. PLoS One 4:e6835. 10.1371/journal.pone.0006835 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Tsetsarkin K, Weaver SC. 2011. Sequential adaptive mutations enhance efficient vector switching by chikungunya virus and its epidemic emergence. PLoS Pathog. 7:e1002412. 10.1371/journal.ppat.1002412 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Tsetsarkin K, Chen R, Yun R, Rossi SL, Plante K, Guerbois M, Forrester NL, Perng GC, Sreekumar E, Leal G, Huang C, Mukhopadhyay S, Weaver SC. Multi-peaked adaptive landscape for chikungunya virus evolution predicts continued fitness optimization in Aedes albopictus mosquitos. Nat. Commun., in press [DOI] [PMC free article] [PubMed] [Google Scholar]