Abstract
Dengue virus (DENV) is the mosquito borne virus which causes Dengue Haemorrhagic Fever and Dengue Shock Syndrome. It consists of four distinct serotypes (DENV 1–4). DENV 1, 3 and 4 were classified into five genotypes (GI–GV), where as DENV-2 belongs to American and Cosmopolitan genotypes. Dengue virus is most prevalent in south and Southeast Asia including India. This study was initiated to study the genetic diversity and evolution among the Dengue isolates in India. Pairwise comparison of amino acid sequences among the serotypes has shown that DENV-3 is having less sequence diversity compared to other serotypes having differences in their amino acid numbers. We have analyzed the 50 Indian strains and 19 of those strains have been identified as recombinant strains by using RDP4 package, which are then excluded for future selection. Episodic positive selection of DENV was obtained using MEME with P value is ≤ 5. Positive selection on several codons was used to correlate the genetic diversity between serotypes. This study clearly established that diversity of amino acids and inter genotypic recombination of strains are the major cause for antigenicity variation and evolution of DENV within India.
Electronic supplementary material
The online version of this article (10.1007/s13337-019-00538-1) contains supplementary material, which is available to authorized users.
Keywords: Protein diversity, Recombination analysis, Antigenicity, Positive selection, Phylogenetic analysis
Introduction
Dengue is pandemic-prone fast emerging viral disease which is spread by Aedes aegyptii mosquitoes. Dengue flourishes in urban poor areas, suburbs and the countryside but also affects more affluent neighbourhoods in tropical and subtropical countries [1]. Dengue results in severe flu-like illness and, sometimes causing a potentially lethal complication called severe dengue [2]. The incidence of dengue has increased 30-fold over the last 50 years. According to World Health Organization (WHO) in 2012, the incidence of dengue epidemic is known to be rapidly increasing worldwide, posing health risk for about half of the world’s population.
Dengue virus (DENV) belongs to the family Flaviviridae and genus Flavivirus. They are enveloped, positive sense, single stranded viruses. The genome of DENV encodes single open reading frame. The ORF encodes 3 structural protein including capsid (C), pre membrane (prM) and envelope (E) and 7 non structural proteins including NS1, NS2a, NS2b, NS3, NS4a, NS4b and NS5 [3]. DENV comprising 4 antigenically distinct serotypes designated as DENV-1–4 [4]. The genetic diversity and antigenicity of Dengue virus mainly depends on envelope protein [5]. Based on phylogenetic analysis of envelope protein, these serotypes can be further subdivided into distinct genotypes. In India the first epidemic of Dengue fever was reported during 1950. In 1956, DENV-1 was the first Dengue serotype isolated from Vellore, TN, India. The major outbreak of Dengue was reported in Calcutta during 1963. Since, then many outbreaks of Dengue have been observed all over the world [1, 6]. Although, all 4 Dengue serotypes have been reported to circulate in the country [6], only DENV-2 and DENV-3 have been implicated in major Dengue Haemorrhagic Fever (DHF) outbreaks in India [7, 8]. DENV-2 and DENV-3 are considered as the most dominant genotypes in terms of spreading potential. Earlier, reports states that DENV-1 is mostly circulating along with those dominant serotypes [8]. Even though all Dengue serotypes have been isolated from different parts of India, DENV-1 has been most predominantly observed in recent decades. DENV-1 is also now widely reported in Andaman and Nicobar islands of Indian Ocean [9]. Genetic diversity of DENV serotypes has been attributed to their high mutation rate [10, 11]. The genomic sequences with information about their geographical origin are available at NCBI-GenBank [12]. The present study was planned to interpret the sequence information of all four DENV serotypes from India. Extensive comparative analysis was carried out for the Dengue genome sequences reported from India to understand the evolutionary relationship and amino acid diversity.
Materials and methods
Data set of DENV strains
The data set of 50 complete genome sequences comprising sixteen DENV-1, fifteen DENV-2, twelve DENV-3 and seven DENV-4 serotypes isolated only from India were obtained from NCBI-GenBank. Details about the above genome sequences such as GenBank accession numbers, geographical origin and year are given in Supplementary Table S1.
Genotyping
Genotyping of the above mentioned Indian Dengue virus sequences were performed using Dengue Subtyper Tool [13]. The resulting genotype information for all genome sequences are given in Supplementary Table S2.
Genome sequence analysis
Individual serotype level multiple sequence alignments were carried out for both genomic and protein sequences of DENV-1, DENV-2, DENV-3 and DENV-4. Both nucleotide sequence and amino acid sequence analysis were carried out using MegAlign module of Lasergene 5 software package (DNASTAR Inc, USA).
Phylogenetic analysis
The data set of 50 DENV strains including sixteen genome sequences from DENV-1, fifteen genome sequences from DENV-2, twelve genome sequences from DENV-3 and seven genome sequences from DENV-4 were collected from GenBank [12]. Multiple sequence alignment of all sequences was obtained using MUSCLE [14]. Using this multiple sequence alignment phylogenetic tree was constructed by following methods: Neighbour-Joining (NJ), Maximum Likelihood (ML) and Maximum Parsimony (MP) available in MEGA V6 suite [9].
Recombinant strains detection
Complete genome alignment of DENV sequences were used as input data for the identification of recombinant DENV strains using RDP4 package [15, 16]. RDP4 was used to identify potential recombinants along with their major and minor parents. The methods used in this detection are as follows BOOTSCAN [16], CHIMERA [17], SiScan [18], GENCONV [19], 3SEQ [15] and MAXCHI [20]. Recombination in this data set was also checked by GARD analysis under Datamonkey web application.
Selection pressure
Potential recombinants identified by RDP4 package were excluded from analysis. After the exclusion each protein sequence from the dataset was aligned individually using MUSCLE. All these datasets are individually subjected to selection pressure analysis using HYPHY package of Datamonkey server with default value of P = 1 [21]. After recombination analysis using GARD, totally four approaches were used for selection including Single likelihood ancestor counting (SLAC), fixed effects likelihood (FEL) and Branch-site Unrestricted Statistical Test for Episodic Diversification (BUSTED) [22]. Mixed effect model evolution (MEME) was also applied for positive selection [23].
Results and discussion
Genotyping
Genotyping of four different serotypes of DENV revealed that in DENV-1: 14 strains belong to genotype III, one belongs to genotype I and one belongs to sylvatic group. In DENV-2: 11 are cosmopolitan strains and remaining 4 belongs to American genotype. In DENV-3: 9 belong to genotype III, 2 belongs to genotype V and 1 belongs to genotype II. In DENV-4: 6 strains belongs to genotype I and 1 belongs to genotype II.
Amino acid sequence diversity
The pair wise comparison of deduced amino acid sequences of complete ORF was carried out to determine the degree of relatedness of these viruses at protein level. The important amino acid substitutions among these strains are shown in Table 1. Amino acid changes mostly occur on the surface of virus. Forty-seven variable sites have been identified already while comparing envelope gene sequences of Dengue virus [24]. The whole genome of strain KU216208 isolated from Rajasthan, India has 5 amino acid variations in structural protein and 20 amino acid variations in non-structural protein [11]. Multiple sequence alignment of DENV proteins revealed that some protein sequences have different length. KU509255.1 (DENV-1) identified in 2011 by German scientists have shorter length of membrane protein compared to other strains. Three strains from DENV2 (KY427084.1, KY427085.1 and KU509271.1) and Two strains from DENV3 (KU509286.1 and KU509281.1) has one extra amino acid in NS3 protein. This might be due to deletions in 3′ UTR region. Nearly, 94.8–98.8% identity has been identified between northern and southern Indian DENV-1 isolates. This is due to distinct genetic makeup of DENV1 in different parts of India [9]. The pair wise comparison of amino acid sequences of DENV-1 isolated between 2009 and 2011 revealed 99.8–99.9% identity [11, 13].
Table 1.
Description of unique amino acid substitutions among Indian DENV isolates compared to global reference strains
| Proteinb | DENV subtypec | Acc. no.a | Position of amino acid in protein | Substituted amino acid | Predecessor amino acid |
|---|---|---|---|---|---|
| Capsid | DENV-1 | JQ922547.1 | 9–12 | S, Q, G, G | F, N, M, L |
| DENV-2 | JQ922553.1 | 81 | K | R | |
| Envelope | DENV-1 | JQ922545.1 | 67–72 | T, G, A, P, I | L, V, T, P, S |
| NS 1 | DENV-1 | JQ922547.1 | 49–52 | C, V, I, R | W, E, E, G |
| DENV-2 | JQ922553.1 | 160 | R | G | |
| 161 | K | V | |||
| DENV-4 | JQ922559.1 | 96 | T | S | |
| LC069810.1 | |||||
| JF262783.1 | |||||
| JQ922559.1 | 2 | G | E | ||
| 11 | N | K | |||
| LC069810.1 | 262 | S | T | ||
| NS 2a | DENV-4 | 80 | V | I | |
| NS 2b | DENV-1 | JN903578.1 | 52 | K | R |
| DENV-2 | JQ922549.1 | 90 | P | Q | |
| NS 3 | DNEV-4 | LC069810.1 | 1 | S | T |
| 1 | L | T | |||
| NS 4a | DENV-1 | JQ922546.1 | 79 | N | I |
| KJ755855.1 | 77–79 | A, G, N | T, S, I | ||
| DENV-3 | 144 | V | I | ||
| DENV-4 | 144 | V | I | ||
| 82 | M | L | |||
| NS 4b | DENV-2 | JQ922549.1 | 93 | P | L |
| NS 5 | DENV-1 | KJ755855.1 | 316 | K | T |
| DENV-4 | JF262783.1 | 7 | N | D | |
| JQ922558.1 | 114,325 | H, S | Q, N | ||
| KU509287.1 | 114,325 | K, I | Q, N | ||
| KX845005.1 | 407 | E | K |
aGenome accession number from GenBank, bProtein annotation taken from GenBank, cSubtype according to WHO classification
Phylogenetic analysis and recombinant strain detection
Phylogenetic tree was constructed based on complete ORF of selected Indian DENV isolates were generated using maximum parsimony, maximum likelihood and neighbour joining methods using MEGA V6 [25]. Evolutionary history of selected 50 Indian DENV was inferred using phylogenetic analysis and it was used further to study the evolutionary distance between the sequences. The tree with highest log likelihood is shown in Fig. 1.
Fig. 1.
Phylogenetic tree of Indian DENV-isolates having highest log likelihood score is shown. Above phylogenetic tree is computed using MEGA7 with Maximum Likelihood method. It is used to study the evolutionary distance between the sequences
Recombination is a key evolutionary process that shapes the architecture of genomes and the genetic structure of populations. RDP4 suite [16] identifies the potential recombinant sequences & their parents (Major and Minor). A sequence which is considered as potential recombinant should have P value of > 1 by at least three of seven recombination detection methods [4, 22]. Recombination analysis was performed using RDP4 package revealed the presence of 19 recombination’s out of 50 strains including 6 from DENV-1, 11 from DENV-2, 1 from DENV-3 and 1 from DENV-4 were shown in Supplementary Table 3–6. According to our results GENCOV and 3SEQ are the best methods for detecting recombinant strains. JQ922550.1 of DENV-2 isolated in 2012 from Maharashtra, India has lowest P value of 0.013 compared to other strains.
Selection pressure
The selection pressure analysis revealed that majority of codons in DENV was under strong positive selection. In each serotype, total of 36 codons from DENV-1, 2 from DENV-3 and 20 codons from DENV-4 were identified to be under positive selection by HYPHY software package. In this study, significant evidence of episodic positive selection in codons of structural and non-structural proteins were obtained and listed in Supplementary Table S7–10. Positive selection on several codons was used to correlate the genetic diversity between serotypes and their antigenicity as described earlier [22].
Significant evidence of insidious positive selection (p ≤ 0.05) was obtained using SLAC, FEL and BUSTED methods which was identified to be specific for selected Indian strains (Supplementary Table S11-S14). The site specific selection pressure on all serotypes revealed strong purifying selection [26]. Maximum sites were identified in NS4B except DENV-4, which has maximum sites in NS5 protein. The members of these strains were observed to have unique mutations in the codons. In DENV-4 unique mutation was observed in codon 58 of NS5 (Glu-114-His) and similarly episodic positive selection was observed in the codon 79 of NS4A of DENV-1 (Ile-154-Asn). The amino acid residues encoded by these codons are known to be the part of B and T cell epitopes [27, 28]. Notably, considerable genetic changes were also observed within these strains. Even though, the evidence of positive selection on DENV strains was noted to be limited, mutations or changes in non structural proteins play a vital role in disease severity [4, 23]. The observations of DENV-1 to 4 serotypes pinpoints that strains circulating in India are more prone to undergo mutations which are likely to serve as reservoir for emergence of diverse strains. So that the population diversity is observed to be influenced by inter genotype admixture and adaptive evolution.
Analysis of DENV strains in India using these approaches revealed that the population is subdivided into 5 major genotypes (G I-V) except DENV-2 which belongs to cosmopolitan and American genotype. Inter genotypic recombination was not observed in any strains. The diversity of amino acids in both structural and non structural proteins gave the idea about the variations in the antigenicity of DENV. Significant evidence of positive selection was observed on codons of all ten proteins.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Footnotes
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- 1.Chen R, Vasilakis N. Dengue—Quo tu et quo vadis. Viruses. 2011;3:1562–1608. doi: 10.3390/v3091562. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Halstead SB. Pathogenesis of dengue: challenges to molecular biology. Science. 1988;239:476–481. doi: 10.1126/science.239.4839.476. [DOI] [PubMed] [Google Scholar]
- 3.Gubler DJ. Dengue and dengue hemorrhagic fever. Clin Microbiol Rev. 1998;11:480–496. doi: 10.1128/CMR.11.3.480. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Waman PV, Mohan MK, Urmila KK. Genetic diversity and evolution of dengue virus serotype 3: a comparative genomics study. Elsevier. 2017;49:234–240. doi: 10.1016/j.meegid.2017.01.022. [DOI] [PubMed] [Google Scholar]
- 5.Amarilla AA, et al. Genetic diversity of the E protein of dengue type 3 virus. Virol J. 2009;6:113. doi: 10.1186/1743-422X-6-113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Ram S, Khurana S, Kaushal V, Gupta R, Khurana SB. Incidence of dengue fever in relation to climatic factors in Ludhiana, Punjab. Indian J Med Res. 1998;108:128–133. [PubMed] [Google Scholar]
- 7.Singh UB, Maitra A, Broor S, Rai A, Pasha ST, Seth P. Partial nucleotide sequencing and molecular evolution of epidemic causing dengue 2 strains. J Infect Dis. 1999;180:959–965. doi: 10.1086/315043. [DOI] [PubMed] [Google Scholar]
- 8.Dash PK, Parida MM, Saxena P, Kumar M, Rai A, Pasha ST, Jana AM. Emergence and continued circulation of dengue-2 (genotype IV) virus strains in northern India. J Med Virol. 2004;74:314–322. doi: 10.1002/jmv.20166. [DOI] [PubMed] [Google Scholar]
- 9.Kukreti H, Chaudhary A, Rautela RS, Anand R, Mittal V, Chhabra M, Bhat-tacharya D, Lal S, Rai A. Emergence of an independent lineage ofdengue virus type 1 (DENV-1) and its co-circulation with predominant DENV-3 during the 2006 dengue fever outbreak in Delhi. Int J Infect Dis. 2008;12:542–549. doi: 10.1016/j.ijid.2008.02.009. [DOI] [PubMed] [Google Scholar]
- 10.Amarilla AA, de Almeida FT, Jorge DM, Alfonso HL, de Castro-Jorge LA, Nogueira NA, Figueiredo LT, Aquino VH. Genetic diversity of the E protein of dengue type 3 virus. Virol J. 2009;6:113. doi: 10.1186/1743-422x-6-113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Annette A, Bennet A, Ajay PJ, Rajendra KB, Suman R, Vinod J. First study of complete genome of Dengue-3 virus from Rajasthan, India: genomic characterization, amino acid variations and phylogenetic analysis. Virol Rep. 2016;6:32–40. doi: 10.1016/j.virep.2016.05.003. [DOI] [Google Scholar]
- 12.Benson DA, et al. GenBank. Nucleic Acids Res. 2013;41(Database issue):D36–D42. doi: 10.1093/nar/gks1195. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Kolekar P, Kale M, Kulkarni-Kale U. Alignment-free distance measure based on return time distribution for sequence analysis: applications to clustering, molecular phylogeny and subtyping. Mol Phylogenet Evol. 2012;65:510–522. doi: 10.1016/j.ympev.2012.07.003. [DOI] [PubMed] [Google Scholar]
- 14.Edgar RC. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucl Acids Res. 2004;32:1792–1797. doi: 10.1093/nar/gkh340. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Boni MF, Posada D, Feldman MW. An exact nonparametric method for inferring mosaic structure in sequence triplets. Genetics. 2007;176:1035–1047. doi: 10.1534/genetics.106.068874. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Martin D, Posada D, Crandall K, Williamson C. A modified bootscan algorithm for automated identification of recombinant sequences and recombination breakpoints. AIDS Res Hum Retrovir. 2005;21:98–102. doi: 10.1089/aid.2005.21.98. [DOI] [PubMed] [Google Scholar]
- 17.Posada D, Crandall KA. Evaluation of methods for detecting recombination from DNA sequences: computer simulations. Proc Natl Acad Sci. 2001;98:13757–13762. doi: 10.1073/pnas.241370698. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Gibbs MJ, Armstrong JS, Gibbs AJ. Sister-scanning: a Monte Carlo procedure for assessing signals in recombinant sequences. Bioinformatics. 2000;16:573–582. doi: 10.1093/bioinformatics/16.7.573. [DOI] [PubMed] [Google Scholar]
- 19.Padidam M, Sawyer S, Fauquet CM. Possible emergence of new geminiviruses by frequent recombination. Virology. 1999;265:218–225. doi: 10.1006/viro.1999.0056. [DOI] [PubMed] [Google Scholar]
- 20.Smith JM. Analyzing the mosaic structure of genes. J Mol Evol. 1992;34:126–129. doi: 10.1007/BF00182389. [DOI] [PubMed] [Google Scholar]
- 21.Delport W, Poon AFY, Frost SDW, Pond SL. Datamonkey 2010: a suite of phylogenetic analysis tools for evolutionary biology. Bioinformatics. 2010;26(19):2455–2457. doi: 10.1093/bioinformatics/btq429. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Murrell B, Wertheim JO, Moola S, Weighill T, Scheffler K, Pond SLK. Detecting individual sites subject to episodic diversifying selection. PLoS Genet. 2012;8:e1002764. doi: 10.1371/journal.pgen.1002764. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Brooks AJ, Johansson M, John AV, Xu Y, Jans DA, Vasudevan SG. The interdomain region of dengue NS5 protein that binds to the viral helicase NS3contains independently functional importin beta 1 and importin alpha/beta-recognized nuclear localization signals. J Biol Chem. 2002;277:36399–36407. doi: 10.1074/jbc.M204977200. [DOI] [PubMed] [Google Scholar]
- 24.Dash PK, Sharma S, Soni M, Agarwal A, Parida M, Rao PVL. Complete genome sequencing and evolutionary analysis of Indian isolates of dengue 2virus. Biochem Biophys Res Commun. 2013;436:478–485. doi: 10.1016/j.bbrc.2013.05.130. [DOI] [PubMed] [Google Scholar]
- 25.Tamura K, Stecher G, Peterson D, Filipski A, Kumar S. MEGA6: molecular evolutionary genetics analysis version 6.0. Mol Biol Evol. 2013;30:2725–2729. doi: 10.1093/molbev/mst197. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Holmes E, Twiddy S. The origin, emergence and evolutionary genetics of dengue virus. Infect Genet Evol. 2003;3:19–28. doi: 10.1016/S1567-1348(03)00004-2. [DOI] [PubMed] [Google Scholar]
- 27.Kurane I. Dengue hemorrhagic fever with special emphasis on immunopathogenesis. Comp Immunol Microbiol Infect Dis. 2007;30:329–340. doi: 10.1016/j.cimid.2007.05.010. [DOI] [PubMed] [Google Scholar]
- 28.Matsui K, et al. Characterization of dengue complex-reactive epitopes on dengue 3 virus envelope protein domain III. Virology. 2009;384(1):16–20. doi: 10.1016/j.virol.2008.11.013. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.

