Skip to main content
PLOS Pathogens logoLink to PLOS Pathogens
. 2022 May 2;18(5):e1010500. doi: 10.1371/journal.ppat.1010500

Beneath the surface: Amino acid variation underlying two decades of dengue virus antigenic dynamics in Bangkok, Thailand

Angkana T Huang 1,2, Henrik Salje 1,3, Ana Coello Escoto 1,4, Nayeem Chowdhury 1, Christian Chávez 1, Bernardo Garcia-Carreras 1, Wiriya Rutvisuttinunt 5, Irina Maljkovic Berry 5, Gregory D Gromowski 5, Lin Wang 3, Chonticha Klungthong 2, Butsaya Thaisomboonsuk 2, Ananda Nisalak 2, Luke M Trimmer-Smith 1, Isabel Rodriguez-Barraquer 6, Damon W Ellison 5, Anthony R Jones 2, Stefan Fernandez 2, Stephen J Thomas 7, Derek J Smith 8, Richard Jarman 5, Stephen S Whitehead 9, Derek A T Cummings 1,*, Leah C Katzelnick 1,4,*
Editor: Anice C Lowen10
PMCID: PMC9098070  PMID: 35500035

Abstract

Neutralizing antibodies are important correlates of protection against dengue. Yet, determinants of variation in neutralization across strains within the four dengue virus serotypes (DENV1-4) is imperfectly understood. Studies focus on structural DENV proteins, especially the envelope (E), the primary target of anti-DENV antibodies. Although changes in immune recognition (antigenicity) are often attributed to variation in epitope residues, viral processes influencing conformation and epitope accessibility also affect neutralizability, suggesting possible modulating roles of nonstructural proteins. We estimated effects of residue changes in all 10 DENV proteins on antigenic distances between 348 DENV collected from individuals living in Bangkok, Thailand (1994-2014). Antigenic distances were derived from response of each virus to a panel of twenty non-human primate antisera. Across 100 estimations, excluding 10% of virus pairs each time, 77 of 295 positions with residue variability in E consistently conferred antigenic effects; 52 were within ±3 sites of known binding sites of neutralizing human monoclonal antibodies, exceeding expectations from random assignments of effects to sites (p = 0.037). Effects were also identified for 16 sites on the stem/anchor of E which were only recently shown to become exposed under physiological conditions. For all proteins, except nonstructural protein 2A (NS2A), root-mean-squared-error (RMSE) in predicting distances between pairs held out in each estimation did not outperform sequences of equal length derived from all proteins or E, suggesting that antigenic signals present were likely through linkage with E. Adjusted for E, we identified 62/219 sites embedding the excess signals in NS2A. Concatenating these sites to E additionally explained 3.4% to 4.0% of observed variance in antigenic distances compared to E alone (50.5% to 50.8%); RMSE outperformed concatenating E with sites from any protein of the virus (ΔRMSE, 95%IQR: 0.01, 0.05). Our results support examining antigenic determinants beyond the DENV surface.

Author summary

Dengue viruses, even of the same serotype, are differentially recognized by preexisting antibodies of individuals. With antibody levels being an important indicator of infection risk and pathogenicity, understanding mechanisms underlying these differences are crucial for vaccine design and development. Investigations have primarily targeted surface regions of the envelope protein (E) where virus-antibody interactions were thought to primarily occur. However, the roles of non-surface regions of the E protein as well as nonstructural proteins has been limited. We looked at the entire virus to identify associations between specific changes in the protein sequence and differences in how viruses were recognized by antibodies. In addition to recovering known determinants on the surface, we found signals in other areas on the structural building blocks of the virus. We also identified additional signals on specific areas of a protein that does not form structures of the virus but orchestrate virus formation. Our results point towards broadening the frame of investigation to gain a more comprehensive understanding of mechanisms giving rise to antibody recognition of dengue viruses, and may aid the design and evaluation of vaccines and/or assays to characterize dengue immunity.

Introduction

Dengue virus (DENV) is a vector-borne flavivirus with four recognized serotypes, DENV1–4, which circulate in the human population and cause a spectrum of disease ranging from mild to life-threatening. Anti-DENV immunity is complex and imperfectly understood. The longstanding belief is that infection by a strain of one serotype induces long-term protection against the homologous serotype but only protects against other serotypes for a limited amount of time [1, 2]. However, evidence of antigenic heterogeneity within and among the four DENV serotypes challenges this long-standing belief [3]. Instances of reinfection with strains that exhibit reduced neutralization, albeit rare, have also been reported [4]. As neutralizing antibodies are the best supported correlates of protection [58], understanding the determinants of these differences is crucial for vaccine design and implementation.

The DENV genome encodes three structural proteins, the envelope protein (E), precursor membrane protein (prM), and the capsid protein (C), and seven non-structural proteins (NS1, NS2A, NS2B, NS3, NS4A, NS4B, NS5), Fig 1. Anti-DENV antibodies primarily target epitopes on the E protein, prM protein, and the secreted NS1 protein [9, 10]. Previous studies have identified numerous amino acid differences in specific epitopes that modulate neutralization of genotypes within serotypes by both monoclonal antibodies and polyclonal serum antibodies [1115]. In addition to directly impacting the recognition of epitopes by antibodies, amino acid changes can also affect other viral processes, imposing large effects on the antigenicity of the virus. For instance, a single change on the envelope protein of DENV4 genotype V disrupts the glycosylation motif of the virus. This change both reduces cell infectivity and hinders the binding and neutralization of some monoclonal antibodies [16]. In addition to increasing epitope heterogeneity, variation in prM cleavage was shown in DENV2 to also affect its structural kinetics. Both mechanisms led to differences in antibody recognition [17]. A mutation which alters the structural kinetics introduced into DENV1 made cryptic epitopes on the E protein more accessible, increasing sensitivity to neutralization [18]. As such, residue changes in non-epitope sites may also modulate DENV antigenic properties more broadly but have yet to be studied in detail.

Fig 1. Dengue proteins and antigenicity.

Fig 1

a) Structure of immature (left-half) and mature (right-half) dengue virion with viral RNA encapsulated. b) Organization of dengue proteins on a polyprotein tranlated from the viral RNA. c) Representation of two-dimensional antigenic map of dengue viruses. Viruses in each of the serotypes form antigenic clusters on the map. Antigenic distances can be measured from the map.

Numerous previous studies have identified the antigenic effects of substitutions between serotypes [19], between genotypes [15, 20], or in highly lab-adapted viruses [21]. To our knowledge, there are no previous studies that have examined amino acid variation that arises among closely related lineages co-circulating in the same location over time, where antigenic evolution is likely to be most evident. Thus, while previous studies have identified key regions associated with antigenic effects for one serotype or another, little is known about the positions that are variable and result in antigenic shifts both within and between serotypes, especially those that emerge and disappear naturally during circulation. Comprehensive antigenic characterization with a robust serological assay of a large number of closely related, co-circulating strains enables identification of these more subtle antigenically important sites that might otherwise be missed or overlooked. Further, dense sampling of highly similar strains paired with matched full genome sequences enables screening for amino acid changes outside major epitopes that may have subtle but significant antigenic effects. Such sites are not generally studied in smaller experimental studies of the effects of specific amino acid changes on structural proteins.

To identify associations of antigenic signal with specific substitutions across all DENV proteins, we leveraged a dataset of paired whole genome sequences and neutralizing antibody titers that includes 348 DENV1–4 strains co-circulating in Bangkok, Thailand over two decades (1994 to 2014, S1 Fig) [22]. Viruses were placed onto a 3-dimensional antigenic map using antigenic cartography, a method which fits the Euclidean distance between virus-serum pairs on a map to the difference in titer measurements across a panel of reference sera [3]. Antigenic distances between virus pairs were calculated from these coordinates. We implemented a method that was previously used to estimate effect sizes of substitutions in the E protein on antigenic distances between 47 global DENV1–4 strains [23] on this new dataset. These genetically similar but antigenically diverse viruses allowed us to identify antigenic determinants within serotypes and even within genotypes. In addition to substitutions that were congruent with known epitopes on the E protein, we identified other substitutions that resided in the stem/anchor of E and in nonstructural proteins. After removing substitutions that harbored signals due to co-ancestry with antigenic determinants in E, we found that the remaining substitutions were associated with antigenic effects above that expected by chance. We describe the positions of these substitutions on the protein and discuss their potential role in modulating viral antigenicity. Finally, we probe our virus set for natural experiments to test for observable antigenic effects of individual residues.

Results

Substitutions in E are associated with antigenic variation

We applied a similar substitution model to that described in Bell, et al. [23] to estimate the effects of observed amino acid changes on antigenic distances in the Thai virus dataset. The model assumes that distances observed between virus pairs were additive effects of amino acid substitutions separating them. Virus-specific intercepts were added to account for the contributions of virus-specific measurement uncertainties. Biologically, the intercept quantifies the expected distance between two independent characterizations of the same virus. Estimations were done 100 times with a random 10% of the 120,756 virus pairs held out each time. Substitutions which showed effects in at least 95% of the estimations were deemed antigenically relevant.

Using E protein sequences as input, distributions of estimated virus-specific intercepts were similar across the 100 estimations (S2 Fig), with an average of 0.74-fold titer reduction (95%IQR: 0.74, 0.75) across all viruses. This average is similar to the expected antigenic distance resulting from variability of PRNT50 measurements (95%IQR: 0.74, 0.75).

Using the effect size estimates and the average intercept in each fold, we predicted antigenic distances between all virus pairs in the dataset. The predictions of antigenic distances from models fitted with the E protein substitutions showed a tight correlation with the observed distances (correlation coefficient of 0.87, S3 Fig). Benchmarked against predictions assuming distances between centroids of serotypes, the substitutions in E explained 50.5% to 50.9% of the residual variance. To quantify how much within serotype variations were captured, we restricted the calculation to only intraserotype distances and found that 79.1% to 79.3% of the variance was explained.

Substitution model captures known epitopes in E

The model identified 394 nonzero effect substitutions positioned on 77 of the 295 sites on the E protein with residue diversity observed in the Thai DENV dataset (Fig 2 and S4 Fig, and S1 File). The number of substitutions identified and number of sites involved changed minimally when we only considered effects present in 100% of estimations as antigenically relevant (S5 Fig). To evaluate whether the nonzero effect size positions were in epitopes previously identified for anti-DENV antibodies, we compared the positions against those compiled in a comprehensive database of DENV monoclonal antibodies (DENVab) [24]. This database includes information on the identified sites which constituted the epitopes (footprints) for 253 anti-DENV human and mouse monoclonal antibodies (mAbs) that were reported in the literature up to 2016, including potently neutralizing antibodies that have been characterized previously [2527].

Fig 2. Association between effect sites and known epitopes of neutralizing antibodies.

Fig 2

a) Number and percentage of sites with and without effects by whether or not they are part of known epitopes. Odds ratios were calculated by either considering epitopes of both human-derived monoclonal antibodies (hmAb) and murine-derived monoclonal antibodies (mmAb) and when only restricted to hmAb epitopes. b) Defining neighborhoods of known hmAb epitopes as positions within N sites away (linear distance), the probability of nonzero effect sites being within the neighborhood at random (red) are contrasted against the proportion of variable sites that were within the neighborhood (gray). c) Analogous analysis but with neighborhoods defined as being within X angstroms away from known hmAb epitopes (3-dimensional spatial distance). N = 0 and X = 0 were when the neighborhood was exactly at the reported epitope positions.

According to DENVab database, 159 positions in the E protein contribute to epitopes of characterized anti-DENV mAbs while 336 positions have not yet been associated with any epitopes. Seventy of the mAbs were recorded to have neutralization activity, footprints involving 111 sites. Of these, 74 sites were variable in our dataset meaning their effects have the potential to be detected by the model. Of the 77 amino acid positions identified as having non-zero effect sizes by our model, 22 (28.6%) were within these footprints, while 55 (71.4%) were not (Fig 2 and S4 Fig). The odds ratio for nonzero effects being in known epitopes versus not known sites was not significantly different from one (1.28, 95%CI: 0.71, 2.29). Restricting the assessment to footprints of human-derived mAb (hmAb) that were variable in our dataset (57 sites) revealed a significant positive association (odds ratio of 1.90, 95%CI: 1.02, 3.51); 21/77 (27.3%) fell within hmAb footprints. Well characterized binding sites of potently neutralizing type-specific hmAb 1F4, 14C10, 2D22, and 5J7, as well as broadly cross-neutralizing mAbs EDE1–2B2 and EDE1–2C8 were captured (Fig 3).

Fig 3. Effects of substitutions in the envelope protein.

Fig 3

a) Substitutions with non-zero effect sizes with 95% interquartile range across the 100-fold Monte Carlo cross-validations as whiskers, median as points. Points are colored red if they match positions of known epitopes for monoclonal antibodies compiled in the DENVab database [24]. Gray vertical lines indicate positions with known human-derived monoclonal antibody (hmAb) epitopes, long if within site diversity exists in our dataset and short if not. b) Footprints of potently neutralizing hmAbs, colored red if the positions showed non-zero effects. c) Top and side views of the envelope protein structure with known epitopes colored red if estimated as non-zero effect, and gray if estimated as zero effect. Non-zero effect positions not matching reported hmAb epitopes are in black.

Interestingly, while 36 sites previously identified as DENV-specific hmAb epitopes were marked as zero-effect size by the model, 29 sites (80.5%) were within 3 linear positions away from a nonzero effect residue. In reverse, of the 56 nonzero effect sites that did not match the reported hmAb epitopes, 28 were within 3 sites of known hmAb epitopes, suggesting that they plausibly could contribute to epitopes for some previously identified antibodies. The chance of observing at least this amount of overlap, 21 captured + 28 proximal, if 77 sites were chosen from the 295 sites with variability at random was small (p = 0.037, Fig 2B). We repeated the analysis using distances extracted from a resolved 3-dimensional structure of E [28]. The chance was also small when proximal sites were defined as being within 3.5 angstroms away (p = 0.014, Fig 2C).

Sixteen of the sites identified by the model to have antigenic signals were located in the stem/anchor domain, sites that were unlikely targets of antibodies and were not within or near the sites of previously identified epitopes. However, recent conformational studies have revealed large increases in accessibility to sites on the amphipathic stem helices (E:431–448) and the transmembrane helices (E:465–486) when temperature was heightened from 28° C to 37° C (DENV2) and 40° C (DENV1) [29] which may allow these cryptic epitopes to be exposed under these physiological conditions.

Antigenic signals exist in proteins beyond E

Aside from sites on the exposed DENV proteins, non-antigenically relevant sites can also harbor antigenic signals if they are linked to antigenic sites. Phylogenies inferred from individual genes were shown to have branching patterns similar to ones inferred from the complete genome or the open reading frame (ORF), with nonstructural genes, except NS4A, yielding better resolution (i.e., stronger clade support values) than structural genes [3032] despite some reports of DENV intraserotype recombination [33]. Hence, it is not unexpected that correlated amino acid changes in other proteins, as a result of correlated nucleotide changes due to shared ancestries, would appear as predictive. However, if the association with antigenic signals were only due to associations with other sites, the prediction performance should be bounded at most by the performance of the signal contributing protein and decline as the linkage dissolves. To identify antigenic determinants in proteins other than E, (1) we fitted effect sizes for each of the DENV proteins separately, (2) screened for proteins with predictive performance exceeding that of sites in E, (3) sifted for sites that were consistently associated with antigenic signals when adjusted for sites in E, and (4) showed that the signals in these sites were not coincidentally mapped to antigenic variation at random.

To screen for proteins with more antigenic signal than expected from linkage alone, performance of antigenic distance predictions for each of the DENV proteins, measured as root mean squared error (RMSE), were compared against 20 random subsamples of the polyprotein (sampled to the length of each respective protein). This adjusts for protein size: as seen in Fig 4, larger proteins are associated with better prediction performances (lower RMSE). Most proteins showed equivalent performance to the random samples from the polyprotein or worse (95%IQR of the RMSE overlapping or higher than 95%IQR of the comparator) except for NS2A which on average outperformed the comparator by 0.04 (0.02, 0.06) in absolute difference in RMSE. A similar analysis but subsampling from sites on the E protein (rather than the entire polyprotein) was performed to screen for proteins harboring signals beyond expected from covariations with sites in E. Again, only NS2A had a lower RMSE (and non-overlapping 95%IQR) than expected: RMSE difference of 0.03 (0.01, 0.06) from sites sampled from E.

Fig 4. Antigenic signal in each DENV protein.

Fig 4

a) Average within site variability in DENV proteins observed in the dataset. Bars were annotated with number of variable sites, total number of sites, and percentage of sites variable. b) Prediction performance of each DENV protein as observed (red) contrasted against expectations derived from random subsample of sites from any DENV protein of the same length (gray) and random down samples of sites from the envelope protein (E, blue). Points and lines are median and 95% interquartile range (IQR) of the root mean squared error (RMSE) evaluated under 100-fold Monte Carlo cross-validation. Length of the proteins are shown in parentheses. Only nonstructural protein 2A (NS2A) appeared to have better predictive performance than the expectations.

Antigenic role of NS2A beyond co-ancestry with E

To rule out sites that may have harbored antigenic signals from co-ancestry with E, we concatenated 60 random NS2A sites to the E protein sequence to adjust for its effects and reran the inference. Frequency at which a site has nonzero effect substitutions when included as part of the random subsets was summarized across 300 randomizations. The number of times sites were included in the subsets ranged from 60 to 107 with a median of 82. Considering 50% nonzero effect frequency as the null, an observed 80% frequency with these sample sizes would achieve a 1% false positive rate and <1% false negative rate under a binomial test. RMSE of E protein concatenated with NS2A sites was lower than when concatenated with sites outside of E and NS2A (Fig 5A). Changing the number of sites from 60 to 30 yielded consistent results (Fig 5B). The frequency at which sites in NS2A were estimated to have nonzero effects appeared bimodal, either showing effects in <1% of the number of times sampled or >99%, Fig 5C. When the 62 sites suggested by both analyses harboring signals beyond linkage with sites in E were concatenated to E, the effects remained (Fig 5D). Its performance (95%IQR of RMSE: 0.68, 0.70) was better than the performance using data concatenating E with random sets of 62 sites from other proteins (95%IQR: 0.71, 0.73), and the same sites but with residues permuted across viruses (lowest achieved 95%IQR: 0.72, 0.74). The within site permutation dissolves the association between residues and antigenicity of the virus but maintains the within site diversity. The consistently lower RMSE of the actual protein sequence supports the existence of antigenic signals in these NS2A sites beyond the linkage with E and was unlikely mapped to the antigenic patterns at random. The refitted substitution model with these 62 sites concatenated to the sequences of E was able to explain 54.2% to 54.5% of the variations in distances (S3(B) Fig), a 3.3% to 4.0% increase from E alone, with improvements more pronounced for interserotypic variations (5.6% to 6.6% increase) than intraserotypic (0.6% to 1.0% increase).

Fig 5. Sites embedding antigenic signals beyond the envelope protein.

Fig 5

Prediction performance of downsampled NS2A sites concatenated with E when randomly downsampled to a) 60 sites and b) 30 sites contrasted against when concatenated with random sites from other proteins. c) Distribution of frequencies at which sites showed non-zero effect given being sampled in the two downsampling schemes. Black lines link frequencies of the same sites. d) Performance when concatenating the 62 sites which >99% of the times sampled was estimated to have non-zero effect size when adjusted for E in both schemes (red) compared against the same sites but permuted (yellow), and sites from other proteins of the same length (gray). Permutation was done by permuting residues observed at each site across viruses to conserve its diversity.

Distributions of antigenic determinants in NS2A

The 62 sites identified in NS2A (S2 File) were scattered throughout the protein involving all eight predicted transmembrane segments of the protein [34], S6 Fig, covering 28.8% of NS2A sites: 31.2% of sites in the cytosol, 36.2% in the ER lumen, and 21.1% in the transmembrane domain. None of the protein segments included more nonzero effect sites than what would be expected if effects were assigned to variable sites at random (S7 Fig). The protein segment with the lowest chance of observing this number of nonzero effect sites at random was pTMS-2 (p = 0.135, 9/20 sites, 45%). Segment pTMS-2 does not truly transverse the ER membrane but electrostatically interacts with the membrane through residues 30–55 [35]. With its characterized properties, Nemeśio and Villalaín [35] speculated that it has a role in membrane rearrangements during replication. Another region with a high proportion of sites harboring signals is the C-terminal which influences viral assembly and secretion [36]: 4/9 sites (44%), p = 0.337. It may be noteworthy that both sites preceding and after pTMS-5 where a predicted amphipathic helix resides [34] also showed nonzero effects. Like pTMS-2, pTMS-5 is located in the ER lumen and does not transverse the membrane, but was not found to be associated with the membrane [34]. The role of pTMS-5 is unclear but mutation D125A disrupted the signaling of NS1-NS2A cleavage abolishing the viral RNA synthesis [37]. Recent findings revealed how NS2A couples the encapsulation of viral RNA and the assembly of infectious virion [38]. NS2A binds prM in the C-prM-E polyprotein and the 3’UTR of the viral RNA. Following a cascade of cleavages, C transverses through the ER membrane, encapsulates the viral RNA, and the nucleocapsid buds with prM and E into the ER lumen. Our identified sites are not at the key molecules that modulate these activities but are proximally localized in the ER lumen. Like how a single amino acid change in a non-epitope site on the E protein could affect the conformation of the virus ensembled leading to differences in neutralization profiles [39], the influence of these sites on the interaction during the cascade of C-prM-E cleavage and assembly may have led to differences in resulting virions, and thus, vary their antigenicity.

E-NS2A coevolution hotspots supports interprotein interactions

Coevolution between sites may indicate interprotein interactions [40]. To explore whether antigenic signals in NS2A could be linked to interactions with sites in E, we applied two coevolution detection methods to our dataset. The first method, fastcov [41], retains both site and residue information and takes into account asynchronous changes at different sites. The use of covariance between sites in the method has been shown to correspond well with branching patterns in the phylogeny. S8 Fig illustrates the density of coevolving residue pairs between sites in E and NS2A identified by fastcov. The second method, SpydrPick [42], is a mutual information (MI) based method with phylogenetic signal adjustment that detects coevolution between nucleotide positions. Filtering for pairs of nucleotide positions with MI values greater than the 99th percentile across all position pairs on the genome, NS2A appears to show a comparatively high level of coevolution with E compared to other proteins (S9 Fig). The coevolution hotspots suggested by both methods were around positions 40 (pTMS-2), 115 (pTMS-4), and 160 (pTMS-6) in the NS2A protein, which coincide with regions of high diversity and the identified nonzero effect sites. These results suggest possible interactions between E and NS2A at these sites.

Effects of substitutions are background-specific

Drawing from the existing diversity of the 348 closely-related virus strains in our dataset, we examined whether the marginal effects identified in the substitution model could be observed for viruses separated by individual substitutions. We consider viruses with identical sequences in E and the 62 nonzero effect sites in NS2A as effectively identical. With the high genetic similarity between viruses in our dataset, we were able to identify pairs of viruses that were separated by a substitution of interest (virus i vs. virus j) and a control virus that was otherwise effectively identical (virus jc). We identified a sufficient number of these ‘triplets’ (i, j, jc) to test isolated effects of six substitutions in footprints of human-derived mAb (hmAb), one substitution in EDI/II/III but outside of known mAb epitopes, eight substitutions in stem/anchor domain of E, and twenty substitutions in NS2A. No nonzero effect substitutions in footprints of murine-derived mAb (mmAb) but outside of hmAb footprints had sufficient virus triplets for evaluation. The number of virus triplets were primarily limited by low number of control viruses due to coupling of the substitutions with other substitutions (470/698 substitutions). Notably, of the strongest effect sizes observed in our models, 138/229 substitutions with effect size >0.5 were not testable with the triplet analysis because these mutations were often accompanied by other antigenically important changes.

We found broad correspondence between differences in antigenic distances observed from virus triplets and effect sizes estimated by our model in all substitution groups (S10 Fig). For all substitutions, differences in antigenic distance observed from virus triplets (ΔDm) have wide 95%IQR. Given that we have matched for all changes in E and the 62 NS2A sites, we suspect that the wide confidence intervals are due to smaller sample sizes of ‘testable’ triplets. This validation is thus likely underpowered and cannot overcome variability of the measurements, an issue that would also likely affect experimental studies introducing individual mutations synthetically into infectious clones. We found that none of the 6 testable substitutions in footprints of hmAb had significant effects (S11 Fig). However, the genetic background had an important effect on the significance of each triplet. Take for example a substitution in the footprint of hmAbs on E, M160K, which has been shown experimentally to have a modest antigenic effect [43]). S12 Fig contrasts the overall distribution of ΔDm for M160K against ΔDm associated with each virus tested individually. Nearly half of the individual viruses have significant effects, and these effects are clustered when mapped to the phylogeny, indicating the effect is dependent on the background genome (S13 Fig). This suggests that the particular virus this mutation is introduced into will affect the magnitude of the antigenic effect observed, even when working with closely related viruses of the same genotype circulating in a single city over time.

In the few substitutions that involved multiple serotype pairs, effects of the substitutions appeared to vary by serotype. For instance, albeit significant overall effects were observed for E:S169P, DENV2-DENV3 pairs were far from rejecting the null (S14 Fig). This heterogeneity further suggests that the effects of substitutions are background-dependent, which also partly explains the wide variation observed in ΔDm pooled across virus triplets with variable backgrounds.

We also performed the triplet analyses on other sites in E and in NS2A. We found significant effects (p ≤ 0.05) for 1 of 8 substitutions in EDI/II/III but outside of known mAb epitopes (S169P, S14 Fig), 0 of 1 substitutions in stem/anchor of E (S15 Fig) and 2 of 20 substitutions in NS2A (L19F and C41L, S16 Fig). The NS2A substitution C41L is in one of the coevolution hotspots with E, and is within pTMS-2, the region most associated with antigenic effect in our larger model.

Discussion

Through studying viruses sampled from a concentrated locality over long periods of time, we were able to recover a large portion of known antigenically relevant sites targeted by neutralizing mAbs. The small number of substitutions separating the viruses within each serotype (and genotype) allowed the substitution model to more precisely draw the link between genetic and antigenic heterogeneity compared to past studies [23]. The fact that viruses of each serotype in the Thai dataset were mainly of a particular genotype means that these changes were associated with antigenic variation within genotypes, an aspect that has rarely been studied.

Our studies of the E protein suggest our model is likely conservative in attributing effects to sites/substitutions and is returning hits more specific to antibody responses in primates. Of the sites on the E protein marked as antigenically relevant by our model, 63.6% were within or neighborhooding known human epitopes but not mouse epitopes. This association was greater than random chance within 3 positions or 3.5 angstroms around known epitopes. Of the remaining antigenically relevant sites, 16/28 were in the stem/anchor domains, which have recently been shown to become exposed under physiological conditions but mAb targeting these sites have yet been identified. These comparisons provide support for antigenic signal in sites as measured by polyclonal responses, which may be similar to identified monoclonal antibodies but may target the same antigenic regions in a slightly different way. Alternatively, some of the sites we identified were not near known epitopes. Our findings suggest that polyclonal antisera may target epitopes beyond those of currently identified monoclonal antibodies and also support recent studies showing that changes at specific sites may introduce global changes to the virus that affect polyclonal neutralization in a non-epitope specific manner.

With the availability of whole genome sequences, we were able to assess the presence of antigenic signals in all DENV proteins. In doing so, we detected an excess of signal in NS2A. We further went on to identify the sites embedding the information in NS2A and observed a small gain in antigenic variation explained, especially distances between serotypes (5.6% to 6.6%), compared to only considering sites in E. Many of these sites were found to coincide with E-NS2A hotspots in our coevolution analyses. These findings suggest that changes in replication machinery (NS2A) in addition to changes in structural proteins (E) may influence antigenic properties of the virus. We did not find an association between other segments of the genome and antigenic change. Notably, we did not include untranslated regions (UTRs) in our analyses, despite works that suggest their roles in replication [44]. However, the contributions, if any, may not be totally lost so long as linkage between sites in the coding sequence and the UTR are retained. Also, we did not account for recombinations between DENV, which has been reported to occur within serotypes between homologous sites [33]. Although this complicates phylogenetic reconstructions, our model is unlikely to be affected by recombination as it is phylogeny-free. In fact, presence of recombination accelerates the dissolve of linkage between sites, increasing diversity of sequence combinations, which makes effects of individual substitutions more likely to be detected.

An important part of our analysis approach is that we are able to query residues across all four serotypes and diverse circulating strains, enabling us to identify the effect of individual mutations within specific genetic backgrounds in an epidemiological context. To further evaluate how the observed effects hold across genetic backgrounds, we tested whether viruses with and without identified antigenic determinants in E and NS2A differ in antigenicity in the absence of other sources of antigenically relevant changes, thus drawing on the redundancy in our existing dataset to identify the antigenic effects attributable to single amino acid changes. These analyses are a prerequisite for experimental studies to test individual mutations in a clonal background. Our analysis shows that the background virus is important, suggesting experimental studies to identify substitutions driving antigenic changes should be conducted using infectious clones specific to the virus population under study. As designing infectious clones for flaviviruses is difficult, these substitutions provide the best candidates for extensive studies to uncover molecular mechanisms underlying the relationship between these substitutions and changes in antigenicity of DENV.

Measurement of the genetic determinants of antigenic difference will inform development of diagnostics to allow finer characterization of virus antigenic properties and establish the link between antigenic variation and severity of infection [45]. Further, identification of the genetic changes that contribute to antigenic variation is an enabling step towards the study of DENV evolution. Replacements of invading genotypes have primarily been attributed to substitutions that confer functional advantage, e.g., infectivity in specific cell types [46] and transmissibility [47, 48], which do not depend on past infection histories in the population. However, differences in susceptibility to heterotypic immunity have also been linked to clade replacements, suggesting antigenic selection may have a role in DENV evolution [49, 50]. In support of this hypothesis, we recently showed that antigenic traits of DENV have changed over time and are associated with both epidemic dynamics and genotype replacement [22]. The genetic determinants of antigenic differences identified in our study here will enable formal inferences into these evolutionary processes. Deeper exploration of DENV antigenic variation and factors underlying its evolution is needed to inform development of more broadly effective preventive and therapeutic countermeasures.

Material and methods

Ethics statement

This study was approved by the ethical review boards of the Queen Sirikit National Institute of Child Health, Walter Reed Army Institute of Research, and Johns Hopkins Bloomberg School of Public Health (former location of DATC) and University of Florida. The work of NIH and WRAIR was deemed non-human subjects research by their respective IRBs. Because researchers at UF, Cambridge and QSNICH can link these data to identifiers (age and location, though not used in this study), IRB approval was obtained at these institutions. These IRB approvals (Queen Sirikit National Institute of Children’s Health (QSNICH 61–062), University of Florida (UF IRB201701844) and the University of Cambridge (HBREC.2019.36)) include a waiver of consent. We followed the National Institutes of Health guidelines for the humane treatment of laboratory animals. The NIAID Animal Care and Use Committee approved these protocols (11DEN33 and 14DEN34, parent protocol NIAID ASP LID 9).

Data

Our study utilized whole genome sequences and 3-dimensional antigenic map coordinates of 348 DENV previously characterized by Katzelnick et al [22]. In brief, 1,944 isolated viruses were derived from serum specimens collected from acute illnesses admitted to the Queen Sirikit National Institute of Child Health (QSNICH) in Bangkok, Thailand, mostly between 1994 and 2014. Aside from a genotype replacement of DENV3 from genotype II to genotype III, viruses were primarily of a single dominant genotype for each serotype (DENV1 genotype I, DENV2 genotype Asian I, DENV4 genotype I). From the 1,944 whole genome sequences acquired (667 DENV1, 440 DENV2, 454 DENV3, and 383 DENV4), the isolates were systematically sampled to represent amino acid variations in the envelope (E) protein and pre-membrane (prM) protein and to balance across all years between 1994 and 2014, resulting in 348 virus isolates (18%; 87 DENV1, 80 DENV2, 90 DENV3, and 91 DENV4) being antigenically characterized. Plaque reduction neutralization titers (PRNT) for a panel of twenty anti-DENV antisera were determined for all 348 viruses. The antisera were of Chlorocebus sabaeus challenged with global representative DENV strains, five per serotype, as described elsewhere [3]. Using antigenic cartography, viruses were placed onto map coordinates in N-dimensions to best preserve the measured PRNT50 titers, finding 3-dimensions to be optimal. Euclidean distances on the map are in units of log2 neutralization titer reductions. Antigenic data are stored on Zenodo (doi:10.5281/zenodo.5365818). To obtain pairwise antigenic distances between the viruses, we computed the Euclidean distances between their coordinates. Substitutions separating virus pairs were identified from translated coding sequences of each gene in the whole genome sequence alignment. All sequence data is publicly available on GenBank (accession numbers: KY586306 to KY586946, MW881266, MW945425 to MW945427, MW945430, MW945433 to MW945437, MW945454 to MW945763, MW945772 to MW946604, MW946607 to MW946985).

Substitution effect size estimation

We adapted the substitution model described in Bell et al. [23] to analyze the data in our study.

DijD^ij=mdm+vi+vj

Our model approximates the observed antigenic distance Dij between virus i and virus j to the predicted antigenic distance Dij^. The predicted distance is a sum of effects of all substitutions present between the two viruses, ∑m dm where dm is the effect of a single substitution m, and virus-specific measurement uncertainties, vi and vj. For a pair of identical viruses, ∑m dm = 0, any antigenic distance observed between them equals to vi + vj. The use of map distances in our study intrinsically implies Dij = Dji. Nonetheless, we still follow the initial formulations and allowed the effects of the residue changes m and its inverse to be asymmetric. L1 regularization terms on dm were added to favor attributing the effects to a small number of substitutions. Virus-specific intercepts were under L2 regularization. Weights of these regularization terms were governed by three parameters which were set to the values used in Bell et al., λ = 3.0, κ = 0.6, δ = 1.2, where the relatively high value of λ disfavors attributing effects to substitutions, reducing the chance of falsely attributing effects to substitutions. Results were shown to be insensitive to these values [23]. Parameters were solved as quadratic programming problems under nonnegativity constraints using the function pnnqp in R package lsei minimizing the following cost function.

C=i,j(D^ij-Dij)2+λmdm+κivi2+δjvj2

Effect size estimations were repeated 100 times, including random 90% of the virus pairs each time. The 10% held out were used to test the performance of each estimation. In each estimation, substitutions present in only one virus pair were excluded. To avoid collinearity, sites with the same residue patterns were grouped into clusters. Estimates were summarized by its median and 95% interquartile range (IQR). Substitutions were determined as having nonzero effects if the 95%IQR excluded zero. Root mean squared error (RMSE) evaluated using the test sets were used to describe the prediction performance of the fits.

Performance of antigenic distance predictions

We evaluated the performance of the model separately for predicting antigenic distances based on mutations in the E protein, each DENV protein, each DENV protein concatenated to E, and within NS2A. For each of the 100 estimations, we predicted the antigenic distances for the 10% of virus pairs held out during the estimation process. To estimate predicted antigenic distances, where the virus specific intercepts are not known, we sum the effects of the substitutions separating them and adding twice the mean per-virus intercept to the sum. We compute the root mean squared error (RMSE) between predicted distances and antigenic distances derived from the 3-dimensional antigenic map. We report the median and 95% interquartile ranges (IQR) across the 100 estimations.

Quantifying effects of measurement variability on antigenic distance

We synthesized 100 datasets with multiple sets of measurements of the sample virus included to study the amount of antigenic distance attributable to measurement variability of the PRNT50 assay. For each dataset, we randomly selected eight viruses (two per serotype) and synthesized four sets of PRNT measurements per each of these viruses by multiplying the original titers with scaling factors 10m, m sampled from a normal distribution of mean one and variance 0.13 [51]. Together with observed measurements of the remaining viruses in the original dataset, 3-dimensional coordinates of the synthetic entries were inferred through antigenic cartography. We computed the pairwise distances between synthetic entries of the same viruses and divided the amount by two to quantify the contribution of each virus entry.

Assessing sensitivity of effect determination threshold

Corrections for multiple comparisons involve adjusting the stringency of significance thresholds [52]. We counted the number of estimations that each substitution showed nonzero effect and divided the count by the number of estimations at which effect size estimation of the substitution was attempted to obtain the proportion of estimations in which substitutions showed nonzero effect. We examined the change in number of substitutions with significant effects as we increased the threshold proportion.

Assessing association between effect sites and known epitopes

For a set of epitope neighborhood sites M and a set of nonzero effect sites S, the observed number of overlap between them equals |MS|. If |S| nonzero effect sites were sampled from the set of variable sites V at random, we would expect the proportion of overlap p to be |MV||V|. Because effects can only be attributed to variable sites, SV, it follows that |SV| = |S| and |MSV| = |MS|. The binomial probability of observing an overlap of at least |MS| if S was sampled from V at random equals

u=|MS||S|(|S|u)pu(1-p)|S|-u

Identifying E-NS2A coevolution hotspots

Coevolution analyses were performed using sequences of all 1,944 Thai viruses in our study. Alignment of E protein sequences concatenated with NS2A sequences was used as input for fastcov v1.0.3 [41] with default configurations. Covarying residues passing default significance thresholds were extracted. SpydrPick v1.2.0 [42] with –linear-genome flag was performed on whole genome alignments of the viruses. Pairs of nucleotide positions with mutual information (MI) values greater than 99th percentile of MI values across all pairs were extracted. Density of extracted pairs by both methods were visualized using geom_density_2d_filled in R package ggplot2 [53].

Assessing observable isolated effects of substitutions

To evaluate further how the specific substitutions estimated to have nonzero effects by the substitution model hold across genetic backgrounds, a suitable first step is to test whether viruses with and without these substitutions differ in antigenicity in the absence of other sources of antigenically relevant changes. Thus, we queried our dataset for virus triplets to as closely simulate experimental validation using infectious clones, where each mutation would be introduced separately into clonal backgrounds. Because our outcome measure is antigenic distance, the equivalent experiment would be to take a reference virus i and measure the fold-difference in titers across all sera in the serum panel to virus j. We would then do the same with control virus jc, which is equivalent to virus j. All measures of distance are antigenic distances between pairs of viruses, which is related to the fold-drop in neutralization titers.

In our dataset of Thai DENV, we identified virus pairs (i, j) that were separated a set of substitutions M where the substitution of interest mM, then queried for control viruses jc where substitutions separating (i, jc) equals M − {m}. As a result of the common substitution requirement, j and jc were always of the same serotype. For each virus triplet (i, j, jc) identified, we compute the difference in observed antigenic distance between (i, j) and (i, jc). We denote this difference as ΔDm. In considering only substitutions in E and the 62 sites in NS2A, our analysis assumes that substitutions outside of E and the 62 NS2A sites do not contribute to antigenic changes. We derived the p-value in rejecting the null hypothesis that ΔDm ≤ 0 by calculating the proportion of triplets with ΔDm ≤ 0. As effects may be background dependent, the calculations were also done separately for each serotype pair of (i, j) identified. Calculations were limited to sets of virus triplets that involved greater than two distinct viruses i and had greater than 30 triplets identified.

Supporting information

S1 Fig. Time-calibrated maximum likelihood phylogenies of virus isolates.

Collected from Queen Sirikit National Institute of Child Health (QSNICH) between 1994–2014. Viruses selected for antigenic characterization were marked as orange circles.

(PDF)

S2 Fig. Virus-specific intercepts fitted using E protein sequences.

a) Distributions and b) variation in virus-specific intercepts estimated using E protein sequences across the 100 estimations. Gray horizontal lines represent the mean intercepts across viruses for each of the estimations. c) Boxplot illustrating the amount of distance attributable to measurement variability across 100 synthetic samples. Divided by two to represent the per virus contribution. Thick lines denote the means (black) and medians (orange).

(PDF)

S3 Fig. Relationship between observed antigenic distance and antigenic distance predicted by the substitution model.

a) when effects were fitted to envelope protein sequences (E) and b) when effects were fitted to E concatenated with 62 nonzero effect sites in nonstructural protein 2A (NS2A).

(PDF)

S4 Fig. Association between effect sites and known epitopes of neutralizing antibodies.

a) Number and percentage of sites with and without effects by whether or not they are part of known epitopes. Odds ratios were calculated by either considering epitopes of both human-derived monoclonal antibodies (hmAb) and murine-derived monoclonal antibodies (mmAb) and when only restricted to hmAb epitopes. Defining neighborhoods of known epitopes as positions within N sites away (linear distance), the probability of nonzero effect sites being within the neighborhood at random (red) are contrasted against the proportion of variable sites that were within the neighborhood (gray): b) known epitopes for either hmAb or mmAb, c) known epitopes for hmAb, and d) known epitopes for mmAb outside of hmAb epitopes. N = 0 was when the neighborhood was exactly at the reported epitope positions. e, f, g) Respective analogous analysis but with neighborhoods defined as being within X angstroms away from known epitopes (3-dimensional spatial distance). X = 0 was when the neighborhood was exactly at the reported epitope positions.

(PDF)

S5 Fig. Proportion of estimations in which substitutions showed nonzero effect.

a) Substitutions in envelope protein (E) only, ordered by the proportion at which substitutions showed nonzero effect across the 100 estimations. Substitutions identified by our threshold of 95% was highly similar to the maximum stringency of 100%; 372/394 substitutions (94.4%). Involvement was retained in 76/77 (99%) of the sites. b) In the analysis where E was concatenated to the 62 nonstructural protein 2A (NS2A) sites which consistently showed nonzero effects in our site sampling analysis, 292/304 substitutions (96.1%) in the NS2A sites remained nonzero at a threshold of 100%. Involvement was retained in 62/62 (100%) of the sites. Proportions corresponding to nonzero effect substitutions reported in our study (threshold of 95%) are colored red.

(PDF)

S6 Fig. Substitutions with non-zero effect sizes in NS2A.

Median effect size of substitutions across the 100-fold Monte Carlo cross-validations shown as points, 95% interquartile range as whiskers. Points are colored by locations of the sites: ER lumen (green), transmembrane (yellow), or cytosol (blue). Locations of the sites and domain annotations were taken from [34].

(PDF)

S7 Fig. Distribution of nonzero effect sites across NS2A segments.

a) Total number of sites in each segment (hollow), number of variable sites (filled black), and number of sites estimated to have nonzero effects (filled red). b) Probability that at least these number of nonzero effect sites were associated with the segments at random. Amino acid positions of the segments shown in parentheses.

(PDF)

S8 Fig. Density of coevolving residue pairs detected by fastcov.

Density values were scaled to maximum value of one. Distributions of nonzero effect substitutions (red) and site-specific Wu-Kabat variability coefficient (gray) of the respective proteins are shown on top (nonstructural protein 2A, NS2A) and side (envelope protein, E).

(PDF)

S9 Fig. Density of coevolving nucleotide pairs detected by SpydrPick.

a) Density of nucleotide positions with mutual information (MI) values greater than 99th percentile of MI values between pairs throughout the DENV genome. Density scaled to maximum value of one. Thin rectangle corresponds to coevolution relationship between E gene (y-axis) and sites throughout the genome. Thick rectangle highlights relationship between E gene and NS2A gene. b) Density plot expanding the highlighted region in panel (a).

(PDF)

S10 Fig. Relationship between difference in antigenic distance observed in virus triplets and effect size estimates from the substitution model.

Shown separately for substitutions located in epitopes of human-derived monoclonal antibodies (hmAb), E domain I/II/III but outside of known epitopes (EDI/II/III), E stem/anchor domain, and nonstructural protein 2A (NS2A). Points are the medians of the observations/estimates. Lines are 95% interquartile ranges.

(PDF)

S11 Fig. Effects of substitutions in footprints of human-derived mAb (hmAb).

Difference in antigenic distance observed between pairs of viruses separated by the specific substitution and antigenic distance observed in respective effectively identical viruses without the substitution (control viruses). Thick lines show median and 95% interquartile range (IQR) for triplets of all serotype pairs combined. Thin lines show the median and 95%IQR for each serotype pair identified.

(PDF)

S12 Fig. Observable effects of substitution differ within the same serotype pair.

a) Distribution of difference in antigenic distance, ΔDm, for E:M160K substitution including all triplets with the same serotype pair (DENV2, DENV2) and the resultant p-value shown in comparison to b) median and 95% interquartile range of ΔDm shown separately for each virus j involved in the virus triplets and their respective p-values.

(PDF)

S13 Fig. Distribution of virus-specific difference in antigenic distance on the phylogeny.

Median difference in antigenic distance, ΔDm, specific to each virus j involved in the virus triplets shown in S12 Fig are colored on the phylogeny. Points are shown as solid circles for p-values ≤ 0.05 and as hollow triangles otherwise.

(PDF)

S14 Fig. Effects of substitutions in EDI/II/III but outside of known mAb epitopes.

a) Difference in antigenic distance observed between pairs of viruses separated by the specific substitution and antigenic distance observed in respective effectively identical viruses without the substitution (control viruses). Thick lines show median and 95% interquartile range (IQR) for triplets of all serotype pairs combined. Thin lines show the median and 95%IQR for each serotype pair identified. b) Distribution of difference in antigenic distance for substitution with p-value ≤ 0.1 colored by serotypes of the virus pairs.

(PDF)

S15 Fig. Effects of substitutions in the stem/anchor domain of E.

Difference in antigenic distance observed between pairs of viruses separated by the specific substitution and antigenic distance observed in respective effectively identical viruses without the substitution (control viruses). Thick lines show median and 95% interquartile range (IQR) for triplets of all serotype pairs combined. Thin lines show the median and 95%IQR for each serotype pair identified.

(PDF)

S16 Fig. Effects of substitutions in nonstructural protein 2A (NS2A).

a) Difference in antigenic distance observed between pairs of viruses separated by the specific substitution and antigenic distance observed in respective effectively identical viruses without the substitution (control viruses). Thick lines show median and 95% interquartile range (IQR) for triplets of all serotype pairs combined. Thin lines show the median and 95%IQR for each serotype pair identified. b) Distribution of difference in antigenic distance for substitutions with p-value ≤ 0.1 colored by serotypes of the virus pairs.

(PDF)

S1 File. Nonzero effect substitutions in envelope protein (E).

(CSV)

S2 File. Nonzero effect substitutions in nonstructural protein 2A (NS2A).

(CSV)

Acknowledgments

We thank the study participants and all individuals involved in the collection and isolation of virus samples which were used to generate data used in this study.

Disclaimer

Material has been reviewed by the Walter Reed Army Institute of Research. There is no objection to its presentation and/or publication. The opinions or assertions contained herein are the private views of the author, and are not to be construed as official, or as reflecting the views of the U.S. Department of the Army, the U.S. Department of Defense, or the U.S. Government.

Data Availability

The authors confirm that all data underlying the findings are fully available without restriction. Data and code for the analyses is held in Zenodo (https://doi.org/10.5281/zenodo.5615512).

Funding Statement

This research was supported by the Intramural Research Program of the National Institute of Allergy and Infectious Diseases (SSW, LCK), National Institute of Allergy and Infectious Diseases and National Institutes of Health Grant R01AI114703-01 (https://www.nih.gov, ATH, HS, ACE, NC, CC, BGC, WR, IMB, GDG, LW, CK, BT, AN, LMT, DWE, ARJ, SF, DJS, RJ, DATC, LCK), the Military Infectious Disease Research Program (https://midrp.amedd.army.mil, ATH, WR, IMB, GG, RJ), and a European Research Council Grant 804744 (https://erc.europa.eu, HS). Sequencing for infectious disease surveillance was additionally supported by the Global Emerging Infections Surveillance (GEIS) Branch (https://www.health.mil/Military-Health-Topics/Combat-Support/Armed-Forces-Health-Surveillance-Branch/Global-Emerging-Infections-Surveillance-and-Response, RJ). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1. Wahala WMPB, de Silva AM. The human antibody response to dengue virus infection. Viruses. 2011;3(12):2374–2395. doi: 10.3390/v3122374 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Sabin AB. Research on dengue during World War II. Am J Trop Med Hyg. 1952;1(1):30–50. doi: 10.4269/ajtmh.1952.1.30 [DOI] [PubMed] [Google Scholar]
  • 3. Katzelnick LC, Fonville JM, Gromowski GD, Bustos Arriaga J, Green A, James SL, et al. Dengue viruses cluster antigenically but not as discrete serotypes. Science. 2015;349(6254):1338–1343. doi: 10.1126/science.aac5017 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Forshey BM, Reiner RC, Olkowski S, Morrison AC, Espinoza A, Long KC, et al. Incomplete Protection against Dengue Virus Type 2 Re-infection in Peru. PLoS Negl Trop Dis. 2016;10(2):e0004398. doi: 10.1371/journal.pntd.0004398 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Buddhari D, Aldstadt J, Endy TP, Srikiatkhachorn A, Thaisomboonsuk B, Klungthong C, et al. Dengue virus neutralizing antibody levels associated with protection from infection in thai cluster studies. PLoS Negl Trop Dis. 2014;8(10):e3230. doi: 10.1371/journal.pntd.0003230 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Katzelnick LC, Montoya M, Gresh L, Balmaseda A, Harris E. Neutralizing antibody titers against dengue virus correlate with protection from symptomatic infection in a longitudinal cohort. Proc Natl Acad Sci U S A. 2016;113(3):728–733. doi: 10.1073/pnas.1522136113 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Moodie Z, Juraska M, Huang Y, Zhuang Y, Fong Y, Carpp LN, et al. Neutralizing Antibody Correlates Analysis of Tetravalent Dengue Vaccine Efficacy Trials in Asia and Latin America. J Infect Dis. 2018;217(5):742–753. doi: 10.1093/infdis/jix609 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Salje H, Cummings DAT, Rodriguez-Barraquer I, Katzelnick LC, Lessler J, Klungthong C, et al. Reconstruction of antibody dynamics and infection histories to evaluate dengue risk. Nature. 2018;557(7707):719–723. doi: 10.1038/s41586-018-0157-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Hertz T, Beatty PR, MacMillen Z, Killingbeck SS, Wang C, Harris E. Antibody Epitopes Identified in Critical Regions of Dengue Virus Nonstructural 1 Protein in Mouse Vaccination and Natural Human Infections. J Immunol. 2017;198(10):4025–4035. doi: 10.4049/jimmunol.1700029 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Tsai WY, Lin HE, Wang WK. Complexity of Human Antibody Response to Dengue Virus: Implication for Vaccine Development. Front Microbiol. 2017;8:1372. doi: 10.3389/fmicb.2017.01372 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Sukupolvi-Petty S, Austin SK, Engle M, Brien JD, Dowd KA, Williams KL, et al. Structure and function analysis of therapeutic monoclonal antibodies against dengue virus type 2. J Virol. 2010;84(18):9227–9239. doi: 10.1128/JVI.01087-10 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Brien JD, Austin SK, Sukupolvi-Petty S, O’Brien KM, Johnson S, Fremont DH, et al. Genotype-specific neutralization and protection by antibodies against dengue virus type 3. J Virol. 2010;84(20):10630–10643. doi: 10.1128/JVI.01190-10 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Wahala WMPB, Donaldson EF, de Alwis R, Accavitti-Loper MA, Baric RS, de Silva AM. Natural strain variation and antibody neutralization of dengue serotype 3 viruses. PLoS Pathog. 2010;6(3):e1000821. doi: 10.1371/journal.ppat.1000821 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. Messer WB, Yount B, Hacker KE, Donaldson EF, Huynh JP, de Silva AM, et al. Development and characterization of a reverse genetic system for studying dengue virus serotype 3 strain variation and neutralization. PLoS Negl Trop Dis. 2012;6(2):e1486. doi: 10.1371/journal.pntd.0001486 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Martinez DR, Yount B, Nivarthi U, Munt JE, Delacruz MJ, Whitehead SS, et al. Antigenic Variation of the Dengue Virus 2 Genotypes Impacts the Neutralization Activity of Human Antibodies in Vaccinees. Cell Rep. 2020;33(1):108226. doi: 10.1016/j.celrep.2020.108226 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Gallichotte EN, Baric TJ, Nivarthi U, Delacruz MJ, Graham R, Widman DG, et al. Genetic Variation between Dengue Virus Type 4 Strains Impacts Human Antibody Binding and Neutralization. Cell Rep. 2018;25(5):1214–1224. doi: 10.1016/j.celrep.2018.10.006 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Dowd KA, Mukherjee S, Kuhn RJ, Pierson TC. Combined effects of the structural heterogeneity and dynamics of flaviviruses on antibody recognition. J Virol. 2014;88(20):11726–11737. doi: 10.1128/JVI.01140-14 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18. Goo L, VanBlargan LA, Dowd KA, Diamond MS, Pierson TC. A single mutation in the envelope protein modulates flavivirus antigenicity, stability, and pathogenesis. PLoS Pathog. 2017;13(2):e1006178. doi: 10.1371/journal.ppat.1006178 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. VanBlargan LA, Mukherjee S, Dowd KA, Durbin AP, Whitehead SS, Pierson TC. The type-specific neutralizing antibody response elicited by a dengue vaccine candidate is focused on two amino acids of the envelope protein. PLoS Pathog. 2013;9(12):e1003761. doi: 10.1371/journal.ppat.1003761 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Gallichotte EN, Baric TJ, Nivarthi U, Delacruz MJ, Graham R, Widman DG, et al. Genetic Variation between Dengue Virus Type 4 Strains Impacts Human Antibody Binding and Neutralization. Cell Rep. 2018;25(5):1214–1224. doi: 10.1016/j.celrep.2018.10.006 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21. Dowd KA, DeMaso CR, Pierson TC. Genotypic Differences in Dengue Virus Neutralization Are Explained by a Single Amino Acid Mutation That Modulates Virus Breathing; 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Katzelnick LC, Coello AE, Huang AT, Garcia-Carreras B, Chowdhury N, Maljkovic IB, et al. Antigenic evolution of dengue viruses over 20 years. Science. 2021; 374(6570):999–1004. doi: 10.1126/science.abk0058 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. Bell SM, Katzelnick L, Bedford T. Dengue genetic divergence generates within-serotype antigenic variation, but serotypes dominate evolutionary dynamics. Elife. 2019;8. doi: 10.7554/eLife.42496 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24. Chaudhury S, Gromowski GD, Ripoll DR, Khavrutskii IV, Desai V, Wallqvist A. Dengue virus antibody database: Systematically linking serotype-specificity with epitope mapping in dengue virus. PLoS Negl Trop Dis. 2017;11(2):e0005395. doi: 10.1371/journal.pntd.0005395 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25. de Alwis R, Smith SA, Olivarez NP, Messer WB, Huynh JP, Wahala WMPB, et al. Identification of human neutralizing antibodies that bind to complex epitopes on dengue virions. Proc Natl Acad Sci U S A. 2012;109(19):7439–7444. doi: 10.1073/pnas.1200566109 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26. Dejnirattisai W, Wongwiwat W, Supasa S, Zhang X, Dai X, Rouvinski A, et al. A new class of highly potent, broadly neutralizing antibodies isolated from viremic patients infected with dengue virus. Nat Immunol. 2015;16(2):170–177. doi: 10.1038/ni.3058 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27. Teoh EP, Kukkaro P, Teo EW, Lim APC, Tan TT, Yip A, et al. The structural basis for serotype-specific neutralization of dengue virus by a human antibody. Sci Transl Med. 2012;4(139):139ra83. doi: 10.1126/scitranslmed.3003888 [DOI] [PubMed] [Google Scholar]
  • 28. Zhang X, Ge P, Yu X, Brannan JM, Bi G, Zhang Q, et al. Cryo-EM structure of the mature dengue virus at 3.5-Å resolution. Nat Struct Mol Biol. 2013;20(1):105–110. doi: 10.1038/nsmb.2463 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29. Lim XX, Chandramohan A, Lim XYE, Bag N, Sharma KK, Wirawan M, et al. Conformational changes in intact dengue virus reveal serotype-specific expansion. Nat Commun. 2017;8:14339. doi: 10.1038/ncomms14339 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Ali A, Ali I. The Complete Genome Phylogeny of Geographically Distinct Dengue Virus Serotype 2 Isolates (1944-2013) Supports Further Groupings within the Cosmopolitan Genotype. PLoS One. 2015;10(9):e0138900. doi: 10.1371/journal.pone.0138900 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31. Klungthong C, Putnak R, Mammen MP, Li T, Zhang C. Molecular genotyping of dengue viruses by phylogenetic analysis of the sequences of individual genes. J Virol Methods. 2008;154(1-2):175–181. doi: 10.1016/j.jviromet.2008.07.021 [DOI] [PubMed] [Google Scholar]
  • 32. Zhang H, Zhang Y, Hamoudi R, Yan G, Chen X, Zhou Y. Spatiotemporal characterizations of dengue virus in mainland China: insights into the whole genome from 1978 to 2011. PLoS One. 2014;9(2):e87630. doi: 10.1371/journal.pone.0087630 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33. Worobey M, Rambaut A, Holmes EC. Widespread intra-serotype recombination in natural populations of dengue virus. Proc Natl Acad Sci U S A. 1999;96(13):7352–7357. doi: 10.1073/pnas.96.13.7352 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34. Xie X, Gayen S, Kang C, Yuan Z, Shi PY. Membrane topology and function of dengue virus NS2A protein. J Virol. 2013;87(8):4609–4622. doi: 10.1128/JVI.02424-12 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35. Nemésio H, Villalaín J. Membrane interacting regions of Dengue virus NS2A protein. J Phys Chem B. 2014;118(34):10142–10155. doi: 10.1021/jp504911r [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36. Nasar S, Rashid N, Iftikhar S. Dengue proteins with their role in pathogenesis, and strategies for developing an effective anti-dengue treatment: A review. J Med Virol. 2020;92(8):941–955. doi: 10.1002/jmv.25646 [DOI] [PubMed] [Google Scholar]
  • 37. Xie X, Zou J, Puttikhunt C, Yuan Z, Shi PY. Two distinct sets of NS2A molecules are responsible for dengue virus RNA synthesis and virion assembly. J Virol. 2015;89(2):1298–1313. doi: 10.1128/JVI.02882-14 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38. Xie X, Zou J, Zhang X, Zhou Y, Routh AL, Kang C, et al. Dengue NS2A Protein Orchestrates Virus Assembly. Cell Host Microbe. 2019;26(5):606–622.e8. doi: 10.1016/j.chom.2019.09.015 [DOI] [PubMed] [Google Scholar]
  • 39. Dowd KA, DeMaso CR, Pierson TC. Genotypic Differences in Dengue Virus Neutralization Are Explained by a Single Amino Acid Mutation That Modulates Virus Breathing. MBio. 2015;6(6):e01559–15. doi: 10.1128/mBio.01559-15 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40. De Juan D, Pazos F, Valencia A. Emerging methods in protein co-evolution. Nature Reviews Genetics. 2013;14(4):249–261. doi: 10.1038/nrg3414 [DOI] [PubMed] [Google Scholar]
  • 41. Shen W, Li Y. A novel algorithm for detecting multiple covariance and clustering of biological sequences. Scientific Reports. 2016;6(1):1–8. doi: 10.1038/srep30425 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42. Pensar J, Puranen S, Arnold B, MacAlasdair N, Kuronen J, Tonkin-Hill G, et al. Genome-wide epistasis and co-selection study using mutual information. Nucleic Acids Res. 2019;47(18):e112. doi: 10.1093/nar/gkz656 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43. Wang C, Katzelnick LC, Montoya M, Hue KDT, Simmons CP, Harris E. Evolutionarily successful Asian 1 dengue virus 2 lineages contain one substitution in envelope that increases sensitivity to polyclonal antibody neutralization. The Journal of Infectious Diseases. 2016;213(6):975–984. doi: 10.1093/infdis/jiv536 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44. Sharma S, Varani G. NMR structure of Dengue West Nile viruses stem-loop B: A key cis-acting element for flavivirus replication. Biochem Biophys Res Commun. 2020;. doi: 10.1016/j.bbrc.2020.07.115 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45. Katzelnick LC, Coloma J, Harris E. Dengue: knowledge gaps, unmet needs, and research priorities. The Lancet Infectious Diseases. 2017;17(3):e88–e100. doi: 10.1016/S1473-3099(16)30473-X [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46. Kumar SRP, Patil JA, Cecilia D, Cherian SS, Barde PV, Walimbe AM, et al. Evolution, dispersal and replacement of American genotype dengue type 2 viruses in India (1956-2005): selection pressure and molecular clock analyses. J Gen Virol. 2010;91(Pt 3):707–720. doi: 10.1099/vir.0.017954-0 [DOI] [PubMed] [Google Scholar]
  • 47. Twiddy SS, Farrar JJ, Vinh Chau N, Wills B, Gould EA, Gritsun T, et al. Phylogenetic relationships and differential selection pressures among genotypes of dengue-2 virus. Virology. 2002;298(1):63–72. doi: 10.1006/viro.2002.1447 [DOI] [PubMed] [Google Scholar]
  • 48. Lambrechts L, Fansiri T, Pongsiri A, Thaisomboonsuk B, Klungthong C, Richardson JH, et al. Dengue-1 virus clade replacement in Thailand associated with enhanced mosquito transmission. J Virol. 2012;86(3):1853–1861. doi: 10.1128/JVI.06458-11 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49. Zhang C, Mammen MP Jr, Chinnawirotpisan P, Klungthong C, Rodpradit P, Monkongdee P, et al. Clade replacements in dengue virus serotypes 1 and 3 are associated with changing serotype prevalence. J Virol. 2005;79(24):15123–15130. doi: 10.1128/JVI.79.24.15123-15130.2005 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50. Adams B, Holmes EC, Zhang C, Mammen MP Jr, Nimmannitya S, Kalayanarooj S, et al. Cross-protective immunity can account for the alternating epidemic pattern of dengue virus serotypes circulating in Bangkok. Proc Natl Acad Sci U S A. 2006;103(38):14234–14239. doi: 10.1073/pnas.0602768103 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51. Salje H, Rodríguez-Barraquer I, Rainwater-Lovett K, Nisalak A, Thaisomboonsuk B, Thomas SJ, et al. Variability in Dengue Titer Estimates from Plaque Reduction Neutralization Tests Poses a Challenge to Epidemiological Studies and Vaccine Development. PLoS Neglected Tropical Diseases. 2014;8(6):e2952. doi: 10.1371/journal.pntd.0002952 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52. Chen SY, Feng Z, Yi X. A general introduction to adjustment for multiple comparisons. J Thorac Dis. 2017;9(6):1725. doi: 10.21037/jtd.2017.05.34 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53. Wickham H. ggplot2: Elegant Graphics for Data Analysis. Springer; 2016. [Google Scholar]

Decision Letter 0

Ana Fernandez-Sesma, Anice C Lowen

6 Jan 2022

Dear Dr. Katzelnick,

Thank you very much for submitting your manuscript "Beneath the surface: Amino acid variation underlying two decades of dengue virus antigenic dynamics in Bangkok, Thailand" for consideration at PLOS Pathogens. As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. In light of the reviews (below this email), we would like to invite the resubmission of a significantly-revised version that takes into account the reviewers' comments.

The reviewers appreciated the attention to an important topic and highlighted the potential significance of the data. However, weaknesses were identified that need to be addressed prior to publication. As emphasized by reviewers 2 and 3, experimental validation of newly identified antigenic sites is needed to fully support the conclusions of the work. While important for both E and NS2A, such validation is critical to explain the unexpected role of NS2A in viral antigenicity suggested by the observational data.

We cannot make any decision about publication until we have seen the revised manuscript and your response to the reviewers' comments. Your revised manuscript is also likely to be sent to reviewers for further evaluation.

When you are ready to resubmit, please upload the following:

[1] A letter containing a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.

[2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file).

Important additional instructions are given below your reviewer comments.

Please prepare and submit your revised manuscript within 60 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email. Please note that revised manuscripts received after the 60-day due date may require evaluation and peer review similar to newly submitted manuscripts.

Thank you again for your submission. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments.

Sincerely,

Anice C. Lowen

Associate Editor

PLOS Pathogens

Ana Fernandez-Sesma

Section Editor

PLOS Pathogens

Kasturi Haldar

Editor-in-Chief

PLOS Pathogens

orcid.org/0000-0001-5065-158X

Michael Malim

Editor-in-Chief

PLOS Pathogens

orcid.org/0000-0002-7699-2064

***********************

The reviewers appreciated the attention to an important topic and highlighted the potential significance of the data. However, weaknesses were identified that need to be addressed prior to publication. As emphasized by reviewers 2 and 3, experimental validation of newly identified antigenic sites is needed to fully support the conclusions of the work. While important for both E and NS2A, such validation is critical to explain the unexpected role of NS2A in viral antigenicity suggested by the observational data.

Reviewer's Responses to Questions

Part I - Summary

Please use this section to discuss strengths/weaknesses of study, novelty/significance, general execution and scholarship.

Reviewer #1: Huang et al. use a dataset of paired full genome dengue virus sequences with PRNT assays to measure the antigenic distance between pairs of viruses using antigenic cartography, and modify a previously published model to estimate the effects of amino acid changes that contribute to those antigenic distances. They find that sites in the E protein that are within 3 amino acids away from previously identified antibody footprints may contribute to antigenicity, as well as sites in NS2A. The method the authors use to look at combinations of sites in NS2a while controlling for association with E is clever. This paper builds nicely off of previous reports of dengue antigenic dynamics, and provides new and interesting data about the roles of substitutions beyond the E protein in antigenic variation. Overall, I think that the results are sound and that the authors' findings are novel and interesting. However, the paper would benefit from more information in the Methods, and clarification of a few points throughout the manuscript to make the paper more readily understandable for the virology audience of this journal.

Reviewer #2: Huang et al. utilized a large dataset of genome sequences and antigenic information (Katzelnick et al. Science 2021), to evaluate the genetic determinants of dengue virus antigenic diversification. They found that 77 of 295 positions with residue variability in the E protein conferred antigenic effects, with only 22 of them (~28%) mapping to known epitopes, thus expanding the number of residues involved in antibody recognition/responses. This information is very interesting and could inform vaccine development. By examining the role of the other 9 dengue virus proteins, they found that the nonstructural (NS) protein NS2A presented a signal for the antigenic diversity detected at antibody level with neutralization assays. They performed different analyses and tested different hypotheses to show that the role of antigenic diversification of NS2A is not linked to similar ancestries on the genome. The groups collaborating on this study are leaders on dengue epidemiology and immunity and the data is of interest, but there is no real explanation on how this NS protein is involved on antibody recognition and virus neutralization. It has been shown that NS2A plays a role in virus RNA replication and potentially in the evasion of the interferon response, but there is no study showing that antibodies are directed to the NS2A from dengue virus, making it difficult to develop a coherent explanation for these results.

Reviewer #3: Huang et al., examine the genetic variation among dengue viruses (DENV) from historic samples in Thailand that span several years. The authors examine the relationships between amino acid residue variation within and outside the Envelope protein, a target of neutralizing antibodies, in potentially modulating the neutralizing activity of antibodies. Overall, the authors report several interesting, hypothesis-generating observations. However, their findings fall short as they do not validate any of their descriptive analyses.

**********

Part II – Major Issues: Key Experiments Required for Acceptance

Please use this section to detail the key new experiments or modifications of existing experiments that should be absolutely required to validate study conclusions.

Generally, there should be no more than 3 such required experiments or major modifications for a "Major Revision" recommendation. If more than 3 experiments are necessary to validate the study conclusions, then you are encouraged to recommend "Reject".

Reviewer #1: (No Response)

Reviewer #2: 1. The data involving NS2A on dengue virus antigenic diversification is interesting but speculative. The authors are from multiple established and well-funded laboratories and should be able to provide experimental data rather than “inviting assessment of these effects in vitro” or state that "it would be interesting to see how the effects compare when introducing these substitutions into other genetic backgrounds experimentally."

2. From those 55 residues that do not map on known epitopes but predicted by the model to be involved in antigenic diversity (Fig 2), authors found that “30 were within 3 sites from known epitopes” suggesting a potential role of antibody recognition. This reviewer could not find how the authors calculated the distance from residues mapping to known epitopes. Was this calculated using linear sequences or structural information from the E protein? Please provide a better description on the methods section on how this analysis was carried out. Linear sequence information might not be the best predictor of the role of these residues on antibody recognition as “distant” residues could be “close” when the structural information is considered. Again, authors have all the resources to provide experimental data to confirm whether those 55 residues that do not map on known epitopes are involved in antigenic diversity.

3. The association of the NS2A protein to antibody recognition and virus antigenic diversification could be linked to interactions between these two proteins, E and NS2A, at the replication level. Authors could use phylogenetic methods and additional sequences from other studies to determine whether this signal is associated to co-evolution.

Reviewer #3: 1. Have the authors validated any of the Envelope AA residues that are outside of the mAb footprints with respect to having an impact on neutralizing antibodies? Are there viral isolates available or are there recombinant viruses that can be used to validate some of their findings in neutralization assays with specific monoclonal antibodies? There are several hits from EDI and EDII that came up on their nonzero effect size. While the computational data is interesting and potentially compelling, it would be good to validate the data with these well characterized mAbs: 1F4, 14C10, 2D22, and 5J7, EDE1-2B2, and EDE1-2C8.

2. It’s unclear how amino acid residues in NS2 would modulate antigenicity of dengue viruses. While the authors show statistically significant data in their nonzero sum size model, and speculate in the discussion of likely mechanisms underlying these mutations and their interactions with capsid and prM, these amino acid residues need to be validated through the generation of mutant NS2 viruses to demonstrate if the reversion of the major NS2 hits have a differential phenotype in terms of 1) antibody immune evasion, 2) viral infectivity, or 3) global conformational changes in antigenicity.

3. Are the authors correcting for multiple comparisons in their statistical analyses? It’s not clear from their methods if this is being done. As some of their p values are borderline significant, I suspect they will not be significant after correcting for multiple comparisons as they should do for rigor.

**********

Part III – Minor Issues: Editorial and Data Presentation Modifications

Please use this section for editorial suggestions as well as relatively minor modifications of existing data that would enhance clarity.

Reviewer #1: 1. I applaud the authors for the brevity of their manuscript. However, the methods section was quite short, and at times difficult to decipher exactly what the authors did. I would suggest adding the following pieces of information into the methods to help readers who are not familiar with Bell et al.

- The authors should define what the hyperparameters are and why they are set to those values.

- The authors should make clear why 10% of measurements are being withheld. I assumed that this was because 90% of the measurements were used as training data, leaving the remaining 10% as test data, but a sentence explicitly clarifying this would be helpful.

- I had to read the Methods section of Bell et al to fully understand their model, and I would guess that other readers of Plos Pathogens would need to do the same. In Bell et al, Dij is connected to dm, vi, and pj, which represent virus avidity, serum potentcy, and the titer drop between viruses. Seeing the explicit connection of Dij to these values made it easier to understand how the effects of each individual mutation was estimated in the model, and the authors should add it. Currently, it is difficult to figure out how each individual effect is being estimated, given that the only parameter present is Dij, which represents (as I understand it) the sum of all mutations' effects. I suggest the authors add more explicit definitions in their model, including the connection of Dij to dm, vi, and pj.

2. I was a tad confused in the manuscript about how exactly the predictions they perform were being done. From my understanding, the authors built these antigenic maps, then estimated the effects of individual amino acid changes on those distances. However, the authors then describe predicting antigenic distances. Does this mean that the authors estimated antigenic distances with antigenic cartography, then estimated the effects of each individual amino acid change using the modified Bell et al model, then used that information to predict the combined antigenic effect of all amino acids for the strains that did not have PRNT data? Did the authors do this separately for each individual protein sequentially? A paragraph in the methods about how exactly these predictions were done, on which strains, and using data from which genes/ORFs would be helpful.

3. For the last paragraph in the first section, there isn't any data shown. It would be good to add the actual data as a plot showing the correlation between models fitted to E and observed distances.

4. The authors write on line 76, "The model identified 394 nonzero effect substitutions positioned on 77 of the 295 sites...". Later, on line 85, they write "158 positions in the E protein contribute to epitopes of characterized anti-DENV mAbs while 336 positions...". Are the authors referring to amino acid sites in 1 part, and nucleotides in the other? Are they referring to different proteins? I was confused about why the denominator for the number of sites on E is different in these 2 sentences.

5. Figure 3 is quite blurry and a bit difficult to read. Figure 3d especially is difficult to interpret because all of the points are overlapping. Perhaps a histogram would help in showing the bimodal distribution? As is, every site looks the same, and it is impossible to distinguish how many sites have 0 vs. 1 effects.

6. In Figure 2c, how do the authors interpret that their model estimated 0 effects for 1/4 of the known epitopes? Similarly, it seems like their model was equally likely to estimate 0 vs. non-0 effects for known epitopes. Why do they think this is?

7. In sections 110-115, the authors describe that individual gene trees match full genome trees, which would make sense if there is little recombination in dengue viruses. It would be nice to explicitly acknowledge whether dengue viruses recombine, add a reference, and directly acknowledge how their test accounts for that.

Reviewer #2: 1. Huang et al. found 394 substitutions with nonzero effect on 77 residues from the E protein. Notably, only 22 of them (~28%) mapped to known epitopes. While this information is presented in Fig 2, the exact location is missing. This data is very interesting and could inform other studies, so authors should provide a supplementary table with the list of all those 77 residues and describe which ones map on known epitopes.

2. The model predicted that over 2/3 (52/74) of the residues that presented variability and mapped to known epitopes were not involved in antigenic differences (zero effect size, Fig 2). Please specify whether those residues are (mostly) associated to residues mapped with non-human mAbs.

3. Authors should provide a better description on how the 348 viruses isolated (18% from total) were selected for this study. This reviewer needed to go back to the recently published paper from this group (Katzelnick et al. Science 2021) to gather more information on the distribution of serotypes and genotypes for this study. This could be included as additional panel for Fig 1.

4. Figure S2 should be plotted to summarize the data. It is hard to determine what proteins of dengue are the most variable. Moreover, authors stated that “295 site on the E protein” presented residue diversity in the Thai dataset, however, it is difficult to determine that number from the current figure. Please considering including this (revised) data as part of the main manuscript.

5. Lines 96-97: I believe the authors meant “41 residues (78.8%)”

6. Lines 172-173: Please provide the reference.

Reviewer #3: 1. The first sentence in the first abstract is inaccurate: “Neutralizing antibodies are important correlates of protection against dengue virus (DENV) infections.” What is known is that neutralizing antibodies are associated with protection from severe DENV disease. However, it is not known if neutralizing antibodies can prevent subclinical viral infections that are asymptomatic. The authors should change sentence for factual accuracy or provide conclusive data that states otherwise.

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Reviewer #3: No

Figure Files:

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org.

Data Requirements:

Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example see here on PLOS Biology: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5.

Reproducibility:

To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols

Decision Letter 1

Ana Fernandez-Sesma, Anice C Lowen

5 Apr 2022

Dear Dr. Katzelnick,

We are pleased to inform you that your manuscript 'Beneath the surface: Amino acid variation underlying two decades of dengue virus antigenic dynamics in Bangkok, Thailand' has been provisionally accepted for publication in PLOS Pathogens.

Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. A member of our team will be in touch with a set of requests.

Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated.

IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript.

Should you, your institution's press office or the journal office choose to press release your paper, you will automatically be opted out of early publication. We ask that you notify us now if you or your institution is planning to press release the article. All press must be co-ordinated with PLOS.

Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Pathogens.

Best regards,

Anice C. Lowen

Associate Editor

PLOS Pathogens

Ana Fernandez-Sesma

Section Editor

PLOS Pathogens

Kasturi Haldar

Editor-in-Chief

PLOS Pathogens

orcid.org/0000-0001-5065-158X

Michael Malim

Editor-in-Chief

PLOS Pathogens

orcid.org/0000-0002-7699-2064

***********************************************************

Reviewer Comments (if any, and for reference):

Reviewer's Responses to Questions

Part I - Summary

Please use this section to discuss strengths/weaknesses of study, novelty/significance, general execution and scholarship.

Reviewer #1: The authors have been quite responsive to reviewer comments. The additional analysis of restricting to human-derived mAbs is a nice one that helps explain the discordance between the model and the previously identified epitope sites. The triplet analysis is clever and a nice addition as well. All of my comments have been addressed, and I am happy to recommend the manuscript for publication. While reading, I found a couple of typos, which I will include here:

line 64: "Finally, probe our virus set..." I think this is missing a "we"?

line 77: "..distribution of estimated..." I think this should be either "distributions" or "the distribution"

Reviewer #2: Authors addressed most of my comments, but they failed to provide empirical evidence for two of the major comments: (i) how the NS2A protein affects virus antigenicity and (ii) whether the new sites described in the E protein play a role on resistance to neutralization. While it is true that the reverse genetic system for dengue viruses is intractable, this system has been successfully used to demonstrate how differences on the E protein affect overall neutralization titers (examples: Messer et al. J Virol. 2016:5090-5097; Messer et al. PLoS Negl Trop Dis. 2012:e1486). Nevertheless, while single point mutations is highly desirable to test the findings described here, the authors have the isolated viruses used for neutralization testing and they could have tested other biological properties in an attempt to explain their observations for the involvement of NS2A in antigenicity (e.g. differences in cellular binding [Lo et al. PLoS One. 2016:e0166474], replication kinetics, glycosylation, defective interfering particles abundance), and test viruses carrying different mutations in the E protein with a panel of mAbs to assess the role of the newly described E residues on dengue antigenicity. If any of these experiments are feasible, the authors should provide a better biological explanation in the discussion for their observations regarding the NS2A protein.

Reviewer #3: My comments and questions have been properly addressed and answered. I recommend that this manuscript be accepted and published without delay.

**********

Part II – Major Issues: Key Experiments Required for Acceptance

Please use this section to detail the key new experiments or modifications of existing experiments that should be absolutely required to validate study conclusions.

Generally, there should be no more than 3 such required experiments or major modifications for a "Major Revision" recommendation. If more than 3 experiments are necessary to validate the study conclusions, then you are encouraged to recommend "Reject".

Reviewer #1: (No Response)

Reviewer #2: (No Response)

Reviewer #3: none

**********

Part III – Minor Issues: Editorial and Data Presentation Modifications

Please use this section for editorial suggestions as well as relatively minor modifications of existing data that would enhance clarity.

Reviewer #1: (No Response)

Reviewer #2: (No Response)

Reviewer #3: none

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Reviewer #3: No

Acceptance letter

Ana Fernandez-Sesma, Anice C Lowen

27 Apr 2022

Dear Dr. Katzelnick,

We are delighted to inform you that your manuscript, "Beneath the surface: Amino acid variation underlying two decades of dengue virus antigenic dynamics in Bangkok, Thailand," has been formally accepted for publication in PLOS Pathogens.

We have now passed your article onto the PLOS Production Department who will complete the rest of the pre-publication process. All authors will receive a confirmation email upon publication.

The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any scientific or type-setting errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript. Note: Proofs for Front Matter articles (Pearls, Reviews, Opinions, etc...) are generated on a different schedule and may not be made available as quickly.

Soon after your final files are uploaded, the early version of your manuscript, if you opted to have an early version of your article, will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers.

Thank you again for supporting open-access publishing; we are looking forward to publishing your work in PLOS Pathogens.

Best regards,

Kasturi Haldar

Editor-in-Chief

PLOS Pathogens

orcid.org/0000-0001-5065-158X

Michael Malim

Editor-in-Chief

PLOS Pathogens

orcid.org/0000-0002-7699-2064

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Fig. Time-calibrated maximum likelihood phylogenies of virus isolates.

    Collected from Queen Sirikit National Institute of Child Health (QSNICH) between 1994–2014. Viruses selected for antigenic characterization were marked as orange circles.

    (PDF)

    S2 Fig. Virus-specific intercepts fitted using E protein sequences.

    a) Distributions and b) variation in virus-specific intercepts estimated using E protein sequences across the 100 estimations. Gray horizontal lines represent the mean intercepts across viruses for each of the estimations. c) Boxplot illustrating the amount of distance attributable to measurement variability across 100 synthetic samples. Divided by two to represent the per virus contribution. Thick lines denote the means (black) and medians (orange).

    (PDF)

    S3 Fig. Relationship between observed antigenic distance and antigenic distance predicted by the substitution model.

    a) when effects were fitted to envelope protein sequences (E) and b) when effects were fitted to E concatenated with 62 nonzero effect sites in nonstructural protein 2A (NS2A).

    (PDF)

    S4 Fig. Association between effect sites and known epitopes of neutralizing antibodies.

    a) Number and percentage of sites with and without effects by whether or not they are part of known epitopes. Odds ratios were calculated by either considering epitopes of both human-derived monoclonal antibodies (hmAb) and murine-derived monoclonal antibodies (mmAb) and when only restricted to hmAb epitopes. Defining neighborhoods of known epitopes as positions within N sites away (linear distance), the probability of nonzero effect sites being within the neighborhood at random (red) are contrasted against the proportion of variable sites that were within the neighborhood (gray): b) known epitopes for either hmAb or mmAb, c) known epitopes for hmAb, and d) known epitopes for mmAb outside of hmAb epitopes. N = 0 was when the neighborhood was exactly at the reported epitope positions. e, f, g) Respective analogous analysis but with neighborhoods defined as being within X angstroms away from known epitopes (3-dimensional spatial distance). X = 0 was when the neighborhood was exactly at the reported epitope positions.

    (PDF)

    S5 Fig. Proportion of estimations in which substitutions showed nonzero effect.

    a) Substitutions in envelope protein (E) only, ordered by the proportion at which substitutions showed nonzero effect across the 100 estimations. Substitutions identified by our threshold of 95% was highly similar to the maximum stringency of 100%; 372/394 substitutions (94.4%). Involvement was retained in 76/77 (99%) of the sites. b) In the analysis where E was concatenated to the 62 nonstructural protein 2A (NS2A) sites which consistently showed nonzero effects in our site sampling analysis, 292/304 substitutions (96.1%) in the NS2A sites remained nonzero at a threshold of 100%. Involvement was retained in 62/62 (100%) of the sites. Proportions corresponding to nonzero effect substitutions reported in our study (threshold of 95%) are colored red.

    (PDF)

    S6 Fig. Substitutions with non-zero effect sizes in NS2A.

    Median effect size of substitutions across the 100-fold Monte Carlo cross-validations shown as points, 95% interquartile range as whiskers. Points are colored by locations of the sites: ER lumen (green), transmembrane (yellow), or cytosol (blue). Locations of the sites and domain annotations were taken from [34].

    (PDF)

    S7 Fig. Distribution of nonzero effect sites across NS2A segments.

    a) Total number of sites in each segment (hollow), number of variable sites (filled black), and number of sites estimated to have nonzero effects (filled red). b) Probability that at least these number of nonzero effect sites were associated with the segments at random. Amino acid positions of the segments shown in parentheses.

    (PDF)

    S8 Fig. Density of coevolving residue pairs detected by fastcov.

    Density values were scaled to maximum value of one. Distributions of nonzero effect substitutions (red) and site-specific Wu-Kabat variability coefficient (gray) of the respective proteins are shown on top (nonstructural protein 2A, NS2A) and side (envelope protein, E).

    (PDF)

    S9 Fig. Density of coevolving nucleotide pairs detected by SpydrPick.

    a) Density of nucleotide positions with mutual information (MI) values greater than 99th percentile of MI values between pairs throughout the DENV genome. Density scaled to maximum value of one. Thin rectangle corresponds to coevolution relationship between E gene (y-axis) and sites throughout the genome. Thick rectangle highlights relationship between E gene and NS2A gene. b) Density plot expanding the highlighted region in panel (a).

    (PDF)

    S10 Fig. Relationship between difference in antigenic distance observed in virus triplets and effect size estimates from the substitution model.

    Shown separately for substitutions located in epitopes of human-derived monoclonal antibodies (hmAb), E domain I/II/III but outside of known epitopes (EDI/II/III), E stem/anchor domain, and nonstructural protein 2A (NS2A). Points are the medians of the observations/estimates. Lines are 95% interquartile ranges.

    (PDF)

    S11 Fig. Effects of substitutions in footprints of human-derived mAb (hmAb).

    Difference in antigenic distance observed between pairs of viruses separated by the specific substitution and antigenic distance observed in respective effectively identical viruses without the substitution (control viruses). Thick lines show median and 95% interquartile range (IQR) for triplets of all serotype pairs combined. Thin lines show the median and 95%IQR for each serotype pair identified.

    (PDF)

    S12 Fig. Observable effects of substitution differ within the same serotype pair.

    a) Distribution of difference in antigenic distance, ΔDm, for E:M160K substitution including all triplets with the same serotype pair (DENV2, DENV2) and the resultant p-value shown in comparison to b) median and 95% interquartile range of ΔDm shown separately for each virus j involved in the virus triplets and their respective p-values.

    (PDF)

    S13 Fig. Distribution of virus-specific difference in antigenic distance on the phylogeny.

    Median difference in antigenic distance, ΔDm, specific to each virus j involved in the virus triplets shown in S12 Fig are colored on the phylogeny. Points are shown as solid circles for p-values ≤ 0.05 and as hollow triangles otherwise.

    (PDF)

    S14 Fig. Effects of substitutions in EDI/II/III but outside of known mAb epitopes.

    a) Difference in antigenic distance observed between pairs of viruses separated by the specific substitution and antigenic distance observed in respective effectively identical viruses without the substitution (control viruses). Thick lines show median and 95% interquartile range (IQR) for triplets of all serotype pairs combined. Thin lines show the median and 95%IQR for each serotype pair identified. b) Distribution of difference in antigenic distance for substitution with p-value ≤ 0.1 colored by serotypes of the virus pairs.

    (PDF)

    S15 Fig. Effects of substitutions in the stem/anchor domain of E.

    Difference in antigenic distance observed between pairs of viruses separated by the specific substitution and antigenic distance observed in respective effectively identical viruses without the substitution (control viruses). Thick lines show median and 95% interquartile range (IQR) for triplets of all serotype pairs combined. Thin lines show the median and 95%IQR for each serotype pair identified.

    (PDF)

    S16 Fig. Effects of substitutions in nonstructural protein 2A (NS2A).

    a) Difference in antigenic distance observed between pairs of viruses separated by the specific substitution and antigenic distance observed in respective effectively identical viruses without the substitution (control viruses). Thick lines show median and 95% interquartile range (IQR) for triplets of all serotype pairs combined. Thin lines show the median and 95%IQR for each serotype pair identified. b) Distribution of difference in antigenic distance for substitutions with p-value ≤ 0.1 colored by serotypes of the virus pairs.

    (PDF)

    S1 File. Nonzero effect substitutions in envelope protein (E).

    (CSV)

    S2 File. Nonzero effect substitutions in nonstructural protein 2A (NS2A).

    (CSV)

    Attachment

    Submitted filename: DengueAntigenicGenetics_ResponseToReviewers.pdf

    Data Availability Statement

    The authors confirm that all data underlying the findings are fully available without restriction. Data and code for the analyses is held in Zenodo (https://doi.org/10.5281/zenodo.5615512).


    Articles from PLoS Pathogens are provided here courtesy of PLOS

    RESOURCES