Skip to main content
PLOS Neglected Tropical Diseases logoLink to PLOS Neglected Tropical Diseases
. 2023 Mar 27;17(3):e0011141. doi: 10.1371/journal.pntd.0011141

The Baikal subtype of tick-borne encephalitis virus is evident of recombination between Siberian and Far-Eastern subtypes

Grigorii A Sukhorukov 1,*, Alexey I Paramonov 2, Oksana V Lisak 2, Irina V Kozlova 2, Georgii A Bazykin 1,3,*, Alexey D Neverov 4,5,*, Lyudmila S Karan 5
Editor: Wen-Ping Guo6
PMCID: PMC10079218  PMID: 36972237

Abstract

Tick-borne encephalitis virus (TBEV) is a flavivirus which causes an acute or sometimes chronic infection that frequently has severe neurological consequences, and is a major public health threat in Eurasia. TBEV is genetically classified into three distinct subtypes; however, at least one group of isolates, the Baikal subtype, also referred to as “886-84-like”, challenges this classification. Baikal TBEV is a persistent group which has been repeatedly isolated from ticks and small mammals in the Buryat Republic, Irkutsk and Trans-Baikal regions of Russia for several decades. One case of meningoencephalitis with a lethal outcome caused by this subtype has been described in Mongolia in 2010. While recombination is frequent in Flaviviridae, its role in the evolution of TBEV has not been established. Here, we isolate and sequence four novel Baikal TBEV samples obtained in Eastern Siberia. Using a set of methods for inference of recombination events, including a newly developed phylogenetic method allowing for formal statistical testing for such events in the past, we find robust support for a difference in phylogenetic histories between genomic regions, indicating recombination at origin of the Baikal TBEV. This finding extends our understanding of the role of recombination in the evolution of this human pathogen.

Author summary

Tick-borne encephalitis is a serious and frequently deadly infectious disease. It is caused by a virus of the same name with genome composed of single-stranded RNA. The known genomes of this virus fall into three large regional groups: Europe, Siberia, or the Russian Far East. These groups have originated from a common ancestor several hundred or thousand years ago and were assumed to have evolved independently since then. This study shows that a previously described group of viruses obtained in the vicinity of Lake Baikal in Russia have a mosaic genome: some parts of it are more closely related to those of the Siberian group, while others, to the Far Eastern group. Such a pattern probably arose through recombination–a process during which a cell infected with two distinct viruses produces “hybrid” viral progeny carrying genetic material from both parents. While recombination is frequent in other RNA viruses, it has not been previously described for the tick-borne encephalitis virus. These findings show that mixture of genetic information from distinct sources can contribute to genetic diversity of this group of viruses, and potentially accelerate their adaptation.

Introduction

The Tick-borne encephalitis virus (TBEV) is a viral species, member of the Flavivirus genus, a genus of the Flaviviridae family of viruses with relatively short single-stranded positive-sense RNA genomes [1]. The approximately 11000 bp long TBEV mRNA encodes a single polyprotein which is cleaved by both viral and host machinery into three viral capsid forming proteins (C, PrM, E) and seven non-structural proteins (NS1, NS2A, NS2B, NS3, NS4A, NS4B and NS5) [2]. Like many other flaviviruses, TBEV persists in a complex lifecycle of arthropod vectors and mammalian hosts, and human infections are a transmission dead-end [3]. The human form of the infection manifests as mild fever, but in some cases is followed by severe neurological impairments or death [4].

Phylogenetic analysis routinely subdivides all TBEV variants into three distinct subtypes which used to have a strong geographical association. Within Russia, the European subtype was mainly present in the European part but could be encountered all the way to Eastern Siberia; the Far-Eastern subtype occupied the Far East of Russia; while the Siberian subtype was present in the North-Western and Central Russia and predominant in Urals, Western and Eastern Siberia [5], [6]. In recent decades, however, these distribution patterns have become disrupted [7,8] which has been attributed to climate change and increasing human influence on virus migration [9], [10]. Moreover, several novel distinct groups of variants were discovered [11,12]. One of them is the Baikal subtype, which was recently proposed as a novel TBEV subtype candidate [13,14].

Recombination is a major contributor to viral diversity in many studied flaviviruses [1517]. However, the complexity of TBEV transmission pathways together with a high diversity of hosts has been previously taken to imply that recombination in TBEV is unlikely, and the overall high conservation of TBEV makes it hard to detect if it is present [18]. Nonetheless, as the number of sequenced TBEV variants increased, rare recombination events between different TBEV variants have been proposed [13,19,20]. However, evidence for recombination has been controversial, with each subsequent work disproving previous results while proposing new putative recombination events. The comprehensive study by Bertrand et al. [21] attributed this to the limitations of the viral recombination detection methods when applied to TBEV. Their in-depth simulation established that, to be reliably detectable, the recombinant genomic segments need to be long (>1000bp) or come from distant viral lineages [21].

Therefore, to argue for the presence of recombination in TBEV, one needs to find a variant with a pronounced signal of recombination. Here, we study the Baikal subtype of TBEV. Baikal TBEV variants have been collected for nearly 30 years in large numbers in areas where variants of Siberian and Far-Eastern subtypes are present as well. Currently, there are 22 variants of the Baikal subtype isolated between 1983–1990 in the collection of the Federal State Public Scientific Institution “Scientific Center for Family Health and Human Reproduction Problems” (Irkutsk, Russia) [22]. Additional 6 variants isolated between 1999 and 2010 are in the collection of the Irkutsk Anti-Plague Research Institute of Siberia and Far East (Irkutsk, Russia). These variants were isolated from Ixodes persulcatus, Myodes rutilus, Myodes glareolus and Microtus gregalis collected in Irkutsk region, Buryat Republic and Trans-Baikal Territory of Eastern Siberia. In each of these territories, all TBEV subtypes (European, Siberian and Far-Eastern) coexist and are routinely collected [23]. Mixed infections by TBEV variants of different subtypes, usually Siberian and Far-Eastern, are well described [24]. In particular, TBEV belonging to two distinct subtypes was found in the brain tissue of deceased patients, in the blood samples of ill patients, and in infecting ticks. Notably, among the 10 described individual polytypic samples, 4 were isolated in Irkutsk and Trans-Baikal regions. Among these 4 polytypic samples, 3 contained simultaneously TBEV of Siberian and Far-Eastern subtypes, while the remaining one was a mixture of European and Siberian subtypes [2527].

Molecular probes show that the Baikal TBEV variants carry fingerprint amino acids unique to Far-Eastern and Siberian subtypes [14]. The unconventional properties of viruses from the Baikal subtype were first discovered in serotyping, when variants of this group showed high antigenic cross-reactivity with variants of Siberian and Far-Eastern subtypes in neutralization tests. Furthermore, the variant 886–84 displayed equivalent affinity to all TBEV subtypes in the agar diffusion precipitation reaction with cross-adsorbed variants-specific serum. All this evidence supports the plausibility of recombination in TBEV in general, and in particular, of recombination involving different subtypes, and makes the Baikal subtype a likely candidate for being recombinant. We hypothesize that the TBEV Baikal subtype has arisen as a result of an ancient recombination event between variants belonging to Siberian and Far-Eastern subtypes.

Many methods for detection of past recombination are available, a number of which are implemented in the RDP package [28]. Most of these methods evaluate the strength of a recombination signal by visualizing the relatedness of the query sequence to other sequences in different regions of the multiple sequence alignment, but ignore the potential diversity of the putative parental variants by uniting them in a consensus sequence, and/or do not implement a formal way to test for statistical difference in relatedness between regions. To address both these shortcomings, we designed a novel approach based on the Grouping Scan analysis (GS) [29]. GS analysis reconstructs the phylogenies in a sliding window along the sequence alignment, and compares the phylogenetic placement of a putative recombinant sequence relative to predefined clades of sequences between different positions of the sliding window. More exactly, for each window, it calculates a score reflecting how deeply the candidate recombinant sequence is embedded into the clades formed by other predefined groups of sequences. Recombination can be inferred if different genomic segments provide conflicting phylogenetic positions for the query sequence, as evidenced by high values of the GS score placing it into different clades. In the present study, we developed a GS-based pipeline that allows us to test for differences in placement of the query sequence between alignment windows, formally assessing the statistical support for recombination. By applying the commonly used methods of the RDP4 package as well as the newly developed GS-based method, we found strong statistically robust support for the hypothesis of the recombinant nature of the TBEV Baikal subtype.

Materials and methods

Viral isolation and culturing

Four variants belonging to the Baikal subtype were isolated from ticks (Ixodes persulcatus) and small mammals (Myodes rufocanus) collected between 1984–1990 in Barguzinsky and Bichursky districts of Buryat Republic, Eastern Siberia. Data on isolation and cultivation of the studied variants are provided in the S1 Table.

Genome amplification and sequencing

Viral RNA was extracted from 100 μl of cell culture supernatant fluid using a commercial kit Viral RNA mini kit (Qiagen, Germany) according to the manufacturer’s instructions. The RNA template was reverse transcribed using the Reverta-L kit (Central Research Institute of Epidemiology, Moscow, Russia). We used RNA isolated from cell culture supernatant fluid for sequencing. The purified PCR products were sequenced bidirectionally using BigDye Terminator v1.1 Cycle Sequencing kit (Thermo Fisher Scientific, Austin, TX, USA) on Applied Biosystems 3500xL Genetic Analyzer (Applied Biosystems, Foster City, CA, USA). The primers used for sequencing are provided in the S2 Table. The sequences were deposited in NCBI GenBank under the following accession numbers: MT708809, MT708810, MT708811, MT708812.

Alignment preparation

All available TBEV and Omsk hemorrhagic fever virus (OHFV) sequences containing the full protein-coding region were downloaded from the GenBank repository on 11.04.2018. The four Baikal TBEV variants assembled from Sanger sequencing reads were added to this set. Sequence alignment was constructed with Mafft v. 7.408 with default parameters [30]. Sequences with gaps in coding regions and identical sequences were discarded from the alignment. The alignment was trimmed to begin with the main open reading frame [31]. The full alignment contained 11 sequences belonging to the Baikal subtype, 32 sequences of the Siberian subtype, 90 sequences of the Far-Eastern subtype, 47 sequences of the European subtype, and 3 sequences of OHFV. The sequences of OHFV, a different species from the Flavivirus genus [1], were added to the alignment to root the reconstructed phylogeny. A maximum likelihood tree was reconstructed on the basis of the whole-genome alignment using MEGA 7 [32] under the GTR+G model with 200 bootstrap trials, and visualized with iTOL [33] (Fig 1).

Fig 1. Reconstructed maximum likelihood whole-genome phylogeny for TBEV variants with 200 bootstrap replicates.

Fig 1

Blue circles indicate bootstrap support of corresponding branches. Leaves are colored by group: red, Siberian subtype of TBEV (Sib); blue, Far-Eastern subtype of TBEV (FE); green, European subtype of TBEV (EU); black, Omsk hemorrhagic fever virus (OHFV) used as the outgroup species; orange, Baikal subtype of TBEV; gray, TBEV variants 2871 and 178–79 excluded from the GS analysis. Newly sequenced Baikal TBEV variants are marked with purple dots.

Recombination analysis

To search for recombination, we used the three methods available in the RDP4 package [25]: TOPAL, SIMPLOT and BootScan [3436] under the default parameters, as well as the newly developed method based on the Grouping Scan analysis for recombination (GS) from the Simple Sequencing Editor (SSE) v.1.3 [29]. For this, we assigned all sequences to five groups: OHFV, European, Far-Eastern, Siberian and Baikal. This assignment was performed according to annotation if available, or otherwise according to their grouping into clades with annotated sequences. The few errors in annotations of subtypes were corrected based on the phylogeny in Fig 1. For the methods of the RDP4 package, a consensus sequence was constructed for each group using the simple majority rule. For the GS analysis, a consensus sequence was constructed just for the Baikal subtype, while all sequences from the remaining groups were retained.

To formalize model choice for the GS analysis, we have run the ModelFinder in IQTree2 for each of the 10 random fragments of TBEV sequence alignment, each 250 nucleotides long, matching the window length of GS; as well as the JCF2 and JCF3 fragments (see below). In all 12 tests, the best-fitting model was one of the following: TN, TIM2, TIM3, or GTR. All these models involve more parameters than the most parameter-rich model implemented in GS, i.e., Kimura’s 2-parameter model (K80), making K80 our model of choice. Therefore, we used GS with the K80 model. We used sliding window size 250 and step 40. For a comparison of a query sequence and a clade, GS score is defined as, where y is the number of sequences in the group and Ni is the number of nodes separating the query sequence and sequence from the group on the unrooted tree [29]. Low GS scores correspond to high distance between the query sequence and the corresponding clade; GS score of 0.5 corresponds to a sister position of a sequence to the clade of interest; and GS score is > 0.5 for a sequence nested within another clade, tending asymptotically to some value < 1 with the increase of the number of nodes separating the sequence from the most recent common ancestor of the clade of interest.

Estimation of statistical significance of recombination events

For each predefined group of sequences, GS reports grouping score statistics which are calculated for positions of the sliding window along the coordinates of the alignment. We refer to the plot of these values along the alignment as the GS curve. For an alignment window, a GS score greater than 0.5 for one of the groups indicates that, in the phylogenetic tree reconstructed from this window, the query sequence descends from the most recent common ancestor of this group; in other words, the query sequence is more closely related to sequences from this group than to other sequences. For each window, the set of GS scores for different clades indicates how likely it is to be associated with each of these clades, and the differences in these sets of scores between windows, manifested in rises and falls in the GS curve, suggest that these windows may have different evolutionary histories. However, these scores are not directly interpretable as evidence for recombination or lack thereof.

To address this, we designed a statistic, GS score area under curve (GSAUCt), defined as the total area under the GS curve above some threshold GS score value of t. We consider different threshold values of t upwards of 0.5; the value of 0.5 corresponds to the query sequence being the sister to the clade of interest. As the GSAUC statistic characterizes the alignment as a whole, its high values indicate that the query sequence is related to the clade of interest at least in some of the genomic regions. If such elevated relatedness is observed for multiple different clades at distinct alignment regions, resulting in high GSAUC values in multiple comparisons, this suggests that segments of the query sequence differ in their evolutionary history, indicating recombination.

To obtain the null distribution of GSAUCt, we use a permutation procedure, randomly reshuffling all alignment columns in 250 replicates and calculating GSAUCt for each such reshuffled alignment. While each reshuffled alignment as a whole has the same phylogeny as the original alignment, phylogenies obtained for individual positions of the sliding window generally differ between the original and the reshuffled datasets, resulting in a different value of GSAUCt. The p-value is then calculated as the percentile in the distribution of simulated GSAUCt values corresponding to the observed GSAUCt. The logic behind this is the following. Reshuffling dissolves all non-independence between neighboring sites, including that caused by similarity of phylogenetic histories. Informally, significant GSAUC implies that genomic segments differ in the degree of phylogenetic affinity they demonstrate towards the considered subtype. While such significance for one of the subtypes may be due to reasons other than the presence of recombination (e.g., differences in conservation between segments), observation of such differences for multiple subtypes has to imply that different genomic windows are differentially affined to different subtypes, implying recombination.

To implement the described procedure, the GS analysis had to be automated. The automation was done on Windows Server 2007 OS with 64 cores and 128 Gb of memory allocated and was controlled by a python script using pywinauto package v. 0.6.5 [19]. Up to thirty instances of GS could be launched in parallel under this setting. Python scripts used for calculation of GSAUC and alignments are available at https://github.com/gregoruar/tbev_rec.

Testing GSAUC on an in silico recombination

To test whether the GSAUC analysis is able to detect ancient recombination events, we used a simulated recombination. For this, we generated an artificial multiple sequence alignment that corresponded to the observed recombination events in the real data. We used ALISIM, an evolutionary sequence simulator that is a part of IqTree 2.2.0 or later versions [37]. ALISIM generates random sequences on the tips of the provided phylogenetic tree using the provided evolutionary model. To model recombination, we first reconstructed the best phylogenies, the best evolutionary models and their parameters for the genome fragments with the strongest signal of recombinations in our data: JCF2, JCF3 and the remaining alignment (non_JCF2_JCF3) (see Results). Using these trees and models, we generated an artificial alignment for each genome fragment.

The positions of the subtypes in the obtained best IqTree trees largely matched the BEAST topologies shown in Fig 4, with the topology of the non_JCF2_JCF3 matching that of JCF1. The inferred evolutionary models for fragments JCF2, JCF3 and non_JCF2_JCF3 were “TIM2e+G4”, “TIM2e+I+G4” and “GTR+F+I+I+R3” correspondingly. Finally, we combined these alignments into a complete artificial multiple alignment by inserting the columns of fragment alignments to their original positions in the initial alignment.

Fig 4. Reconstructed Bayesian phylogenies of TBEV based on the characteristic fragments.

Fig 4

Each panel is based on the alignment segment of the corresponding JCF.

We analysed the generated artificial sequence alignment by GS analysis and calculated GSAUC for the GS scores. Both modeled recombination events are detected by GSAUC reliably (S1 Fig).

Genealogy reconstruction with BEAST

We used the results of the GS analysis to pick the candidate segments of the alignment characteristic of distinct phylogenetic histories (joint characteristic fragments, JCFs). As JCF1, we picked a segment of the alignment representative of the whole-genome phylogeny, namely, the NS1 gene (2329–3388 bp). To choose the segments representative of alternative phylogenies, we reasoned that a single window might exceed the 0.5 GS threshold due to random phylogenetic noise, but this is less likely for multiple adjacent windows. Therefore, for a fragment to be picked as a part of a JCF, we required it to exceed the 0.5 threshold in at least two adjacent windows. There were exactly three such segments in our alignment: the two segments for the Siberian subtype (5520–5770 and 6640–6970 bp, with respectively two and three adjacent windows exceeding the threshold), which we concatenated as JCF2, representing a total of 5.7% of the viral genome; and the single segment for the Far-Eastern subtype (7810–8160, with 8 such adjacent windows; Fig 2D), which we classified as JCF3, representing 3.4% of the viral genome. JCF2 and JCF3 cover the highest GS score peaks in the alignment (> 0.65; Fig 2D). Each JCF was then subjected to the analysis of phylogeny (Fig 2D).

Fig 2. Evidence for recombination at origin of the Baikal subtype.

Fig 2

Each analysis compared five consensus sequences: of the Baikal subtype (used as the query), and of the four other subtypes. (A) TOPAL/DSS score, reflecting the difference in shapes of the phylogenies between the adjacent alignment windows (dark gray), against the background of the same values calculated for bootstrap replicates of simulated non-recombinant sequences (light gray); the two dashed lines indicate the expected 99% and 95% confidence intervals of expected scores [36]. (B) SIMPLOT score, reflecting the nucleotide sequence distance between the query and the four consensus sequences [35]; the Far-Eastern subtype is the closest to the Baikal subtype throughout most of the alignment, but the Siberian subtype is more closely related in some of the alignment regions. (C) Bootscan plot is based on reconstructing the neighbor joining trees for each sequence window, and calculating the bootstrap support for the clade uniting the query subtype and each of the four other subtypes [34]; again, the clade uniting the Baikal subtype with the Far-Eastern subtype has the highest support for the bulk of the alignment, but the clade uniting it with the Siberian subtype has a higher support for some regions. (D) GS analysis. Joint characteristic fragments (JCFs) used for subsequent phylogenetic reconstruction are shown below the GS scores plot. The 0.5 threshold is shown as a dashed gray line. For (A)-(C), default parameters were used, namely, sliding window of length 200, step 20 (for (A), step 10 + smoothing step 10). For (D), we used the sliding window length of 250, and step size of 40.

For all described parts of the alignment, genealogy was reconstructed by BEAST v.1.10.4 [38]. The same priors were used for each JCF. We performed a formal model selection for BEAST analysis using IQtree [39]. The best-fit model was GTR+F+I+G4 for JCF1, TIM2e+G4 for JCF2, and TIM+F+I+G4 for JCF3. For all these models, the closest model among those implemented in BEAST was GTR+Gamma, which we used. Uncorrelated lognormal clocks were modeled as lognormal distributions of substitution rates among lineages with the mean of 10−4. The prior for the age of the root was modeled by normal distribution with a mean of 5000 years and standard deviation of 1000 years. Calibrated Yule model with default parameters was used as the prior on the tree shape best representing the heterogeneity of the dataset. The phylogeny of JCF2 was reconstructed in a single run with the substitution model and mutation rate being unlinked. Each BEAST analysis was performed for 50 million generations with 10% of samples discarded at burn-in. All BEAST runs were checked for convergence by controlling for effective sampling size (ESS values > 200). Visualization was done in ggtree R package [40].

Results

The Baikal subtype is evident of recombination

We characterized the genetic diversity of TBEV by phylogenetic analysis. The obtained TBEV phylogeny (Fig 1) consists of the three main genetic groups with high bootstrap support corresponding to the accepted classification of the virus into three subtypes: European, Siberian and Far-Eastern [5]. Our four newly sequenced isolates cluster together with other variants belonging to the Baikal subtype, forming a well-supported sister clade to the Far-Eastern subtype. Similar to other studies [41], the variants we obtained underwent several rounds of mouse and tissue culture propagation, potentially introducing new mutations; however, the bulk of the divergence of this clade was in its ancestral branch (Fig 1), indicating that the contribution of such mutations was negligible if present at all. Overall, the obtained whole-genome phylogeny conforms with the previous results [21]. variants TBEV-178-79 (GenBank ID EF469661) and TBEV-2871 (GenBank ID MF774565) were found to be remote outgroups to the Far-Eastern and Siberian subtypes respectively (Fig 1). variant 178–79 is hypothesized to be the sole representative of a distinct TBEV subtype [42], and variant 2871 was acknowledged to form a new lineage in Siberian TBEV subtype [11]. As these variants are sole representatives of their clades, we have no way to exclude cross-contamination or additional recombination events involving them; therefore, we excluded them from subsequent analysis.

To formally ask whether the Baikal subtype descends from a recombination event, we used a series of tests. First, we applied three alignment-based tests available in the RDP4 package: TOPAL, SIMPLOT and BootScan to consensus sequences of the five studied groups. All three tests support recombination between the Far Eastern and the Siberian groups at the origin of the 886-like group (Fig 2). Specifically, both SIMPLOT (Fig 2B) and BootScan (Fig 2C) indicate that while the Baikal subtype is more closely related to the Far-Eastern subtype in most alignment regions, it is more closely related to the Siberian subtype in some others, notably, around position 6750. Besides, around position 8000, the Baikal subtype clusters with the Far-Eastern subtype more reliably, and at a lower sequence distance, than in surrounding positions. Both the vicinity of position 6750 and of position 8000 demonstrate an abrupt change in the shapes of the phylogenies between the adjacent positions, as evidenced by high TOPAL/DSS scores (Fig 2A).

Second, to better understand the history of recombination events at the origin of the Baikal subtype, we used the GS analysis. GS revealed a signal of recombination for the Baikal subtype (Fig 2D): for some segments of the alignment, GS score was above 0.5 for Siberian subtypes, while for others, it was above 0.5 for Far-Eastern subtypes, suggesting that it has descended from a recombination event involving these two subtypes.

Statistical support for recombination

The results of the GS analysis indicate that different segments of the 866-84-like genome are nested within different subtypes, suggesting recombination. To formally test this, for each of the four predefined groups (European, Far-Eastern, Siberian TBEV subtypes and OHFV), we tested the hypothesis that the 866-84-like clade is equally deeply embedded within the considered subtype across all genomic windows. If this hypothesis is rejected for more than one subtype, this would imply that different segments of the 866-84-like genomes are more closely related to different subtypes.

As the test statistic, we calculated the GSAUCs for the consensus sequence of the Baikal TBEV clade. High values of GSAUCt for a threshold t > 0.5 imply a high degree of relatedness, for some genomic segments, between the query sequence and the clade of interest. A comparison with a single clade can result in significant GSAUCt values for chance reasons such as unevenness of the substitution rate (e.g. due to differences in mutation rate or selective convariantt) across the query genome. However, such unevenness cannot result in unexpectedly high GSAUCt values in comparisons with multiple different clades. Instead, such a pattern has to imply that segments of the query sequence differ in their relatedness to different clades, implying recombination.

To statistically assess the increase in a GSAUCt value, we calculated the p-values by comparing the actual GSAUCt value to its null-distribution (see Methods). In order to explore the whole GS score curve, we performed this calculation for different GS thresholds above 0.5.

Significantly (p<0.05) elevated GSAUCt values are observed in the comparisons of the 866-84-like sequence with the Far-Eastern and the Siberian TBEV subtypes (Fig 3). The significance in these comparisons depends on the threshold chosen. Generally, the ranges of the threshold values for which GSAUCt is significant reflect the ranges of GS scores that are outlying compared to the baseline in Fig 2D. On the one hand, for the Far-Eastern subtype, we observe low p-values (down to 0.0) for the threshold values ranging between ~0.72 and ~0.93, indicating elevated GSAUCt values for these thresholds. These thresholds correspond to the elevated GS score in JCF3 (Fig 2D). On the other hand, for the Siberian subtype, we observe a low p-value (down to 0.0) for threshold values up to ~0.67, corresponding to the elevated GS score in JCF2 (Fig 2D). The non-monotonic dependence of p-values on the threshold is unsurprising: a p-value of 1 is expected both for low thresholds (as the baseline GS score is rather uniform across the genome) and for high thresholds (as the GS score never reaches values above ~0.95, Fig 2D). For the European subtype, the GS scores are somewhat elevated for low threshold values (<0.55); however, this decrease is not statistically significant (p = 0.069), and the fact that it is only observed for threshold values close to 0.5 suggests that it is due to noise. The OHFV group has high p-value for all threshold values, as expected from the outgroup subtype.

Fig 3. Phylogenetic evidence for recombinant origin of the Baikal subtype.

Fig 3

The plots show the dependence of the p-value for the GSAUC statistic on the GS score threshold (with threshold increments of 0.01), relative to each of the four TBEV subtypes. While no GS score thresholds result in p<0.05 for the European subtype and the OHFV (an outgroup species), for the Far-Eastern and Siberian subtypes, some of the GS score thresholds correspond to p<0.05, indicating that both these subtypes contributed to the Baikal subtype.

Evolutionary history of the Baikal subtype

To study the origin of the putative recombinant fragments identified with the GS analysis, we reconstructed their genealogy using BEAST. For this analysis, we selected three groups of alignment columns (joint characteristic fragments, JCFs) each corresponding to one of the three patterns in the GS analysis: low GS scores for all comparisons (positions 2329–3338, JCF1); high GS scores for grouping with the Siberian subtype (concatenated positions 5520–5770 and 6640–6970, JCF2); and high GS scores for grouping with the Far-Eastern subtype (positions 7810–8160, JCF3). The resulting phylogeny (Fig 4) contains posterior probability (PP) of the nodes that determine the phylogenetic position of the Baikal subtype; PP and heights of other nodes are available in S2 Fig and S3 Table.

The genealogy of the non-recombinant JCF1 has the same structure as the ML tree for the whole genome: the Baikal subtype branches off from the most recent ancestor of the Far-Eastern subtype, being the sister group to all Far-Eastern sequences. This topology is supported by the high posterior probability (PP) of three nodes: uniting the Baikal subtype into a clade (PP > 0.99, node I of S2 Fig), uniting the Far Eastern sequences into a clade (PP > 0.99, node F), and indicating that these two clades are in a sister relationship (PP > 0.99, node C).

By contrast, in the phylogenetic tree reconstructed from JCF2, the Baikal subtype is the sister group to all Siberian sequences. Again, this relationship is supported robustly by the high PP of three nodes: uniting the Baikal subtype into a clade (PP > 0.99, node I), uniting the Siberian sequences into a clade (PP > 0.99, node E), and indicating that these are sister clades (PP > 0.99, node D).

Finally, in the phylogeny of JCF3, the Baikal subtype is embedded in the Far-Eastern subtype. The clustering of all Baikal TBEV sequences into a group and of the Far Eastern sequences into a group are each supported by high PP (PP > 0.99 (node I) and PP > 0.99 (node J), respectively); however, the Baikal TBEV clade is nested within the Far Eastern clade, although the support for this nesting is lower (PP = 0.61 (node G) and PP = 0.77 (node H) for two successive nodes).

Discussion

The probability of a recombination event and the ease with which it can be detected both depend on the evolutionary distance between the parental lineages. Recombination between closely related variants is expected to be more frequent as such variants are more likely to circulate at the same place and time, exchange of genetic segments is mechanistically easier, and is less likely to incur a fitness cost due to epistatic interactions between sites; but it is harder to detect as its footprint on sequences is weaker [21]. All methods for detection of recombination, including our newly developed GSAUC statistic, are expected to have more power for more distantly related sequences. Most previous effort to detect recombination in TBEV has focused on within-subtype recombination, and evidence for it has been controversial [13,1821]. Here, we address a putative recombination event involving two major TBEV subtypes: Siberian and Far Eastern, and provide evidence that at least one recombination event between these subtypes has been at the origin of the Baikal subtype of TBEV.

These findings have the following limitations. First, our inference of recombination is based on an incongruence of phylogenetic trees for different alignment regions. While homologous recombination is the most likely source of such incongruence, other factors can also contribute to it, including non-uniform substitution rates between sequence regions and convergent evolution caused by biased mutation patterns or the action of natural selection. Such convergence would have to affect multiple sites, both synonymous and nonsynonymous, making this possibility unlikely; still, it is possible. Second, our analysis is based on just 11 samples of the Baikal subtype of TBEV. Including additional samples may reveal a more complicated history, including additional recombination events. This can be addressed by more extensive sampling of the Baikal subtype as well as other related subtypes. Third, the current methods for recombination detection are unable to precisely pinpoint the recombination breakpoint, particularly if the recombination was old. Fourth, the temporal signal of the inferred recombination events is weak, complicating their dating (see below).

Given these limitations, our claim for recombination at origin of the Baikal subtype is based on two types of phylogenetic evidence, both with robust statistical support. First, using existing methods as well as a newly developed GSAUC statistic, we show that fragments of the TBEV genome differ in which other clade they are more similar to. The methods used to show this are based on different approaches, indicating that the support for recombination is unlikely to be a methodological artifact. Second, using Bayesian phylogenetics, we show that they respectively cluster with different TBEV subtypes.

The GS plot indicates that the segments involved in recombination are relatively short (Fig 2D). Therefore, it is critical to integrate its signal over sequence segments. The developed GSAUC statistic provides a framework for such integration. Similar to the original GS analysis which underlies it, it is based on the idea that, unless the recombination events were too frequent, adjacent genomic positions tend to share phylogenetic history. Therefore, even if the signal is partially eroded by subsequent evolution, we would expect higher cumulative GS scores relative to some predefined clades than would be expected for a reshuffled genome sequence.

The GS plots allow to single out putative regions with different evolutionary histories, JCF1, JCF2 and JCF3, which can then be analyzed phylogenetically. While other regions could have also been involved in recombination, these are the genomic segments for which the evidence for recombination is the most robust (Fig 2D).

Remarkably, the reconstructed phylogeny for the JCF1, JCF2 and JCF3 strongly supports the hypothesis that the Baikal subtype originated through recombination. In particular, the tree of JCF1 features the Baikal subtype as an outgroup for the Far-Eastern subtype with very high posterior support, while the tree of JCF2 strongly backs the placement of this group with the Siberian subtype. The discrepancy between these trees implies recombination between the Siberian and Far-Eastern subtypes of TBEV at the origin of the Baikal subtype. The tree of JCF3, the genomic region with very high GS similarity to the Far-Eastern subtype, shows deep embedding of the Baikal TBEV clade into the Far-Eastern subtype, although with moderate posterior support. The difference between the JCF1- and JCF3-based phylogenies may indicate the occurrence of an additional more recent recombination between the ancestor of the Baikal subtype and Far-Eastern subtype.

When did these events take place? In theory, evolutionary events can be dated using a molecular clock analysis. Such inference is complicated by the apparent weakness of the temporal signal in the TBEV genetic data [43,44]. For example, Deviatkin et al. [43] have not observed positive correlation between sampling date and root-to-tip distance for divergence at the scale above major TBEV subtypes, although such correlation has been observed within most subtypes (Far Eastern, European and two out of three Siberian subclades). Moreover, even in the presence of a temporal signal, precise dating of evolutionary events based on three phylogenetic trees with different topology and evolutionary rates is challenging. With these caveats in mind, we attempt to roughly date the major events in the evolutionary history of the Baikal TBEV clade. To cross-reference the branching events in different trees, we use the fact that the last common ancestor (LCA) of all considered TBEV subtypes should coincide between them. Indeed, its timing is similar (3731–4546 y.a.) between all three trees (S2 Fig).

Comparison of the phylogenies constructed from JCF1, JCF2 and JCF3 suggests the following timeline (S2 Fig). The Far-Eastern and Siberian subtypes diverged approximately 676–1434 y.a. The LCA of the Far-Eastern subtype and the Baikal TBEV clade date to 688 y.a., according to the JCF1 tree; and the LCA of the Siberian subtype and the Baikal TBEV clade date to 973 y.a., according to the JCF2 tree. This implies that the first recombination event involving the Far-Eastern and Siberian subtypes dated in this time interval. According to the JCF3 tree, the LCA of the Far-Eastern subtype and the Baikal TBEV clade dates to 235 y.a., implying that the second putative recombination event involving the Baikal TBEV clade and the Shenjang lineage within Far-Eastern subtype dated between 688 and 235 y.a. The LCA of all known representatives of the Baikal TBEV clade dates to 83–119 y.a. The 95% HPDs on these dates are very high (S2 Fig), and therefore they should only be considered rough estimates.

In conclusion, we detect an instance of recombination in the history of the TBEV Baikal subtype. This inference is supported by multiple methods, including a newly developed GSAUC method based on the GS statistic which allows to measure the signal of recombination in a statistically rigorous way. This discovery concludes a long debate regarding the possibility of recombination in the evolution of TBEV. The role of such events in the origin of other TBEV variants remains an important subject for future studies.

Supporting information

S1 Table. Isolation and cultivation of the studied strains.

(PDF)

S2 Table. Primers used for sequencing.

(PDF)

S3 Table. Heights and posterior probabilities of select nodes shown in S2 Fig from the reconstructed Bayesian phylogeny of TBEV JCFs.

(PDF)

S1 Fig. GSAUC reliably detects simulated recombination event.

(A) GS analysis of the artificially generated alignment with recombination; (B) GSAUC curves corresponding to the GS analysis.

(TIF)

S2 Fig. Nodes from the reconstructed Bayesian phylogeny of TBEV JCFs for which heights and posterior probabilities are presented in S3 Table.

(TIF)

Data Availability

Python scripts used for calculation of GSAUC and alignments are available at https://github.com/gregoruar/tbev_rec. All other relevant data are within the manuscript and its Supporting Information files.

Funding Statement

This work was funded by the Russian Science Foundation (project no. 21-74-20160 to G.A.B.). A.D.N. was partially supported by the HSE University Basic Research Program. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Walker PJ, Siddell SG, Lefkowitz EJ, Mushegian AR, Adriaenssens EM, Alfenas-Zerbini P, et al. Recent changes to virus taxonomy ratified by the International Committee on Taxonomy of Viruses (2022). Arch Virol. 2022. Aug 23. doi: 10.1007/s00705-022-05516-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Lindqvist R, Upadhyay A, Överby A. Tick-Borne Flaviviruses and the Type I Interferon Response. Viruses. 2018. Jun 21;10(7):340. doi: 10.3390/v10070340 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Mlera L, Bloom ME. The Role of Mammalian Reservoir Hosts in Tick-Borne Flavivirus Biology. Front Cell Infect Microbiol. 2018. Aug 28;8:298. doi: 10.3389/fcimb.2018.00298 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Valarcher J.F., Hägglund S., Juremalm M., Blomqvist G., Renström L., Zohari S., et al. Tick-borne encephalitis. Rev Sci Tech Int Off Epizoot. 2015;34(2):453–66. doi: 10.20506/rst.34.2.2371 [DOI] [PubMed] [Google Scholar]
  • 5.Ecker M, Allison SL, Meixner T, Heinz FX. Sequence analysis and genetic classification of tick-borne encephalitis viruses from Europe and Asia. J Gen Virol. 1999. Jan 1;80(1):179–85. doi: 10.1099/0022-1317-80-1-179 [DOI] [PubMed] [Google Scholar]
  • 6.Zlobin VI, Belikov SI, Dzhioev YuP, Demina TV, Kozlova IV. Molecular epidemiology of tick-borne encephalitis. Vopr Virusol. 2007;52(6):6–13. [PubMed] [Google Scholar]
  • 7.Yoshii K, Song JY, Park S beom, Yang J, Yoshii K, Song JY, et al. Tick-borne encephalitis in Japan, Republic of Korea and China Tick-borne encephalitis in Japan, Republic of Korea and China. Nat Publ Group. 2019;1751. [Google Scholar]
  • 8.Kovalev SY, Kokorev VS, Belyaeva IV. Distribution of Far-Eastern tick-borne encephalitis virus subtype strains in the former Soviet Union. J Gen Virol. 2010. Dec 1;91(12):2941–6. doi: 10.1099/vir.0.023879-0 [DOI] [PubMed] [Google Scholar]
  • 9.Estrada-Peña A, Ayllón N, de la Fuente J. Impact of Climate Trends on Tick-Borne Pathogen Transmission. Front Physiol. 2012;3. doi: 10.3389/fphys.2012.00064 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Pfeffer M, Dobler G. Emergence of zoonotic arboviruses by animal trade and migration. Parasit Vectors. 2010;3(35):15. doi: 10.1186/1756-3305-3-35 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Tkachev SE, Chicherina GS, Golovljova I, Belokopytova PS, Tikunov AYu, Zadora OV, et al. New genetic lineage within the Siberian subtype of tick-borne encephalitis virus found in Western Siberia, Russia. Infect Genet Evol. 2017. Dec;56:36–43. doi: 10.1016/j.meegid.2017.10.020 [DOI] [PubMed] [Google Scholar]
  • 12.Dai X, Shang G, Lu S, Yang J, Xu J. A new subtype of eastern tick-borne encephalitis virus discovered in Qinghai-Tibet Plateau, China. Emerg Microbes Infect. 2018. Dec;7(1):1–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Norberg P, Roth A, Bergström T. Genetic recombination of tick-borne flaviviruses among wild-type strains. Virology. 2013. Jun;440(2):105–16. doi: 10.1016/j.virol.2013.02.017 [DOI] [PubMed] [Google Scholar]
  • 14.Kozlova IV, Demina TV, Tkachev SE, Doroshchenko EK, Lisak OV, Verkhozina MM, et al. Characteristics of the baikal subtype of tick-borne encephalitis virus circulating in eastern siberia. Acta Biomed Sci. 2018. Jul 28;3(4):53–60. [Google Scholar]
  • 15.Perez-Ramirez G, Diaz-Badillo A, Camacho-Nuez M, Cisneros A, Munoz M. Multiple recombinants in two dengue virus, serotype-2 isolates from patients from Oaxaca, Mexico. BMC Microbiol. 2009;9(1):260. doi: 10.1186/1471-2180-9-260 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Han JF, Jiang T, Ye Q, Li XF, Liu ZY, Qin CF. Homologous recombination of Zika viruses in the Americas. J Infect. 2016. Jul;73(1):87–8. doi: 10.1016/j.jinf.2016.04.011 [DOI] [PubMed] [Google Scholar]
  • 17.Carney J, Daly JM, Nisalak A, Solomon T. Recombination and positive selection identified in complete genome sequences of Japanese encephalitis virus. Arch Virol. 2012. Jan;157(1):75–83. doi: 10.1007/s00705-011-1143-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Twiddy SS, Holmes EC. The extent of homologous recombination in members of the genus Flavivirus. Gen Virol. 2003;429–40. doi: 10.1099/vir.0.18660-0 [DOI] [PubMed] [Google Scholar]
  • 19.Yun SM, Kim SY, Ju YR, Han MG, Jeong YE, Ryou J. First complete genomic characterization of two tick-borne encephalitis virus isolates obtained from wild rodents in South Korea. Virus Genes. 2011. Jun;42(3):307–16. doi: 10.1007/s11262-011-0575-y [DOI] [PubMed] [Google Scholar]
  • 20.Bertrand Y, Töpel M, Elväng A, Melik W, Johansson M. First Dating of a Recombination Event in Mammalian Tick-Borne Flaviviruses. Martin DP, editor. PLoS ONE. 2012. Feb 22;7(2). doi: 10.1371/journal.pone.0031981 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Bertrand YJK, Johansson M, Norberg P. Revisiting Recombination Signal in the Tick-Borne Encephalitis Virus: A Simulation Approach. Alvisi G, editor. PLOS ONE. 2016. Oct 19;11(10). doi: 10.1371/journal.pone.0164435 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Kozlova IV. TBEV strain collection [Internet]. 2017. Available from: http://www.ckp-rf.ru/ckp/478258/. [Google Scholar]
  • 23.Kozlova IV, Verhozina MM, Demina TV, Dzhioev YuP, Tkachev SE, Karan LS, et al. Genetic and Biological Properties of the Original Group of Тick-Borne Encephalitis Virus Strains Circulating in Eastern Siberia. Epidemiol Vaccine Prev. 2012;3(64):14–25. [Google Scholar]
  • 24.Pogodina V.V., Karan L.S., Kolyasnikova N.M., Gerasimov S.G., Levina L.S., Bochkova N.G., et al. Polytypic strains in genofund of tick-borne encephalitis virus. Vopr Virusol. 2012;57(3):30–6. [PubMed] [Google Scholar]
  • 25.Karan LS, Braslavskaya SI, Myazin AE. Development Of Amplification Technology-Based Methods For Tick-Borne Encephalitis Virus Detection And Genotyping. Vopr Virusol. 2007;6:17–22. [PubMed] [Google Scholar]
  • 26.Pogodina VV, Karan LS, Kolyasnikova NM. Evolution Of Tick-Borne Encephalitis And A Problem Of Evolution Of Its Causative Agent. Vopr Virusol. 2007;5:16–20. [PubMed] [Google Scholar]
  • 27.Pogodina VV, Karan LS, Kolyasnikova NM. Polytipovie schtammi v genofonde virusa kleschevogo encefalita. Vopr Virusol. 2012;3:30–6. [Google Scholar]
  • 28.Martin DP, Murrell B, Golden M, Khoosal A, Muhire B. RDP4: Detection and analysis of recombination patterns in virus genomes. Virus Evol. 2015. Mar;1(1). doi: 10.1093/ve/vev003 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Simmonds P. SSE: a nucleotide and amino acid sequence analysis platform. BMC Res Notes. 2012. Dec;5(1):50. doi: 10.1186/1756-0500-5-50 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Katoh K, Standley DM. MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability. Mol Biol Evol. 2013. Apr 1;30(4):772–80. doi: 10.1093/molbev/mst010 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Černý J, Selinger M, Palus M, Vavrušková Z, Tykalová H, Bell-Sakyi L, et al. Expression of a second open reading frame present in the genome of tick-borne encephalitis virus strain Neudoerfl is not detectable in infected cells. Virus Genes. 2016. Jun;52(3):309–16. doi: 10.1007/s11262-015-1273-y [DOI] [PubMed] [Google Scholar]
  • 32.Kumar S, Stecher G, Tamura K. MEGA7: Molecular Evolutionary Genetics Analysis Version 7.0 for Bigger Datasets. Mol Biol Evol. 2016. Jul;33(7):1870–4. doi: 10.1093/molbev/msw054 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Letunic I, Bork P. Interactive tree of life (iTOL) v3: an online tool for the display and annotation of phylogenetic and other trees. Nucleic Acids Res. 2016. Jul 8;44(1):242–5. doi: 10.1093/nar/gkw290 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Martin DP, Posada D, Crandall KA, Williamson C. A Modified Bootscan Algorithm for Automated Identification of Recombinant Sequences and Recombination Breakpoints. AIDS Res Hum Retroviruses. 2005. Jan;21(1):98–102. doi: 10.1089/aid.2005.21.98 [DOI] [PubMed] [Google Scholar]
  • 35.Lole KS, Bollinger RC, Paranjape RS, Gadkari D, Kulkarni SS, Novak NG, et al. Full-Length Human Immunodeficiency Virus Type 1 Genomes from Subtype C-Infected Seroconverters in India, with Evidence of Intersubtype Recombination. J Virol. 1999. Jan 1;73(1):152–60. doi: 10.1128/JVI.73.1.152-160.1999 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.McGuire G, Wright F. TOPAL 2.0: improved detection of mosaic sequences within multiple alignments. Bioinformatics. 2000. Feb 1;16(2):130–4. doi: 10.1093/bioinformatics/16.2.130 [DOI] [PubMed] [Google Scholar]
  • 37.Ly-Trong N, Naser-Khdour S, Lanfear R, Minh BQ. AliSim: A Fast and Versatile Phylogenetic Sequence Simulator for the Genomic Era. Crandall K, editor. Mol Biol Evol. 2022. May 3;39(5). doi: 10.1093/molbev/msac092 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Drummond AJ, Suchard MA, Xie D, Rambaut A. Bayesian Phylogenetics with BEAUti and the BEAST 1.7. Mol Biol Evol. 2012. Aug;29(8):1969–73. doi: 10.1093/molbev/mss075 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Minh BQ, Schmidt HA, Chernomor O, Schrempf D, Woodhams MD, von Haeseler A, et al. IQ-TREE 2: New Models and Efficient Methods for Phylogenetic Inference in the Genomic Era. Teeling E, editor. Mol Biol Evol. 2020. May 1;37(5):1530–4. doi: 10.1093/molbev/msaa015 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Yu G, Smith DK, Zhu H, Guan Y, Lam TT. ggtree: an r package for visualization and annotation of phylogenetic trees with their covariates and other associated data. McInerny G, editor. Methods Ecol Evol. 2017. Jan;8(1):28–36. [Google Scholar]
  • 41.Adelshin RV, Sidorova EA, Bondaryuk AN, Trukhina AG, Sherbakov DYu, White III RA, et al. “886-84-like” tick-borne encephalitis virus strains: Intraspecific status elucidated by comparative genomics. Ticks Tick-Borne Dis. 2019. Aug;10(5):1168–72. doi: 10.1016/j.ttbdis.2019.06.006 [DOI] [PubMed] [Google Scholar]
  • 42.Demina TV, Dzhioev YuP, Verkhozina MM, Kozlova IV, Tkachev SE, Plyusnin A, et al. Genotyping and characterization of the geographical distribution of tick-borne encephalitis virus variants with a set of molecular probes. J Med Virol. 2010. May;82(6):965–76. doi: 10.1002/jmv.21765 [DOI] [PubMed] [Google Scholar]
  • 43.Deviatkin AA, Karganova GG, Vakulenko YA, Lukashev AN. TBEV Subtyping in Terms of Genetic Distance. Viruses. 2020. Oct 31;12(11):1240. doi: 10.3390/v12111240 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Clark JJ, Gilray J, Orton RJ, Baird M, Wilkie G, Filipe A da S, et al. Population genomics of louping ill virus provide new insights into the evolution of tick-borne flaviviruses. Holbrook MR, editor. PLoS Negl Trop Dis. 2020. Sep 14;14(9). doi: 10.1371/journal.pntd.0008133 [DOI] [PMC free article] [PubMed] [Google Scholar]
PLoS Negl Trop Dis. doi: 10.1371/journal.pntd.0011141.r001

Decision Letter 0

Samuel V Scarpino, Wen-Ping Guo

17 Jul 2022

Dear Dr. Bazykin,

Thank you very much for submitting your manuscript "The recently identified long-living lineage of the tick-borne encephalitis virus is a result of an ancient recombination event" for consideration at PLOS Neglected Tropical Diseases. As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. In light of the reviews (below this email), we would like to invite the resubmission of a significantly-revised version that takes into account the reviewers' comments.

please revise the manuscript based on the formatting requirements, you can find it at https://journals.plos.org/plosntds/s/submission-guidelines

We cannot make any decision about publication until we have seen the revised manuscript and your response to the reviewers' comments. Your revised manuscript is also likely to be sent to reviewers for further evaluation.

When you are ready to resubmit, please upload the following:

[1] A letter containing a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.

[2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file).

Important additional instructions are given below your reviewer comments.

Please prepare and submit your revised manuscript within 60 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email. Please note that revised manuscripts received after the 60-day due date may require evaluation and peer review similar to newly submitted manuscripts.

Thank you again for your submission. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments.

Sincerely,

Wen-Ping Guo

Academic Editor

PLOS Neglected Tropical Diseases

Samuel Scarpino

Section Editor

PLOS Neglected Tropical Diseases

***********************

please revise the manuscript based on the formatting requirements, you can find it at https://journals.plos.org/plosntds/s/submission-guidelines

Reviewer's Responses to Questions

Key Review Criteria Required for Acceptance?

As you describe the new analyses required for acceptance, please consider the following:

Methods

-Are the objectives of the study clearly articulated with a clear testable hypothesis stated?

-Is the study design appropriate to address the stated objectives?

-Is the population clearly described and appropriate for the hypothesis being tested?

-Is the sample size sufficient to ensure adequate power to address the hypothesis being tested?

-Were correct statistical analysis used to support conclusions?

-Are there concerns about ethical or regulatory requirements being met?

Reviewer #1: Revisions ...

Pg. 6 Par. 2 Line 1: How did the authors determine which model (i.e., Kimura’s 2-parameter model) best fit the applied TBEV sequence alignment?

Comments & Suggestions ...

Methods: Inclusion of an additional positive control or engineered in silico model for TBEV recombination (similar to the cited Bertrand et al. reference) may further strengthen the robustness of the proposed GSAUC method. Does this method detect artificially-introduced recombination events? Furthermore, how do the alternative RDP4 methodologies perform as a baseline in these same scenarios?

Reviewer #2: - The authors excluded two most divergent variants of the Siberian and Far-East genotypes (TBEV-178-79 (GenBank ID EF469661) and TBEV-2871 (GenBank ID MF774565)) from the presented analysis. Why not improve the power of the analysis, given the otherwise limited virus sampling and low genomic variation that hindered prior analyses? If these viruses are indeed recombinants, as the authors concern, any evidence in support of this assertion will strengthen the main conclusion of the study. If the inclusion of these two sequences affects the main conclusion of this study, this result could be used to reveal critical dependencies of the recombination analysis and call for further expansion of genomic exploration of the TBEV natural diversity. One way or another, confidence of conclusions, original or revised, would be improved.

- Consensus sequence versus all sequences. The authors used consensus sequences of genotypes (groups) for recombination analysis without providing a rational and acknowledging that it may decrease the analysis power. Besides, there are two other issues with this choice: a) no details provided about how consensus sequences were generated, including the use of weighting or another method to correct for uneven virus sampling (even if the two most divergent sequences were excluded); b) despite consensus sequence being used for each group, GS score calculation relies on analysis of all sequences in each group (p. 6), which is confusing.

- Provide a Supplementary Table detailing sequences used in the study

Reviewer #3: Objectives of the study are clearly articulated, the design is appropriate.

--------------------

Results

-Does the analysis presented match the analysis plan?

-Are the results clearly and completely presented?

-Are the figures (Tables, Images) of sufficient quality for clarity?

Reviewer #1: Revisions ...

Pg. 13 Par. 3: Several indicated node and LCA values for the JCF1 and JCF2 phylogenies (e.g., nodes C and D) are inconsistent with the referenced supplemental Figure S1 infographics.

Figure S1: Based on the JCF2 phylogeny, the associated metadata for Node C appears to be incorrectly listed in the JCF1 versus intended JCF2 column. In addition, similar metadata for Node D is listed in the JCF2 column despite annotation on the JCF1 phylogeny. Relative to Figure 5, these discrepancies may stem from the apparent swap of the labeled JCF1 and JCF2 phylogenies in Figure S1.

Comments & Suggestions ...

Pg. 9 Par. 2 Line 3-5 (and Figure 2): Expand upon the provided text to clarify how the results of the applied RDP4 package analytics: TOPAL, SIMPLOT, and BootScan support signatures of Far Eastern and Siberian genotype recombination. Inclusion of these tests provides an analytical foundation the authors build upon in the proposed GSAUC method, and readers may be unfamiliar with interpretation of these results. In brief, what particular features of the TOPAL, SIMPLOT, and BootScan output support (or potentially contradict) the results of the applied GSAUC method? Minimally, expansion of the descriptive text in the Figure 2 legend would be highly informative (such as the labeled thresholds and gray-scaled line series in the TOPAL/DSS score panel).

Figure 3: Based on the provided inclusion criteria and the results detailed in Figures 3 and 4, GS scores for the Siberian genotype exceeded the defined 0.5 threshold with GSAUC statistic p-values < 0.05 in the encoded capsid and prM-E gene segments. Why did the authors disqualify these signals as putative joint characteristic fragments (JCFs)?

Reviewer #2: - Provide a chapter describing four new TBEVs and their genome sequences and relationship with known TBEVs. Acknowledge the mouse and tissue culture propagation of TBEV as sources of extra mutations. Its (limited?) scale and impact on downstream inferences should be discussed in the Discussion section.

- p. 9: detail results presented in Fig. 2, e,g. plot distribution, peaks, agreement between different methods, type of evidence, etc.

- the time-based analysis should be moved from Discussion p.13 to the Results if retained in the paper at all, which is uncertain. The lack of the root-to-tip regression signal and high uncertainty of the HPD ranges (see below) question reliability of this analysis. Choice of the outgroup and testing of evolutionary models should be substantiated.

- Move Fig. 3 GS and genomic plots to the bottom and top of Fig. 2, respectively. This will facilitate comparison and reveal agreement between results of different methods.

- Figure 4: explain why plots look continuous rather than discrete functions, despite the underlying analysis uses a sliding window with a discrete step. Could the function be extended below 0.5? Could a log scale be used at the Y axis and the p-value cut-offs of 0.05 and 0.01 be specified ?

- What is a tree topology of phylogeny that is based on the entire genome minus JCF2 and JCF3? Does it match the topology of Fig. 1 and JCF1 trees? If so, why not use it instead of JCF1, which selection is a bit arbitrary?

- Provide sequence alignment(s) to illustrate and support recombination

Reviewer #3: The results are clearly and completely presented.

--------------------

Conclusions

-Are the conclusions supported by the data presented?

-Are the limitations of analysis clearly described?

-Do the authors discuss how these data can be helpful to advance our understanding of the topic under study?

-Is public health relevance addressed?

Reviewer #1: Comments & Suggestions ...

Discussion: Sukhorukov et al. might consider addressing how this new methodology can be applied more broadly in their field. Is the GSAUC method limited to the detection of ancestral recombination events (due to underlying model assumptions/parameters), or can this tool be applied to more recent evolutionary scenarios in other viral systems? Inclusion of additional text would highlight inappropriate applications and strengthen readership impact/utility.

Reviewer #2: - See Methods

- The manuscript is dominated by the use of “recombination” without acknowledgement that obtained results are about the incongruence of trees for different genomic regions, with a homologous recombination being one of possible explanations of the observed patterns. Accordingly, the word “recombination” should be used carefully and seldom, and other explanations for the obtained results should be presented and, if discarded, explicit reasoning should be provided. - The same is implied to the title and conclusions.

- Limitations of the study are yet to be specified

Reviewer #3: The conclusions are correct and supported by the data presented.

--------------------

Editorial and Data Presentation Modifications?

Use this section for editorial suggestions as well as relatively minor modifications of existing data that would enhance clarity. If the only modifications needed are minor and/or editorial, you may wish to recommend “Minor Revision” or “Accept”.

Reviewer #1: Revisions ...

Pg. 4 Par. 2 Line 17: Provide the expanded definition for “AUC” in the “GSAUC” acronym.

Figure 1: Consider editing the applied color-scheme for the bootstrap legend to allow better visibility/step differentiation (such as an orange-blue scale). Currently, it is difficult (partially due to size) to discriminate between the green lines (0.95 and 1.00) and find the bright yellow (0.90) colored lines.

Comments & Suggestions ...

Figure/Table Descriptions: In general, provision of more descriptive text for applied data series, axes, thresholds, color-schemes, icons, and numerical labels will aid in orienting readers and data interpretation.

Reviewer #2: Introduction.

- Note that most analysed TBEV sequences are just sequences, they are “variants” but not “strains”, that designation is reserved for well characterized variants.

- Up-to-date taxonomy of the Flaviviridae including how it must be spelled and written is here (10.1099/jgv.0.000672)

- The presented analysis concerns a flavivirus species, which includes all variants of TBEV and this species should be specified. Consult https://doi.org/10.1371/journal.ppat.1009318 for understanding differences between viruses and virus species.

- TBEV genome is mRNA that is translated to produce polyprotein, cleaved by virus and host proteases to mature proteins.

- Describe the basis of the TBEV genotyping, e.g. genome region, method, criterion etc

- Describe evidence for homologous recombination in TBEV (genome regions, genotypes) and causes of uncertainty of these findings.

- p.3. Provide reference for the observation of infection by multiple TBEV genotypes of a patient.

- “In the agar…” sentence does not have sense

Discussion

- p.11: discuss “evidence for it [recombination] has been controversial”. Show how the controversial points were addressed in this study, especially a role of the new statistics; why it is superior compared for the Topal statistic, for instance.

- p.11: make explicit reference to tree nodes (A, B, C, etc) in Fig. S1, when discussing the results.

- p.12: based on the Results section, it is misstatement to list the JCF1 region as of the possible recombinant origin; should be corrected

Others:

- Specify genome fractions occupied by JCF2 and JCF3

- Could you say that JCF2 and JCF3 are the only regions of possible recombinant origins, given the analyzed dataset?

- Illustrations and elsewhere: make clear difference between TBEV genotypes of the same virus species and the outgroup, which belongs to another flavivirus species.

- Figures 2-4: specify query

- Fig. S1: a) Results for JCF1 and JCF2 may have been swapped relative to Fig. 5 and Fig. S1-associated Table; b) 95% HPD for many nodes are highly uncertain that need to be discussed in respect to the time-based inferences; c) some nodes for three phylogenies must have different names (e.g. F node for JCF3 versus JCF1 and JCF2)

Reviewer #3: Specific comments:

1. Title: It would be good to change the title to reflect better the obtained results. For example, the Baikal TBEV genotype is a result of a recombination between Siberan and Far-Eastern TBEV strains.

2. Abstract: The Baikal and Himalayan genotypes are already considered as new TBEV genotypes.

3. Abstract: TBEV samples – better: TBEV strains

4. Introduction: Flavivirus genus is not the only genus of the family Flaviviridae

5. The reference number 1 is not appropriate. Please, cite any recent review or ICTV release.

6. “short single-stranded positive-sense RNA viruses” – unusual wording

7. “encodes a single transcript that is translated” – better: encodes a single open reading frame that is translated

8. it is not clear how the non-structural proteins are involved in transmission of the virus?

9. Introduction second paragraph: Far East (capital F)

10. Page 3 and further: in the scientific literature a name “Baikalian subtype/genotype” is already well established instead of “886-84-like”. The “886-84-like” name is not well known in the Western literature and most papers use just the name “Baikalian”. This reviewer strongly encourages to use the name Baikalian genotype thorough the manuscript.

11. Page 3 end of the second paragraph: a reference is missing

12. Page 3 third paragraph: not “reaction of neutralization” but “neutralization test” or “neutralization assay”

13. Page 4 second paragraph: … study, we developed….

14. Figure 2: please, prepare a better figure legend. It is not good to say only “results from the analysis”. The legend should be self-explanatory; i.e., describe what was the purpose, what was done and what is shown. The same for Figure 2 and 5.

--------------------

Summary and General Comments

Use this section to provide overall comments, discuss strengths/weaknesses of the study, novelty, significance, general execution and scholarship. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. If requesting major revision, please articulate the new experiments that are needed.

Reviewer #1: In this manuscript, Sukhorukov et al. introduce a newly developed method to infer ancestral recombination events in the tick-borne encephalitis virus (TBEV) genome. Here, the authors implemented a recombination model (i.e., GSAUC method) to specifically detect and validate evolutionary signals of recombination events within the TBEV 886-84-like lineage; in particular, the evidence for two proposed recombination signatures were outlined between the ancestral Far Eastern (FE) and Siberian genotypes and within the more recent Far Eastern Shenjang lineage and 886-84-like clade.

This document is well-written. In particular, the implemented GSAUC method computational model, hypothesis testing, and data interpretation/validation are all outlined in an intuitive framework which will allow readers to reproduce or translate the proposed method via the provided GitHub link and referenced software packages. In preparation, Sukhorukov et al. could consider further validation of the robustness of their testing algorithm using engineered in silico data models or additional, comparative software packages/code repositories.

Reviewer #2: Sukhorukov et al., present new genome sequences of four TBEV collected in the Baikal area of Russia over six years and phylogenomic analysis of TBEV sequences available on 2018 along with the newly sequenced genomes. Using diverse software and an original statistical test, developed for this study, they found support for conflicting evolutionary histories of different genome regions encoded by a TBEV group known as 886-84-like cluster (genotype), including the new genome sequences. The authors proposed that an ancestor of this cluster emerged as result of ancient recombination(s) between ancestors of two main TBEV genotypes, Siberian and Far-East.

The conducted analysis is informative (after addressing the criticisms), although hardly definitive in resolving the uncertainty concerning a role of homologous recombination in the evolution of TBEV, contrary to the authors claim.

Reviewer #3: Sukhorukov et al. submitted a manuscript titled “The recently identified long-living lineage of the tick-borne encephalitis virus is a result of an ancient recombination event” for peer-review procedure in PLoS Neglected Tropical Diseases. The manuscript identified that the newly described Baikalian TBEV subtype may be a result or a recombination event between Siberian and Far-Eastern TBEV strains. This discovery is of interest and provide the first clear and statistically solid evidence of recombination in TBEV. However, the title and several other issues listed above need to be addressed before the manuscript can be considered acceptable for publication.

--------------------

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Reviewer #3: No

Figure Files:

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org.

Data Requirements:

Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5.

Reproducibility:

To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols

Attachment

Submitted filename: MinorRevision_ReviewerComments_20220706.docx

PLoS Negl Trop Dis. doi: 10.1371/journal.pntd.0011141.r003

Decision Letter 1

Samuel V Scarpino, Wen-Ping Guo

20 Nov 2022

Dear Dr. Bazykin,

Thank you very much for submitting your manuscript "The Baikal subtype of tick-borne encephalitis virus is a result of recombination between Siberan and Far-Eastern subtypes" for consideration at PLOS Neglected Tropical Diseases. As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. The reviewers appreciated the attention to an important topic. Based on the reviews, we are likely to accept this manuscript for publication, providing that you modify the manuscript according to the review recommendations.

Please prepare and submit your revised manuscript within 30 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email.

When you are ready to resubmit, please upload the following:

[1] A letter containing a detailed list of your responses to all review comments, and a description of the changes you have made in the manuscript.

Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out

[2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file).

Important additional instructions are given below your reviewer comments.

Thank you again for your submission to our journal. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments.

Sincerely,

Wen-Ping Guo

Academic Editor

PLOS Neglected Tropical Diseases

Samuel Scarpino

Section Editor

PLOS Neglected Tropical Diseases

***********************

Reviewer's Responses to Questions

Key Review Criteria Required for Acceptance?

As you describe the new analyses required for acceptance, please consider the following:

Methods

-Are the objectives of the study clearly articulated with a clear testable hypothesis stated?

-Is the study design appropriate to address the stated objectives?

-Is the population clearly described and appropriate for the hypothesis being tested?

-Is the sample size sufficient to ensure adequate power to address the hypothesis being tested?

-Were correct statistical analysis used to support conclusions?

-Are there concerns about ethical or regulatory requirements being met?

Reviewer #1: In this revised manuscript, Sukhorukov et al. comprehensively address all provided reviewer comments in addition to suggestions for further supporting evidence/analytics. In particular, expansion of the methods section and figure/table caption text will help orient and guide readers through the logic of the presented data analyses.

Reviewer #3: (No Response)

--------------------

Results

-Does the analysis presented match the analysis plan?

-Are the results clearly and completely presented?

-Are the figures (Tables, Images) of sufficient quality for clarity?

Reviewer #1: Minor Revisions:

Pg. 10 Par. 3 Title: (grammatical) Alternatively, consider “The Baikal subtype is evidence of recombination.”

Pg. 13 Par. 4 Line 3: Replace the reference to “node F” with “node J.” There is no node F in the revised JCF3 phylogeny (see additional Figure S2 table revision below).

Reviewer #3: (No Response)

--------------------

Conclusions

-Are the conclusions supported by the data presented?

-Are the limitations of analysis clearly described?

-Do the authors discuss how these data can be helpful to advance our understanding of the topic under study?

-Is public health relevance addressed?

Reviewer #1: Minor Revisions:

Pg. 15 Par. 2 Line 1: How do the mentioned Figure 3 plots indicate putative recombination segment length? Do the authors mean to reference Figure 2D instead where JCF1, JCF2, and JCF3 are annotated relative to genetic position?

Pg. 16 Par 2 Line 2: No LCA value of 678 is provided in the Figure S2 table. Is the value 676 (rounded from 675.5) for the respective node B median height intended?

Reviewer #3: (No Response)

--------------------

Editorial and Data Presentation Modifications?

Use this section for editorial suggestions as well as relatively minor modifications of existing data that would enhance clarity. If the only modifications needed are minor and/or editorial, you may wish to recommend “Minor Revision” or “Accept”.

Reviewer #1: Minor Revisions:

Pg. 4 Par. 3 Line 2: (grammatical) Consider “Most of these methods evaluate the strength...” instead.

Table S1: How do three asterisks differ or encapsulate the provided one and two asterisk descriptions? Alternatively, consider use of numerical superscript annotation (e.g., 1, 2, 3, etc.) instead.

Figure S2: Replace all “886-84-like” references with “Baikal subtype” to be consistent with the rest of the manuscript (if applicable).

Figure S2 (table): Remove the entry for node F in the JCF3 column. This information appears to have been moved to node J in this column based on the revised supplemental figure.

Comments & Suggestions:

Figures: Streamline the subtype-specific color-schemes between all figures. For example, the colors applied in Figure S1A do not match those applied in all the other figures.

Figure 2 Description: Provide more specific text than “in some of the alignment regions” or “for some regions” where indicated. For example, the authors may consider highlighting how indicated Joint characteristic fragments (JCFs) annotated in the GS analysis (2D) correspond to positional segments which demonstrate graphical shifts in genetic distance (2B) and/or bootstrap support (2C) to specific subtypes as compared to surrounding positional segments.

Reviewer #3: (No Response)

--------------------

Summary and General Comments

Use this section to provide overall comments, discuss strengths/weaknesses of the study, novelty, significance, general execution and scholarship. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. If requesting major revision, please articulate the new experiments that are needed.

Reviewer #1: No additional major revisions or new analyses are recommended. Provided revisions and comments are minor with a few data/graphical inconsistencies indicated within the manuscript text and figures/tables.

Reviewer #3: The authors addressed all my comments. I have no further comments on this submission.

--------------------

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #3: No

Figure Files:

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org.

Data Requirements:

Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5.

Reproducibility:

To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols

References

Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article's retracted status in the References list and also include a citation and full reference for the retraction notice.

PLoS Negl Trop Dis. doi: 10.1371/journal.pntd.0011141.r005

Decision Letter 2

Samuel V Scarpino, Wen-Ping Guo

6 Feb 2023

Dear Dr. Bazykin,

We are pleased to inform you that your manuscript 'The Baikal subtype of tick-borne encephalitis virus is evident of recombination between Siberian and Far-Eastern subtypes' has been provisionally accepted for publication in PLOS Neglected Tropical Diseases.

Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. A member of our team will be in touch with a set of requests.

Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated.

IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript.

Should you, your institution's press office or the journal office choose to press release your paper, you will automatically be opted out of early publication. We ask that you notify us now if you or your institution is planning to press release the article. All press must be co-ordinated with PLOS.

Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Neglected Tropical Diseases.

Best regards,

Wen-Ping Guo

Academic Editor

PLOS Neglected Tropical Diseases

Samuel Scarpino

Section Editor

PLOS Neglected Tropical Diseases

***********************************************************

PLoS Negl Trop Dis. doi: 10.1371/journal.pntd.0011141.r006

Acceptance letter

Samuel V Scarpino, Wen-Ping Guo

21 Mar 2023

Dear Dr. Bazykin,

We are delighted to inform you that your manuscript, "The Baikal subtype of tick-borne encephalitis virus is evident of recombination between Siberian and Far-Eastern subtypes," has been formally accepted for publication in PLOS Neglected Tropical Diseases.

We have now passed your article onto the PLOS Production Department who will complete the rest of the publication process. All authors will receive a confirmation email upon publication.

The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any scientific or type-setting errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript. Note: Proofs for Front Matter articles (Editorial, Viewpoint, Symposium, Review, etc...) are generated on a different schedule and may not be made available as quickly.

Soon after your final files are uploaded, the early version of your manuscript will be published online unless you opted out of this process. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers.

Thank you again for supporting open-access publishing; we are looking forward to publishing your work in PLOS Neglected Tropical Diseases.

Best regards,

Shaden Kamhawi

co-Editor-in-Chief

PLOS Neglected Tropical Diseases

Paul Brindley

co-Editor-in-Chief

PLOS Neglected Tropical Diseases

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Table. Isolation and cultivation of the studied strains.

    (PDF)

    S2 Table. Primers used for sequencing.

    (PDF)

    S3 Table. Heights and posterior probabilities of select nodes shown in S2 Fig from the reconstructed Bayesian phylogeny of TBEV JCFs.

    (PDF)

    S1 Fig. GSAUC reliably detects simulated recombination event.

    (A) GS analysis of the artificially generated alignment with recombination; (B) GSAUC curves corresponding to the GS analysis.

    (TIF)

    S2 Fig. Nodes from the reconstructed Bayesian phylogeny of TBEV JCFs for which heights and posterior probabilities are presented in S3 Table.

    (TIF)

    Attachment

    Submitted filename: MinorRevision_ReviewerComments_20220706.docx

    Attachment

    Submitted filename: review_answer_PLOS.docx

    Attachment

    Submitted filename: review_answer_2.docx

    Data Availability Statement

    Python scripts used for calculation of GSAUC and alignments are available at https://github.com/gregoruar/tbev_rec. All other relevant data are within the manuscript and its Supporting Information files.


    Articles from PLOS Neglected Tropical Diseases are provided here courtesy of PLOS

    RESOURCES