ABSTRACT
Begomoviruses (family Geminiviridae, genus Begomovirus) significantly hamper crop production and threaten food security around the world. The frequent emergence of new begomovirus genotypes is facilitated by high mutation frequencies and the propensity to recombine and reassort. Homologous recombination has been especially implicated in the emergence of novel cassava mosaic begomovirus (CMB) genotypes, which cause cassava mosaic disease (CMD). Cassava (Manihot esculenta) is a staple food crop throughout Africa and an important industrial crop in Asia, two continents where production is severely constrained by CMD. The CMD species complex is comprised of 11 bipartite begomovirus species with ample distribution throughout Africa and the Indian subcontinent. While recombination is regarded as a frequent occurrence for CMBs, a revised, systematic assessment of recombination and its impact on CMB phylogeny is currently lacking. We assembled data sets of all publicly available, full-length DNA-A (n = 880) and DNA-B (n = 369) nucleotide sequences from the 11 recognized CMB species. Phylogenetic networks and complementary recombination detection methods revealed extensive recombination among the CMB sequences. Six out of the 11 species descended from unique interspecies recombination events. Estimates of recombination and mutation rates revealed that all species experience mutation more frequently than recombination, but measures of population divergence indicate that recombination is largely responsible for the genetic differences between species. Our results support that recombination has significantly impacted the CMB phylogeny and has driven speciation in the CMD species complex.
IMPORTANCE Cassava mosaic disease (CMD) is a significant threat to cassava production throughout Africa and Asia. CMD is caused by a complex comprised of 11 recognized virus species exhibiting accelerated rates of evolution, driven by high frequencies of mutation and genetic exchange. Here, we present a systematic analysis of the contribution of genetic exchange to cassava mosaic virus species-level diversity. Most of these species emerged as a result of genetic exchange. This is the first study to report the significant impact of genetic exchange on speciation in a group of viruses.
KEYWORDS: evolution, geminivirus, homologous recombination, mutation, speciation
INTRODUCTION
Viruses in the Geminiviridae family are major constraints to agricultural crop production and pose serious threats to global food security, especially those in the genus Begomovirus (1). Begomoviruses are dicot-infecting, whitefly-transmitted pathogens that severely limit many economically important crops in tropical and subtropical regions around the world (2). Begomovirus genomes consist of either one (monopartite) or two (bipartite) circular single-stranded DNA (ssDNA) genetic segments, each independently encapsidated in twinned, quasi-icosahedral particles (3). There are 424 established begomovirus species in the 2019 International Committee on the Taxonomy of Viruses (ICTV) master species list, the largest number of species for any virus genus. The frequent emergence of begomovirus genotypes and persistence of begomovirus disease epidemics is facilitated by increased agricultural trade of infected plant materials, the spread of polyphagous whitefly vector biotypes (1, 4, 5), and the accelerated rate of begomovirus evolution that stems from the vast amount of genetic diversity and the consequent adaptive potential found within populations (6).
Genetic diversity is generated by a combination of mutations and genetic exchange processes (i.e., recombination and reassortment). While mutations are the fundamental source of genetic variation, genetic exchange fuels diversity by combining extant mutations from distinct genomes to produce new haplotypes. Begomoviruses have high mutation frequencies (7) and substitution rates (comparable to those of RNA viruses) (8, 9) that independently enable the efficient exploration of both sequence space and adaptive landscapes under changing environmental conditions. However, recombination has also been extensively documented among begomoviruses and is implicated in the diversification of different disease complexes affecting a variety of crops (10–13). Recombination and reassortment can introduce significant variation in a single event and profoundly impact virus evolution by preventing the accumulation of deleterious mutations (14, 15) and potentially allowing access to novel phenotypes that would be difficult to attain by mutation alone. Some phenotypic modifications associated with genetic exchange in viruses include the modulation of virulence, novel strain emergence, evasion of host immunity, and antiviral resistance (16, 17). Therefore, examining patterns of viral genetic exchange is critical to understanding virus evolution and can help inform the development of control strategies.
Cassava mosaic begomoviruses (CMBs) are the causative agents of cassava mosaic disease (CMD), which frequently limits crop production of a staple food for ∼800 million people around the world (18). In 2019, Africa was the leading continent in terms of cassava production, accounting for over 63% of the 303 million tons produced, followed by Asia with 28% (https://www.fao.org/faostat/en/%23data/QC/). While the general resiliency of cassava against droughts and its tolerance of poor soil conditions have led to its widespread adoption in these regions, its susceptibility to CMD presents a major biotic constraint on production on these two continents. There are 11 identified species in the CMD species complex (Table 1). Nine CMB species are found in Africa: African cassava mosaic virus, African cassava mosaic Burkina Faso virus, Cassava mosaic Madagascar virus, South African cassava mosaic virus, East African cassava mosaic virus, East African cassava mosaic Cameroon virus, East African cassava mosaic Kenya virus, East African cassava mosaic Malawi virus, and East African cassava mosaic Zanzibar virus. Two additional CMB species, Indian cassava mosaic virus and Sri Lankan cassava mosaic virus, have been found exclusively in Asia. The African CMB species are extensively distributed throughout sub-Saharan Africa (19) and are one of the largest threats to cassava yield, accounting for up to $2.7 billion in annual losses (20). Although initial reports placed the Asian CMBs solely on the Indian subcontinent, SLCMV has expanded its distribution in recent years from India and Sri Lanka into Cambodia, Vietnam, Thailand, and China (21–24).
TABLE 1.
Continent | CMB species | Isolate abbreviation | Sample size |
|
---|---|---|---|---|
DNA-A | DNA-B | |||
Africa | African cassava mosaic Burkina Faso virus | ACMBFV | 4 | 1 |
African cassava mosaic virus | ACMV | 311 | 103 | |
South African cassava mosaic virus | SACMV | 132 | 96 | |
East African cassava mosaic virus | EACMV | 228 | 56 | |
East African cassava mosaic Cameroon virus | EACMCV | 28 | 9 | |
East African cassava mosaic Kenya virus | EACMKV | 114 | 67 | |
East African cassava mosaic Malawi virus | EACMMV | 15 | 1 | |
East African cassava mosaic Zanzibar virus | EACMZV | 18 | 13 | |
Cassava mosaic Madagascar virus | CMMGV | 1 | 1 | |
Total | 851 | 347 | ||
Asia | Sri Lankan cassava mosaic virus | SLCMV | 19 | 12 |
Indian cassava mosaic virus | ICMV | 10 | 10 | |
Total | 29 | 22 |
CMB genomes are bipartite, comprised of two circular segments of similar size (∼2.8 kb), which are referred to as DNA-A and DNA-B. On the virion-sense strand of the ssDNA genome, DNA-A has two partially overlapping genes that encode the coat (AV1) and precoat (AV2) proteins. The complementary strand of DNA-A encodes the replication-associated protein (AC1), the transcriptional activator protein (AC2), a replication enhancer (AC3), and an RNA-silencing suppressor (AC4). The DNA-B segment encodes two proteins, a nuclear shuttle protein in the virion sense (BV1) and a movement protein in the complementary sense (BC1) (25, 26). Although genetically distinct, both segments share a region (CR) of ∼200 nucleotides (nt) that includes a stem-loop structure with the conserved nonanucleotide TAATATTAC where rolling-circle replication is initiated. Additionally, the CR contains several regulatory elements, including multiple copies of cis-elements, known as iterons, which are binding sites for the replication-associated protein (27).
Analyses from field samples have revealed that both CMB segments are frequently evolving through homologous recombination (“recombination” is presumed to be homologous recombination in this work) (28–34). Most notably, recombination contributed to the emergence of a highly virulent hybrid of ACMV and EACMV isolates known as EACMV-Uganda (EACMV-UG) that caused severe disease outbreaks in East and Central Africa in the 1990s (35, 36). Due to the frequent characterization of emergent recombinants and the fact that distinct CMBs are commonly found infecting the same plant (37–39), recombination is regarded as a widespread phenomenon that significantly impacts CMB biodiversity and evolution.
Here, we present a systematic analysis of recombination and its influence on the evolution of the CMD species complex. By applying several recombination analysis tools to data sets of publicly available CMB sequences, we mapped a complex recombination history where interspecies recombination events correlated with the emergence of most (6/11) CMB species. While mutation was estimated to occur more often than recombination in all our data sets, our findings support interspecies recombination as the main driver of diversity at a macroevolutionary scale.
RESULTS
A total of 880 full-length DNA-A sequences and 369 DNA-B sequences from the 11 established CMB species were downloaded from NCBI GenBank in July 2019 (Table 1). The DNA-A isolates were classified based on the begomovirus 91% nucleotide identity species demarcation threshold. Pairwise nucleotide identity comparisons (see File S1 in the supplemental material) resulted in the reassignment of four isolates previously identified as EACMV sequences to EACMCV (accession numbers AY211887, AY795983, JX473582, and MG250164). Because the species definition does not extend to the DNA-B segment, DNA-B sequences were identified according to their species designation in GenBank (DNA-B segments are typically classified based on DNA-A sequences isolated from the same host sample or by highest nucleotide identity to an extant DNA-B sequence when no corresponding DNA-A sequence is available). Our data sets are imbalanced with respect to genomic segment (DNA-B is less frequently sampled than DNA-A) and geography (the sample size was larger for African CMBs than Asian CMBs). We present results for DNA-A followed by DNA-B.
Likely recombinant origin for 6 of 11 CMB species.
Since recombination is a major contributor to begomovirus evolution, standard phylogenetic approaches cannot fully recapitulate the evolutionary history of CMBs. Therefore, we used a split-network analysis to examine evolutionary relationships within the CMB phylogeny. The network (Fig. 1A) showed most sequences in tight clusters based on the 11 species. Some divergent isolates were found near the main clusters in the SACMV, EACMKV, and EACMV clades, suggestive of phylogenetic conflict and, potentially, recombination causing the divergence in those sequences. Multiple edges connecting branches of SLCMV and ICMV isolates indicate complicated patterns of recombination among the Asian CMBs, consistent with previous reports (40). The highly reticulate structure of the network implies an extensive history of recombination, both within and between species.
To further explore and characterize recombination among the CMB DNA-A sequences, the all-species alignment (n = 880) was analyzed using RDP4. An initial scan did not detect recombination between the Asian and African sequences. We split these sequences into two data sets (African, n = 851; Asian, n = 29) with the rationale that reducing the number of gaps in the alignments would improve accuracy of recombination detection. We performed RDP4 analysis on the two multiple alignments separately (with stringent settings, described in Materials and Methods) and identified a total of 24 high-confidence recombination events (Table 2), 16 for the African CMB data set and 8 for the Asian data set. Six unique events were supported in all representatives of individual species (depicted in Fig. 1B), suggesting a recombinant origin for 6 out of the 11 species: ACMBFV, EACMCV, EACMKV, EACMMV, EACMZV, and SLCMV. We refer to these events as macroevolutionary based on the hypothesis, discussed below, that the recombination events led to the original splitting of each relevant species cluster from “parent” species clusters. Most of these events have been reported previously, except for the associated with SLCMV. Similarity plots for all 24 high-confidence events using the best candidate parental sequences identified by RDP4 are presented in File S2 (Fig. S1 to S24).
TABLE 2.
Event type and no. | Recombinant | Region |
Parent |
Methodc | P valued | ||
---|---|---|---|---|---|---|---|
Begin | End | Major | Minor | ||||
Macroevolutionary | |||||||
1 | ACMBFV | 1730 | 71 | ACMV | Unknown (ToLCCMV)b | GBMCST | 6.8 × 10−07 |
2 | EACMCV | 1066 | 1834 | EACMV | Unknown | RGMCST | 3.1 × 10−09 |
3 | EACMKV | 38 | 1076 | SACMV | EACMCV | RGMCST | 1.2 × 10−18 |
4 | EACMMV | 1986 | 16 | SACMV | EACMCV | RGMST | 3.1 × 10−09 |
5 | EACMZV | 2085 | 2762 | EACMKV | CMMGV | RGMCST | 2.9 × 10−16 |
6 | SLCMV | 1346a | 2736a | ICMV | Unknown | RBMCST | 1.2 × 10−05 |
Other | |||||||
7 | EACMV-UG | 544 | 1008 | EACMV | ACMV | RGBMCST | 6.2 × 10−9 |
8 | SACMV | 501 | 906a | SACMV | EACMCV | RGBMCST | 3.1 × 10−9 |
9 | SACMV | 133 | 445 | SACMV | ACMV | RBMCST | 5.0 × 10−8 |
10 | SACMV | 510 | 1103 | SACMV | CMMGV | RGBMCST | 1.2 × 10−3 |
11 | EACMKV | 1839 | 2776 | EACMV | SACMV | RGBMCST | 9.7 × 10−13 |
12 | EACMKV | 550 | 1053 | EACMKV | SACMV | RBMCST | 2.4 × 10−13 |
13 | EACMKV | 1099a | 1840 | SACMV | Unknown (EACMCVb) | RGBMCST | 3.1 × 10−9 |
14 | EACMKV | 591 | 1156 | EACMV | CMMGV | RGBMCST | 4.8 × 10−13 |
15 | EACMKV | 766 | 1045a | EACMKV | SACMV | RGBMCST | 1.4 × 10−5 |
16 | EACMV | 1619 | 2081 | EACMV | SACMV | RGMCST | 2.9 × 10−6 |
17 | EACMCV | 19 | 186a | EACMV | Unknown | RGBMCST | 3.8 × 10−03 |
18 | ICMV | 1770 | 15 | SLCMV | ICMV | RGBMCST | 2.5 × 10−2 |
19 | ICMV | 1339 | 1869 | ICMV | SLCMV | RGBMCST | 2.9 × 10−7 |
20 | SLCMV | 130 | 1305 | SLCMV | ICMV | RGBMCST | 1.1 × 10−3 |
21 | SLCMV | 1339 | 2734 | ICMV | SLCMV | RGBMCST | 7.4 × 10−16 |
22 | ICMV | 6 | 280 | ICMV | Unknown | RGBMCST | 4.8 × 10−4 |
23 | SLCMV | 16a | 516a | SLCMV | Unknown | RGBMCST | 9.0 × 10−5 |
24 | ICMV | 2719 | 329 | ICMV | ICMV | RGMCST | 6.7 × 10−3 |
Actual breakpoint is undetermined; most likely overprinted by subsequent recombination event.
BLASTn result with highest percent identity to fragment.
R, RDP; G, GeneConv; B, Bootscan; M, MaxChi; C, Chimaera; S, SisScan; T, 3SEQ.
Corresponds to the underlined method, which displayed the highest P value (i.e., weakest evidence against the null hypothesis) under the Bonferroni-corrected 0.05 threshold.
As in Tiendrébéogo et al. (32), ACMBFV was identified as a recombinant of ACMV with a recombinant fragment spanning most of the AC1 open reading frame (ORF), the entire AC4 ORF, and a portion of the CR. Despite RDP4 choosing CMMGV as the minor parent for this event in our analysis, low nucleotide identity (<80%) within the recombinant region makes it an unlikely parental sequence (Fig. S1). BLAST analysis of the recombinant portion identified a tomato leaf curl Cameroon virus (ToLCCMV) sequence as the closest relative currently in GenBank, which is consistent with the previous report. The EACMCV and EACMZV macroevolutionary recombination events (events 2 and 5) corroborate results from previous recombination analyses where they were characterized as recombinants through pairwise sequence comparisons (29, 30). No significant virus donor was identified for the EACMCV recombinant fragment, but its major parent was likely EACMV. EACMMV has been described as an EACMV-like recombinant (28, 41), yet RDP4 suggested SACMV as the most likely major parent in our analysis. The conflict between these results is most likely due to the very high degree of similarity between the regions covering AC3, AC2, and the 3′ end of AC1 in SACMV and EACMV (Fig. 2A), which suggests a shared evolutionary history for that region among the two species. Since high similarity can confound recombination detection, it becomes hard to unambiguously detect correct breakpoints and potential parental sequences. However, analysis with similarity plots showing a drop-off in similarity at one of the boundaries of this region between EACMV and EACMMV points to SACMV being the more likely major parental species (scenario 1, illustrated in Fig. 2B). The high-sequence-similarity region also affects candidate parent sequence identification for EACMKV (Fig. 2C, discussed below).
Curiously, the single available CMMGV sequence, which has been previously characterized as recombinant (33), did not display any putative recombinant regions within its genome. It was reported that CMMGV had minor fragments donated by both SACMV and EACMZV-like sequences. However, a close examination using the distance plot and phylogenetic tree construction tools in RDP4 revealed that only one SACMV sequence (the minor parent identified by Harimalala et al. [33] and the first SACMV isolate ever fully sequenced [42]; accession number AF155806) had high similarity in the AV1 recombinant region with CMMGV (event 10; Fig. S10), whereas all other SACMV genomes did not. As a result, it seems more plausible that CMMGV acted as a donor to that single SACMV isolate. In the case of the second minor fragment, our RDP4 analysis suggested that CMMGV was the donor virus and EACMZV the recipient (event 5; Fig. S5), contrary to what was argued previously (33). At the moment, we cannot distinguish the direction in which the fragment was donated, so there is no definitive evidence as to whether the sole CMMGV isolate represents a recombinant species.
The results showed frequent recombination between ICMV and SLCMV, which made it difficult to resolve the recombination profiles within the Asian CMB data set. This issue is suggested by the statistically undetermined breakpoints in the SLCMV species-wide event involving ICMV and an unknown minor parent (event 6; Fig. S6), which points to likely overprinting by subsequent recombination events. A total of 16 out of 19 SLCMV sequences were predicted to be descendants of this event. However, the three remaining SLCMV isolates (accession numbers AJ314737, KP455484, and AJ890226) showed evidence of a similar event between ICMV and an SLCMV-like isolate with almost the same breakpoints (event 21; Fig. S21). Altogether, these results indicate a recombinant origin for SLCMV.
Other high-confidence DNA-A events confirm previously described recombinants.
In addition to the six macroevolutionary events, 18 other events were detected in the DNA-A data sets (Table 2). Among these events, the most well-represented event was that of the frequently studied EACMV-UG recombinant (event 7; Fig. S7), found in 97 of the 228 EACMV sequences. Of the 10 other nonmacroevolutionary events in the African data set, most were associated with either EACMKV or SACMV as the recombinants (5 and 3 events, respectively). Recombinants with evidence of events 8, 9, 12, 13, 14, 15, and 17 were collected in one of the most comprehensive CMB sampling studies to date, which took place in Madagascar (34). The EACMKV isolate in event 13 (accession number KJ888083) presented an interesting case, as it was classified as EACMKV by having 91.02% nucleotide identity to only one other EACMKV isolate (accession number KJ888079; File S1, Table S1), suggesting the event caused just enough divergence to where the sequence narrowly satisfies the criterion to be classified as EACMKV. A BLAST analysis revealed an EACMCV isolate from Madagascar (accession number KJ888077) as a highly similar recombinant donor, which was not identified by RDP4 as a parent despite being present in the data set.
Recombination event 11 (detected only in EACMKV accession number JF909125) could conflict with the recombinant origin for all other EACMKVs, as all EACMKV sequences can match both this and the profile suggested in event 3 (Fig. 2B). We maintain that event 3 is the more likely origin of the EACMKV species on the basis that it was detected in all EACMKV sequences in our RDP4 analysis. However, the alternative recombinant origin where an EACMV sequence acts as the major parent (event 11) is consistent with the first characterization of an EACMKV isolate (31) and remains a possibility for all other isolates. This highlights once more the challenge in characterizing events involving SACMV and EACMV-like sequences due to their region of high similarity (Fig. 2A). No evidence of recombination was found among EACMZV and EACMMV isolates.
Despite having a smaller pool of sequences, 4 genetic exchanges (events 18 to 21) between SLCMV and ICMV were identified in the Asian data set, indicating a recombination-prone history between the species (illustrated in Fig. 1).
Mutation occurs more frequently than recombination within the DNA-A segment of all CMB species.
We estimated nucleotide diversity (π) within all species (except for CMMGV and ACMBFV, which each had fewer than 5 sequences; Table 1) as a measure of standing genetic diversity. Nucleotide diversity for all species was within the same order of magnitude and ranged from 0.012 (for EACMMV) to 0.074 (for ICMV) (Table 3). No associations were observed between diversity and sample size. Additionally, we estimated per-generation, population-scaled rates of recombination (ρ) and mutation (θ) to assess the frequency of recombination within each species relative to mutation (ρ/θ). We further tested for the presence of recombination by calculating correlations between estimates of linkage disequilibrium (r2) and physical distance (d) and used a likelihood permutation test (LPT) of recombination (Table 3) with LDhat (43).
TABLE 3.
Clade | No. of sequences | π | S | r2, d | LPT P value | ρ | θW | ρ/θW |
---|---|---|---|---|---|---|---|---|
ACMV | 311 | 0.033 | 1317 | −0.041 | 0 | 46 | 208.54 | 0.22 |
SACMV | 132 | 0.014 | 863 | −0.104 | 0 | 28 | 158.17 | 0.18 |
EACMV | 228 | 0.057 | 1249 | −0.072 | 0 | 6 | 208.02 | 0.029 |
EACMCV | 28 | 0.048 | 712 | −0.176 | 0 | 4 | 182.97 | 0.022 |
EACMKV | 114 | 0.043 | 986 | −0.188 | 0 | 13 | 185.72 | 0.070 |
EACMZV | 18 | 0.031 | 352 | −0.036 | 0 | 2 | 102.34 | 0.020 |
EACMMV | 15 | 0.012 | 146 | −0.01 | 0.68 | 0 | 44.9 | NA |
SLCMV | 19 | 0.024 | 349 | −0.147 | 0 | 2 | 99.85 | 0.020 |
ICMV | 10 | 0.074 | 533 | −0.043 | 0 | 4 | 188.41 | 0.021 |
π, nucleotide diversity; average number of pairwise differences per site for samples within a clade. S, number of segregating sites. r2, square of the correlation coefficient between sites. d, physical distance. LPT, likelihood permutation test for the presence of recombination. ρ, population-scaled recombination rate. θw, Watterson’s infinite-sites estimator of the population-scaled mutation rate (θ).
The correlation between r2 and d was negative across all data sets, consistent with the expectation of linkage disequilibrium decay as distance is increased in the presence of recombination. The LPT indicated recombination in all species except EACMMV, which was consistent with the ρ = 0 estimate for that species. Across all populations, mutation was the dominant evolutionary mechanism in terms of frequency compared to recombination, as displayed by <1 values of the ρ/θ ratio (typically <0.03). Interestingly, the highest ρ/θ value was observed for ACMV (0.22), which was involved in three interspecies recombination events detected thus far (events 1, 7, and 9), but none within the species. SACMV and EACMKV were the other two clades with higher contributions of recombination and were also the two most featured species in our RDP4 results for the African CMB sequence alignment.
Sequence divergence between DNA-A recombinant species and their hypothesized major parents suggests interspecies recombination as the major contributor to phylogenetic divergence.
The average number of pairwise nucleotide differences per site within species (π) and between recombinant and predicted major parental species (DXY) were estimated in sliding windows to assess the effect of the macroevolutionary recombination events on phylogenetic divergence (Fig. 3). In every comparison, there was a pronounced increase over the genome-wide average of DXY in regions associated with macroevolutionary recombination events. This suggests appreciably different evolutionary histories in those regions compared to the rest of the genome, which supports species-wide recombination events as drivers of greater divergence than mutational and other minor recombination events. A noticeable peak in DXY within the CR and 5′ end of AV2 in the EACMCV-EACMV comparison was observed (Fig. 3). This region was detectably recombinant in one EACMCV sequence (event 17; accession number KJ888049). Close examination of the alignment in this region suggested that 23 of the remaining 27 EACMCV sequences have an undetected recombination event in this region but likely with different breakpoints from event 17. All samples with evidence of the undetected event and event 17 were sampled in West Africa, Comoros, or Madagascar, while the three EACMCV isolates without a recombination event in that part of the genome were sampled in East Africa. This supports the hypothesis that EACMCV originated in East Africa and acquired a second recombinant fragment in the West African isolates (41), and it is possible that the West African genotype has now been introduced to the Comoros and Madagascar. The uncharacterized event and event 17 clearly have contributed to the divergence within EACMCV (as evidenced in a spike in EACMCV nucleotide diversity; Fig. 3) and between EACMV and EACMCV. Similarly, a downstream increase in DXY and π for EACMV within the AV1 3′ end was observed, corresponding to the region of the EACMV-UG recombination event (event 7). While these are examples of how small recombination events have contributed to the phylogenetic divergence between species, our results show that the larger, ancestral interspecies recombination events are the driving force behind evolutionary divergence at the CMB species level.
Fewer high-confidence DNA-B recombination events.
Due to the high levels of divergence between DNA-B isolates, an all-species DNA-B alignment was difficult to construct. Therefore, we split the DNA-B sequences into three broad groups, EACMV-like (EACMV, SACMV, EACMKV, EACMMV, EACMZV, and EACMCV) plus CMMGV (n = 243), ICMV-SLCMV (n = 22), and ACMV-ACMBFV (n = 104), and conducted phylogenetic network (Fig. 4A) and RDP4 (Table 4) analyses on each group separately. For the EACMV-like group, we observe a network with sporadic reticulations indicating some recombination. The DNA-B sequences from most species do not form monophyletic clades, with isolates from EACMV, EACMKV, and SACMV spread out around the network. Isolates from EACMCV, which have been reported as clearly distinct from the rest of the EACMV-like DNA-B segments (44), are separated from the center of the network by long branches, indicating large genetic distances between them and the rest of the EACMV-like DNA-B segments. Similarly, a long branch separates CMMGV from all other clusters.
TABLE 4.
Event no. | Recombinant | Region |
Parent |
Methodsb | P valuec | ||
---|---|---|---|---|---|---|---|
Begin | End | Major | Minor | ||||
B1 | CMMGV | 1564 | 2730 | EACMKV | Unknown | RGBMCST | 1.5 × 10−10 |
B2 | SLCMV | 2596 | 2714 | ICMV | Unknown (SLCMV DNA-A)a | RGMCST | 1.5 × 10−5 |
B3 | EACMKV, EACMV | 861 | 1650 | EACMV | EACMV | RMCST | 8.2 × 10−4 |
B4 | EACMZV | 504 | 1535 | EACMZV | EACMZV | RGBMCST | 7.8 × 10−3 |
B5 | EACMV | 2740 | 2113 | EACMV | EACMZV | RGMST | 2.8 × 10−2 |
B6 | SACMV | 2228 | 2292 | SACMV | Unknown | RGBMCT | 2.7 × 10−2 |
B7 | EACMKV | 1123 | 1458 | EACMKV | EACMCV | RGBMCST | 8.8 × 10−10 |
B8 | ICMV | 1871 | 2527 | ICMV | Unknown | RGBMCST | 3.7 × 10−2 |
B9 | ICMV | 46 | 265 | ICMV | Unknown | RGMCT | 4.7 × 10−11 |
BLASTn result with highest percent identity to fragment.
R, RDP; G, GeneConv; B, Bootscan; M, MaxChi; C, Chimera; S, SisScan; T, 3SEQ.
Corresponds to the underlined method, which displayed the highest P value (i.e., weakest evidence against the null hypothesis) under the Bonferroni-corrected 0.05 threshold.
The ICMV-SLCMV network is more compact than the EACMV-like group, signifying a higher degree of genetic similarity between all isolates. All the SLCMVs are closely related to one another, and the branches for both SLCMV and ICMV isolates show some reticulation. The ACMV-ACMBFV sequences are also genetically very similar and have the least reticulation of the three networks (Fig. 4A).
A total of 9 recombination events were identified in the DNA-B data sets: 6 events in the EACMV-like group and 3 events in the ICMV-SLCMV group (designated B1 to B9) (Table 4). No events were detected in the ACMV-ACMBFV sequences. Similarity plots for all 9 high-confidence events using the best candidate parental sequences identified by RDP4 are presented in File S2 (Fig. S25 to S33).
Of the 9 events, 2 could be considered ancestral clade-founding events (Fig. 4B). We refer to these events as clade-founding rather than macroevolutionary to emphasize that this classification is distinct from the DNA-A-based species definition. Event B1 is associated with CMMGV (Fig. S25), which had an EACMKV isolate as a closely related major parent and a recombinant fragment from an unknown virus that spanned most of BC1 and the 5′ portion of the CR. This event was previously reported (33). Event B2 was associated with SLCMV (Fig. S26), where all 12 sequences had evidence of the event. In this event, ICMV was observed as a major parent with a fragment in the 5′-CR from an unknown parent. From a BLAST analysis, we identified that the fragment most likely originated from an SLCMV DNA-A sequence. This event has been described before and is believed to explain the evolution of SLCMV from a putative monopartite begomovirus, where an SLCMV-like sequence captured an ICMV DNA-B segment by donating the Rep-binding iteron sequences necessary for replication (45).
In addition to these two well-supported clade-forming events, there are two other recombination events that may have had a similar impact. Event B3 is observed in 8 different sequences classified as either EACMKV or EACMV, which suggests recombination was the mechanism of emergence for this small circulating clade. Event B7 is another example of an event that possibly led to the emergence of a small clade, identified in two EACMKV isolates (accession numbers JF909228 and JF909227) collected in the Seychelles archipelago (46). A recombination event related to the distinct EACMCV DNA-B clade was not substantiated under the stringent settings used in this analysis, but a recombinant profile for EACMCV DNA-B is referenced in Fondong et al. (29).
Mutation occurs more frequently than recombination within CMB DNA-B groups.
Nucleotide diversity and rates of mutation and recombination were estimated for ACMV-ACMBFV, EACMCV, and ICMV-SLCMV groupings (Table 5). The high degree of similarity within these groups justifies them being defined as individual populations for these analyses. The other species were not included in this analysis because of our inability to define meaningful populations. Nucleotide diversity estimates for the ACMV-ACMBFV DNA-B cluster were higher (0.067) than for ACMV DNA-A (0.033) (Table 3). The same was observed for the EACMCV DNA-A and DNA-B segments (0.048 and 0.088, respectively). However, we estimate a slightly higher standing genetic variation in the ICMV DNA-A sequences (0.074) than in the ICMV-SLCMV DNA-B group (0.062), indicating comparable levels of variability among the examined sequences.
TABLE 5.
Clade | No. of sequences | π | S | r2, d | LPT P value | ρ | θw | ρ/θW |
---|---|---|---|---|---|---|---|---|
ACMV-ACMBFV | 104 | 0.067 | 1335 | −0.009 | 0.003 | 28 | 193.31 | 0.145 |
EACMCV | 9 | 0.088 | 711 | −0.005 | 0 | 3 | 261.60 | 0.011 |
ICMV-SLCMV | 22 | 0.062 | 881 | −0.036 | 0 | 2 | 241.68 | 0.008 |
π, nucleotide diversity; average number of pairwise differences per site for samples within a clade. S, number of segregating sites. r2, square of the correlation coefficient between sites. d, physical distance. LPT, likelihood permutation test for the presence of recombination. ρ, population-scaled recombination rate. θw, Watterson’s infinite-sites estimator of the population-scaled mutation rate (θ).
We found evidence of linkage disequilibrium decay in all three data sets using the r2 measure, and the LPT indicated the presence of recombination in all groups. The ρ/θ ratio ranged from 0.008 to 0.145, showing that mutation is much more frequent than recombination for DNA-B. Overall, these within-group results for DNA-B (Table 5) were very similar to those for DNA-A (Table 3).
DISCUSSION
Recombination is an important and pervasive mechanism that contributes significantly to plant virus evolution (47) and is broadly documented among begomovirus species (10–13, 16). Our updated recombination profile of all sampled CMB full genomic segments to date reveals widespread intra- and interspecies recombination. A variety of complementary recombination analyses indicate that the majority of CMB species (6/11) have a recombinant origin. For the first time, we show a recombinant origin of SLCMV DNA-A, which likely descended from genetic exchange between an ICMV-like isolate and an unidentified begomovirus DNA-A segment. Surprisingly, our analysis did not support a recombinant origin of the single isolate of CMMGV, although it had been considered previously to be the product of genetic exchange between a major parent distantly related to CMBs and minor parents SACMV and EACMZV (33). Instead, our analyses consider CMMGV a parental virus, contributing to the creation of EACMZV and an SACMV recombinant (42). Although no macroevolutionary signals of recombination were detected in SACMV, ACMV, CMMGV, EACMV, and ICMV, it is possible that events associated with their emergence occurred so long ago that the distinguishing patterns of polymorphism created by recombination have been erased by subsequent mutations and cannot be detected. In the case of SACMV, an argument has been made for it having a recombinant origin based on molecular analyses and phylogenetic incongruencies observed in different parts of the genome where the AV1 ORF and CR resemble tomato yellow leaf curl virus isolates, the AC2, AC3, and AC1 3′ end are closely related to EACMV, and the 5′ end of the AC1/AC4 ORF portion seems to have a distinct evolutionary history (41, 42, 48, 49). Regardless of the undetectable contribution of recombination to all CMB DNA-As, we have strong evidence for recombination leading to speciation in the majority of currently defined CMB species.
Although parentage cannot be definitively established in some cases, fragments derived from EACMKV and SACMV lineages seem to have a high propensity for interspecies recombination (34). We also observe frequent recombination between both Asian CMB species, which supports past reporting of ICMV and SLCMV as a highly recombinogenic pair (50). Unsurprisingly, no recombination was detected between isolates originating in Asia and those from Africa. At this moment, there are no reports of Asian CMBs infecting cassava crops in Africa, and there is only one study where an African CMB (i.e., EACMZV) has been sampled in cassava crops in Asia, specifically in the West Asian country of Oman (51), where ICMV and SLCMV have never been identified. While we lack experimental or field evidence that Asian and African CMB species have the ability to recombine and produce viable viral progeny, it is probable that these viruses have had limited opportunities to coinfect the same host plant. However, ACMV isolates have been recovered in cotton crops in the South Asian country of Pakistan, and ACMV recombinant fragments have been detected in cotton-infecting begomoviruses, even though cassava is not cultivated there (52). This suggests an exchange of CMB species between Africa and southern Asia and hints at the role of alternative plant hosts in the emergence of interspecies recombinant begomoviruses (53). CMD management presents a unique challenge in that infected stem cuttings and rootstocks can mediate the long-distance spread of CMBs (1), as has been documented with the introduction of SLCMV into China (23, 24). Continuous CMB surveillance efforts are needed to ensure endemic viral species are not spread across continents via international trade and are not given the opportunity to spread and potentially recombine with native begomoviruses. Countries heavily involved in agricultural trade, such as Oman, should receive special attention, as they can become a sink for divergent begomovirus species and potentially a hub for the emergence of novel recombinant begomoviruses (54).
While mutation is more frequent, retained recombination events are more significant.
Estimates of the DNA-A relative rates of recombination and mutation (ρ/θ) show that mutations occur more often than recombination within all the analyzed clades, which is consistent with previous analysis based on the Rep and CP genes of other begomovirus species (55). Notably, while no recombination was readily detected in any of the ACMV sequences with RDP4, the LPT detected a signal of recombination, and our ACMV DNA-A data set had the highest frequency of genetic exchange relative to mutation. Since recombination signals were detected within the ACMV sequences, we interpret these results collectively to mean most ACMV recombination is intraspecific (illustrated in Fig. 1), which is difficult to detect with the methods used by RDP4. The lack of ACMV recombinants involving other CMB species, which has been mentioned in the literature (20, 41), might be explained by potential genome incompatibility and selection against mosaic sequences where donor fragments from divergent CMBs could disrupt intragenomic interactions and gene coadaptation (56, 57).
Mutation is clearly the more frequent process compared to recombination within all CMB species, confirming the conclusions of previous studies that show that the genetic diversity in begomovirus populations is predominantly shaped by mutation (10, 55). Some of the diversity we attribute to mutation could be introduced by recombination between nearly identical viruses, and, indeed, recombination between identical or nearly identical viruses, even if very frequent (58), will never be detected by any of our methods. However, the relative contributions of these processes to the evolution of CMBs are not necessarily a function of their frequency. The DXY sliding-window plots reveal that single recombination events are correlated with most of the divergence between putative parental species and their recombinant progeny species (Fig. 3). We conclude that the relatively higher rates of mutation relative to recombination on a microevolutionary scale are not reflective of the influence of recombination at the macroevolutionary scale, where interspecies recombination is the driving force behind the emergence of the majority of CMB species. While reports of begomovirus species that have emerged through recombination are common (59–63), this is the first time a systematic analysis of recombination, and its contribution to species diversity is performed within all known species of a begomovirus disease complex.
Although “speciation” does not directly apply to DNA-B sequences, it is clear that intersegment recombination has played a significant role in the evolution of viruses such as SLCMV. Indeed, the transreplicational capture of divergent DNA-B segments/satellite molecules by DNA-A sequences via recombination involving Rep-binding sites is a documented mechanism that has led to new associations resulting in different disease phenotypes (45, 64, 65). Additionally, it represents a plausible explanation for the potential evolutionary transition to bipartite begomoviruses from monopartite ancestors (44). Ultimately, this phenomenon has and continues to contribute to the genomic modularity of begomoviruses, which in turn can influence their evolvability.
The phylogenetic networks among the DNA-B groups (Fig. 4) were comparatively less reticulated than the DNA-A network (Fig. 1), and fewer high-confidence recombination events and recombinants were detected. Despite these comparisons, it is not yet clear if begomovirus DNA-B sequences are more or less prone to recombination than DNA-A more broadly. The expectation is that the genomic structure of DNA-B segments (where there are no overlapping genes and a larger proportion of noncoding regions relative to the highly overlapping and mostly coding DNA-A segments) imposes fewer selective constraints on recombinants than in DNA-A sequences (44, 66) and can tolerate greater nucleotide diversity, which we do observe (Tables 3 and 5). However, there are several factors that might explain why fewer recombinants were detected among the DNA-B data sets. No reliable, complete alignments were obtained due to the high divergence of DNA-B segments, hindering our ability to characterize recombination events using RDP4. RDP4 analyses were also explicitly set up to be conservative and to test for intrasegment recombination in this study, so additional events involving DNA-A and DNA-B segments might have been missed. Additionally, CMB DNA-B sequences are infrequently sampled compared to DNA-A sequences (Table 1), which reduces our ability to detect recombination both by having a lower number of exemplar parental sequences in our data sets and by having fewer representative DNA-B sequences. A recent study suggests that recombination occurs more frequently in DNA-B segments of New World begomoviruses relative to their cognate DNA-As, but this pattern may be virus specific (67). Moreover, previous research suggests that DNA-A and DNA-B sequences have been subjected to different evolutionary pressures, which have resulted in distinct evolutionary histories for the two segments, with further segment-specific differences found between New World and Old World begomoviruses (44). Regardless of the absolute rate of recombination differences that may exist between DNA-A and DNA-B segments, ρ/θ values for groups of DNA-B sequences follow the trend of DNA-A segments, which suggests that mutations occur more frequently than recombination events.
Viral recombination events are often deleterious (68–70). However, previous studies of begomovirus recombination have shown that there is a subset of fit recombinants that can be generated in the laboratory (71, 72) and be observed in nature (73). Recombination can additionally recover functional full-length genomes from populations with defective geminiviruses (70, 74, 75). Since begomovirus phenotypes associated with recombination include altered disease severity (76, 77), host range expansion (77, 78), and resistance breaking (79, 80), recombination is also a major contributor to the epidemic potential of these viruses. Consequently, recombination is a markedly important evolutionary mechanism with epidemiological implications for begomovirus emergence. Knowledge about mechanistic patterns and selective determinants of fit CMB recombinants should be incorporated in the development of antiviral strategies to reduce the likelihood of the emergence of virulent recombinants. This is especially important in the context of breeding CMD-resistant cassava varieties, which has been the most effective approach for disease control to date (81).
Species constructed on sequence divergence are ripe for speciation by recombination.
It should be noted that recombination as a driver of speciation is also a function of the way the community defines species in Begomovirus. Current taxonomy guidelines state that a begomovirus species is defined as a group of DNA-A isolates sharing ≥91% pairwise nucleotide sequence identity, and any new isolate is assigned to a species if it shares at least 91% nucleotide sequence identity to any one isolate from that species (82). As we increase surveying efforts of natural CMB biodiversity with improving sequencing techniques, a larger fraction of the tolerated sequence diversity within each species will be found. It will be increasingly unlikely that sufficient mutations will accumulate quickly enough to create >9% sequence diversity without any intermediates being sampled. These cataloged intermediates then shift the goal posts for speciation, as a novel species would have to have >9% sequence divergence from them. Consequently, recombination may be the main way to obtain enough genetic variation to cross the species demarcation threshold for begomoviruses and therefore is the likely predominant mechanism of speciation for the entire genus.
Virus speciation is often discussed in terms of ecological factors, where host specificities and virus-host interactions lead to the evolution of diverged lineages that develop into different viral species (83–86). Under this model, frequent recombination homogenizes viral diversity, and only when recombination is limited do lineages diversify (87, 88). Here, by zooming in on the CMD species complex, we provide evidence that diversity at the species level can be predominantly shaped by recombination as well. Recombination has also been implicated as a direct mechanism of speciation in other virus groups, e.g., members of the Tombusviridae (formerly of the abolished Luteoviridae family) (89), Bromoviridae (90), Reoviridae (91), and Papillomaviridae (92). Additionally, recombination has shaped some deep phylogenetic relationships among viruses. Within Geminiviridae, most genera have emerged from ancient intergeneric recombination events (93–98). For higher taxonomic ranks, it is apparent that the origins of multiple families within Cressdnaviricota, including Geminiviridae (99, 100), can be traced to independent recombination events involving prokaryotic plasmids and diverse plant and animal RNA viruses (101). These and other recombination events in deep phylogeny have led to both modular patterns of virus evolution and polyphyletic groupings across the Baltimore classifications (102).
The general trend of speciation via recombination for CMBs might not be true for species in other virus families. For instance, a recent review of potyviruses found recombination to be common within populations but uncommon as a mechanism of speciation (103). In picornaviruses, whose species demarcation is defined by a significant degree of amino acid identity, it has been concluded that recombination limits speciation and members of distinct species based on current taxonomic schemes are so diverged that they are generally presumed to be incompatible (104). Our contrasting results are likely due to the narrow way that novel species are determined in begomoviruses (<91% nucleotide identity for the DNA-A segment). However, as more viral groups move to nucleotide identity as species demarcation criteria as a way to integrate the wealth of viral diversity known from genetic sequences alone (105), our conclusions from CMBs may prove more broadly applicable.
MATERIALS AND METHODS
CMB sequence data sets.
Two data sets comprised of all full-length DNA-A and DNA-B nucleotide sequences corresponding to the 11 recognized CMB species were downloaded from the GenBank database via the NCBI Taxonomy Browser (https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi) in July 2019. The 11 species analyzed here are African cassava mosaic virus, African cassava mosaic Burkina Faso virus, Cassava mosaic Madagascar virus, South African cassava mosaic virus, East African cassava mosaic virus, East African cassava Cameroon virus, East African cassava Kenya virus, East African cassava Malawi virus, East African cassava mosaic Zanzibar virus, Sri Lankan cassava mosaic virus, and Indian cassava mosaic virus. The corresponding virus isolate abbreviations are in Table 1. All sequences were organized to begin at the nick site of the conserved nonanucleotide motif at the origin of replication (5′-TAATATT//AC-3′).
Alignments and sample classification.
All multiple-sequence alignments were constructed using the MUSCLE method (106) as implemented in MEGA X (107) and manually corrected using AliView v1.26 (108). Multiple alignments have been archived as Zenodo records: 11 species DNA-A alignment (https://zenodo.org/record/4029589), CMMGV, EACMCV, EACMV, EACMKV, EACMMV, EACMZV, and SACMV DNA-B alignment (https://zenodo.org/record/3965023), ACMV and ACMBFV DNA-B alignment (https://zenodo.org/record/3964979), and ICMV and SLCMV DNA-B alignment (https://zenodo.org/record/3964977).
A pairwise nucleotide identity matrix was calculated for complete DNA-A sequences using SDT v1.2 (109) and was used to assign each DNA-A sequence to a viral species according to the ICTV-approved begomovirus species demarcation threshold of >91% DNA-A identity (82). For DNA-B sequences, the species assignment listed in GenBank was used; species definitions for DNA-B are less distinct, as discussed in the text.
Phylogenetic network analysis.
Phylogenetic networks, which can capture conflicting phylogenetic signals, such as those caused by recombination, were inferred from the alignments using the distance-based Neighbor-Net method (110) implemented in SplitsTree4 v4.14 (111). Distances were corrected with a GTR + G model of sequence evolution using base frequencies, rate heterogeneity, and gamma shape parameters estimated with jModelTest v2.1.6 (112) on the CIPRES Gateway on XSEDE.
Recombination detection and similarity plots.
Putative recombinants and major and minor “parents” within the data sets were determined using the RDP (113), GeneConv (11), Bootscan (114), MaxChi (115), Chimaera (116), SiScan (117), and 3Seq (118) recombination detection methods implemented on the RDP4 v4.100 suite (119). The terms major parent and minor parent are used by RDP4 to refer to sequences that have contributed the larger and smaller fractions, respectively, to the recombinant and are regarded as closest relatives to the true isolates involved in the event. Analyses were performed with default settings while also enabling Chimaera and 3Seq for primary scan, and a Bonferroni-corrected P value cutoff of 0.05 was used. Only events supported by at least five of the seven methods were considered high-confidence events. Breakpoint positions, putative recombinants, and parental sequences were evaluated and manually adjusted when necessary using the available breakpoint cross-checking tools and phylogenetic tree construction methods available in RDP4. RDP4 results files have been archived as Zenodo records: RDP4 results for ACMBFV, ACMV, CMMGV, EACMCV, EACMV, EACMKV, EACMMV, EACMZV, and SACMV DNA-A, https://zenodo.org/record/4592854; RDP4 results for ICMV and SLCMV DNA-A, https://zenodo.org/record/4592926; RDP4 results for EACMV-like plus CMMGV DNA-B (7 “species”), https://zenodo.org/record/3965029; RDP4 results for ACMV and ACMBFV DNA-B, https://zenodo.org/record/3975834; and RDP4 results for ICMV and SLCMV DNA-B, https://zenodo.org/record/3975838. Events were considered macroevolutionary recombination events when all members of a designated species had evidence of said event. A BLASTn analysis of the nonredundant nucleotide database at NCBI (https://blast.ncbi.nlm.nih.gov/Blast.cgi) was performed to identify the species whose members have sequences most similar to the “unknown” recombinant fragments in our data sets.
Based on RDP4 results, similarity analyses comparing the recombinants to their putative parental sequences were performed using SimPlot v3.5.1 (120). All plots were done using the Kimura 2-parameter distance model with a sliding window size of 100 and a step size of 10. For the similarity plots in Fig. 2, analyses were done by comparing 50% consensus sequences of all members of the compared species (except in the case of Fig. 2C, where the best candidate sequence was used). Similarity plots for events in Tables 2 and 4 were made by comparing the best candidate recombinant and parent sequences reported by RDP4 and are presented in File S2 in the supplemental material (Fig. S1 to S33).
Estimates for the relative rates of recombination and mutation, linkage disequilibrium correlations with distance, and likelihood permutation tests of recombination were determined. LDhat v2.2 (43) was used to infer composite likelihood estimates (CLEs) of population-scaled recombination rates and estimates of population-scaled mutation rates with the PAIRWISE and CONVERT packages, respectively. This program uses an extension of Hudson’s composite-likelihood method (121), which estimates the population recombination rate by combining the coalescent likelihoods of all pairwise comparisons of segregating sites. The extension in LDhat allows for a finite-sites mutation model, which makes it appropriate for sets of sequences with high mutation rates such as the ones found in viral genomes.
CONVERT was used with all default settings. While using PAIRWISE, a gene conversion model with an average tract length of 500 nucleotides (nt) was fitted, and a precomputed likelihood lookup table for per-site θ = 0.01 with a maximum of 100 and 101-point size grid was used to obtain the CLEs of . Since precomputed likelihood lookup tables for data sets larger than n = 100 are not available, the COMPLETE package was used with GNU parallel (https://zenodo.org/record/3903853) to generate a likelihood lookup table for per-site θ = 0.01 that can accommodate data sets of up to 320 sequences to use for the larger data sets in this study. The file is available as “LDhat coalescent likelihood lookup table for 320 sequences with theta = 0.01” (https://zenodo.org/record/3934350). estimate was obtained as the relative rate of recombination and mutation in the history of the samples within each analyzed clade. Additionally, PAIRWISE was used to obtain the correlation between estimates of linkage disequilibrium (r2) and physical distance (d) and to test for the presence of recombination using the likelihood permutation test (LPT) developed by McVean et al. (43).
Standing genetic diversity and divergence between parental and recombinant species.
The per-site standing genetic diversity of each species was assessed by calculating nucleotide diversity, π (122), which is the average number of pairwise nucleotide differences per site between sequences within a clade. To obtain absolute measures of divergence between recombinant and parental species, per-site DXY (122) was calculated. DXY refers to the average number of pairwise differences between sequences from two clades while excluding all intraclade comparisons. It is calculated as
where, in two clades, X and Y, dij measures the number of nucleotide differences between the ith haplotype from X and the jth haplotype from Y. All per-site estimates were obtained with DnaSP v6.12 (123). When estimating nucleotide diversity, gaps/missing information were excluded only in pairwise comparisons. For sliding window analyses, a sliding window size of 100 nt (including gaps) and a step size of 10 nt were used.
Data availability.
As noted above, multiple alignments, RDP4 results, and the generated LDhat likelihood lookup table have been deposited as Zenodo records: alignment for DNA-A, 11 species DNA-A alignment (https://zenodo.org/record/4029589); alignments for DNA-B, CMMGV, EACMCV, EACMV, EACMKV, EACMMV, EACMZV, and SACMV (https://zenodo.org/record/3965023); ACMV and ACMBFV DNA-B alignment (https://zenodo.org/record/3964979); ICMV and SLCMV DNA-B alignment (https://zenodo.org/record/3964977); RDP4 results for DNA-A and RDP4 results for ACMBFV, ACMV, CMMGV, EACMCV, EACMV, EACMKV, EACMMV, EACMZV, and SACMV DNA-A (https://zenodo.org/record/4592854); RDP4 results for ICMV and SLCMV DNA-A (https://zenodo.org/record/4592926); RDP4 results for DNA-B and RDP4 results for EACMV-like plus CMMGV DNA-B (7 “species”) (https://zenodo.org/record/3965029); RDP4 results for ACMV and ACMBFV DNA-B (https://zenodo.org/record/3975834); RDP4 results for ICMV and SLCMV DNA-B (https://zenodo.org/record/3975838); and LDhat coalescent likelihood lookup table for 320 sequences with theta = 0.01 (https://zenodo.org/record/3934350).
ACKNOWLEDGMENTS
We are grateful to all the research groups that have shared sequence data via GenBank and the farmers who enabled collection of those data. We thank members of the Duffy laboratory at Rutgers University and our colleagues at the Hanley-Bowdoin laboratory at North Carolina State University for feedback and critical readings of the manuscript. We also thank the staff of the Office of Advanced Research Computing (OARC) at Rutgers for access to and maintenance of the Amarel cluster. We thank R. Muldowney (Rutgers SEBS ITS) for assistance with technical computing.
This work was supported by U.S. NSF award OIA-1545553 to S.D. and an HHMI Gilliam Fellowship for Advanced Study for A.C.B. D.D. was partly supported by the Aresty Undergraduate Research Scholars Program at Rutgers University.
Footnotes
Supplemental material is available online only.
Contributor Information
Siobain Duffy, Email: duffy@sebs.rutgers.edu.
Anne E. Simon, University of Maryland, College Park
REFERENCES
- 1.Rojas MR, Macedo MA, Maliano MR, Soto-Aguilar M, Souza JO, Briddon RW, Kenyon L, Rivera Bustamante RF, Zerbini FM, Adkins S, Legg JP, Kvarnheden A, Wintermantel WM, Sudarshana MR, Peterschmitt M, Lapidot M, Martin DP, Moriones E, Inoue-Nagata AK, Gilbertson RL. 2018. World management of geminiviruses. Annu Rev Phytopathol 56:637–677. 10.1146/annurev-phyto-080615-100327. [DOI] [PubMed] [Google Scholar]
- 2.Seal SE, vandenBosch F, Jeger MJ. 2006. Factors influencing begomovirus evolution and their increasing global significance: implications for sustainable control. CRC Crit Rev Plant Sci 25:23–46. 10.1080/07352680500365257. [DOI] [Google Scholar]
- 3.Zerbini FM, Briddon RW, Idris A, Martin DP, Moriones E, Navas-Castillo J, Rivera-Bustamante R, Roumagnac P, Varsani A, Consortium IR. 2017. ICTV virus taxonomy profile: Geminiviridae. J Gen Virol 98:131–133. 10.1099/jgv.0.000738. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Jones RAC. 2009. Plant virus emergence and evolution: origins, new encounter scenarios, factors driving emergence, effects of changing world conditions, and prospects for control. Virus Res 141:113–130. 10.1016/j.virusres.2008.07.028. [DOI] [PubMed] [Google Scholar]
- 5.Navas-Castillo J, Fiallo-Olivé E, Sánchez-Campos S. 2011. Emerging virus diseases transmitted by whiteflies. Annu Rev Phytopathol 49:219–248. 10.1146/annurev-phyto-072910-095235. [DOI] [PubMed] [Google Scholar]
- 6.Elena SF, Fraile A, Garcia-Arenal F. 2014. Chapter three. Evolution and emergence of plant viruses, p 161–191. In Maramorosch K, Murphy FA (ed), Advances in Virus Research, vol 88. Academic Press, Cambridge, MA. [DOI] [PubMed] [Google Scholar]
- 7.Ge L, Zhang J, Zhou X, Li H. 2007. Genetic structure and population variability of tomato yellow leaf curl China virus. J Virol 81:5902–5907. 10.1128/JVI.02431-06. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Duffy S, Holmes EC. 2008. Phylogenetic evidence for rapid rates of molecular evolution in the single-stranded DNA begomovirus tomato yellow leaf curl virus. J Virol 82:957–965. 10.1128/JVI.01929-07. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Duffy S, Holmes EC. 2009. Validation of high rates of nucleotide substitution in geminiviruses: phylogenetic evidence from East African cassava mosaic viruses. J Gen Virol 90:1539–1547. 10.1099/vir.0.009266-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Lima ATM, Sobrinho RR, Gonzalez-Aguilera J, Rocha CS, Silva SJC, Xavier CAD, Silva N, Duffy S, Zerbini FM. 2013. Synonymous site variation due to recombination explains higher genetic variability in begomovirus populations infecting non-cultivated hosts. J Gen Virol 94:418–431. 10.1099/vir.0.047241-0. [DOI] [PubMed] [Google Scholar]
- 11.Padidam M, Sawyer S, Fauquet CM. 1999. Possible emergence of new geminiviruses by frequent recombination. Virology 225:218–225. 10.1006/viro.1999.0056. [DOI] [PubMed] [Google Scholar]
- 12.Lefeuvre P, Martin DP, Hoareau M, Naze F, Delatte H, Thierry M, Varsani A, Becker N, Reynaud B, Lett J-M. 2007. Begomovirus “melting pot” in the south-west Indian Ocean islands: molecular diversity and evolution through recombination. J Gen Virol 88:3458–3468. 10.1099/vir.0.83252-0. [DOI] [PubMed] [Google Scholar]
- 13.Graham AP, Martin DP, Roye ME. 2010. Molecular characterization and phylogeny of two begomoviruses infecting Malvastrum americanum in Jamaica: evidence of the contribution of inter-species recombination to the evolution of malvaceous weed-associated begomoviruses from the Northern Caribbean. Virus Genes 40:256–266. 10.1007/s11262-009-0430-6. [DOI] [PubMed] [Google Scholar]
- 14.Muller HJ. 1964. The relation of recombination to mutational advance. Mutat Res 1:2–9. 10.1016/0027-5107(64)90047-8. [DOI] [PubMed] [Google Scholar]
- 15.Keightley PD, Otto SP. 2006. Interference among deleterious mutations favours sex and recombination in finite populations. Nature 443:89–92. 10.1038/nature05049. [DOI] [PubMed] [Google Scholar]
- 16.Martin DP, Biagini P, Lefeuvre P, Golden M, Roumagnac P, Varsani A. 2011. Recombination in eukaryotic single stranded DNA viruses. Viruses 3:1699–1738. 10.3390/v3091699. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Simon-Loriere E, Holmes EC. 2011. Why do RNA viruses recombine? Nat Rev Microbiol 9:617–626. 10.1038/nrmicro2614. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Howeler R, Lutaladio N, Thomas GS. 2013. Save and grow: cassava. A guide to sustainable production intensification. Food and Agriculture Organization of the United Nations, Rome, Italy. http://www.fao.org/3/i3278e/i3278e.pdf. [Google Scholar]
- 19.Jacobson AL, Duffy S, Sseruwagi P. 2018. Whitefly-transmitted viruses threatening cassava production in Africa. Curr Opin Virol 33:167–176. 10.1016/j.coviro.2018.08.016. [DOI] [PubMed] [Google Scholar]
- 20.Patil BL, Fauquet CM. 2009. Cassava mosaic geminiviruses: actual knowledge and perspectives. Mol Plant Pathol 10:685–701. 10.1111/j.1364-3703.2009.00559.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Wang HL, Cui XY, Wang XW, Liu SS, Zhang ZH, Zhou XP. 2016. First report of Sri Lankan cassava mosaic virus infecting cassava in Cambodia. Plant Dis 100:1029–1029. 10.1094/PDIS-10-15-1228-PDN. [DOI] [Google Scholar]
- 22.Uke A, Hoat TX, Quan MV, Liem NV, Ugaki M, Natsuaki KT. 2018. First report of Sri Lankan cassava mosaic virus infecting cassava in Vietnam. Plant Dis 102:2669. 10.1094/PDIS-05-18-0805-PDN. [DOI] [Google Scholar]
- 23.Wang D, Yao XM, Huang GX, Shi T, Wang GF, Ye J. 2019. First report of Sri Lankan cassava mosaic virus infected cassava in China. Plant Dis 103:1437. 10.1094/PDIS-09-18-1590-PDN. [DOI] [Google Scholar]
- 24.Siriwan W, Jimenez J, Hemniam N, Saokham K, Lopez-Alvarez D, Leiva AM, Martinez A, Mwanzia L, Becerra Lopez-Lavalle LA, Cuellar WJ. 2020. Surveillance and diagnostics of the emergent Sri Lankan cassava mosaic virus (Fam. Geminiviridae) in Southeast Asia. Virus Res 285:197959. 10.1016/j.virusres.2020.197959. [DOI] [PubMed] [Google Scholar]
- 25.Fondong VN. 2013. Geminivirus protein structure and function. Mol Plant Pathol 14:635–649. 10.1111/mpp.12032. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Hanley-Bowdoin L, Bejarano ER, Robertson D, Mansoor S. 2013. Geminiviruses: masters at redirecting and reprogramming plant processes. Nat Rev Microbiol 11:777–788. 10.1038/nrmicro3117. [DOI] [PubMed] [Google Scholar]
- 27.Argüello-Astorga GR, Ruiz-Medrano R. 2001. An iteron-related domain is associated to Motif 1 in the replication proteins of geminiviruses: identification of potential interacting amino acid-base pairs by a comparative approach. Arch Virol 146:1465–1485. 10.1007/s007050170072. [DOI] [PubMed] [Google Scholar]
- 28.Zhou X, Robinson DJ, Harrison BD. 1998. Types of variation in DNA-A among isolates of East African cassava mosaic virus from Kenya, Malawi and Tanzania. J Gen Virol 79:2835–2840. 10.1099/0022-1317-79-11-2835. [DOI] [PubMed] [Google Scholar]
- 29.Fondong VN, Pita JS, Rey ME, de Kochko A, Beachy RN, Fauquet CM. 2000. Evidence of synergism between African cassava mosaic virus and a new double-recombinant geminivirus infecting cassava in Cameroon. J Gen Virol 81:287–297. 10.1099/0022-1317-81-1-287. [DOI] [PubMed] [Google Scholar]
- 30.Maruthi MN, Seal S, Colvin J, Briddon RW, Bull SE. 2004. East African cassava mosaic Zanzibar virus–a recombinant begomovirus species with a mild phenotype. Arch Virol 149:2365–2377. 10.1007/s00705-004-0380-1. [DOI] [PubMed] [Google Scholar]
- 31.Bull SE, Briddon RW, Sserubombwe WS, Ngugi K, Markham PG, Stanley J. 2006. Genetic diversity and phylogeography of cassava mosaic viruses in Kenya. J Gen Virol 87:3053–3065. 10.1099/vir.0.82013-0. [DOI] [PubMed] [Google Scholar]
- 32.Tiendrébéogo F, Lefeuvre P, Hoareau M, Harimalala MA, De Bruyn A, Villemot J, Traoré VSE, Konaté G, Traoré AS, Barro N, Reynaud B, Traoré O, Lett J-M. 2012. Evolution of African cassava mosaic virus by recombination between bipartite and monopartite begomoviruses. Virol J 9:67–67. 10.1186/1743-422X-9-67. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Harimalala M, Lefeuvre P, de Bruyn A, Tiendrébéogo F, Hoareau M, Villemot J, Ranomenjanahary S, Andrianjaka A, Reynaud B, Lett JM. 2012. A novel cassava-infecting begomovirus from Madagascar: cassava mosaic Madagascar virus. Arch Virol 157:2027–2030. 10.1007/s00705-012-1399-3. [DOI] [PubMed] [Google Scholar]
- 34.De Bruyn A, Harimalala M, Zinga I, Mabvakure BM, Hoareau M, Ravigné V, Walters M, Reynaud B, Varsani A, Harkins GW, Martin DP, Lett J-M, Lefeuvre P. 2016. Divergent evolutionary and epidemiological dynamics of cassava mosaic geminiviruses in Madagascar. BMC Evol Biol 16:182. 10.1186/s12862-016-0749-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Zhou XP, Liu Y, Calvert L, Munoz C, Otim-Nape W, Robinson D, Harrison B. 1997. Evidence that DNA-A of a geminivirus associated with severe cassava mosaic disease in Uganda has arisen by interspecific recombination. J Gen Virol 78:2101–2111. 10.1099/0022-1317-78-8-2101. [DOI] [PubMed] [Google Scholar]
- 36.Pita JS, Fondong VN, Sangaré A, Otim-Nape GW, Ogwal S, Fauquet CM. 2001. Recombination, pseudorecombination and synergism of geminiviruses are determinant keys to the epidemic of severe cassava mosaic disease in Uganda. J Gen Virol 82:655–665. 10.1099/0022-1317-82-3-655. [DOI] [PubMed] [Google Scholar]
- 37.Zinga I, Chiroleu F, Legg J, Lefeuvre P, Komba EK, Semballa S, Yandia SP, Mandakombo NB, Reynaud B, Lett J-M. 2013. Epidemiological assessment of cassava mosaic disease in Central African Republic reveals the importance of mixed viral infection and poor health of plant cuttings. Crop Prot 44:6–12. 10.1016/j.cropro.2012.10.010. [DOI] [Google Scholar]
- 38.Harimalala M, Chiroleu F, Giraud-Carrier C, Hoareau M, Zinga I, Randriamampianina JA, Velombola S, Ranomenjanahary S, Andrianjaka A, Reynaud B, Lefeuvre P, Lett J-M. 2015. Molecular epidemiology of cassava mosaic disease in Madagascar. Plant Pathol 64:501–507. 10.1111/ppa.12277. [DOI] [Google Scholar]
- 39.Mulenga RM, Legg JP, Ndunguru J, Miano DW, Mutitu EW, Chikoti PC, Alabi OJ. 2016. Survey, molecular detection, and characterization of geminiviruses associated with cassava mosaic disease in Zambia. Plant Dis 100:1379–1387. 10.1094/PDIS-10-15-1170-RE. [DOI] [PubMed] [Google Scholar]
- 40.Rothenstein D, Haible D, Dasgupta I, Dutt N, Patil BL, Jeske H. 2006. Biodiversity and recombination of cassava-infecting begomoviruses from southern India. Arch Virol 151:55–69. 10.1007/s00705-005-0624-8. [DOI] [PubMed] [Google Scholar]
- 41.Ndunguru J, Legg JP, Aveling TAS, Thompson G, Fauquet CM. 2005. Molecular biodiversity of cassava begomoviruses in Tanzania: evolution of cassava geminiviruses in Africa and evidence for East Africa being a center of diversity of cassava geminiviruses. Virol J 2:21. 10.1186/1743-422X-2-21. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Berrie LC, Rybicki EP, Rey MEC. 2001. Complete nucleotide sequence and host range of South African cassava mosaic virus: further evidence for recombination amongst begomoviruses. J Gen Virol 82:53–58. 10.1099/0022-1317-82-1-53. [DOI] [PubMed] [Google Scholar]
- 43.McVean G, Awadalla P, Fearnhead P. 2002. A coalescent-based method for detecting and estimating recombination from gene sequences. Genetics 160:1231–1241. 10.1093/genetics/160.3.1231. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Briddon RW, Patil BL, Bagewadi B, Nawaz-Ul-Rehman MS, Fauquet CM. 2010. Distinct evolutionary histories of the DNA-A and DNA-B components of bipartite begomoviruses. BMC Evol Biol 10:97. 10.1186/1471-2148-10-97. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Saunders K, Salim N, Mali VR, Malathi VG, Briddon R, Markham PG, Stanley J. 2002. Characterisation of Sri Lankan cassava mosaic virus and Indian cassava mosaic virus: evidence for acquisition of a DNA B component by a monopartite begomovirus. Virology 293:63–74. 10.1006/viro.2001.1251. [DOI] [PubMed] [Google Scholar]
- 46.De Bruyn A, Villemot J, Lefeuvre P, Villar E, Hoareau M, Harimalala M, Abdoul-Karime AL, Abdou-Chakour C, Reynaud B, Harkins GW, Varsani A, Martin DP, Lett J-M. 2012. East African cassava mosaic-like viruses from Africa to Indian ocean islands: molecular diversity, evolutionary history and geographical dissemination of a bipartite begomovirus. BMC Evol Biol 12:228. 10.1186/1471-2148-12-228. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Pagán I. 2018. The diversity, evolution and epidemiology of plant viruses: a phylogenetic view. Infect Genet Evol 65:187–199. 10.1016/j.meegid.2018.07.033. [DOI] [PubMed] [Google Scholar]
- 48.Berrie LC, Palmer KE, Rybicki EP, Hiyadat SH, Maxwell DP, Rey MEC. 1997. A new isolate of African cassava mosaic virus in South Africa. African J Root Tuber Crop 2:49–52. [Google Scholar]
- 49.Berrie LC, Palmer KE, Rybicki EP, Rey MEC. 1998. Molecular characterisation of a distinct South African cassava infecting geminivirus. Arch Virol 143:2253–2260. 10.1007/s007050050457. [DOI] [PubMed] [Google Scholar]
- 50.Borah BK, Dasgupta I. 2012. PCR-RFLP analysis indicates that recombination might be a common occurrence among the cassava infecting begomoviruses in India. Virus Genes 45:327–332. 10.1007/s11262-012-0770-5. [DOI] [PubMed] [Google Scholar]
- 51.Khan AJ, Akhtar S, Al-Matrushi AM, Fauquet CM, Briddon RW. 2013. Introduction of East African cassava mosaic Zanzibar virus to Oman harks back to Zanzibar, the capital of Oman. Virus Genes 46:195–198. 10.1007/s11262-012-0838-2. [DOI] [PubMed] [Google Scholar]
- 52.Nawaz-Ul-Rehman MS, Briddon RW, Fauquet CM. 2012. A melting pot of old world begomoviruses and their satellites infecting a collection of Gossypium species in Pakistan. PLoS One 7:e40050. 10.1371/journal.pone.0040050. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.García-Arenal F, Zerbini FM. 2019. Life on the edge: geminiviruses at the interface between crops and wild plant hosts. Annu Rev Virol 6:411–433. 10.1146/annurev-virology-092818-015536. [DOI] [PubMed] [Google Scholar]
- 54.Khan AJ, Mansoor S, Briddon RW. 2014. Oman: a case for a sink of begomoviruses of geographically diverse origins. Trends Plant Sci 19:67–70. 10.1016/j.tplants.2013.11.004. [DOI] [PubMed] [Google Scholar]
- 55.Lima ATM, Silva JCF, Silva FN, Castillo-Urquiza GP, Silva FF, Seah YM, Mizubuti ESG, Duffy S, Zerbini FM. 2017. The diversification of begomovirus populations is predominantly driven by mutational dynamics. Virus Evol 3:vex005. 10.1093/ve/vex005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Martin DP, Lefeuvre P, Varsani A, Hoareau M, Semegni J-Y, Dijoux B, Vincent C, Reynaud B, Lett J. 2011. Complex recombination patterns arising during geminivirus coinfections preserve and demarcate biologically important intra-genome interaction networks. PLoS Pathog 7:e1002203–e1002214. 10.1371/journal.ppat.1002203. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Escriu F, Fraile A, García-Arenal F. 2007. Constraints to genetic exchange support gene coadaptation in a tripartite RNA virus. PLoS Pathog 3:e8. 10.1371/journal.ppat.0030008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Jeske H, Lütgemeier M, Preiss W. 2001. DNA forms indicate rolling circle and recombination-dependent replication of Abutilon mosaic virus. EMBO J 20:6158–6167. 10.1093/emboj/20.21.6158. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Chen L-F, Rojas M, Kon T, Gamby K, Xoconostle-Cazares B, Gilbertson RL. 2009. A severe symptom phenotype in tomato in Mali is caused by a reassortant between a novel recombinant begomovirus (Tomato yellow leaf curl Mali virus) and a betasatellite. Mol Plant Pathol 10:415–430. 10.1111/j.1364-3703.2009.00541.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Lozano G, Trenado HP, Valverde RA, Navas-Castillo J. 2009. Novel begomovirus species of recombinant nature in sweet potato (Ipomoea batatas) and Ipomoea indica: taxonomic and phylogenetic implications. J Gen Virol 90:2550–2562. 10.1099/vir.0.012542-0. [DOI] [PubMed] [Google Scholar]
- 61.Kumar Y, Hallan V, Zaidi AA. 2011. Chilli leaf curl Palampur virus is a distinct begomovirus species associated with a betasatellite. Plant Pathol 60:1040–1047. 10.1111/j.1365-3059.2011.02475.x. [DOI] [Google Scholar]
- 62.Xie Y, Zhao L, Jiao X, Jiang T, Gong H, Wang B, Briddon RW, Zhou X. 2013. A recombinant begomovirus resulting from exchange of the C4 gene. J Gen Virol 94:1896–1907. 10.1099/vir.0.053181-0. [DOI] [PubMed] [Google Scholar]
- 63.Kesumawati E, Okabe S, Homma K, Fujiwara I, Zakaria S, Kanzaki S, Koeda S. 2019. Pepper yellow leaf curl Aceh virus: a novel bipartite begomovirus isolated from chili pepper, tomato, and tobacco plants in Indonesia. Arch Virol 164:2379–2383. 10.1007/s00705-019-04316-8. [DOI] [PubMed] [Google Scholar]
- 64.Jovel J, Reski G, Rothenstein D, Ringel M, Frischmuth T, Jeske H. 2004. Sida micrantha mosaic is associated with a complex infection of begomoviruses different from Abutilon mosaic virus. Arch Virol 149:829–841. 10.1007/s00705-003-0235-1. [DOI] [PubMed] [Google Scholar]
- 65.Haq QMI, Rouhibakhsh A, Ali A, Malathi VG. 2011. Infectivity analysis of a blackgram isolate of Mungbean yellow mosaic virus and genetic assortment with MYMIV in selective hosts. Virus Genes 42:429–439. 10.1007/s11262-011-0591-y. [DOI] [PubMed] [Google Scholar]
- 66.Ho ES, Kuchie J, Duffy S. 2014. Bioinformatic analysis reveals genome size reduction and the emergence of tyrosine phosphorylation site in the movement protein of new world bipartite begomoviruses. PLoS One 9:e111957. 10.1371/journal.pone.0111957. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Xavier CAD, Godinho MT, Mar TB, Ferro CG, Sande OFL, Silva JC, Ramos-Sobrinho R, Nascimento RN, Assunção I, Lima GSA, Lima ATM, Zerbini FM. 2021. Evolutionary dynamics of bipartite begomoviruses revealed by complete genome analysis. Mol Ecol 10.1111/mec.15997. [DOI] [PubMed] [Google Scholar]
- 68.Rokyta DR, Wichman HA. 2009. Genic incompatibilities in two hybrid bacteriophages. Mol Biol Evol 26:2831–2839. 10.1093/molbev/msp199. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Davino S, Napoli C, Dellacroce C, Miozzi L, Noris E, Davino M, Accotto GP. 2009. Two new natural begomovirus recombinants associated with the tomato yellow leaf curl disease co-exist with parental viruses in tomato epidemics in Italy. Virus Res 143:15–23. 10.1016/j.virusres.2009.03.001. [DOI] [PubMed] [Google Scholar]
- 70.Monjane AL, Martin DP, Lakay F, Muhire BM, Pande D, Varsani A, Harkins G, Shepherd DN, Rybicki EP. 2014. Extensive recombination-induced disruption of genetic interactions is highly deleterious but can be partially reversed by small numbers of secondary recombination events. J Virol 88:7843–7851. 10.1128/JVI.00709-14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Vuillaume F, Thébaud G, Urbino C, Forfert N, Granier M, Froissart R, Blanc S, Peterschmitt M. 2011. Distribution of the phenotypic effects of random homologous recombination between two virus species. PLoS Pathog 7:e1002028. 10.1371/journal.ppat.1002028. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Urbino C, Regragui ZF, Granier M, Peterschmitt M. 2020. Fitness advantage of inter-species TYLCV recombinants induced by beneficial intra-genomic interactions rather than by specific mutations. Virology 542:20–27. 10.1016/j.virol.2020.01.002. [DOI] [PubMed] [Google Scholar]
- 73.Fiallo-Olivé E, Trenado HP, Louro D, Navas-Castillo J. 2019. Recurrent speciation of a tomato yellow leaf curl geminivirus in Portugal by recombination. Sci Rep 9:1332. 10.1038/s41598-018-37971-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Etessami P, Watts J, Stanley J. 1989. Size reversion of African cassava mosaic virus coat protein gene deletion mutants during infection of Nicotiana benthamiana. J Gen Virol 70:277–289. 10.1099/0022-1317-70-2-277. [DOI] [PubMed] [Google Scholar]
- 75.van der Walt E, Rybicki EP, Varsani A, Polston JE, Billharz R, Donaldson L, Monjane AL, Martin DP. 2009. Rapid host adaptation by extensive recombination. J Gen Virol 90:734–746. 10.1099/vir.0.007724-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Hou YM, Gilbertson RL. 1996. Increased pathogenicity in a pseudorecombinant bipartite geminivirus correlates with intermolecular recombination. J Virol 70:5430–5436. 10.1128/JVI.70.8.5430-5436.1996. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.García-Andrés S, Monci F, Navas-Castillo J, Moriones E. 2006. Begomovirus genetic diversity in the native plant reservoir Solanum nigrum: evidence for the presence of a new virus species of recombinant nature. Virology 350:433–442. 10.1016/j.virol.2006.02.028. [DOI] [PubMed] [Google Scholar]
- 78.Lefeuvre P, Moriones E. 2015. Recombination as a motor of host switches and virus emergence: geminiviruses as case studies. Curr Opin Virol 10:14–19. 10.1016/j.coviro.2014.12.005. [DOI] [PubMed] [Google Scholar]
- 79.Briddon RW, Akbar F, Iqbal Z, Amrao L, Amin I, Saeed M, Mansoor S. 2014. Effects of genetic changes to the begomovirus/betasatellite complex causing cotton leaf curl disease in South Asia post-resistance breaking. Virus Res 186:114–119. 10.1016/j.virusres.2013.12.008. [DOI] [PubMed] [Google Scholar]
- 80.Belabess Z, Peterschmitt M, Granier M, Tahiri A, Blenzar A, Urbino C. 2016. The non-canonical tomato yellow leaf curl virus recombinant that displaced its parental viruses in southern Morocco exhibits a high selective advantage in experimental conditions. J Gen Virol 97:3433–3445. 10.1099/jgv.0.000633. [DOI] [PubMed] [Google Scholar]
- 81.Fondong VN. 2017. The search for resistance to cassava mosaic geminiviruses: how much we have accomplished, and what lies ahead. Front Plant Sci 8:408. 10.3389/fpls.2017.00408. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Brown JK, Zerbini FM, Navas-Castillo J, Moriones E, Ramos-Sobrinho R, Silva JCF, Fiallo-Olivé E, Briddon RW, Hernández-Zepeda C, Idris A, Malathi VG, Martin DP, Rivera-Bustamante R, Ueda S, Varsani A. 2015. Revision of Begomovirus taxonomy based on pairwise sequence comparisons. Arch Virol 160:1593–1619. 10.1007/s00705-015-2398-y. [DOI] [PubMed] [Google Scholar]
- 83.Meyer JR, Dobias DT, Medina SJ, Servilio L, Gupta A, Lenski RE. 2016. Ecological speciation of bacteriophage lambda in allopatry and sympatry. Science 354:1301–1304. 10.1126/science.aai8446. [DOI] [PubMed] [Google Scholar]
- 84.Saxenhofer M, Schmidt S, Ulrich RG, Heckel G. 2019. Secondary contact between diverged host lineages entails ecological speciation in a European hantavirus. PLoS Biol 17:e3000142. 10.1371/journal.pbio.3000142. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85.Simmonds P, Aiewsakun P, Katzourakis A. 2019. Prisoners of war—host adaptation and its constraints on virus evolution. Nat Rev Microbiol 17:321–328. 10.1038/s41579-018-0120-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.Chaikeeratisak V, Birkholz EA, Prichard AM, Egan ME, Mylvara A, Nonejuie P, Nguyen KT, Sugie J, Meyer JR, Pogliano J. 2021. Viral speciation through subcellular genetic isolation and virogenesis incompatibility. Nat Commun 12:342. 10.1038/s41467-020-20575-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87.Hendrix RW, Smith MCM, Burns RN, Ford ME, Hatfull GF. 1999. Evolutionary relationships among diverse bacteriophages and prophages: all the world’s a phage. Proc Natl Acad Sci U S A 96:2192–2197. 10.1073/pnas.96.5.2192. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 88.Breitbart M, Rohwer F. 2005. Here a virus, there a virus, everywhere the same virus? Trends Microbiol 13:278–284. 10.1016/j.tim.2005.04.003. [DOI] [PubMed] [Google Scholar]
- 89.Pagán I, Holmes EC. 2010. Long-term evolution of the Luteoviridae: time scale and mode of virus speciation. J Virol 84:6177–6187. 10.1128/JVI.02160-09. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 90.Codoñer FM, Elena SF. 2008. The promiscuous evolutionary history of the family Bromoviridae. J Gen Virol 89:1739–1747. 10.1099/vir.0.2008/000166-0. [DOI] [PubMed] [Google Scholar]
- 91.Yang Y, Gaspard G, McMullen N, Duncan R. 2020. Polycistronic genome segment evolution and gain and loss of FAST protein function during fusogenic orthoreovirus speciation. Viruses 12:702. 10.3390/v12070702. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 92.Borvet F, Bravo IG, Willemsen A. 2020. Papillomaviruses infecting cetaceans exhibit signs of genome adaptation following a recombination event. Virus Evol 6:veaa038. 10.1093/ve/veaa038. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 93.Rybicki EP. 1994. A phylogenetic and evolutionary justification for three genera of Geminiviridae. Arch Virol 139:49–77. 10.1007/BF01309454. [DOI] [PubMed] [Google Scholar]
- 94.Briddon RW, Bedford ID, Tsai JH, Markham PG. 1996. Analysis of the nucleotide sequence of the treehopper-transmitted geminivirus, tomato pseudo-curly top virus, suggests a recombinant origin. Virology 219:387–394. 10.1006/viro.1996.0264. [DOI] [PubMed] [Google Scholar]
- 95.Varsani A, Shepherd DN, Dent K, Monjane AL, Rybicki EP, Martin DP. 2009. A highly divergent South African geminivirus species illuminates the ancient evolutionary history of this family. Virol J 6:36. 10.1186/1743-422X-6-36. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 96.Briddon RW, Heydarnejad J, Khosrowfar F, Massumi H, Martin DP, Varsani A. 2010. Turnip curly top virus, a highly divergent geminivirus infecting turnip in Iran. Virus Res 152:169–175. 10.1016/j.virusres.2010.05.016. [DOI] [PubMed] [Google Scholar]
- 97.Heydarnejad J, Keyvani N, Razavinejad S, Massumi H, Varsani A. 2013. Fulfilling Koch’s postulates for beet curly top Iran virus and proposal for consideration of new genus in the family Geminiviridae. Arch Virol 158:435–443. 10.1007/s00705-012-1485-6. [DOI] [PubMed] [Google Scholar]
- 98.Varsani A, Roumagnac P, Fuchs M, Navas-Castillo J, Moriones E, Idris A, Briddon RW, Rivera-Bustamante R, Murilo Zerbini F, Martin DP. 2017. Capulavirus and Grablovirus: two new genera in the family Geminiviridae. Arch Virol 162:1819–1831. 10.1007/s00705-017-3268-6. [DOI] [PubMed] [Google Scholar]
- 99.Koonin EV, Ilyina TV. 1992. Geminivirus replication proteins are related to prokaryotic plasmid rolling circle DNA replication initiator proteins. J Gen Virol 73:2763–2766. 10.1099/0022-1317-73-10-2763. [DOI] [PubMed] [Google Scholar]
- 100.Krupovic M, Ravantti JJ, Bamford DH. 2009. Geminiviruses: a tale of a plasmid becoming a virus. BMC Evol Biol 9:112. 10.1186/1471-2148-9-112. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 101.Kazlauskas D, Varsani A, Koonin EV, Krupovic M. 2019. Multiple origins of prokaryotic and eukaryotic single-stranded DNA viruses from bacterial and archaeal plasmids. Nat Commun 10:3425. 10.1038/s41467-019-11433-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 102.Koonin EV, Dolja VV, Krupovic M, Varsani A, Wolf YI, Yutin N, Zerbini FM, Kuhn JH. 2020. Global organization and proposed megataxonomy of the virus world. Microbiol Mol Biol Rev 84:e00061-19. 10.1128/MMBR.00061-19. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 103.Gibbs AJ, Hajizadeh M, Ohshima K, Jones RAC. 2020. The potyviruses: an evolutionary synthesis is emerging. Viruses 12:132. 10.3390/v12020132. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 104.Lukashev AN. 2010. Recombination among picornaviruses. Rev Med Virol 20:327–337. 10.1002/rmv.660. [DOI] [PubMed] [Google Scholar]
- 105.Simmonds P, Adams MJ, Benkő M, Breitbart M, Brister JR, Carstens EB, Davison AJ, Delwart E, Gorbalenya AE, Harrach B, Hull R, King AMQ, Koonin EV, Krupovic M, Kuhn JH, Lefkowitz EJ, Nibert ML, Orton R, Roossinck MJ, Sabanadzovic S, Sullivan MB, Suttle CA, Tesh RB, van der Vlugt RA, Varsani A, Zerbini FM. 2017. Virus taxonomy in the age of metagenomics. Nat Rev Microbiol 15:161–168. 10.1038/nrmicro.2016.177. [DOI] [PubMed] [Google Scholar]
- 106.Edgar RC. 2004. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 32:1792–1797. 10.1093/nar/gkh340. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 107.Kumar S, Stecher G, Li M, Knyaz C, Tamura K. 2018. MEGA X: molecular evolutionary genetics analysis across computing platforms. Mol Biol Evol 35:1547–1549. 10.1093/molbev/msy096. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 108.Larsson A. 2014. AliView: a fast and lightweight alignment viewer and editor for large datasets. Bioinformatics 30:3276–3278. 10.1093/bioinformatics/btu531. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 109.Muhire BM, Varsani A, Martin DP. 2014. SDT: a virus classification tool based on pairwise sequence alignment and identity calculation. PLoS One 9:e108277. 10.1371/journal.pone.0108277. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 110.Bryant D, Moulton V. 2004. Neighbor-Net: an agglomerative method for the construction of phylogenetic networks. Mol Biol Evol 21:255–265. 10.1093/molbev/msh018. [DOI] [PubMed] [Google Scholar]
- 111.Huson DH, Bryant D. 2006. Application of phylogenetic networks in evolutionary studies. Mol Biol Evol 23:254–267. 10.1093/molbev/msj030. [DOI] [PubMed] [Google Scholar]
- 112.Darriba D, Taboada GL, Doallo R, Posada D. 2012. jModelTest 2: more models, new heuristics and parallel computing. Nat Methods 9:772. 10.1038/nmeth.2109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 113.Martin D, Rybicki E. 2000. RDP: detection of recombination amongst aligned sequences. Bioinformatics 16:562–563. 10.1093/bioinformatics/16.6.562. [DOI] [PubMed] [Google Scholar]
- 114.Salminen MO, Carr JK, Burke DS, McCutchan FE. 1995. Identification of breakpoints in intergenotypic recombinants of HIV type 1 by bootscanning. AIDS Res Hum Retroviruses 11:1423–1425. 10.1089/aid.1995.11.1423. [DOI] [PubMed] [Google Scholar]
- 115.Smith JM. 1992. Analyzing the mosaic structure of genes. J Mol Evol 34:126–129. 10.1007/BF00182389. [DOI] [PubMed] [Google Scholar]
- 116.Posada D, Crandall KA. 2001. Evaluation of methods for detecting recombination from DNA sequences: computer simulations. Proc Natl Acad Sci U S A 98:13757–13762. 10.1073/pnas.241370698. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 117.Gibbs MJ, Armstrong JS, Gibbs AJ. 2000. Sister-scanning: a Monte Carlo procedure for assessing signals in recombinant sequences. Bioinformatics 16:573–582. 10.1093/bioinformatics/16.7.573. [DOI] [PubMed] [Google Scholar]
- 118.Boni MF, Posada D, Feldman MW. 2007. An exact nonparametric method for inferring mosaic structure in sequence triplets. Genetics 176:1035–1047. 10.1534/genetics.106.068874. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 119.Martin DP, Murrell B, Golden M, Khoosal A, Muhire B. 2015. RDP4: detection and analysis of recombination patterns in virus genomes. Virus Evol 1:vev003. 10.1093/ve/vev003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 120.Lole KS, Bollinger RC, Paranjape RS, Gadkari D, Kulkarni SS, Novak NG, Ingersoll R, Sheppard HW, Ray SC. 1999. Full-length human immunodeficiency virus type 1 genomes from subtype C-infected seroconverters in India, with evidence of intersubtype recombination. J Virol 73:152–160. 10.1128/JVI.73.1.152-160.1999. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 121.Hudson RR. 2001. Two-locus sampling distributions and their application. Genetics 159:1805–1817. 10.1093/genetics/159.4.1805. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 122.Nei M, Li WH. 1979. Mathematical model for studying genetic variation in terms of restriction endonucleases. Proc Natl Acad Sci U S A 76:5269–5273. 10.1073/pnas.76.10.5269. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 123.Rozas J, Ferrer-Mata A, Sanchez-DelBarrio JC, Guirao-Rico S, Librado P, Ramos-Onsins SE, Sanchez-Gracia A. 2017. DnaSP 6: DNA sequence polymorphism analysis of large data sets. Mol Biol Evol 34:3299–3302. 10.1093/molbev/msx248. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
As noted above, multiple alignments, RDP4 results, and the generated LDhat likelihood lookup table have been deposited as Zenodo records: alignment for DNA-A, 11 species DNA-A alignment (https://zenodo.org/record/4029589); alignments for DNA-B, CMMGV, EACMCV, EACMV, EACMKV, EACMMV, EACMZV, and SACMV (https://zenodo.org/record/3965023); ACMV and ACMBFV DNA-B alignment (https://zenodo.org/record/3964979); ICMV and SLCMV DNA-B alignment (https://zenodo.org/record/3964977); RDP4 results for DNA-A and RDP4 results for ACMBFV, ACMV, CMMGV, EACMCV, EACMV, EACMKV, EACMMV, EACMZV, and SACMV DNA-A (https://zenodo.org/record/4592854); RDP4 results for ICMV and SLCMV DNA-A (https://zenodo.org/record/4592926); RDP4 results for DNA-B and RDP4 results for EACMV-like plus CMMGV DNA-B (7 “species”) (https://zenodo.org/record/3965029); RDP4 results for ACMV and ACMBFV DNA-B (https://zenodo.org/record/3975834); RDP4 results for ICMV and SLCMV DNA-B (https://zenodo.org/record/3975838); and LDhat coalescent likelihood lookup table for 320 sequences with theta = 0.01 (https://zenodo.org/record/3934350).