Skip to main content
Journal of Virology logoLink to Journal of Virology
. 2024 May 16;98(6):e00513-24. doi: 10.1128/jvi.00513-24

The genomic and phylogenetic analysis of Marseillevirus cajuinensis raises questions about the evolution of Marseilleviridae lineages and their taxonomical organization

Bruna Luiza de Azevedo 1, Victória Fulgêncio Queiroz 1, Isabella Luiza Martins de Aquino 1, Talita Bastos Machado 1, Felipe Lopes de Assis 1, Erik Reis 1, João Pessoa Araújo Júnior 2, Leila Sabrina Ullmann 2, Philippe Colson 3,4,5, Gilbert Greub 6, Frank Aylward 7,8, Rodrigo Araújo Lima Rodrigues 1, Jônatas Santos Abrahão 1,
Editor: Kristin N Parent9
PMCID: PMC11237802  PMID: 38752754

ABSTRACT

Marseilleviruses (MsV) are a group of viruses that compose the Marseilleviridae family within the Nucleocytoviricota phylum. They have been found in different samples, mainly in freshwater. MsV are classically organized into five phylogenetic lineages (A/B/C/D/E), but the current taxonomy does not fully represent all the diversity of the MsV lineages. Here, we describe a novel strain isolated from a Brazilian saltwater sample named Marseillevirus cajuinensis. Based on genomics and phylogenetic analyses, M. cajuinensis exhibits a 380,653-bp genome that encodes 515 open reading frames. Additionally, M. cajuinensis encodes a transfer RNA, a feature that is rarely described for Marseilleviridae. Phylogeny suggests that M. cajuinensis forms a divergent branch within the MsV lineage A. Furthermore, our analysis suggests that the common ancestor for the five classical lineages of MsV diversified into three major groups. The organization of MsV into three main groups is reinforced by a comprehensive analysis of clusters of orthologous groups, sequence identities, and evolutionary distances considering several MsV isolates. Taken together, our results highlight the importance of discovering new viruses to expand the knowledge about known viruses that belong to the same lineages or families. This work proposes a new perspective on the Marseilleviridae lineages organization that could be helpful to a future update in the taxonomy of the Marseilleviridae family.

IMPORTANCE

Marseilleviridae is a family of viruses whose members were mostly isolated from freshwater samples. In this work, we describe the first Marseillevirus isolated from saltwater samples, which we called Marseillevirus cajuinensis. Most of M. cajuinensis genomic features are comparable to other Marseilleviridae members, such as its high number of unknown proteins. On the other hand, M. cajuinensis encodes a transfer RNA, which is a gene category involved in protein translation that is rarely described in this viral family. Additionally, our phylogenetic analyses suggested the existence of, at least, three major Marseilleviridae groups. These observations provide a new perspective on Marseilleviridae lineages organization, which will be valuable in future updates to the taxonomy of the family since the current official classification does not capture all the Marseilleviridae known diversity.

KEYWORDS: Marseillevirus, Marseilleviridae, lineages, phylogeny, taxonomy, evolution

INTRODUCTION

Some years after the mimivirus discovery (1), the Marseilleviridae family became the second described group of large/giant viruses that are able to infect amoebas. The first Marseilleviridae isolate was obtained from water samples collected in a cooling tower in Paris, France (2). Typically, the members of the Marseilleviridae family exhibit icosahedral particles with an average diameter of approximately 250 nm. The discovery of the first Marseillevirus (MsV) paved the way to describe several new isolates. Amoebae are likely marseillevirus hosts in the environment, and Acanthamoeba was the species used at the laboratory to isolate these viruses. Most of them were obtained from freshwater samples, including Lausannevirus, Tunisvirus, Melbournevirus, Cannes 8 virus, Noumeavirus, Brazilian marseillevirus, and Tokyovirus (39). Samples from other environments have been tested leading to the discovery of new Marseilleviridae isolates. Golden marseillevirus and Insectomime virus, for example, were isolated from golden mussels and internal organs of insect larvae, respectively (10, 11). In addition to several isolates already described, metagenomic studies have reported five divergent Marseilleviridae sequences detected in sediments from a hydrothermal vent (Loki’s Castle), in the Atlantic Ocean (12). These discoveries in different types of environments, such as deep ocean regions, have expanded our understanding about the ecology of these viruses and locations where they can be isolated.

Marseilleviridae genomes are composed by circular double-stranded DNA molecules that range in size from 348 to 404 kbp (13). The G-C content ranges from 42.9% to 44.8%, and the number of predicted genes varies from 386 to 491 (13). Most part of Marseillevirus genomes encodes uncharacterized proteins. This is a general characteristic for giant viruses. Besides, Marseilleviruses do not have a diversity of genes involved in protein translation as observed in other giant viruses, such as those of Mimiviridae family (14). Although genes encoding translation factors have been described, it is not common to find in Marseilleviruses genome genes that encode transfer RNAs (tRNA) and aminoacyl-tRNA synthetases (13, 15). Considering tRNAs, they are currently described only in Tokyovirus and in the Loki’s Castle metagenomic sequences (8, 12). As other Nucleocytoviricota phylum viruses, Marseillevirus genomes present a high mosaicism, which means that they have sets of genes of multiple origins (2).

The International Committee on Taxonomy of Viruses (ICTV) classifies all members of the Marseilleviridae family within the Nucleocytoviricota phylum, Pimascorivales order (16, 17). Currently, Marseilleviridae is composed by one genus (Marseillevirus), in which two species are included: Marseillevirus marseillevirus and Senegalvirus marseillevirus. There are two other species in the family, which are not included in any genus, the species Lausannevirus and Tunisvirus (16). However, a new organization and nomenclature for Marseillevirus-related taxa was recently proposed and is under approval for official publishing by ICTV. Phylogenetic analyses based on DNA polymerase proteins revealed the existence of five different Marseillevirus phylogenetic lineages, named A, B, C, D, and E. Lineage A is represented by the first isolate, Marseillevirus marseillevirus (2). Lineage B is represented by Lausannevirus, the second Marseilleviridae isolated (3). Lineages A and B have the largest number of isolates. The lineage C is represented by Tunisvirus and Insectomime virus (5, 17), while lineages D and E are composed of the Brazilian isolates, Brazilian marseillevirus and Golden marseillevirus, respectively (9, 11). Taking that into account, the known diversity of Marseilleviridae members is not being completely represented by the official taxonomy of the group. For example, the representatives of lineages D and E are not considered yet.

Almost 15 years after the discovery of the first Marseillevirus, it becomes apparent that the isolation of new viruses is important to elucidate the evolutionary history of this family. In this study, we report the discovery of Marseillevirus cajuinensis, a new isolate obtained from a saltwater sample from the Northeast coast of Brazil. This discovery paved the way for a genomic and phylogenetic characterization that suggested the existence of at least three major consistent Marseilleviridae groups. Analysis involving Clusters of Orthologous Groups (COGs), relative evolutionary distance (RED), and average amino acid/nucleotide identity (AAI and ANI) reinforces this three-group organization and helps to establish parameters for a future taxonomic organization of Marseilleviridae into three genera and different species. These findings provide a new perspective on Marseilleviridae lineages, which will be valuable in evolutive studies and future updates to the taxonomy of the family.

RESULTS

Marseillevirus cajuinensis: a new amoebae-infecting virus isolated from saltwater

From a saltwater sample collected in the Northeast coast of Brazil, we identified a new viral isolate able to infect Acanthamoeba castellanii. The transmission electronic microscopy (TEM) images containing infected amoebae showed icosahedral particles presenting shape similar to members of Marseilleviridae (Fig. 1A). With TEM analysis, it was also possible to observe some general morphological aspects from the isolate replication cycle. For example, we observed viral factories (VF) with an electron lucent aspect that occupied a large part of the cell cytoplasm (Fig. 1B, highlighted in pink). The images showed several viral particles in different maturation stages inside the VF (Fig. 1C, black arrow). Amorphic structures were also observed inside VFs (Fig. 1B through D), some of them horseshoe shaped (Fig. 1C and D, yellow arrows), resembling the crescent precursors of viral particles found in poxvirus and mimivirus viral factories (18, 19). Also, we observed giant vesicles harboring several viral particles at the end of the replication cycle. One of these giant vesicles reached a size of 5 µm (Fig. 1E, red arrow).

Fig 1.

Fig 1

Transmission electronic microscopy images showing morphological characteristics from the new isolate inside Acanthamoeba castellanii cells. (A) Several icosahedral Marseillevirus-like particles. (B) Acanthamoeba castellanii cell containing a viral factory occupying a large portion of the cell. Viral factory boundaries are highlighted in pink. (C) A portion of Fig. 1B seen closer, showing a VF containing particles in different maturation stages (black arrow) and horseshoe-shaped structures (yellow arrow). (D) Another amoeba cell containing a VF filled with amorphic and horseshoe-shaped structures (yellow arrows). (E) Giant vesicle (red arrow) harboring several viral particles in a final stage of the cycle.

Besides TEM, the viral sample was analyzed by genome sequencing for a more accurate identification of the virus. Next-generation sequencing generated 225,574 reads that were assembled into a 166× depth single scaffold containing 380,653 bp. When compared with the National Center for Biotechnology Information (NCBI) database, the genome sequence best matched with a member of the Marseilleviridae family, presenting 79% of average nucleotide identity with Tokyovirus. Since we could confirm that the isolate is a Marseilleviridae, we named it Marseillevirus cajuinensis.

Marseillevirus cajuinensis genome

The circular double-stranded DNA molecule that composes M. cajuinensis genome has a G-C content of 45.24%. A total of 515 open reading frames (ORFs) were predicted that encode proteins with sizes ranging from 50 to 1,520 amino acids. GenBank sequence database searches suggested that 40 of the 515 proteins encoded by M. cajuinensis genome have functions related to DNA replication, recombination, and repair (Fig. 2A). This category includes a chaperone, a DNA topoisomerase, different nucleases, helicases, histones, and the DNA polymerase protein, which is commonly used as a marker for phylogeny. Other remarkable categories include signal transduction regulation and miscellaneous (Fig. 2A). The former primarily comprises serine/threonine protein kinases, while the latter consists of proteins whose function cannot be reliably predicted due to the presence of non-specific domains and/or repeats, such as ankyrin repeat-containing proteins and zinc finger proteins. M. cajuinensis genome also encodes for proteins involved in different metabolic processes, such as lipases and proteases. Furthermore, the major capsid protein (MCP) and the A32-like packaging ATPase are important proteins found in the virion structure and morphogenesis category.

Fig 2.

Fig 2

Genomic analysis of the Marseillevirus cajuinensis genome. (A) Number of M. cajuinensis proteins according to the function predicted during genome annotation. n = 515. (B) Comparison between the number of M. cajuinensis ORFans and their sizes when using different gene prediction tools. (C) Percentage of Marseillevirus cajuinensis proteins that presented best hit (BLASTp) with the five different Marseilleviridae lineages currently described (A–E). n = 515.

More than half (59%) of the 515 M. cajuinensis proteins were classified as uncharacterized (Fig. 2A). Another 38 proteins were considered as ORFans since they had no hits with any other sequence in the used databases (Fig. 2A). Since several Marseilleviridae isolates have been described, it could be considered a high number of ORFans for a new isolate. However, by analyzing the ORFans’ sizes, we observed that most of them (27/38) correspond to short polypeptides ranging from 50 to 100 amino acids (Fig. 2B). The possible high number of ORFans and their short sizes was intriguing. To confirm these results, a new gene prediction was performed using a different program. Thus, Prodigal was used instead of GeneMarkS. After this new analysis, 493 proteins (bigger than 50 amino acids) were predicted by Prodigal, 22 less than the 515 that were predicted by GeneMarkS. Interestingly, the number of predicted ORFans decreased considerably. Instead of 38, the new prediction returned only 9 ORFans. When analyzing the size of these newly predicted ORFans, it was possible to note that the number of ORFans bigger than 150 amino acids was increased, while the number of ORFans that were composed by less than 100 amino acids was considerably low (Fig. 2B). Here, predictions generated by GeneMarkS were selected to perform all the analysis in this work because this tool was the one the most consistently used for Marseilleviruses in previous works (2, 46, 9, 10, 17, 20).

To know which organisms had their sequences considered as best hits with M. cajuinensis proteins, we analyzed hit by hit the results obtained in the BLASTp searches . Thus, for more than half (73.2%) of M. cajuinensis proteins, best hits from GenBank database, were Marseillevirus lineage A sequences (MsV A) (Fig. 2C). This suggests that M. cajuinensis could present a closer evolutionary relationship with lineage A, which can be confirmed by phylogeny. Also, this best hit analysis showed one M. cajuinensis protein (P86) whose best hit was a sequence encoded by a bacterium (Fig. 2C). P86 was the only protein for which no best hits were sequences from other members of the family Marseilleviridae.

Genome synteny analysis of different members of Marseilleviridae lineages highlighted the relatedness between M. cajuinensis and previously described lineage A members. It is possible to observe that the organization of the M. cajuinensis genome blocks, separated and colored according to similarity, is more like that of the representative member of lineage A than of viruses from other lineages (Fig. 3). In this analysis, it is also possible to observe that in all the genomes analyzed, there is a region that appears to be more conserved, being approximately the last third of genome lengths (Fig. 3). Such conserved region in Marseilleviridae genomes was already described before and called a “core region” (21). More detailed synteny analysis containing different Marseilleviruses isolates from the five lineages can be found in Supplementary Figure 1 at https://www.giantviruses.com/sup-material-of-papers/sup-material-the-genomic-and-phylogenetic-analysis-of-marseillevirus-cajuinensis-raises-questions-about-the-evolution-of-marseilleviridae-lineages-and-their-taxonomical-organization.

Fig 3.

Fig 3

Genome synteny analysis of Marseilleviridae isolates representing the five currently described lineages and Marseillevirus cajuinensis. Each line represents the sequence of a different virus, which is identified in the legend on the left. The letters A, B, C, D, and E indicate the respective phylogenetic lineages of each analyzed virus. Blocks of the same color indicate similar regions between sequences. The areas without any colored blocks represent regions exclusive to that virus, that is, which do not show similarity with other viruses used in the analysis. Note: As they have a circular topology, the sequences were adjusted to start from the MCP aiming to facilitate interpretation of this figure. Marseillevirus marseillevirus was used as the reference genome.

Marseillevirus cajuinensis’ translation-related genes and detection of tRNAs in different Marseilleviruses

The genomic analysis showed that M. cajuinensis encodes four different translation factors. No aminoacyl-tRNA-synthetase genes were found. Additionally, a search for transfer RNA sequences in M. cajuinensis genome was performed. To perform this search, two different programs (Aragorn and tRNAscanSE) were used. No tRNA sequence was found in M. cajuinensis genome by using tRNAscanSE. However, Aragorn was able to detect 1 tRNA sequence (tRNA-Gln-CTG) that has a 1,379-nucleotide intron.

Considering all the known giant/large amoeba-infecting viruses, tRNA encoding is not commonly described for Marseilleviridae isolates neither for other families from Pimascovirales. Otherwise, in groups phylogenetically related to the Pimascovirales order, such as cedratviruses and orpheoviruses, tRNAs were already described (22). Because of the difference in results between the two programs used to predict the tRNA, we carried out a search for tRNA in genomes from the five classical Marseilleviridae lineages that had complete sequences available in GenBank (March 2023). For this, we used both Aragorn and tRNAscanSE. The Aragorn program allows changing its parameters to consider or not the presence of introns, and we performed the analysis in both conditions.

Thus, we detected tRNA sequences in Tokyovirus, Marseillevirus marseillevirus, Melbournevirus, Insectomime virus, Tunisvirus, and Golden marseillevirus using the Aragorn program with parameter allowing introns detection. When introns detection was not considered, it was only possible to detect tRNA in Tokyovirus, as the two sequences encoded by this virus do not have introns. Using tRNAscanSE, it was possible to detect tRNA only in the Tokyovirus sequence. Interestingly, tRNAscanSE detected three tRNA sequences in Tokyovirus, while Aragorn detected only two (see Supplementary Figure 2 at https://www.giantviruses.com/sup-material-of-papers/sup-material-the-genomic-and-phylogenetic-analysis-of-marseillevirus-cajuinensis-raises-questions-about-the-evolution-of-marseilleviridae-lineages-and-their-taxonomical-organization). Considering all tRNA sequences detected in different Marseillevirus, only two of the Tokyovirus tRNA were already described (8). Albeit little described in the Marseilleviruses of the five previously reported lineages, tRNAs have already been described in the sequences detected by metagenomics from samples from Loki’s Castle. Although genomes assembled from metagenomes should be considered with caution, one of these sequences, called LCMAC202, was reported to encode 26 types of tRNA (12).

Phylogeny of different Nucleocytoviricota conserved proteins raises questions about Marseilleviridae lineages organization

To better elucidate the evolutionary relationship of M. cajuinensis with other Marseilleviridae members, phylogenetic analyses were performed using protein sequences that are considered conserved in Nucleocytoviricota. This set of conserved proteins includes the DNA polymerase, the A32-like packaging ATPase, and the late transcription factor VLT3 like, which were used as markers to construct phylogenetic trees (Fig. 4). It is noteworthy that the topology within lineages or genera varies not only among Marseilleviruses but also among other amoebal viruses, depending on the gene analyzed. Virus evolution is modular, with each gene subject to various nuances of a multitude of selective pressures.

Fig 4.

Fig 4

Marseilleviridae phylogeny using different conserved protein sequences. (A) Phylogeny based on DNA polymerase sequences. (B) Phylogenetic tree based on the A32-like packaging ATPase sequences. (C) Phylogeny based on the amino acid sequence of the VLT3-like late transcription factor sequences. (D) Concatenated sequence tree based on DNA polymerase, A32-like packaging ATPase, and VLT3-like late transcription factor sequences. Marseillevirus cajuinensis sequence is labeled in the tree with a pink disk. The trees were built using the maximum likelihood method, with statistical support based on 1,000 replicates (bootstrap). The best model, selected by IQtree (ModelFinder), for the trees was VT + F + I + G4 for (A), LG + G4 for (B), LG + I + G4 for (C), and VT + F + I + G4 for (D). The trees shown in A, B, and C were rooted on Iridoviridae branch as an outgroup. Concatenated tree was rooted at the midpoint. The tree scale bars represent the number of amino acid substitutions per site.

Analyzing all phylogenies, it was possible to observe that M. cajuinensis groups together with sequences from Marseilleviruses of lineage A but represents a more divergent branch within this group (Fig. 4). Similar results were described for Tokyovirus within lineage A in previous works (8). It is noteworthy that M. cajuinensis and Tokyovirus cluster together in separate branch in the VLT3-like tree but not in all constructed trees, including the concatenated one. It is important to note that the divergence of M. cajuinensis within lineage A is comparable with the divergence that separates two different lineages (C and D). It raises questions about which criteria should be used to define what can be considered a new lineage or a new genus within the Marseilleviridae family. For example, by comparing lineage C and D branches, Brazilian marseillevirus is currently considered as a different lineage. Based on this, M. cajuinensis, and even Tokyovirus, could also be considered new lineages.

In addition to this questioning, it is also possible to observe the clear divergence of the common ancestor of the five classical Marseilleviridae lineages into three major groups: one that groups the current members of lineage A, corresponding to the current genus Marseillevirus, another that groups members of the current lineages B-C-D, and finally Golden marseillevirus (lineage E) in a third group. Noteworthy, the lineage E is closer from B-C-D branch than from the Marseillevirus genus (lineage A) but still presents a high divergence within its clade. Indeed, this same topology can be observed in a concatenated phylogenetic tree based on the three conserved sequences former analyzed individually (Fig. 4D).

Marseillevirus cajuinensis expands the pangenome of Marseilleviridae isolates

To understand the impact of the M. cajuinensis isolation on the pangenome and core genome of Marseilleviridae isolates, we searched for Clusters of Orthologous Groups of proteins shared between complete genome sequences of the isolates available in GenBank. Thus, it was possible to analyze the pangenome and core genome of isolated members of the Marseilleviridae family after including M. cajuinensis (Fig. 5) as well as the sharing of COGs between each lineage. It was observed that the pangenome of Marseilleviridae isolates increases from 598 to 1,626 when 13 new isolates are added in the analysis (Fig. 5). The first inserted sequence was Golden marseillevirus because it is the most divergent member of the five lineages. It was expected that with each new discovery of a different virus, the number of total COGs (pangenome) increases. This shows that the Marseilleviridae pangenome is still expanding and that the discovery of new additional viruses is warranted and important, as it will consequently lead to the discovery of new genes.

Fig 5.

Fig 5

Analysis of the pangenome and core genome of Marseilleviridae isolates, including Marseillevirus cajuinensis. The curves indicate the variation in the number of Clusters of Orthologous Groups as new sequences are inserted in analysis. White boxes in the bottom of gray bars indicate the number of coding sequences (CDS) for each virus. Note: The number of CDS is based on the new gene prediction performed exclusively to this analysis.

On the other hand, the number of COGs shared by all the analyzed viruses (core genome) is 182 (Fig. 5). The graph analysis suggests that the core genome of Marseilleviridae isolates appears to have reached a plateau, suggesting that the gene content essential for the existence of these viruses is already relatively well defined, although the functions of most of these genes still need to be deciphered.

A detailed analysis of COGs shared between different Marseilleviruses reinforces the organization of Marseilleviridae in three major groups

In addition to pangenome and coregenome analyses, our data indicate the number of COGs that are shared between the members of the five Marseilleviridae classical lineages (Fig. 6). This analysis shows that M. cajuinensis (Fig. 6A, VI, red arrow) has 65 singletons, that is, clusters of proteins that are found only in its sequence. Among the members of lineage A (Fig. 6A, red disks), M. cajuinensis (VI, red arrow) and Tokyovirus (I) are those that have the greatest numbers of singletons. This reinforces the assumption that both viruses are the most divergent members of the lineage, corroborating the phylogeny results mentioned above. Altogether, the members of lineage A analyzed here share a total of 370 COGs that are unique to their lineage (Fig. 6A, red disks).

Fig 6.

Fig 6

The sharing of Clusters of Orthologous Groups among different viruses belonging to the five classical lineages of Marseilleviruses. (A) Network showing the distribution of COGs among viruses. White circles represent COGs. The colored circles represent the analyzed viruses, separated by color, according to their respective phylogenetic lineages. Roman numerals individually identify each virus analyzed, while Arabic numbers indicate the number of COGs contained in each cluster. Note: The number of proteins obtained though GeneMarkS is indicated in parentheses for each virus. Marseillevirus cajuinensis is highlighted by a red arrow, Golden marseillevirus is highlighted by an orange arrow, Brazilian marseillevirus is highlighted by a gray arrow, and the sharing of COGs between lineages A, B, C, and D is highlighted by an asterisk. (B) Graph detailing the number of exclusive COGs shared between combinations of different classical Marseilleviridae lineages. The black bars correspond to combinations of lineages that are not represented in Fig. 6A, while the gray bars correspond to combinations already represented in Fig. 6A. Note: Some of the lineage’s combinations were not represented in Fig. 6A because some of them often overlap themselves in the network, making analysis hard.

Among the other Marseilleviridae lineages analyzed, in lineage B (Fig. 6A, blue disks), it is possible to observe that all the four lineage members share 137 COGs that are absent in other lineages. Similarly, the lineage C members (Fig. 6A, purple disks) share 133 COGs that are exclusive to their lineage. On the other hand, the only known member of lineage D, Brazilian marseillevirus, has 52 singletons (Fig. 6A, gray arrow). This virus is described to compose its own lineage; however, in this analysis, it presents a smaller number of COGs than M. cajuinensis and Tokyovirus which are considered members of a same phylogenetic lineage (lineage A). Golden marseillevirus, the only known representative member of lineage E, has 294 singletons (Fig. 6A, orange arrow). The comparison of the number of exclusive shared COGs between the different lineages is detailed in Fig. 6B. In this graph, it is possible to analyze the sharing of COGs between different lineages in a simpler way. Lineages A and E (A + E), for example, share only eight COGs. This number decreases when comparing the other lineages with lineage E. Lineages B and E (B + E) share only four COGs, while lineages C and E (C + E), and D and E (D + E) share only COGs. Together, lineages A, B, C, and D (A + B + C + D) share 89 COGs that are not found in lineage E (Fig. 6A, asterisk). Thus, these data reinforce the divergence of lineage E among Marseilleviridae and support its assignment in a distinct group of the family.

Using the data obtained in COGs analysis described above, a hierarchical clustering phenophyletic tree was constructed based on the presence and absence of COGs in the analyzed sequences (Fig. 7). In this figure, it was possible to observe a topology very similar to what was observed in the phylogenetic trees described in this work. Representatives of lineage A are organized into a group (group I) that represents the current genus Marseillevirus and apart from representatives of lineages B, C, D (group II), and E (group III) (Fig. 7).

Fig 7.

Fig 7

Hierarchical clustering tree considering the presence and absence of Clusters of Orthologous Groups in different viruses of Marseilleviridae family. Marseillevirus cajuinensis is labeled in the tree with a pink circle. Scale bar represents arbitrary values that express the evolutionary distance based on presence-absence of COGs.

Within lineage A, it was possible to observe the organization of the viruses into two subgroups, one was composed of M. cajuinensis and Tokyovirus, and the other was composed of the other lineage A viruses (Fig. 7). The same happens in group II since there are three subgroups composed by each member of lineages B, C, and D (Fig. 7). These results reinforce what was observed in phylogeny and genomic analyses. Also, they reinforce the questions raised about the classical organization of Marseilleviridae viruses in five lineages. For example, if the tree branches (lineages B, C, and D) that compose group II are represented by viruses originally classified into three different lineages, this could justify that Marseillevirus cajuinensis and Tokyovirus could be classified in its own lineages. On the other hand, the different branches that compose the classical Marseilleviridae lineages could also be considered as members of major groups.

Relative evolutionary distance and average amino acid/nucleotide identity analyses suggest the organization of Marseilleviridae isolates in three putative genera

To quantify all these observations and make them more consistent, an RED analysis was performed. Thus, we considered three groups for this analysis based on the three major clades observed in DNA polymerase phylogeny: group I (lineage A), group II (lineages B, C, D), and group III (lineage E) (Fig. 8A). DNA polymerase phylogeny was selected because it is the main phylogenetic marker for Nucleocytoviricota, but the same topology was observed for all phylogenies (Fig. 4) and for the hierarchical clustering tree as well (Fig. 7).

Fig 8.

Fig 8

(A) DNA polymerase phylogenetic tree illustrating the three-group proposal for Marseilleviridae. RED values are indicated for groups I and II, and the numbers are consistent with the genus level for taxonomy. Group III does not have a RED value because there is only one virus in lineage E. The tree was built using the maximum likelihood method, with statistical support based on 1,000 replicates (bootstrap). The best model, selected by IQtree (ModelFinder), for the tree was VT + F + I + G4. The tree was rooted on Iridoviridae branch as an outgroup. The tree scale bars represent the number of amino acid substitutions per site. Note: MAGs, metagenome-assembled genomes. (B) Average nucleotide identity and average amino acid identity analysis of Marseilleviridae. Fourteen Marseilleviruses are grouped based on a similarity matrix composed by ANI (left heatmap) and AAI (right heatmap). The three viral groups are indicated. ANI <75% was set to zero. ANI values ranged from 77 to 100, and AAI values ranged from 53 to 100.

RED values varies from 0 to 1, and threshold values for different taxonomic levels in Nucleocytoviricota were defined previously (23). The previously reported RED values for genus ranged from 0.69 to 0.995 (23). The present analysis showed that group I (lineage A) had a RED value of 0.86, while group II (lineages B, C, and D) had a RED value of 0.83 (Fig. 8A). These numbers are consistent with values expected for the genus level (23). Because group III (lineage E) is composed by a single genome (Golden marseillevirus), RED analysis cannot be performed. However, when Golden marseillevirus sequence is included in group II (lineages B, C, and D), the RED value decreases to 0.78. Although this value could still classify group II as a genus while including Golden marseillevirus, the lower RED value makes group II less consistent when comparing with group I. Thus, excluding lineage E from group II and considering it as a separate group were most consistent here.

Additionally, sequences of the Marseillevirus isolates previously analyzed in this work were submitted to an average nucleotide identity analysis. The ANI analysis delineated three main groups of Marseilleviruses, considering an ANI cutoff >75% (Fig. 8B). This value corroborates with our phylogenetic analyses and hierarchical clustering of COGs, revealing the existence of three distinct groups in Marseilleviridae, possibly corresponding to three distinct genera. Within groups, we can use pairwise ANI >95% to define viral species, as used for members of Imitervirales (24). In this case, we can define eight viral species, among which three belong to group I—one of them corresponding to the Marseillevirus cajuinensis; four belong to group II; and only one, consisting of the Golden marseillevirus isolate, belongs to group III. Also, an average amino acid identity analysis was performed. The same three groups could be observed in the bidirectional AAI analysis, with AAI >65% (Fig. 8B). Moreover, in the AAI estimation, we clearly saw eight putative viral species, with AAI >95%.

DISCUSSION

In this work, we describe the isolation of M. cajuinensis from a saltwater sample from the Northeast coast of Brazil. Although members of the family Marseilleviridae have already been isolated from samples of different sources (2, 10, 11, 17, 25), most of them were obtained from freshwater or mud from rivers and lakes (24, 68). To our knowledge, this is the first time that a Marseilleviridae member has been isolated from ocean water, although they have already been detected in this type of environment through metagenomic analyses (12).

TEM images revealed that the replication cycle of M. cajuinensis shares similar characteristics with other members of the Marseilleviridae family, such as the presence of giant vesicles by the end of the cycle (26). These vesicles are important structures for their replication cycles, as the viruses can be released from the cell inside these vesicles, referred to as “expelled vesicles” (27, 28). This mechanism is important to ensure greater effectiveness in the subsequent entry of particles into another amoeba to start new cycle, corresponding to a sort of Trojan horse strategy. Alternatively, such structure may be related to the exocytosis of the viral progeny (26) and their increased resistance to harsh environments when being outside amoebae, waiting for new hosts. The presence of amorphic and horseshoe-like structures inside the viral factory (Fig. 1C and D) requires further investigations, and more detailed analyses of the replication cycle are needed to infer a biological function to these structures. It is known that in the initial stages of Marseilleviridae isolates viral factory formation, endosomal membranes are recruited. These recruited membranes are involved in the formation of the internal membranes that compose viral capsids (26). Therefore, it can be hypothesized that M. cajuinensis VF structures have been recruited from some cellular component to perform a function that is not yet known.

After analyzing the genome of M. cajuinensis, we observed that its genome size, GC content, and number of genes are compatible with that described for other members of the Marseilleviridae family (2, 3, 5, 9, 11, 13). The functional categorization of proteins suggests that more than half of the M. cajuinensis genome encode proteins with unknown functions. This is a very common characteristic among giant amoeba viruses and reinforces the need for new studies that aim the elucidation of the unknown functions of these proteins (1, 29, 30). Also, we detected 38 ORFans in a first gene prediction, but most of them have sizes that range from 50 to 100 amino acids. This puts in doubt whether these sequences are real proteins or whether they are artifacts due to the main gene prediction protocol used in this study. When using a new gene prediction tool, the number of ORFans decreased to 9. This considerable difference in results might be due to the lack in updates of prediction tools that currently are mostly indicated for prokaryotes, eukaryotes, and for viruses that have smaller and fewer complex genomes. The number of ORFs was also affected when different parameters and tools for prediction are utilized. In addition, we detected a gene that encodes a tRNA, which is not commonly described in members of the family Marseilleviridae (13, 14). By analyzing sequences from other Marseilleviridae members using different programs and parameters, we detected tRNAs in five Marseilleviruses that, to our knowledge, have not been previously described as encoding such sequences. The absence of tRNA detection might be attributed to the use of a single algorithm in most previous analyses. Our current analysis shows a difference in the sensitivity of tRNA detection, with Aragorn being more sensitive than the tRNAscanSE algorithm tested in parallel. This is likely due to the different tRNA search models and parameters used by each tool (31, 32). For example, the Aragorn does not depend on the taxonomic lineage specification as parameter to achieve maximum search sensitivity, whereas the tRNAscan-SE does (31). Thus, current gene prediction algorithms lack updates that consider the singularities of giant viruses. This highlights the importance to systematically use more than one algorithm for ORFs and tRNA prediction, as they can complement each other and can stimulate deeper investigations.

Comparisons between M. cajuinensis and members of the Marseillevirus genus, conducted through genomic and phylogenic analyses, showed that they are all grouped within the lineage A. Despite this, M. cajuinensis and Tokyovirus form a divergent branch within this lineage. Phylogenetic analyses also showed that the common ancestor of the five classical Marseilleviridae lineages is in fact diversified into three main branches, which we refer to as group I (lineage A), group II (lineages B, C, and D), and group III (lineage E). After analyzing parameters such as RED, AAI, and ANI, it is possible to suggest that these three groups could potentially be considered as three genera. Taxonomically, group I currently corresponds to Marseillevirus genus, and group II corresponds to the recently proposed Losanna genus (that includes the L. lausannense and the L. tunisiense species). Group III and some members of the other proposed groups (e.g. Brazilian marseillevirus) remain to be officially assigned taxonomically. Also, AAI and ANI analyses suggested the organization of Marseilleviridae isolates in eight species (see Supplementary Figure 3 at https://www.giantviruses.com/sup-material-of-papers/sup-material-the-genomic-and-phylogenetic-analysis-of-marseillevirus-cajuinensis-raises-questions-about-the-evolution-of-marseilleviridae-lineages-and-their-taxonomical-organization), according to the percentage of sequence identity between each other. This could be helpful to classify the isolates that are still not considered officially.

The analyses of COGs shared between different lineages highlight conserved and variable COGs in each lineage. An in-depth analysis to understand the function of each protein belonging to the clusters was not performed here, and most part of these proteins might not have a known function. However, it is possible to hypothesize that conserved COGs might represent important genes to the lineage, possibly inherited from their common ancestor. Conversely, the proteins belonging to clusters that vary among lineages might have different origins. The presence/absence of COGs analysis complemented the phylogeny and reinforced both the greater divergence of Marseillevirus cajuinensis within lineage A and the organization of Marseilleviridae classical lineages into three groups. However, is it worth mentioning that defining the number of Marseilleviridae lineages/genera is a big challenge and might be treacherous as it depends on the methods and the viruses considered in the analysis. Taking that into account, it is clear that there is a need to continue efforts to obtain new isolates as the Marseilleviridae pangenome is still open. The raised questions about the number of lineages within the Marseilleviridae family reflect the impact of new virus discovery on taxonomists’ perspectives, as these new strains add new information to the analyses, sometimes leading to different tree topologies. Such new topologies call for updating phylogenetic organization and taxonomy to ensure that genus and species taxonomic levels better reflect the reality of diversity in a given taxon, rather than being biased due to sampling mostly specific habitats, such as freshwater. For this taxonomic update to happen, it is necessary to first establish the parameters that are needed to classify these viruses into new species or genus. Thus, this work represents a contribution to shape future updates in Marseilleviridae taxonomy.

MATERIALS AND METHODS

Viral isolation, multiplication, purification, and titration

The isolate was obtained through the collection of saltwater samples at Cajueiro da Praia city, located in Piaui state (Northeast coast of Brazil). The protocol was based on the inoculation of the collected samples on 96-well plates containing Acanthamoeba castellanii cells (33). The inoculated wells that presented cytopathic effects (i.e., rounding cells and cellular lysis) had their content collected and analyzed through transmission electron microscopy. After confirming the isolation, the virus was inoculated at a multiplicity of infection (MOI) of 0.01 in cell culture flasks containing 1.4 × 107 Acanthamoeba castellanii cells and 35 mL of peptone-yeast extract-glucose (PYG) medium, supplemented with penicillin (100 U/mL; Cellofarm, Brazil), streptomycin (100 µg/mL; Sigma-Aldrich, USA), and amphotericin B (0.25 µg/mL; Cultilab, Brazil). The cells were incubated at 32°C. Non-infected cells maintained in the same conditions were used as control. When viral-induced cytopathic effects were observed, the flask’s content was collected. This content was filtered through 0.45 µm pores, and then it was ultracentrifuged (36,000 x g) in a 25% sucrose cushion for 2 hours. The pellet containing purified viral particles was homogenized in 300 µL of phosphate-buffered saline (PBS 1×). All the viral titers were obtained and calculated using the end-point method (34).

Transmission electron microscopy

To analyze the morphology of isolated viral particles, the samples were prepared for TEM. First, 7 × 106 A. castellanii cells, cultured in 25 mL of PYG medium, were inoculated with the virus at an MOI of 0.01. Once cytopathic effects were observed, we performed two consecutive washes with 0.1 M sodium phosphate buffer, and we subsequently fixed the cells for 2 hours at room temperature under rotation in an orbital mixer. The fixation solution consisted of 0.1 M sodium phosphate buffer and 2.5% glutaraldehyde. Following this initial fixation step, the cells underwent a secondary fixation with 2% osmium tetroxide before being embedded in Epon resin. This resin allowed an ultramicrotomy, and the 60-nm-thick sections were then examined using a transmission electron microscope (Spirit Biotwin FEI-120 kV) at the Center of Microscopy of the Federal University of Minas Gerais (CM-UFMG).

Sequencing, assembly, and annotation

The purified virus was sequenced using an Illumina MiSeq instrument with a paired-end library using the Illumina DNA Prep Kit (Illumina Inc., San Diego, CA, USA). The FastQC program was used for quality control of the obtained reads, and the per base sequence quality was considered satisfactory (phred >28). The reads were trimmed using the Trimmomatic tool (35). Genome de novo assembly was performed using Spades 3.12 program with default parameters (36, 37). The obtained scaffold was compared with sequences from the NCBI database, using BLASTn (database: nr/nt; expect threshold: 10−3). Open reading frames were predicted with the GeneMarkS tool and Prodigal (38, 39), considering only proteins that were bigger than 50 amino acids. Additionally, tRNA-coding sequences (CDSs) were predicted using ARAGORN (parameters – type: tRNA; allow intron: yes and no (alternately); topology: circular; strand- both) and tRNAscanSE (parameters - Sequence source: general tRNA model; Search mode: default, Genetic Code for tRNA Isotype Prediction: universal) (31, 32). ORFs were annotated using BLASTp (expect threshold: 10−3) against the NCBI non-redundant protein sequence (nr) database aiming to search for similar sequences in this database. The functional categorization of predicted proteins was carried out based on the Nucleo-Cytoplasmic Virus Orthologous Groups (40, 41).

Synteny and phylogenetic analysis

To perform synteny analyses, genome sequences of different MsV isolates were obtained from the NCBI GenBank database. Only genomes from isolated Marseilleviruses (excluding those built from metagenomic data) and that were complete and available in GenBank (March 2023) were selected. As they have a circular topology, the sequences were manually curated to start from the major capsid protein aiming to facilitate image interpretation. After curating the sequences, synteny analysis was performed using the MAUVE program, with default parameters (42). The following genome sequences were retrieved from GenBank and then analyzed: Tokyovirus (NC_030230.1); Marseillevirus marseillevirus (GU071086.1); Cannes 8 virus (KF261120.1); Marseillevirus Shanghai (MG827395.1); Melbournevirus (KM275475.1); Kurlavirus (KY073338.1); Lausannevirus (HQ113105.1); Noumeavirus (KX066233.1); Port-miou virus (KT428292.1); Insectomime (HG428764.1); Tunisvirus (KF483846.1); Brazilian MsV (KT752522.1); Golden MsV (KT835053.1).

Phylogenetic trees were constructed using the IQtree software (version 1.6.12) with 1,000 bootstrap replicates as branch support (43). To prepare the data sets for alignment, a search for similar sequences was performed using the NCBI non-redundant protein sequences (nr) database and BLASTp with an expected threshold of 10−3. Sequence alignment was performed using the MUSCLE algorithm (44). The best-fit substitution models were determined using the ModelFinder algorithm within IQtree (45). Finally, the resulting phylogenetic trees were visualized and edited using MEGA X software (46).

Relative evolutionary distance analyses were performed using phylogeny constructed according to parameters mentioned above. RED values were calculated using the R package “castor” (47), and the thresholds for taxonomic levels were defined as described in the Results section based in a previous work (23).

Pangenome and COGs analysis

All complete MsV sequences that were previously obtained for synteny analyses from GenBank were also subjected to a new gene prediction using GeneMarkS (38). The amino acid sequences of each predicted CDS were analyzed using the ProteinOrtho software (parameters - selfblast, identity: 30%, coverage: 50%, and e-value of 10−5) (48). The output files generated by ProteinOrtho were used to analyze the pangenome and core genome of isolated MsV. Also, output files of orthologous proteins were used to compare the number of Clusters of Orthologous Groups that are shared between the studied viruses and to construct a hierarchical clustering based on the presence and absence of COGs in different MsV. To analyze the sharing of COGs between different MsV, a network representation was constructed using the Gephi 0.10 software. For this, data obtained in ProteinOrtho analysis were used to create spreadsheets containing the “nodes” (viruses and COGs) and the “edges” (presence of COGs in each virus). Network representation was built using an algorithm based on attraction and repulsion forces (Force Atlas). To perform COGs presence and absence analysis, a binary file was generated, and a phenetic tree was created in the MultiExperiment Viewer program, version 4.9.0, using the hierarchical clustering algorithm and the Pearson correlation as distance metric (49).

Average nucleotide and amino acid identities

Whole-genome average nucleotide identity analysis was performed using FastANI (50) implemented on Galaxy Server (https://usegalaxy.eu/), on the complete genomes of 14 Marseilleviruses obtained inform the NCBI GenBank database. ANI <75% was considered 0. Average amino acid identity was calculated using reciprocal best hits (two-way AAI) between two Marseilleviruses’ protein genomic data sets, considering an identity cutoff of 20%. AAI was estimated using the AAI calculator (http://enve-omics.ce.gatech.edu/aai/).

ACKNOWLEDGMENTS

We would like to thank all colleagues from Grupo de Estudo e Prospecção de Vírus Gigantes (GEPVIG) and from Laboratório de Vírus of Universidade Federal de Minas Gerais (UFMG). Also, we thank Centro de Microscopia of UFMG for the contribution of microscopy images.

We thank CNPq (Conselho Nacional de Desenvolvimento Científico e Tecnológico), CAPES (Coordenação de Aperfeiçoamento de Pessoal de Nível Superior), and FAPEMIG (Fundação de Amparo à Pesquisa do estado de Minas Gerais) for research funding.

This research is registered at SISGEN and SISBIO.

J.S.A., J.P.A.J., and R.A.L.R. are CNPq researchers.

P.C., G.G., and J.S.A. are contributors of the ICTV Marseillevirus study group (2016-2023).

Contributor Information

Jônatas Santos Abrahão, Email: jonatas.abrahao@gmail.com.

Kristin N. Parent, Michigan State University, East Lansing, Michigan, USA

DATA AVAILABILITY

The M. cajuinensis genome sequence is available in GenBank under accession number OR991738.

REFERENCES

  • 1. Scola BL, Audic S, Robert C, Jungang L, de Lamballerie X, Drancourt M, Birtles R, Claverie J-M, Raoult D. 2003. A giant virus in amoebae. Science 299:2033–2033. doi: 10.1126/science.1081867 [DOI] [PubMed] [Google Scholar]
  • 2. Boyer M, Yutin N, Pagnier I, Barrassi L, Fournous G, Espinosa L, Robert C, Azza S, Sun S, Rossmann MG, Suzan-Monti M, La Scola B, Koonin EV, Raoult D. 2009. Giant Marseillevirus highlights the role of amoebae as a melting pot in emergence of chimeric microorganisms. Proc Natl Acad Sci U S A 106:21848–21853. doi: 10.1073/pnas.0911354106 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Thomas V, Bertelli C, Collyn F, Casson N, Telenti A, Goesmann A, Croxatto A, Greub G. 2011. Lausannevirus, a giant amoebal virus encoding histone doublets. Environ Microbiol 13:1454–1466. doi: 10.1111/j.1462-2920.2011.02446.x [DOI] [PubMed] [Google Scholar]
  • 4. Aherfi S, Pagnier I, Fournous G, Raoult D, La Scola B, Colson P. 2013. Complete genome sequence of Cannes 8 virus, a new member of the proposed family “Marseilleviridae”. Virus Genes 47:550–555. doi: 10.1007/s11262-013-0965-4 [DOI] [PubMed] [Google Scholar]
  • 5. Aherfi S, Boughalmi M, Pagnier I, Fournous G, La Scola B, Raoult D, Colson P. 2014. Complete genome sequence of Tunisvirus, a new member of the proposed family Marseilleviridae. Arch Virol 159:2349–2358. doi: 10.1007/s00705-014-2023-5 [DOI] [PubMed] [Google Scholar]
  • 6. Doutre G, Philippe N, Abergel C, Claverie J-M. 2014. Genome analysis of the first Marseilleviridae representative from Australia indicates that most of its genes contribute to virus fitness. J Virol 88:14340–14349. doi: 10.1128/JVI.02414-14 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Fabre E, Jeudy S, Santini S, Legendre M, Trauchessec M, Couté Y, Claverie J-M, Abergel C. 2017. Noumeavirus replication relies on a transient remote control of the host nucleus. Nat Commun 8:15087. doi: 10.1038/ncomms15087 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Takemura M. 2016. Morphological and taxonomic properties of Tokyovirus, the first Marseilleviridae member isolated from Japan. Microbes Environ 31:442–448. doi: 10.1264/jsme2.ME16107 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Dornas FP, Assis FL, Aherfi S, Arantes T, Abrahão JS, Colson P, La Scola B. 2016. A Brazilian marseillevirus is the founding member of a lineage in family Marseilleviridae. Viruses 8:76. doi: 10.3390/v8030076 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Colson P, Fancello L, Gimenez G, Armougom F, Desnues C, Fournous G, Yoosuf N, Million M, La Scola B, Raoult D. 2013. Evidence of the megavirome in humans. J Clin Virol 57:191–200. doi: 10.1016/j.jcv.2013.03.018 [DOI] [PubMed] [Google Scholar]
  • 11. Dos Santos RN, Campos FS, Medeiros de Albuquerque NR, Finoketti F, Côrrea RA, Cano-Ortiz L, Assis FL, Arantes TS, Roehe PM, Franco AC. 2016. A new marseillevirus isolated in Southern Brazil from Limnoperna fortunei. Sci Rep 6:35237. doi: 10.1038/srep35237 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Bäckström D, Yutin N, Jørgensen SL, Dharamshi J, Homa F, Zaremba-Niedwiedzka K, Spang A, Wolf YI, Koonin EV, Ettema TJG. 2019. Virus genomes from deep sea sediments expand the ocean megavirome and support independent origins of viral gigantism. mBio 10:e02497-18. doi: 10.1128/mBio.02497-18 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Sahmi-Bounsiar D, Rolland C, Aherfi S, Boudjemaa H, Levasseur A, La Scola B, Colson P. 2021. Marseilleviruses: an update in 2021. Front Microbiol 12:648731. doi: 10.3389/fmicb.2021.648731 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. Rodrigues RAL, da Silva LCF, Abrahão JS. 2020. Translating the language of giants: translation-related genes as a major contribution of giant viruses to the virosphere. Arch Virol 165:1267–1278. doi: 10.1007/s00705-020-04626-2 [DOI] [PubMed] [Google Scholar]
  • 15. Abrahão JS, Araújo R, Colson P, La Scola B. 2017. The analysis of translation-related gene set boosts debates around origin and evolution of mimiviruses. PLoS Genet 13:e1006532. doi: 10.1371/journal.pgen.1006532 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Current ICTV taxonomy release. ICTV. Available from: https://ictv.global/taxonomy. Retrieved 01 Oct 2022. [Google Scholar]
  • 17. Boughalmi M, Pagnier I, Aherfi S, Colson P, Raoult D, La Scola B. 2013. First isolation of a Marseillevirus in the Diptera Syrphidae Eristalis tenax. Intervirology 56:386–394. doi: 10.1159/000354560 [DOI] [PubMed] [Google Scholar]
  • 18. Andrade A, Rodrigues RAL, Oliveira GP, Andrade KR, Bonjardim CA, La Scola B, Kroon EG, Abrahão JS. 2017. Filling knowledge gaps for mimivirus entry, uncoating, and morphogenesis. J Virol 91:e01335-17. doi: 10.1128/JVI.01335-17 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Maruri-Avidal L, Weisberg AS, Moss B. 2013. Association of the vaccinia virus A11 protein with the endoplasmic reticulum and crescent precursors of immature virions. J Virol 87:10195–10206. doi: 10.1128/JVI.01601-13 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Chatterjee A, Kondabagil K. 2017. Complete genome sequence of Kurlavirus, a novel member of the family Marseilleviridae isolated in Mumbai, India. Arch Virol 162:3243–3245. doi: 10.1007/s00705-017-3469-z [DOI] [PubMed] [Google Scholar]
  • 21. Blanca L, Christo-Foroux E, Rigou S, Legendre M. 2020. Comparative analysis of the circular and highly asymmetrical Marseilleviridae genomes. Viruses 12:1270. doi: 10.3390/v12111270 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Queiroz VF, Carvalho J, de Souza FG, Lima MT, Santos JD, Rocha KLS, de Oliveira DB, Araújo JP, Ullmann LS, Rodrigues RAL, Abrahão JS. 2023. Analysis of the genomic features and evolutionary history of pithovirus-like isolates reveals two major divergent groups of viruses. J Virol 97:e0041123. doi: 10.1128/jvi.00411-23 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. Aylward FO, Moniruzzaman M, Ha AD, Koonin EV. 2021. A phylogenomic framework for charting the diversity and evolution of giant viruses. PLOS Biol 19:e3001430. doi: 10.1371/journal.pbio.3001430 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24. Aylward FO, Abrahão JS, Brussaard CPD, Fischer MG, Moniruzzaman M, Ogata H, Suttle CA. 2023. Taxonomic update for giant viruses in the order Imitervirales (phylum Nucleocytoviricota). Arch Virol 168:283. doi: 10.1007/s00705-023-05906-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25. Doutre G, Arfib B, Rochette P, Claverie J-M, Bonin P, Abergel C. 2015. Complete genome sequence of a new member of the Marseilleviridae recovered from the brackish submarine spring in the Cassis Port-Miou Calanque, France. Genome Announc 3:e01148-15. doi: 10.1128/genomeA.01148-15 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26. Arantes TS, Rodrigues RAL, Dos Santos Silva LK, Oliveira GP, de Souza HL, Khalil JYB, de Oliveira DB, Torres AA, da Silva LL, Colson P, Kroon EG, da Fonseca FG, Bonjardim CA, La Scola B, Abrahão JS. 2016. The large Marseillevirus explores different entry pathways by forming giant infectious vesicles. J Virol 90:5246–5255. doi: 10.1128/JVI.00177-16 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27. Greub G, Raoult D. 2004. Microorganisms resistant to free-living amoebae. Clin Microbiol Rev 17:413–433. doi: 10.1128/CMR.17.2.413-433.2004 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28. Greub G, Raoult D. 2002. Crescent bodies of Parachlamydia acanthamoeba and its life cycle within Acanthamoeba polyphaga: an electron micrograph study. Appl Environ Microbiol 68:3076–3084. doi: 10.1128/AEM.68.6.3076-3084.2002 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29. Boratto PVM, Oliveira GP, Machado TB, Andrade A, Baudoin J-P, Klose T, Schulz F, Azza S, Decloquement P, Chabrière E, Colson P, Levasseur A, La Scola B, Abrahão JS. 2020. Yaravirus: a novel 80-nm virus infecting Acanthamoeba castellanii. Proc Natl Acad Sci U S A 117:16579–16586. doi: 10.1073/pnas.2001637117 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Legendre M, Bartoli J, Shmakova L, Jeudy S, Labadie K, Adrait A, Lescot M, Poirot O, Bertaux L, Bruley C, Couté Y, Rivkina E, Abergel C, Claverie J-M. 2014. Thirty-thousand-year-old distant relative of giant icosahedral DNA viruses with a pandoravirus morphology. Proc Natl Acad Sci U S A 111:4274–4279. doi: 10.1073/pnas.1320670111 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31. Laslett D, Canback B. 2004. ARAGORN, a program to detect tRNA genes and tmRNA genes in nucleotide sequences. Nucleic Acids Res 32:11–16. doi: 10.1093/nar/gkh152 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32. Chan PP, Lowe TM. 2019. tRNAscan-SE: searching for tRNA genes in genomic sequences. Methods Mol Biol 1962:1–14. doi: 10.1007/978-1-4939-9173-0_1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33. Machado TB, de Aquino ILM, Abrahão JS. 2022. Isolation of giant viruses of Acanthamoeba castellanii. Curr Protoc 2:e455. doi: 10.1002/cpz1.455 [DOI] [PubMed] [Google Scholar]
  • 34. Reed LJ, Muench H. 1938. A simple method of estimating fifty per cent endpoints. Am J Epidemiol 27:493–497. doi: 10.1093/oxfordjournals.aje.a118408 [DOI] [Google Scholar]
  • 35. Bolger AM, Lohse M, Usadel B. 2014. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30:2114–2120. doi: 10.1093/bioinformatics/btu170 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36. Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Kulikov AS, Lesin VM, Nikolenko SI, Pham S, Prjibelski AD, Pyshkin AV, Sirotkin AV, Vyahhi N, Tesler G, Alekseyev MA, Pevzner PA. 2012. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J Comput Biol 19:455–477. doi: 10.1089/cmb.2012.0021 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37. Prjibelski A, Antipov D, Meleshko D, Lapidus A, Korobeynikov A. 2020. Using spades de novo assembler. Curr Protoc Bioinformatics 70:e102. doi: 10.1002/cpbi.102 [DOI] [PubMed] [Google Scholar]
  • 38. Besemer J, Borodovsky M. 2005. GeneMark: web software for gene finding in prokaryotes, eukaryotes and viruses. Nucleic Acids Res 33:W451–W454. doi: 10.1093/nar/gki487 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39. Hyatt D, Chen G-L, Locascio PF, Land ML, Larimer FW, Hauser LJ. 2010. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics 11:119. doi: 10.1186/1471-2105-11-119 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40. Yutin N, Wolf YI, Raoult D, Koonin EV. 2009. Eukaryotic large nucleo-cytoplasmic DNA viruses: clusters of orthologous genes and reconstruction of viral genome evolution. Virol J 6:223. doi: 10.1186/1743-422X-6-223 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41. Rodrigues RAL, Queiroz VF, Ghosh J, Dunigan DD, Van Etten JL. 2022. Functional genomic analyses reveal an open pan-genome for the chloroviruses and a potential for genetic innovation in new isolates. J Virol 96:e0136721. doi: 10.1128/JVI.01367-21 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42. Darling ACE, Mau B, Blattner FR, Perna NT. 2004. Mauve: multiple alignment of conserved genomic sequence with rearrangements. Genome Res 14:1394–1403. doi: 10.1101/gr.2289704 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43. Nguyen L-T, Schmidt HA, von Haeseler A, Minh BQ. 2015. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol Biol Evol 32:268–274. doi: 10.1093/molbev/msu300 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44. Edgar RC. 2004. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 32:1792–1797. doi: 10.1093/nar/gkh340 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45. Kalyaanamoorthy S, Minh BQ, Wong TKF, von Haeseler A, Jermiin LS. 2017. ModelFinder: fast model selection for accurate phylogenetic estimates. Nat Methods 14:587–589. doi: 10.1038/nmeth.4285 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46. Kumar S, Stecher G, Li M, Knyaz C, Tamura K. 2018. MEGA X: molecular evolutionary genetics analysis across computing platforms. Mol Biol Evol 35:1547–1549. doi: 10.1093/molbev/msy096 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47. Louca S, Doebeli M. 2018. Efficient comparative phylogenetics on large trees. Bioinformatics 34:1053–1055. doi: 10.1093/bioinformatics/btx701 [DOI] [PubMed] [Google Scholar]
  • 48. Lechner M, Findeiss S, Steiner L, Marz M, Stadler PF, Prohaska SJ. 2011. Proteinortho: detection of (Co-)orthologs in large-scale analysis. BMC Bioinformatics 12:124. doi: 10.1186/1471-2105-12-124 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49. Howe E, Holton K, Nair S, Schlauch D, Sinha R, Quackenbush J. 2010. MeV: MultiExperiment viewer, p 267–277. In Ochs MF, Casagrande JT, Davuluri RV (ed), Biomedical informatics for cancer research. Springer US, Boston, MA. [Google Scholar]
  • 50. Jain C, Rodriguez-R LM, Phillippy AM, Konstantinidis KT, Aluru S. 2018. High throughput ANI analysis of 90K prokaryotic genomes reveals clear species boundaries. Nat Commun 9:5114. doi: 10.1038/s41467-018-07641-9 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

The M. cajuinensis genome sequence is available in GenBank under accession number OR991738.


Articles from Journal of Virology are provided here courtesy of American Society for Microbiology (ASM)

RESOURCES