Skip to main content
Journal of Virology logoLink to Journal of Virology
. 2023 Jul 3;97(7):e00411-23. doi: 10.1128/jvi.00411-23

Analysis of the Genomic Features and Evolutionary History of Pithovirus-Like Isolates Reveals Two Major Divergent Groups of Viruses

Victória F Queiroz a, João Victor R P Carvalho a, Fernanda G de Souza a, Maurício T Lima a, Juliane D Santos b, Kamila L S Rocha b, Danilo B de Oliveira b, João Pessoa Araújo Jr c, Leila Sabrina Ullmann c,d, Rodrigo A L Rodrigues a,, Jônatas S Abrahão a,
Editor: Colin R Parrishe
PMCID: PMC10373538  PMID: 37395647

ABSTRACT

New representatives of the phylum Nucleocytoviricota have been rapidly described in the last decade. Despite this, not all viruses of this phylum are allocated to recognized taxonomic families, as is the case for orpheovirus, pithovirus, and cedratvirus, which form the proposed family Pithoviridae. In this study, we performed comprehensive comparative genomic analyses of 8 pithovirus-like isolates, aiming to understand their common traits and evolutionary history. Structural and functional genome annotation was performed de novo for all the viruses, which served as a reference for pangenome construction. The synteny analysis showed substantial differences in genome organization between these viruses, with very few and short syntenic blocks shared between orpheovirus and its relatives. It was possible to observe an open pangenome with a significant increase in the slope when orpheovirus was added, alongside a decrease in the core genome. Network analysis placed orpheovirus as a distant and major hub with a large fraction of unique clusters of orthologs, indicating a distant relationship between this virus and its relatives, with only a few shared genes. Additionally, phylogenetic analyses of strict core genes shared with other viruses of the phylum reinforced the divergence of orpheovirus from pithoviruses and cedratviruses. Altogether, our results indicate that although pithovirus-like isolates share common features, this group of ovoid-shaped giant viruses presents substantial differences in gene contents, genomic architectures, and the phylogenetic history of several core genes. Our data indicate that orpheovirus is an evolutionarily divergent viral entity, suggesting its allocation to a different viral family, Orpheoviridae.

IMPORTANCE Giant viruses that infect amoebae form a monophyletic group named the phylum Nucleocytoviricota. Despite being genomically and morphologically very diverse, the taxonomic categories of some clades that form this phylum are not yet well established. With advances in isolation techniques, the speed at which new giant viruses are described has increased, escalating the need to establish criteria to define the emerging viral taxa. In this work, we performed a comparative genomic analysis of representatives of the putative family Pithoviridae. Based on the dissimilarity of orpheovirus from the other viruses of this putative family, we propose that orpheovirus be considered a member of an independent family, Orpheoviridae, and suggest criteria to demarcate families consisting of ovoid-shaped giant viruses.

KEYWORDS: giant virus, orpheovirus, Pithoviridae, Orpheoviridae, pangenome, cedratvirus, evolution, pithovirus

INTRODUCTION

The phylum Nucleocytoviricota is a diverse, monophyletic taxon composed of double-stranded-DNA viruses (1). This phylum includes viruses that infect different metazoans and protists (2). In this group, viruses that infect protists are commonly referred to as giant viruses, and in recent decades, this term has been used mainly to refer to the viruses of this phylum that infect amoebae (39). Amid the many distinct characteristics of giant viruses, the size and complexity of their genomes are some of the most striking. Their genomes can be formed by more than 2 million base pairs and code for more than a thousand genes, including some involved in several metabolic processes that were hitherto uncommon in the virosphere (1014). For example, klosneuviruses and tupanviruses have robust translational machinery, expressing several tRNAs, aminoacyl-tRNA synthetases (aaRSs) for all amino acids, and several translation factors, a more complete machinery than that of some cellular organisms (11, 15). Furthermore, in different groups of giant viruses, most genes are unique and have little or no significant similarity to previously described sequences in databases and therefore are classified as ORFans (6, 13, 1623).

Currently, the phylum Nucleocytoviricota contains 11 families recognized by the International Committee on Taxonomy of Viruses (ICTV), namely, Phycodnaviridae, Allomimiviridae, Mesomimiviridae, Schizomimiviridae, Mimiviridae, Ascoviridae, Iridoviridae, Marseilleviridae, Mamonoviridae, Asfarviridae, and Poxviridae (https://ictv.global/taxonomy). However, with the improvement of isolation techniques and the intensification of prospecting assays, many groups of viruses that infect amoebae have been discovered and are still not allocated to any recognized taxonomic family (13, 1822, 2426). Among the emerging taxa is the putative family Pithoviridae. To date, this family comprises 3 pithovirus isolates, 9 cedratvirus isolates, and a single isolate of orpheovirus. These viruses have atypical virion morphologies, exhibiting long, ovoid-shaped particles morphologically distinguishable by a delivery portal usually observed by transmission electron microscopy (4, 6, 17, 2733). In addition to the delivery portals, the sizes and composition of the virus genomes also differ within the group (6, 28, 34). Additionally, phylogenetic reconstructions of conserved genes show that orpheovirus is evolutionarily divergent from the clade that includes cedratviruses and pithoviruses (17).

Given the large phylogenetic breadth of this phylum and with new giant viruses being discovered rapidly, emerging taxa within this viral group will become increasingly common, making it harder to define the phylogenetic relationships. This raises the challenging task of establishing criteria to define these emerging viral taxa (35). Therefore, this study helps to elucidate aspects of the evolution, biology, and diversity of these viruses and their unique traits. Herein, we describe the functional pangenome based on the reannotation of the isolated viruses that compose the pithovirus-like group and phylogenetic reconstructions of the genes that compose the core genome of this group, along with those of other viruses of the phylum. Our results reinforce the high divergence of orpheovirus from the other viruses within the group, suggesting that it should be considered a new independent family.

RESULTS

Structural and functional genomic characterization of orpheovirus.

Orpheovirus is the only representative of a divergent branch of the proposed family Pithoviridae. To start the analyses presented in this work, the virus was produced, and its genome was sequenced to verify the stability and integrity of the sample. All orpheovirus reads had a Phred quality score above 30. The genome assembly was obtained de novo in a single contig containing 1,473,655 pairs of bases with a coverage of 74.49. A total of 1,520 open reading frames (ORFs) were predicted by GeneMarkS software. In addition, ARAGORN software indicated two tRNAs, a tyrosine tRNA and an alanine tRNA. The predicted proteins were separated into six categories according to the number of amino acids that composed them. The majority of the proteins predicted for orpheovirus ranged between 201 and 500 amino acids (Fig. 1A). The 245 proteins composed of fewer than 51 amino acids (~16% of the predicted proteins) were considered for annotation and subsequent analysis.

FIG 1.

FIG 1

Characterization of orpheovirus protein-coding genes. (A) Numbers of predicted proteins for each size category. The relative numbers of ORFans and proteins with uncharacterized and characterized functions are differentiated by color. aa, amino acids. (B) Functional annotation of the proteins with a defined function.

Genome annotation was performed using 3 different platforms, Diamond, HHpred, and InterProScan. Altogether, 823 predicted proteins had a hit on at least one platform; as a result, the ORFans represented ~46% of the predicted genes. ORFans accounted for the majority of proteins composed of up to 100 amino acids and were also a significant part of those between 101 and 500 amino acids. Only six proteins with more than 500 amino acids were ORFans, and there were no ORFans composed of more than 1,000 amino acids (Fig. 1A).

Functional annotation alone does not always indicate a biological function. In many cases, the best hit was with hypothetical proteins, families with unknown functions, transmembrane or cytoplasmic domains, etc. In such cases, there is no way to predict a biological function only with annotation procedures; therefore, these proteins were classified as uncharacterized. Among the annotated proteins, the uncharacterized function category corresponded to most proteins smaller than 100 amino acids. For the majority of the proteins above 100 amino acids and all proteins over 1,000 amino acids, a biological function was indicated (Fig. 1A).

Overall, most proteins with an indicated biological function were involved in signal transduction regulation, but this category was less prevalent in proteins greater than 500 amino acids and did not appear in those greater than 1,000 amino acids. Proteins larger than 1,000 amino acids were also not involved in nucleotide metabolism, although most of these proteins were associated with transcription and RNA processing and DNA replication, recombination, and repair. Proteins with fewer than 51 amino acids had no biological function related to protein metabolism or translation or carbohydrate metabolism, but all other proteins of different sizes had such functions. Proteins involved in transcription and RNA processing, miscellaneous, other metabolic functions, and DNA replication, recombination, and repair were found in all size categories (Fig. 1B).

Comparative genomic analysis of the pithovirus-like group.

In contrast to other members of the pithovirus-like group, orpheovirus has a completely different genome organization (Fig. 2). The whole-genome alignment of pithoviruses showed similar genomic arrangements. Strong synteny could also be observed in each cedratvirus lineage, among the three cedratviruses that compose lineage A (cedratvirus zaza IHUMI, cedratvirus A11, and cedratvirus lausannensis) and the two that compose lineage B (Brazilian cedratvirus IHUMI and cedratvirus kamchatka). Despite the clear asynteny between pithovirus and cedratvirus, when orpheovirus was added to the analysis, it differed greatly in terms of genome size, composition, and genetic organization.

FIG 2.

FIG 2

Genomic synteny analysis. Comparison of genome organization between members of the pithovirus-like group. The schematic whole-genome-alignment diagram was obtained using Mauve software. The vertical red line indicates orpheovirus.

For the sake of consistency, a new prediction and a new annotation were performed for all other members of the group whose complete genomes were deposited in public databases, using the same strategy employed for orpheovirus. After reannotation, when compared to the other members, orpheovirus was the representative with the highest percentage of ORFans in its genome, followed by the first described pithovirus (pithovirus sibericum), with ORFans corresponding to ~35% of the genome, and the first described cedratvirus (cedratvirus A11), with ~33% (Fig. 3). The smaller percentages of ORFans observed in other cedratviruses and in pithovirus massiliensis were explained by homology found with the genes that were considered ORFans in the first isolates. Therefore, there were decreases in the percentages of ORFans, but the percentages of proteins with uncharacterized function remained, with little variation.

FIG 3.

FIG 3

Comparison of the functional annotations of members of the pithovirus-like group. New predictions and annotations were made for all viruses using the same parameters. Functional categories are represented in percentages, with 100% equivalent to all ORFs predicted by GeneMarkS software.

Proteins with characterized biological functions corresponded to approximately 50% of the genome of most viruses except for pithovirus massiliensis and orpheovirus, in which they represented approximately 40%. Of these, in all viruses, the category comprising the greatest number of proteins was signal transduction regulation, followed by miscellaneous and other metabolic functions, depending on the virus. Notably, a higher percentage of proteins involved in translation was observed for orpheovirus than for the other relatives (Fig. 3).

Pangenome evolution and functional characterization.

To better understand which genes are conserved in this group of viruses, the 8 viruses were gathered, and a pangenomic analysis was performed. The pangenome was constructed using OrthoFinder software by adding the genomes one by one. The analysis of the 8 viruses revealed that a total of 6,275 genes were grouped into 2,586 clusters of orthologous groups of genes (COGs), clusters of paralogous genes, or unique genes (singletons) as components of the pithovirus-like group pangenome (Fig. 4).

FIG 4.

FIG 4

Evolution of the pangenome of the pithovirus-like group. Sequential inclusion of viral representatives indicates an open pangenome and a small core genome for this group of viruses. Blue bars indicate the number of CDSs for each virus. The y axis shows the numbers of CDSs and cumulative COG numbers after the inclusion of a new genome (pangenome) and the decrease in the number of conserved genes (core genome). The x axis shows the viruses.

It was possible to observe a constant increase in the viral pangenome with the addition of new cedratviruses, but this increase became more accentuated when the first pithovirus of the analysis was added. However, the addition of orpheovirus alone contributed almost a thousand COGs to the pangenome. Likewise, it was also possible to observe a sharp decrease in the core genome slope. At the end of the analysis of the 5 cedratviruses, the core genome was composed of 360 COGs. With the addition of the 2 pithoviruses, this number decreased to 130 COGs, and with the inclusion of orpheovirus, this number dropped to 52 conserved COGs, corresponding to 4.8% of the pangenome (Fig. 4). A total of 1691 COGs were singletons, meaning that they had only one gene with no paralogs and were unique to a specific virus.

In the network graph, it became evident how the COGs were shared between the viruses of the analysis. Among the 1,074 COGs that comprised orpheovirus, 52 formed the core genome. Thirteen were shared between orpheovirus and both pithoviruses, 2 were shared with pithovirus massiliensis, 9 were shared with all cedratviruses, and 6 were shared with different groups of cedratviruses. Notably, 992 COGs were unique to orpheovirus, the equivalent of 92.3% of all orpheovirus COGs and 38.3% of the pangenome. The unshared genes caused orpheovirus to form a distant cluster in the network analysis (Fig. 5). Of these, 805 were singletons and 187 were clusters of paralogous genes.

FIG 5.

FIG 5

COG sharing among the pithovirus-like group. A bipartite network graph connecting the 2,586 COGs to the 8 viruses included in the pangenome analysis. Larger nodes correspond to the viruses. Orpheovirus nodes are colored purple, pithoviruses blue, and cedratviruses yellow. Smaller gray nodes correspond to the COGs. The 52 COGs that compose the core genome are circled. The graph was generated using a force-based algorithm.

The functional annotation of the COGs was performed manually, keeping the most informative functional category as the final functional annotation of each COG. Among all COGs that formed the pangenome, 34% were composed of ORFans, and 27.5% had uncharacterized functions, resulting in a 61.5% “dark” pangenome. COGs listed as miscellaneous corresponded to 6.5%, a category that includes proteins whose in silico functional annotation alone cannot establish a specific function, such as protein kinases. Among the categories with better established functions, the most extensive was signal transduction regulation, corresponding to ~11% of the COGs; another 11.6% were associated with metabolic functions. COGs related to DNA replication, recombination, and repair processes, including the DNA polymerase B cluster, corresponded to 4.4%, and those involved in virion structure and morphogenesis, including the major capsid protein (MCP) cluster, corresponded to only 0.4%. Finally, approximately 3% of the COGs were related to transcription and RNA processing, 0.8% to translation, 0.2% to integration and transposition, and 0.3% to virus-host interaction (see Fig. S1 posted at https://5c95043044c49.site123.me/sup-materials/sup-material-the-analysis-of-genomic-features-and-evolutionary-history-of-pithovirus-like-isolates-reveals-the-existence-of-two-major-divergent-groups-of-viruses).

The 52 COGs that compose the core genome were distributed in nine functional categories. The majority were involved in transcription and RNA processing, for a total of 16 proteins, equivalent to 30.7% of the core genome. Next, 14 proteins were related to DNA replication, recombination, and repair (26.9%). Nine (17.3%) COGs were hypothetical proteins, many containing a transmembrane region. These conserved proteins may have important functions in viral biology but are still listed as uncharacterized. Six COGs were associated with nucleotide metabolism. Two (3.8%) COGs were related to translation and virion structure and morphogenesis, and only one (1.9%) COG was correlated with signal transduction regulation and lipid and protein metabolism. The main characteristics of these viruses are summarized in Table 1.

TABLE 1.

Main genomic characteristics of the isolated viruses of the pithovirus-like group

Virus Genome size (bp) % GC Coding density (%) No. of:
% ORFans No. of tRNAs Host Delivery portal
CDS COGs
Pithovirus sibericum 610,033 35.80 69 532 488 34.58 0 Acanthamoeba sp. Single cork
Pithovirus massiliensis 683,254 35.40 67 667 552 22.03 0 Acanthamoeba sp. Single cork
Cedratvirus A11 589,068 42.59 84 763 655 32.76 1 Acanthamoeba sp. Double cork
Cedratvirus lausannensis 575,161 42.78 86 752 656 14.76 0 Acanthamoeba sp. Double cork
Cedratvirus zaza 560,887 42.73 86 717 631 9.62 0 Acanthamoeba sp. Double cork
Brazilian cedratvirus 460,038 42.88 89 677 551 20.97 0 Acanthamoeba sp. Double cork
Cedratvirus kamchatka 466,767 40.62 85 647 515 15.76 1 Acanthamoeba sp. Double cork
Orpheovirus 1,473,655 24.99 70 1520 1074 45.78 2 V. vermiformis Ostiole

Characterization and evolution of the core genome.

To achieve a more robust pangenomic analysis, we performed the same procedure using Proteinortho software. At the established thresholds, the pangenome constructed in Proteinortho was composed of 3,359 COGs and the core genome of 30 COGs, all of which also formed the core genome in the OrthoFinder analyses. The average identities of the proteins that comprised the core genome COGs ranged from 45.3% to 29.4% (mean value = 33.5%) for cedratvirus and from 24.3% to 43.3% (mean value = 33.9%) for pithovirus. The highest mean identity found in the core genome was in orpheovirus gene 25, which encodes a protein involved in nucleotide metabolism (Fig. 6).

FIG 6.

FIG 6

Range of similarities of orpheovirus genes present in the strict core genome to those of members of the phylum Nucleocytoviricota. Heatmap of the average identity of all the representatives found, ranging from 22% (yellow) to 60% (blue).

To identify which orpheovirus genes present in the 30 COGs that formed the strict core genome were also found in other viruses of the Nucleocytoviricota phylum, a BLASTp search was performed. Among the other taxa within the phylum, the closest phylogenetically related family, Marseilleviridae, shared the most genes, with 21 shared genes, followed by the order Imitervirales and the family Phycodnaviridae, with 17 shared genes each. Furthermore, 5 COGs of the core genome were present only in representatives of the pithovirus-like group within the phylum. Of these, four were involved in transcription and RNA processing (mRNA decapping complex subunit 2, transcription initiation factor, RNA polymerase II, and RNA polymerase III) and the other had an uncharacterized function. DNA polymerase B was the only gene shared among all taxa, with an average identity of 28% (Fig. 6).

To assess the evolutionary history of each protein that composed the strict core genome, individual phylogenetic analyses were performed. In 27 trees, orpheovirus formed a divergent branch of the group (Fig. 7A; see Fig. S2 to S28 posted at https://5c95043044c49.site123.me/sup-materials/sup-material-the-analysis-of-genomic-features-and-evolutionary-history-of-pithovirus-like-isolates-reveals-the-existence-of-two-major-divergent-groups-of-viruses). This topology was maintained even in the phylogenetic tree of the transcription elongation factor TFIIS gene, where the average identities of the orpheovirus gene with those of other giant virus groups (iridovirus, imitervirus, phycodnavirus, faustovirus, asfarvirus, and mamonovirus) were higher than with those of cedratvirus and pithovirus, and it is a gene known for breaking well-established clades (1, 36). Only in three cases did orpheovirus become an even more external taxon, causing the last common ancestor between orpheovirus, cedratvirus, and pithovirus to also be shared with other groups of giant viruses (Fig. 7B to D). In addition, phylogenetic trees were constructed including viruses and cellular organisms from the three domains of life when found. Only the RNA polymerase III and transcription initiation factor genes did not have hits with genes from organisms other than representatives of the pithovirus-like group. Consistent with the previously observed results, in 27 trees, the topology was maintained, with orpheovirus clustering with pithovirus and cedratvirus, and in the trees of DNA oxidative demethylase, DNA repair exonuclease, and hypothetical protein (gene 21) genes, orpheovirus was an even more external taxon (data not shown).

FIG 7.

FIG 7

Phylogeny of Nucleocytoviricota based on amino acid sequences encoded by the genes present in the strict core genome. (A) DNA polymerase family B (gene 1; clandestinovirus was removed from the analyses); (B) Uracil-DNA glycosylase (gene 9); (C) DNA oxidative demethylase (gene 12); (D) Hypothetical protein (gene 21). Scale bars indicate the rates of evolution. Only bootstrap values >50 are shown. Orpheovirus is highlighted in purple, and pithoviruses and cedratviruses in yellow.

DISCUSSION

Despite their late discovery, giant viruses are one of the most abundant groups of viruses in nature; therefore, with adequate methods, it has been possible to isolate them from different types of samples collected worldwide (5, 11, 1820, 23, 24, 29, 3739). Even with several representatives described, the wide detection of these viruses by metagenomics indicates that there are many more to be isolated (40). Nevertheless, the diversity of this group may be underestimated, since a virus with only a few homologs in databases has already been isolated (16). Moreover, considering that the origin and diversification of this phylum began before the origin of modern eukaryotes, as these new representatives are isolated, there may also be changes in already defined taxa, making it necessary to identify criteria for defining the taxa (1).

In this work, the analyses were carried out using only the pithovirus and cedratvirus isolates with complete genomes deposited in public databases, but other representatives have already been detected by metagenomics (35, 41). Recent metagenomic work shows that in permafrost samples, there is a great diversity of unknown viral subgroups and clades of the Nucleocytoviricota phylum, mainly of divergent sequences of the pithovirus-like group (36). With new isolates, the differences that orpheovirus presents within the clade will probably be reinforced. Even with the advancement of bioinformatics, some characteristics can be observed only after isolation, including the host cell and particle morphology.

First, all cedratviruses and pithoviruses isolated to date infect amoebae of the genus Acanthamoeba, whereas orpheovirus infects only Vermamoeba vermiformis (4, 6, 17, 2732). Even with in silico evidence of a possible increase in the host range for this group of viruses, thus far, the only giant viruses of amoebae to our knowledge proven capable of infecting more than one host genus are the tupanviruses (11). In addition to the hosts, orpheovirus also differs from pithoviruses and cedratviruses in the morphology of its particles. Although all viruses are nonicosahedral, the delivery portal of orpheovirus has an ostiole structure, similar to that observed for pandoraviruses and very different from the corks present in pithovirus and cedratvirus (6, 28, 34, 42).

Despite the large virion size that these viruses have, pithoviruses and cedratviruses are exceptions to the allometric scaling law and have small genomes in comparison to their particles, but the same does not occur with orpheovirus (31). Its genome is more than two times larger, exceeding 1.4 Mb, and while cedratvirus and pithovirus have GC contents of approximately 42.3% and 35.6%, respectively, orpheovirus has a GC content of approximately 24.9%. The larger genome of orpheovirus reflects its greater number of genes, maintaining a coding density close to that observed for pithovirus but lower than that observed for cedratvirus.

Among the viruses annotated in this work, orpheovirus has the most ORFans. In many studies, predicted proteins smaller than 50 amino acids are usually disregarded in genome analyses (4, 17, 18, 22, 25). Even considering these proteins, which are mostly ORFans, the annotation performed in this work reduced the number of orpheovirus ORFans by 20%, indicating the importance of using different methods to achieve comprehensive gene annotation (17). In the pangenome network, the ORFans comprise a large part of the exclusive genes of each virus. Among the viruses analyzed, orpheovirus has the most unique genes, which leads to an abrupt increase in the pangenome slope, unlike for other pangenomes published, which mostly have a more tenuous and continuous increase (4348).

The total pangenome is large, formed by 2,586 unique or clustered genes in OrthoFinder. There is an increase to 3,359 COGs in Proteinortho, a larger pangenome than those observed for other groups of giant viruses, even the ones with more viruses included in the analyses, but smaller than that observed for pandoraviruses, the viruses that have the largest genomes in the virosphere (3, 13). The core genome represents 4.8% of the pangenome considering the OrthoFinder construction, decreasing to 0.89% of the pangenome in the Proteinortho construction, the smallest core genome of isolated viruses to our knowledge. These differences might be explained by the fact that OrthoFinder uses both gene similarities and phylogenetic analyses to construct the orthologous groups, resulting in a more stringent clustering method than those used by Proteinortho (49, 50). In general, the categories that composed the core genome in the two programs did not change much, but two functional categories did not appear in the Proteinortho results. The functions that disappeared include a protein kinase of the signal transduction regulation category and the virion membrane protein and major capsid protein of the virion structure and morphogenesis category. It is known that the major capsid protein of these viruses is divergent from those of the other giant viruses, and not only MCP but all three of these proteins have a degree of divergence within the group itself (36, 51). In some groups of giant viruses, it is common for representatives to present paralogies for genes, such as the MCP gene; however, in both core genome analyses, no duplicated genes were found in the clusters (16, 52). Although signal transduction regulation, other metabolic functions, and miscellaneous are the categories that encompass most of the pangenome, in the strict core genome, these categories were not observed, indicative of the high specificity of these genes to their respective viruses. In contrast to observations for chlorovirus, there was no proportional correspondence in the numbers of COGs that constituted the different functional categories of the pangenome and the core genome (44).

The phylum Nucleocytoviricota is very diverse, and the small number of genes shared between its members made it possible to make phylogenetic reconstructions indicating monophyly. In this work, phylogenetic analyses of all the genes of the core genome were carried out to check whether different shared genes would produce different tree topologies. In most cases, it was observed that orpheovirus maintained a long branch clearly divergent from pithovirus and cedratvirus. Long branches are considered markers highlighting the emergence of new viral families or lineages of giant viruses (17). Additionally, in works in which the phylogenies include metagenome-assembled genomes, the formation of new clades occurs not only for orpheovirus but also for other taxa (35, 41, 51, 53).

Although prospecting and metagenomic studies have advanced over the years, considerably speeding up the identification of new lineages of giant viruses, information regarding their diversity, distribution, genomic content, and role in nature remains scarce. Revisiting published works and public data to standardize and update information about these viruses is extremely important to elucidate these aspects and support further studies.

Conclusions.

Taken together, our analyses reveal more elements that highlight the divergent evolution of orpheovirus. In addition to its different host, delivery portal, and genome size, low GC content, and high number of ORFans, the genome of orpheovirus is also asyntenic compared to those of the closest relatives. Moreover, when orpheovirus is added to the pangenome analysis, there is an intensive increase in the pangenome slope, although less than 10% of orpheovirus COGs are shared with other viruses, which leads to a significant decrease in the size of the core genome. Furthermore, the phylogenetic analyses show that for all the shared genes, the great divergence of this virus from pithovirus and cedratvirus is maintained or increased. Altogether, our results indicate that orpheovirus has the potential to be the founding member of a new family, Orpheoviridae. Nevertheless, further prospecting and characterization studies are essential to determine the best criteria to define emerging viral taxa and gain a broader view of the virosphere.

MATERIALS AND METHODS

Viral production, purification, and titration.

Orpheovirus IHUMI-LCC2 was described by Bernard La Scola’s group from Aix Marseille Université. To produce the viral stocks (SISGEN number A2291C9), five 300-cm2 glass flasks containing 2.5 × 107 Vermamoeba vermiformis cells (ATCC CDC19) cultivated in peptone-yeast-glucose (PYG) broth medium supplemented with penicillin (500 mg/mL; Schering-Plough, Brazil), streptomycin (100 mg/mL; Sigma-Aldrich, USA), and amphotericin B (25 mg/mL; Cristalia, São Paulo, Brazil) were infected with orpheovirus at a multiplicity of infection (MOI) of 0.01 and incubated at 29°C until the observation of typical cytopathic effect (CPE). Next, the flask’s contents were collected and subjected to three cycles of freezing and thawing to release the viral particles that could eventually be trapped inside cells that were still intact. The lysate was then centrifuged at 1,200 × g for 10 min to remove cell debris. The supernatant was collected, placed over a 40% sucrose (Merck, Germany) cushion, and centrifuged twice at 36,000 × g for 1 h. The pellet was resuspended in phosphate-buffered saline (PBS), and the purified viruses were stored at −20°C. The titer was obtained by the endpoint method, calculated according to Reed-Muench (54), and expressed as the number of 50% tissue culture infective doses (TCID50) per milliliter.

Genome sequencing and assembly.

To confirm the nature of the sample, the genome of orpheovirus was sequenced twice by the paired-end method using Illumina MiSeq equipment, and the libraries were prepared with the Illumina DNA prep kit (Illumina, Inc., San Diego, CA, USA). For quality control of the sequences obtained, the FastQC program was used, and the HEADCROP operation was performed to trim the initial base pairs of the reads in Trimmomatic (55). The genome was assembled de novo using Spades 3.12 software with the default parameters (56, 57). The two sequencing data sets were used for assembly. For this, two R1 and two R2 files in fastq format were added to the command line for the assembly program.

In addition, the genome of pithovirus massiliensis was assembled using the four scaffolds available in the NCBI database (accession numbers LT598836.1, LT598837.1, LT598838.1, and LT598839.1) and MeDuSa software (58), using pithovirus sibericum as a reference.

Genome synteny and gene prediction and annotation.

For the alignment of the nucleotide sequences of the complete genomes used in this work, the software Mauve was used (59). Open reading frames (ORFs) of the 8 viruses utilized in this work were predicted with the GeneMarkS tool (sequence type: prokaryotic) (60). Additionally, tRNA coding sequences were predicted using ARAGORN (61). All of the predicted ORFs were annotated using three different platforms: Diamond with an E value of <10−5 and identity of 40% as thresholds against the NCBI nonredundant protein sequence (Nr.gz 2022-05-15 00:58 125G) database, excluding from the search the viruses of families discovered after the query; HHpred server (62) to predict functions and/or structures for ORFs (databases PDB_mmCIF30_14_Apr and SCOPe70_2.07), considering as valid hits those that presented probabilities greater than 80% or E values equal to or smaller than 1; and InterProScan (63, 64), where the annotation was performed manually during November and December of 2021. Functional annotation of genes and COGs was performed according to Rodrigues and colleagues (44).

Pangenome and network construction.

To perform the pangenomic analyses, protein prediction files were clustered using OrthoFinder 2.5.4 software with an MCL inflation parameter equal to 4 (49, 65). Moreover, to reinforce the core genome, the Proteinortho tool was also used (50) based on the reciprocal best-hit strategy, using an amino acid sequence identity of 30%, a sequence coverage of 50%, and an E value of <10−5 as thresholds.

To obtain a general picture of COG sharing among the members of the pithovirus-like group, a bipartite network was built using Gephi (66). The layout was generated using a force-based algorithm.

Core-genome-sharing and similarity analyses were performed using BLASTP. Between the orpheovirus genes and their homologous sequences in cedratviruses and pithoviruses, alignment was performed in BLASTP from the new gene prediction files. For the other viruses of the phylum, BLASTP was performed against the NCBI database with an expected threshold of 0.00001. The first 100 target sequences’ identity and coverage values were computed for each gene and averaged for heatmap representation.

Phylogenetic analysis.

For phylogenetic analysis, the predicted genes of orpheovirus that composed the strict core genome were used. Amino acid sequences were obtained from the National Center for Biotechnology Information database using the BLASTp program with a minimum E-value of 0.00001 and aligned using the MUSCLE algorithm. When the sequences of pithovirus and cedratvirus were not found using BLASTp (genes 12, 26, and 28), they were added manually from the prediction file based on the indication of clusters generated in Proteinortho. After aligning these sequences, the best-fit substitution models were selected by the ModelFinder algorithm implemented in IQtree (67). The phylogenetic tree was reconstructed using IQtree software (version 1.6.12) (68), using the maximum-likelihood method and 1,000 bootstrap replicates. Finally, the phylogenetic tree was visualized and edited using iTOL (69).

Data availability.

The genome sequence of orpheovirus IHUMI-LCC2 is available on the GenBank website under accession number NC_036594. In this study, we resequenced this virus and confirmed that the publicly available sequence is accurately deposited.

ACKNOWLEDGMENTS

We thank our colleagues from Laboratório de Vírus—UFMG for their technical support, and we thank the Microscopy Center of UFMG. Special thanks to Agnello Picorelli, Lucas Marioza, Gabriel Carreta, and Matheus Galantine for helping with the development of in-house annotation scripts.

We acknowledge financial support from Rede Vírus—Ministério da Ciência, Tecnologia e Inovações (MCTI), Câmara Pox—405249/2022-5. We thank Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq), Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES), grant number 88882.348380/2010-1, Fundação de Amparo à Pesquisa do estado de Minas Gerais (FAPEMIG), Programas Institutos Nacionais de Ciência e Tecnologia (INCT), grant number 406441/2022-7, chamada 58/2022, and Pró-Reitorias de Pesquisa e Pós-Graduação of UFMG.

J.S.A. and J.P.A.J. are CNPq researchers.

Contributor Information

Rodrigo A. L. Rodrigues, Email: rodriguesral07@gmail.com.

Jônatas S. Abrahão, Email: jonatas.abrahao@gmail.com.

Colin R. Parrish, Cornell University Baker Institute for Animal Health

REFERENCES

  • 1.Guglielmini J, Woo A, Krupovic M, Forterre P, Gaia M. 2019. Diversification of giant and large eukaryotic dsDNA viruses predated the origin of modern eukaryotes. Proc Natl Acad Sci USA 116:19585–19592. doi: 10.1073/pnas.1912006116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Koonin EV, Dolja VV, Krupovic M, Varsani A, Wolf YI, Yutin N, Zerbini FM, Kuhn JH. 2020. Global organization and proposed megataxonomy of the virus world. Microbiol Mol Biol Rev 84:e00061-19. doi: 10.1128/MMBR.00061-19. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Aherfi S, Andreani J, Baptiste E, Oumessoum A, Dornas FP, Andrade AdS, Chabriere E, Abrahao J, Levasseur A, Raoult D, La Scola B, Colson P. 2018. A large open pangenome and a small core genome for giant pandoraviruses. Front Microbiol 9:1486. doi: 10.3389/fmicb.2018.01486. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Andreani J, Aherfi S, Bou Khalil J, Di Pinto F, Bitam I, Raoult D, Colson P, La Scola B. 2016. Cedratvirus, a double-cork structured giant virus, is a distant relative of pithoviruses. Viruses 8:300. doi: 10.3390/v8110300. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Boyer M, Yutin N, Pagnier I, Barrassi L, Fournous G, Espinosa L, Robert C, Azza S, Sun S, Rossmann MG, Suzan-Monti M, La Scola B, Koonin EV, Raoult D. 2009. Giant Marseillevirus highlights the role of amoebae as a melting pot in emergence of chimeric microorganisms. Proc Natl Acad Sci USA 106:21848–21853. doi: 10.1073/pnas.0911354106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Legendre M, Bartoli J, Shmakova L, Jeudy S, Labadie K, Adrait A, Lescot M, Poirot O, Bertaux L, Bruley C, Couté Y, Rivkina E, Abergel C, Claverie JM. 2014. Thirty-thousand-year-old distant relative of giant icosahedral DNA viruses with a pandoravirus morphology. Proc Natl Acad Sci USA 111:4274–4279. doi: 10.1073/pnas.1320670111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Okamoto K, Miyazaki N, Song C, Maia FRNC, Reddy HKN, Abergel C, Claverie JM, Hajdu J, Svenda M, Murata K. 2017. Structural variability and complexity of the giant Pithovirus sibericum particle revealed by high-voltage electron cryo-tomography and energy-filtered electron cryo-microscopy. Sci Rep 7:1–12. doi: 10.1038/s41598-017-13390-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Piacente F, De Castro C, Jeudy S, Molinaro A, Salis A, Damonte G, Bernardi C, Abergel C, Tonetti MG. 2014. Giant virus Megavirus chilensis encodes the biosynthetic pathway for uncommon acetamido sugars. J Biol Chem 289:24428–24439. doi: 10.1074/jbc.M114.588947. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Rodrigues RAL, Abrahão JS, Drumond BP, Kroon EG. 2016. Giants among larges: how gigantism impacts giant virus entry into amoebae. Curr Opin Microbiol 31:88–93. doi: 10.1016/j.mib.2016.03.009. [DOI] [PubMed] [Google Scholar]
  • 10.Silva LCF, Rodrigues RAL, Oliveira GP, Dornas FP, La Scola B, Kroon EG, Abrahão JS. 2019. Microscopic analysis of the tupanvirus cycle in Vermamoeba vermiformis. Front Microbiol 10:671. doi: 10.3389/fmicb.2019.00671. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Abrahão J, Silva L, Silva LS, Khalil JYB, Rodrigues R, Arantes T, Assis F, Boratto P, Andrade M, Kroon EG, Ribeiro B, Bergier I, Seligmann H, Ghigo E, Colson P, Levasseur A, Kroemer G, Raoult D, La Scola B. 2018. Tailed giant Tupanvirus possesses the most complete translational apparatus of the known virosphere. Nat Commun 9:749. doi: 10.1038/s41467-018-03168-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Andrade A, Rodrigues RAL, Oliveira GP, Andrade KR, Bonjardim CA, La Scola B, Kroon EG, Abrahão JS. 2017. Filling knowledge gaps for mimivirus entry, uncoating, and morphogenesis. J Virol 91:e01335-17. doi: 10.1128/JVI.01335-17. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Philippe N, Legendre M, Doutre G, Couté Y, Poirot O, Lescot M, Arslan D, Seltzer V, Bertaux L, Bruley C, Garin J, Claverie JM, Abergel C. 2013. Pandoraviruses: amoeba viruses with genomes up to 2.5 Mb reaching that of parasitic eukaryotes. Science 341:281–286. doi: 10.1126/science.1239181. [DOI] [PubMed] [Google Scholar]
  • 14.Raoult D, Audic S, Robert C, Abergel C, Renesto P, Ogata H, La Scola B, Suzan M, Claverie JM. 2004. The 1.2-megabase genome sequence of Mimivirus. Science 306:1344–1350. doi: 10.1126/science.1101485. [DOI] [PubMed] [Google Scholar]
  • 15.Hussein Bajrai L, Mougari S, Andreani J, Baptiste E, Delerce J, Raoult D, Ibraheem Azhar E, La Scola B, Levasseur A. 2019. Isolation of Yasminevirus, the first member of Klosneuvirinae isolated in coculture with Vermamoeba vermiformis, demonstrates an extended arsenal of translational apparatus components. J Virol 94:e01534-19. doi: 10.1128/JVI.01534-19. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Boratto PVM, Oliveira GP, Machado TB, Andrade ACSP, Baudoin J-P, Klose T, Schulz F, Azza S, Decloquement P, Chabrière E, Colson P, Levasseur A, La Scola B, Abrahão JS. 2020. Yaravirus: a novel 80-nm virus infecting Acanthamoeba castellanii. Proc Natl Acad Sci USA 117:16579–16586. doi: 10.1073/pnas.2001637117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Andreani J, Khalil JYB, Baptiste E, Hasni I, Michelle C, Raoult D, Levasseur A, La Scola B. 2017. Orpheovirus IHUMI-LCC2: a new virus among the giant viruses. Front Microbiol 8:2643. doi: 10.3389/fmicb.2017.02643. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Bajrai L, Benamar S, Azhar E, Robert C, Levasseur A, Raoult D, La Scola B. 2016. Kaumoebavirus, a new virus that clusters with faustoviruses and Asfarviridae. Viruses 8:278. doi: 10.3390/v8110278. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Legendre M, Lartigue A, Bertaux L, Jeudy S, Bartoli J, Lescot M, Alempic J-M, Ramus C, Bruley C, Labadie K, Shmakova L, Rivkina E, Couté Y, Abergel C, Claverie J-M. 2015. In-depth study of Mollivirus sibericum, a new 30,000-y-old giant virus infecting Acanthamoeba. Proc Natl Acad Sci USA 112:E5327–E5335. doi: 10.1073/pnas.1510795112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Andreani J, Khalil JYB, Sevvana M, Benamar S, Di Pinto F, Bitam I, Colson P, Klose T, Rossmann MG, Raoult D, La Scola B. 2017. Pacmanvirus, a new giant icosahedral virus at the crossroads between Asfarviridae and faustoviruses. J Virol 91:e00212-17. doi: 10.1128/JVI.00212-17. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Reteno DG, Benamar S, Khalil JB, Andreani J, Armstrong N, Klose T, Rossmann M, Colson P, Raoult D, La Scola B. 2015. Faustovirus, an Asfarvirus-related new lineage of giant viruses infecting amoebae. J Virol 89:6585–6594. doi: 10.1128/JVI.00115-15. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Rolland C, Andreani J, Sahmi-Bounsiar D, Krupovic M, La Scola B, Levasseur A. 2021. Clandestinovirus: a giant virus with chromatin proteins and a potential to manipulate the cell cycle of its host Vermamoeba vermiformis. Front Microbiol 12:715608. doi: 10.3389/fmicb.2021.715608. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Yoshikawa G, Blanc-Mathieu R, Song C, Kayama Y, Mochizuki T, Murata K, Ogata H, Takemura M. 2019. Medusavirus, a novel large DNA virus discovered from hot spring water. J Virol 93:2130–2148. doi: 10.1128/JVI.02130-18. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Christo-Foroux E, Alempic J-M, Lartigue A, Santini S, Labadie K, Legendre M, Abergel C, Claverie J-M. 2020. Characterization of Mollivirus kamchatka, the first modern representative of the proposed Molliviridae family of giant viruses. J Virol 94:e01997-19. doi: 10.1128/JVI.01997-19. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Geballa-Koukoulas K, Abdi S, La Scola B, Blanc G, Andreani J. 2021. Pacmanvirus S19, the second Pacmanvirus isolated from sewage waters in Oran, Algeria. Microbiol Resour Announc 10:e00693-21. doi: 10.1128/MRA.00693-21. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Geballa-Koukoulas K, Andreani J, La Scola B, Blanc G. 2021. The Kaumoebavirus LCC10 genome reveals a unique gene strand bias among “extended Asfarviridae.” Viruses 13:148. doi: 10.3390/v13020148. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Levasseur A, Andreani J, Delerce J, Bou Khalil J, Robert C, La Scola B, Raoult D. 2016. Comparison of a modern and fossil Pithovirus reveals its genetic conservation and evolution. Genome Biol Evol 8:2333–2339. doi: 10.1093/gbe/evw153. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Silva LKDS, Andrade ACDSP, Dornas FP, Rodrigues RAL, Arantes T, Kroon EG, Bonjardim CA, Abrahão JS. 2018. Cedratvirus getuliensis replication cycle: an in-depth morphological analysis. Sci Rep 8:4000. doi: 10.1038/s41598-018-22398-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Boudjemaa H, Andreani J, Bitam I, La Scola B. 2020. Diversity of amoeba-associated giant viruses isolated in Algeria. Diversity 12:215. doi: 10.3390/d12060215. [DOI] [Google Scholar]
  • 30.Bertelli C, Mueller L, Thomas V, Pillonel T, Jacquier N, Greub G. 2017. Cedratvirus lausannensis—digging into Pithoviridae diversity. Environ Microbiol 19:4022–4034. doi: 10.1111/1462-2920.13813. [DOI] [PubMed] [Google Scholar]
  • 31.Rodrigues RAL, Andreani J, Andrade ACDSP, Machado TB, Abdi S, Levasseur A, Abrahão JS, La Scola B. 2018. Morphologic and genomic analyses of new isolates reveal a second lineage of cedratviruses. J Virol 92:e00372-18. doi: 10.1128/JVI.00372-18. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Jeudy S, Rigou S, Alempic JM, Claverie JM, Abergel C, Legendre M. 2020. The DNA methylation landscape of giant viruses. Nat Commun 11:1–12. doi: 10.1038/s41467-020-16414-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Alempic JM, Lartigue A, Goncharov AE, Grosse G, Strauss J, Tikhonov AN, Fedorov AN, Poirot O, Legendre M, Santini S, Abergel C, Claverie JM. 2023. An update on eukaryotic viruses revived from ancient permafrost. Viruses 15:564. doi: 10.3390/v15020564. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Souza F, Rodrigues R, Reis E, Lima M, La Scola B, Abrahão J. 2019. In-depth analysis of the replication cycle of Orpheovirus. Virol J 16:158. doi: 10.1186/s12985-019-1268-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Aylward FO, Moniruzzaman M, Ha AD, Koonin EV. 2021. A phylogenomic framework for charting the diversity and evolution of giant viruses. PLoS Biol 19:e3001430. doi: 10.1371/journal.pbio.3001430. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Rigou S, Santini S, Abergel C, Claverie JM, Legendre M. 2022. Past and present giant viruses diversity explored through permafrost metagenomics. Nat Commun 13:1–13. doi: 10.1038/s41467-022-33633-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Andrade ACDSP, Arantes TS, Rodrigues RAL, Machado TB, Dornas FP, Landell MF, Furst C, Borges LGA, Dutra LAL, Almeida G, Trindade GDS, Bergier I, Abrahão W, Borges IA, Cortines JR, de Oliveira DB, Kroon EG, Abrahão JS. 2018. Ubiquitous giants: a plethora of giant viruses found in Brazil and Antarctica. Virol J 15:22. doi: 10.1186/s12985-018-0930-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.La Scola B, Audic S, Robert C, Jungang L, de Lamballerie X, Drancourt M, Birtles R, Claverie J-M, Raoult D, Bruley C, Garin J, Claverie J-M, Abergel C. 2003. A giant virus in amoebae. Science 299:2033. doi: 10.1126/science.1081867. [DOI] [PubMed] [Google Scholar]
  • 39.Endo H, Blanc-Mathieu R, Li Y, Salazar G, Henry N, Labadie K, de Vargas C, Sullivan MB, Bowler C, Wincker P, Karp-Boss L, Sunagawa S, Ogata H. 2020. Biogeography of marine giant viruses reveals their interplay with eukaryotes and ecological functions. Nat Ecol Evol 4:1639–1649. doi: 10.1038/s41559-020-01288-w. [DOI] [PubMed] [Google Scholar]
  • 40.Schulz F, Abergel C, Woyke T. 2022. Giant virus biology and diversity in the era of genome-resolved metagenomics. Nat Rev Microbiol 20:721–736. doi: 10.1038/s41579-022-00754-5. [DOI] [PubMed] [Google Scholar]
  • 41.Bäckström D, Yutin N, Jørgensen SL, Dharamshi J, Homa F, Zaremba-Niedwiedzka K, Spang A, Wolf YI, Koonin EV, Ettema TJG. 2019. Virus genomes from deep sea sediments expand the ocean megavirome and support independent origins of viral gigantism. mBio 10:e02497-18. doi: 10.1128/mBio.02497-18. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Dornas FP, Khalil JYB, Pagnier I, Raoult D, Abrahão J, La Scola B. 2015. Isolation of new Brazilian giant viruses from environmental samples using a panel of protozoa. Front Microbiol 6:1086. doi: 10.3389/fmicb.2015.01086. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Sahmi-Bounsiar D, Rolland C, Aherfi S, Boudjemaa H, Levasseur A, La Scola B, Colson P. 2021. Marseilleviruses: an update in 2021. Front Microbiol 12:648731. doi: 10.3389/fmicb.2021.648731. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Rodrigues RAL, Queiroz VF, Ghosh J, Dunigan DD, Van Etten JL. 2021. Functional genomic analyses reveal an open pan-genome for the chloroviruses and a potential for genetic innovation in new isolates. J Virol 96:e01367-21. doi: 10.1128/JVI.01367-21. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Assis FL, Franco-Luiz APM, dos Santos RN, Campos FS, Dornas FP, Borato PVM, Franco AC, Abrahao JS, Colson P, La Scola B. 2017. Genome characterization of the first mimiviruses of lineage C isolated in Brazil. Front Microbiol 8:2562. doi: 10.3389/fmicb.2017.02562. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Wang Z, Jia L, Li J, Liu H, Liu D. 2020. Pan-genomic analysis of African swine fever virus. Virol Sin 35:662–665. doi: 10.1007/s12250-019-00173-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Benamar S, Reteno DGI, Bandaly V, Labas N, Raoult D, La Scola B. 2016. Faustoviruses: comparative genomics of new Megavirales family members. Front Microbiol 7:3. doi: 10.3389/fmicb.2016.00003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Assis FL, Bajrai L, Abrahao JS, Kroon EG, Dornas FP, Andrade KR, Boratto PVM, Pilotto MR, Robert C, Benamar S, La Scola B, Colson P. 2015. Pan-genome analysis of Brazilian lineage A amoebal mimiviruses. Viruses 7:3483–3499. doi: 10.3390/v7072782. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Emms DM, Kelly S. 2019. OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol 20:1–14. doi: 10.1186/s13059-019-1832-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Lechner M, Findeiss S, Steiner L, Marz M, Stadler PF, Prohaska SJ. 2011. Proteinortho: detection of (co-)orthologs in large-scale analysis. BMC Bioinformatics 12:124. doi: 10.1186/1471-2105-12-124. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Schulz F, Roux S, Paez-Espino D, Jungbluth S, Walsh DA, Denef VJ, McMahon KD, Konstantinidis KT, Eloe-Fadrosh EA, Kyrpides NC, Woyke T. 2020. Giant virus diversity and host interactions through global metagenomics. Nat 578:432–436. doi: 10.1038/s41586-020-1957-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Boratto PVM, Arantes TS, Silva LCF, Assis FL, Kroon EG, La Scola B, Abrahão JS. 2015. Niemeyer virus: a new mimivirus group A isolate harboring a set of duplicated aminoacyl-tRNA synthetase genes. Front Microbiol 6:1256. doi: 10.3389/fmicb.2015.01256. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Moniruzzaman M, Martinez-Gutierrez CA, Weinheimer AR, Aylward FO. 2020. Dynamic genome evolution and complex virocell metabolism of globally-distributed giant viruses. Nat Commun 11:1–11. 2020 111 doi: 10.1038/s41467-020-15507-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Reed LJ, Muench H. 1938. A simple method of estimating fifty per cent endpoints. Am J Epidemiol 27:493–497. doi: 10.1093/oxfordjournals.aje.a118408. [DOI] [Google Scholar]
  • 55.Bolger AM, Lohse M, Usadel B. 2014. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30:2114–2120. doi: 10.1093/bioinformatics/btu170. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Kulikov AS, Lesin VM, Nikolenko SI, Pham S, Prjibelski AD, Pyshkin AV, Sirotkin AV, Vyahhi N, Tesler G, Alekseyev MA, Pevzner PA. 2012. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J Comput Biol 19:455–477. doi: 10.1089/cmb.2012.0021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Prjibelski A, Antipov D, Meleshko D, Lapidus A, Korobeynikov A. 2020. Using SPAdes de novo assembler. Curr Protoc Bioinforma 70:e102. doi: 10.1002/cpbi.102. [DOI] [PubMed] [Google Scholar]
  • 58.Bosi E, Donati B, Galardini M, Brunetti S, Sagot MF, Lió P, Crescenzi P, Fani R, Fondi M. 2015. MeDuSa: a multi-draft based scaffolder. Bioinformatics 31:2443–2451. doi: 10.1093/bioinformatics/btv171. [DOI] [PubMed] [Google Scholar]
  • 59.Darling AE, Mau B, Perna NT. 2010. progressiveMauve: multiple genome alignment with gene gain, loss and rearrangement. PLoS One 5:e11147. doi: 10.1371/journal.pone.0011147. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Besemer J, Borodovsky M. 2005. GeneMark: web software for gene finding in prokaryotes, eukaryotes and viruses. Nucleic Acids Res 33:W451–W454. doi: 10.1093/nar/gki487. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Laslett D, Canback B. 2004. ARAGORN, a program to detect tRNA genes and tmRNA genes in nucleotide sequences. Nucleic Acids Res 32:11–16. doi: 10.1093/nar/gkh152. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Söding J, Biegert A, Lupas AN. 2005. The HHpred interactive server for protein homology detection and structure prediction. Nucleic Acids Res 33:W244–W248. doi: 10.1093/nar/gki408. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Quevillon E, Silventoinen V, Pillai S, Harte N, Mulder N, Apweiler R, Lopez R, Mulder J, Attwood TK, Bairoch A, Bateman A, Binns D, Bradley P, Bork P, Bucher P. 2005. InterProScan: protein domains identifier. Nucleic Acids Res 33:W116–W120. doi: 10.1093/nar/gki442. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Blum M, Chang H, Chuguransky S, Grego T, Kandasaamy S, Mitchell A, Nuka G, Paysan-Lafosse T, Qureshi M, Raj S, Richardson L, Salazar GA, Williams L, Bork P, Bridge A, Gough J, Haft DH, Letunic I, Marchler-Bauer A, Mi H, Natale DA, Necci M, Orengo CA, Pandurangan AP, Rivoire C, Sigrist CJA, Sillitoe I, Thanki N, Thomas PD, Tosatto SCE, Wu CH, Bateman A, Finn RD. 2020. The InterPro protein families and domains database : 20 years on. Nucleic Acids Res 49:344–354. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Waite DW, Liefting L, Delmiglio C, Chernyavtseva A, Ha HJ, Thompson JR. 2022. Development and validation of a bioinformatic workflow for the rapid detection of viruses in biosecurity. Viruses 14:2163. doi: 10.3390/v14102163. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Bastian M, Heymann S, Jacomy M. 2009. Gephi: an open source software for exploring and manipulating networks. Proc Int AAAI Conf Web Social Media 3:361–362. doi: 10.1609/icwsm.v3i1.13937. [DOI] [Google Scholar]
  • 67.Kalyaanamoorthy S, Minh BQ, Wong TKF, Von Haeseler A, Jermiin LS. 2017. ModelFinder: fast model selection for accurate phylogenetic estimates. Nat Methods 14:587–589. doi: 10.1038/nmeth.4285. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Nguyen LT, Schmidt HA, Von Haeseler A, Minh BQ. 2015. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol Biol Evol 32:268–274. doi: 10.1093/molbev/msu300. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69.Letunic I, Bork P. 2021. Interactive Tree Of Life (iTOL) v5: an online tool for phylogenetic tree display and annotation. Nucleic Acids Res 49:W293–W296. doi: 10.1093/nar/gkab301. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

The genome sequence of orpheovirus IHUMI-LCC2 is available on the GenBank website under accession number NC_036594. In this study, we resequenced this virus and confirmed that the publicly available sequence is accurately deposited.


Articles from Journal of Virology are provided here courtesy of American Society for Microbiology (ASM)

RESOURCES