Skip to main content
The Journal of General Virology logoLink to The Journal of General Virology
. 2017 Aug 31;98(9):2368–2378. doi: 10.1099/jgv.0.000902

Polycipiviridae: a proposed new family of polycistronic picorna-like RNA viruses

Ingrida Olendraite 1,, Nina I Lukhovitskaya 1,, Sanford D Porter 2, Steven M Valles 2, Andrew E Firth 1,*
PMCID: PMC5656759  PMID: 28857036

Abstract

Solenopsis invicta virus 2 is a single-stranded positive-sense picorna-like RNA virus with an unusual genome structure. The monopartite genome of approximately 11 kb contains four open reading frames in its 5′ third, three of which encode proteins with homology to picornavirus-like jelly-roll fold capsid proteins. These are followed by an intergenic region, and then a single long open reading frame that covers the 3′ two-thirds of the genome. The polypeptide translation of this 3′ open reading frame contains motifs characteristic of picornavirus-like helicase, protease and RNA-dependent RNA polymerase domains. An inspection of public transcriptome shotgun assembly sequences revealed five related apparently nearly complete virus genomes isolated from ant species and one from a dipteran insect. By high-throughput sequencing and in silico assembly of RNA isolated from Solenopsis invicta and four other ant species, followed by targeted Sanger sequencing, we obtained nearly complete genomes for four further viruses in the group. Four further sequences were obtained from a recent large-scale invertebrate virus study. The 15 sequences are highly divergent (pairwise amino acid identities of as low as 17 % in the non-structural polyprotein), but possess the same overall polycistronic genome structure, which is distinct from all other characterized picorna-like viruses. Consequently, we propose the formation of a new virus family, Polycipiviridae, to classify this clade of arthropod-infecting polycistronic picorna-like viruses. We further propose that this family be divided into three genera: Chipolycivirus (2 species), Hupolycivirus (2 species) and Sopolycivirus (11 species), with members of the latter infecting ants in at least 3 different subfamilies.

Keywords: Picornavirales, insect, RNA virus, ant, taxonomy

Abbreviations

Hel, helicase; IRES, internal ribosome entry site; LneV-1, Lasius neglectus virus 1; LniV-1, Lasius niger virus 1; MsaV-1, Myrmica scabrinodis virus 1; ORF, open reading frame; Pro, protease; RdRp, RNA-dependent RNA polymerase; SINV-2, Solenopsis invicta virus 2; SINV-4, Solenopsis invicta virus 4; SNP, single nucleotide polymorphism; TSA, transcriptome shotgun assembly.

Introduction

The order Picornavirales currently comprises the families Dicistroviridae, Iflaviridae, Marnaviridae, Picornaviridae and Secoviridae, and the unassigned genera Bacillarnavirus and Labyrnavirus. Members of the order are characterized by (i) a positive-sense RNA genome, usually with a 5′ covalently linked VPg (virus protein, genome linked) and a 3′ poly(A) tail, (ii) a polyprotein gene expression strategy with cleavage mainly mediated by viral protease(s), (iii) a structural protein module containing three jelly-roll capsid protein domains which form small non-enveloped icosahedral virions with pseudo T=3 symmetry and (iv) a non-structural protein module containing a superfamily III helicase (or NTPase), a 3C-like chymotrypsin-like protease and a superfamily I RNA-dependent RNA polymerase (RdRp), encoded sequentially in that order [1]. In contrast, the families Caliciviridae and Potyviridae (for example), although termed picorna-like (and informally grouped into a ‘picorna-like superfamily’) are excluded from the Picornavirales order for various reasons; for example, caliciviruses encode a single jelly-roll capsid protein and virions have true T=3 symmetry, while potyviruses have an unrelated capsid protein and filamentous virions, and encode a superfamily II instead of a superfamily III helicase.

Solenopsis invicta virus 2 (SINV-2) is a positive-sense single-stranded RNA virus that infects the red imported fire ant, Solenopsis invicta Buren, an invasive species in the southern USA [2]. Replicating virus was detected in the larval and adult stages of S. invicta [3]. While field infection rates are rather low for S. invicta [4, 5], there are distinct fitness costs for founding queens infected with the virus [6]. Colonies established from SINV-2-infected queens produced less brood and had longer claustral periods. SINV-2 infection was also associated with significant up-regulation of global gene expression in S. invicta queens, including immune response genes [6]. The monopartite genome was originally found to contain three open reading frames (ORFs) in its 5′ third and a single long ORF in its 3′ two-thirds, though we show here that there are actually four consecutive 5′ ORFs, one having been overlooked previously as a result of a sequencing error. The polypeptide encoded by the 3′ ORF contains motifs characteristic of a superfamily III helicase, a 3C-like chymotrypsin-related protease and a superfamily I RdRp [2]. A phylogenetic analysis of RdRp sequences placed SINV-2 within the ‘picorna-like superfamily’, but outside of established virus families [7]. Isometric particles with a diameter of ~33 nm were only found in ants testing positive for SINV-2 by RT-PCR [2].

By searching the NCBI transcriptome shotgun assembly (TSA) database, we identified six related sequences, of which five were derived from ant samples. To further explore the diversity and prevalence of this group of viruses, we performed high-throughput sequencing of RNA derived from Solenopsis invicta and four UK ant species. This led to the identification of four additional viruses. During the course of this work, 1445 new viruses were identified via high-throughput sequencing [8], of which four exhibited similar genome organizations to SINV-2. We performed a phylogenetic and comparative genomic analysis of these SINV-2-like virus sequences. All have a characteristic genome organization, with four consecutive 5′ proximal ORFs and one long 3′ ORF. ORFs 1, 3 and 4 are predicted to encode jelly-roll fold capsid proteins, while ORF2 encodes a protein of unknown function. Ant-infecting members of the group apparently have an additional 5′ ORF (ORF2b) overlapping the 5′ end of ORF2 and encoding a small protein containing a predicted transmembrane domain. The 3′ ORF5 encodes helicase, protease and RdRp domains, is presumed to encode a VPg and potentially another protein between the helicase and protease, and is likely to encode one or more additional proteins upstream of the helicase. The unusual genome organization and phylogenetic distinctness of this group of viruses suggests they should be classified into a new virus family, for which we propose the name Polycipiviridae (polycistronic picorna-like viruses). The characteristic complement of picornavirus-like protein domains suggests that the Polycipiviridae should be included within the Picornavirales order.

Results

Identification of SINV-2-like viruses

The SINV-2 genome was first sequenced by Valles et al. [2]. Suspicious of a large apparently non-coding gap following ORF1, we resequenced the SINV-2 genome. This resulted in the correction of a UGA stop codon to a UGG tryptophan codon, after which it became apparent that the region following ORF1 is actually occupied by an ORF. The reannotated SINV-2 ORFs are shown in Fig. 1. The new SINV-2 sequence has been deposited in GenBank under accession MF041813.1, and has 143 nucleotide differences relative to the previous SINV-2 sequence EF428566.1.

Fig. 1.

Fig. 1.

Genome map of SINV-2. The genome is represented by a black line. ORFs are indicated in pale blue with vertical offsets indicating the reading frame (−1, 0, +1) relative to ORF5. IGR, intergenic region.

Further high-throughput sequencing of Solenopsis invicta RNA samples revealed the presence of a new SINV-2-like virus, which we named Solenopsis invicta virus 4 (SINV-4). A complete genome was obtained by Sanger sequencing and has been deposited in GenBank with the accession number MF041808.1. SINV-4 has the same genome organization (Fig. 2), and 31 % amino acid identity to SINV-2 in the ORF5 polypeptide.

Fig. 2.

Fig. 2.

Genome maps of proposed polycipiviruses. ORFs are represented by coloured rectangles with vertical offsets indicating reading frames relative to the Hel-Pro-RdRp ORF. Green rectangles indicate ORFs encoding jelly-roll capsid protein domains as predicted by HHpred; hatched green lines indicate uncertain capsid domain identifications. Pink rectangles indicate a conserved overlapping ORF encoding a transmembrane helix as predicted by TMHMM. A dark green rectangle in the F. exsecta TSA map indicates an additional ORF, 3a. Replication protein domains, identified by their conserved characteristic motifs, are indicated as Hel (helicase), Pro (protease) and RdRp (RNA-dependent RNA polymerase). TSA indicates sequences obtained from the NCBI Transcriptome Shotgun Assembly database. Two horizontal lines (on the right side) separate the three proposed genera.

Next we queried the NCBI Transcriptome Shotgun Assembly (TSA) database using tblastn with the SINV-2 ORF5 polypeptide as the query and eukaryotes as the target taxonomic group. Amongst the top matching ‘hits’, long sequences (>10 kb) were identified, and their longest ORF extracted, translated and queried against NCBI RefSeq virus genome sequences. Sequences whose reciprocal best match was SINV-2 were retained, while other sequences (typically with top matches to iflavirus or dicistrovirus sequences) were discarded. This resulted in five long sequences with reciprocal best matches to SINV-2. An additional long sequence was obtained by merging the partly overlapping fragments LH935078.1 and LH935077.1. Five of the long sequences derive from ant RNA-Seq libraries (order Hymenoptera, family Formicidae; LA858223.1 and LA866448.1 from Monomorium pharaonis, LI526777.1 from Lasius neglectus, LI719284.1 from Linepithema humile and the combined LH935078.1+LH935077.1 from Formica exsecta). The sixth sequence, KA182589.1, derives from a Chironomus riparius RNA-Seq library (order Diptera; family Chironomidea). Sequence attributes are recorded in Table 1 and the genome organizations are depicted in Fig. 2. A number of shorter partial genome sequences were also identified, but are not discussed further here.

Table 1. Viruses in the proposed family Polycipiviridae.

Sequence name Accession no. Host species or source Sequence length§, nt 5′ UTR length|| ORF coordinates¶ 3′ UTR length||
ORF1 ORF2b ORF2 ORF3 ORF4 IGR length ORF5
Start End Start End Start End Start End Start End Start End
SINV-2 EF428566.1
MF041813.1
Solenopsis invicta 11303** 301 302 1078 1079 1309 1075 1878 1871 2647 2644 3792 662 4455 10916 387
M. pharaonis TSA LA858223.1 rev Monomorium pharaonis 11259** 255 256 984 985 1221 981 1763 1753 2547 2544 3692 658 4351 10812 447
LniV-1 MF041812.1 Lasius niger 11092 134 135 899 900 1157 896 1705 1695 2507 2507 3970 687 4658
L. neglectus TSA LI526777.1 Lasius neglectus 11854** 366 367 1131 1132 1392 1128 1940 1930 2742 2742 4181 657 4839 11375 479
L. humile TSA LI719284.1 Linepithema humile 11245** 258 259 1011 1015 1305 1011 1817 1804 2664 2661 3872 385 4258 10860 385
M. pharaonis TSA LA866448.1 Monomorium pharaonis 12160** 230 231 1064 1069 1434 1065 1898 1891 2697 2702 4018 759 4778 11719 441
Shuangao insect virus 8 KX883910.1 Insects mix (1)† 12155 224 225 1046 1050 1415 1046 1882 1875 2693 2690 4033 768 4802 11755 400
SINV-4 MF041808.1 Solenopsis invicta 12077** 198 199 1065 1069 1434 1065 1898 1891 2697 2697 4034 734 4769 11662 415
MsaV-1 MF041810.1 Myrmica scabrinodis 11872** 228 229 1062 1111 1482 1059 1973 1970 2773 2776 4134 438 4573 11430 442
LneV-1 MF041809.1 Lasius neglectus 11821** 207 208 1053 1099 1422 1050 1946 1943 2782 2772 4259 336 4596 11402 419
F. exsecta TSA LH935078.1 rev
LH935077.1
Formica exsecta 11899 206 207 1046 1080 1388 1043 1876# 2079 2879 2882 4300 386 4687 11535 364
Hubei picorna-like virus 81 KX884540.1 Procambarus clarkia 8116 84 81 1019 357 1377 8006 110
Hubei picorna-like virus 81 KX883940.1 Insects mix (2)‡ 10315 189 190 867 857 1540 1537 2157 2154 3098 353 3452 10078 237
Ch. riparius TSA KA182589.1 Chironomus riparius 11462 365 366 1109 1109 1627 1630 2418 2421 3413 384 3798 11144 318
Hubei picorna-like virus 82 KX883688.1 Spiders mix* 11081 279 280 1023 1023 1751 1748 2722 2719 3723 442 4166 10939 142

*Spiders mix (Neoscona nautica, Parasteatoda tepidariorum, Plexippus setipes, Pirata sp., Araneae sp.).

†Insects mix (1) (Abraxas tenuisuffusa, Hermetia illucens, Chrysopidae sp., Coleoptera sp., Psychoda alternata, Diptera sp., Stratiomyidae sp.).

‡Insects mix (2) (Paramercion melanotum, Paracercion calamorum, Ceriagrion auranticum, Brachydiplax chalybea, Orthetrum albistylum, Pseudothemis zonata, Chironomus sp.).

§Excluding polyA (if present).

||UTRs are likely to be incomplete in some species.

¶ORF coordinates (including stop codon).

#The F. exsecta TSA sequence has an additional ORF (ORF3a; 1873 - 2082) between ORF2 and ORF3.

**Sequence extends to 3′ polyA.

Given that SINV-2-like viruses appeared to be most prevalent in ants, we obtained samples from four different ant species (Lasius flavus, L. niger, L. neglectus and Myrmica scabrinodis) collected in Cambridge, UK, and performed two rounds of high-throughput RNA sequencing. Contigs were assembled using Trinity [9, 10] and Velvet [11], and SINV-2-like sequences were identified with blastx.

In the first round, we sequenced L. niger and L. flavus RNA samples using both small-RNA sequencing to enrich for 21–22 nt virus-derived RNAs that are expected to be produced as a result of the insect RNA interference (RNAi) anti-viral defence pathway, and standard RNA-Seq. In this case, good contig assemblies were not obtained for the small-RNA samples and no further small-RNA sequencing was performed. For the long RNA-Seq of L. niger, but not L. flavus, we identified one SINV-2-like sequence. In the second round, we performed long RNA-Seq for RNA obtained from M. scabrinodis, L. neglectus and a second L. flavus sample, and identified SINV-2-like sequences for M. scabrinodis and L. neglectus. Morphological ant species identifications were confirmed by blastn of cytochrome C oxidase subunit I (COI) sequences obtained from the transcriptome assemblies against the NCBI non-redundant (nr) nucleotide database.

The three SINV-2-like sequences were verified by Sanger sequencing and extended where possible by 5′ and 3′ RACE (see below). Sequence attributes are recorded in Table 1 and the genome organizations are depicted in Fig. 2. Hereafter we refer to these viruses as Lasius niger virus 1 (LniV-1), Lasius neglectus virus 1 (LneV-1) and Myrmica scabrinodis virus 1 (MsaV-1), with the GenBank accession numbers MF041812.1, MF041809.1 and MF041810.1, respectively.

Using Bowtie 2 [12], we mapped RNA-Seq reads (>30 nt) back to the assembled virus genomes to assess coverage and variation. LniV-1, LneV-1 and MsaV-1 had mean coverage values of 6.4, 457 and 1348, respectively. For the two viruses with high coverage (Fig. 3), we identified single nucleotide polymorphisms (SNPs) that were present in >10 % of reads. In MsaV-1, we found seven SNPs in coding regions, all non-synonymous and with frequencies just over 10 %. In LneV-1, we found 87 SNPs, of which 62 were synonymous – 5/22 in ORF1, 3/3 in ORF2, 7/7 in ORF3, 8/8 in ORF4 and 39/47 in ORF5.

Fig. 3.

Fig. 3.

RNA sequencing coverage of the MsaV-1 and LneV-1 genomes. The total coverage at each nucleotide position is indicated in red (positive sense) or blue (negative sense; note the different axis scale). Black lines show the mean coverage in a 1000-nt sliding window.

During the course of the above work, 1445 new RNA viruses were identified via high-throughput sequencing of RNA samples obtained from a variety of invertebrates [8]. The authors identified the most similar previously published virus sequence for each new virus, four of which were found to have highest similarity to SINV-2. These four sequences derive not from ants, but from a mixed sample of spiders (KX883688.1), a mixed insect sample including members of the orders Odonata and Diptera (KX883940.1), the crayfish Procambarus clarkia (KX884540.1) and a mixed insect sample including members of the orders Lepidoptera, Diptera, Neuroptera and Coleoptera (KX883910.1), with respectively 19, 21, 21 and 30 % amino acid sequence identity to SINV-2 in the ORF5 polypeptide. The sequences from spiders, Odonata/Diptera and crayfish have low abundance (<0.1 % of sample non-rRNA), leading the authors to suggest the possibility that they may derive from parasites or gut contents (which in these cases might include insects) instead of the target organisms. Sequence attributes are recorded in Table 1 and the genome organizations are depicted in Fig. 2.

In summary, we identified a total of 15 SINV-2-like sequences (Table 1), comprising a correction of the original SINV-2 sequence, SINV-4, six sequences from the NCBI TSA database, three sequences from UK ant species and four additional recently published sequences. Fourteen of the sequences appear to represent complete or nearly complete virus genomes.

Genome organization of SINV-2-like viruses

Each full-length sequence had five main ORFs (Fig. 2). The stop and start codons of consecutive ORFs in the 5′ region have closely spaced (often overlapping) stop and start codons (Table 1). We also identified a shorter sixth ORF (termed ORF2b) in some members of the group (Fig. 2). ORF2b overlaps the 5′ end of ORF2 in the +1 frame relative to ORF2. The Formica exsecta TSA contains an additional short ORF (ORF3a) between ORFs 2 and 3. Hubei picorna-like virus 81 (accession number KX884540; [8]) appears to be incomplete, lacking ORFs 1 and 2 and most of 3. Sequences with ORF2b cluster phylogenetically (Fig. 4; see below) and all are ant-associated, except for Shuangao insect virus 8, which was derived from an ‘insect mix’ that was not reported to contain members of the family Formicidae [8]. However, insects within this sample may include predators of ants; thus the sequence could potentially have originated from an ant (or other) host.

Fig. 4.

Fig. 4.

Phylogenetic tree of the proposed family Polycipiviridae. ORF5 amino acid sequences were aligned with muscle, and a Bayesian Markov chain Monte Carlo-based phylogenetic tree was produced with MrBayes. All posterior probabilities were equal to 1.00. The tree was mid-point rooted and visualized with FigTree. TSA indicates sequences obtained from the NCBI Transcriptome Shotgun Assembly database. Proposed Polycipiviridae genera – Chipolycivirus, Hupolycivirus and Sopolycivirus – are delineated with boxes. For sopolyciviruses, the host ant subfamilies are indicated on the right-hand side.

Most of the 15 sequences had some amount of the 5′ and 3′ untranslated regions (UTRs) present, though, given the variability in UTR lengths, we assume that several were incomplete. The 5′ UTR sequences were frequently >200 nt and ranged up to 366 nt in length. Eight sequences extend to the poly(A) tail and may therefore be assumed to have been 3′-complete. For these sequences, the 3′ UTR lengths ranged from 385 to 479 nt, excluding the poly(A) sequence. The length of the intergenic region between ORFs 4 and 5 ranged from 336 to 768 nt. To assess 5′-completeness we compared 5′ nucleotide sequences between different species (Figure S1, available in the online Supplementary Material). A number of polycipivirus sequences exhibit a predicted stable simple stem–loop structure close to the 5′ end. In SINV-4, SINV-2 and the L. neglectus TSA, this is preceded by an AU-rich tract with 5′-terminal UUU. This 5′ end similarity between these highly divergent sequences suggests that this represents the true 5′ end of the genome. Relative to these, the MsaV-1 and LneV-1 sequences appear to be missing approximately 5 and 23 nt, respectively, from their 5′ ends. For the LniV-1 sequence we were unable to obtain complete 5′ or 3′ UTR sequences.

For each ORF, functions were predicted with HHpred [13] using the Pfam [14] and PDB [15] databases. For most sequences, ORFs 1, 3 and 4 were predicted to encode picornavirus-like capsid proteins (Fig. 2). For the most divergent sequences – Hubei picorna-like viruses 81 and 82, and the C. riparius TSA – ORF1 was predicted to encode a picornavirus-like capsid protein, but for ORF4 this was only predicted for Hubei picorna-like virus 81, and for ORF3 it was not predicted for any of the three. However, for ORFs 3 and 4, HHpred [16, 17] found significant sequence alignments (E-values <0.0001; alignment lengths ranging from 112 to 245 amino acids) between an alignment of the 11 sequences from the SINV-4/SINV-2 clade and a query alignment of the C. riparius TSA and Hubei picorna-like virus 82, or the single Hubei picorna-like virus 81 sequence, suggesting that these ORFs encode homologous proteins across all 14 sequences. For all sequences, ORF5 was predicted to encode helicase (Hel), protease (Pro) and RNA-dependent RNA polymerase (RdRp) domains. ORFs 2 and 2b had no HHpred matches; however, a transmembrane domain was predicted using TMHMM [18] in the middle of all ORF2b amino acid sequences. HHpred indicated homology between ORF2 of the SINV-4/SINV-2 clade and ORF2 of Hubei picorna-like virus 81 (E-value <10−8; alignment length 126 amino acids), but homology to ORF2 of the C. riparius TSA/Hubei picorna-like virus 82 clade was less certain (E-value=0.029; alignment length 165 amino acids). The Formica exsecta TSA ORF3a polypeptide also had no significant HHpred matches.

Following Koonin and Dolja [19], we identified characteristic Hel, Pro and RdRp motifs in the ORF5 polypeptide sequences. All sequences contained the three superfamily III helicase motifs (Fig. S2). The protease domain is less well conserved across picorna-like viruses, with only three very short characteristic motifs corresponding to the catalytic triad – H, D and C (within a quite conserved GxCG motif) [19]. Positive-sense RNA viruses, and in particular picorna-like viruses, usually have chymotrypsin-like cysteine proteases. In the SINV-2-like sequences, however, the protease has a serine (S) at the corresponding active site, GxSG (Fig. S3). We were able to find all eight conserved RdRp motifs (Fig. S4), with the motifs most closely matching superfamily I RdRps (a group that also includes the RdRps of picornaviruses, potyviruses, sobemoviruses and nidoviruses [19]). In two sequences (the C. riparius TSA and Hubei picorna-like virus 82), the usually very well-conserved GDD in motif VI (also often called motif C) was replaced with ADD.

Phylogeny of SINV-2-like viruses

Based on the genome structure and the identified protein domains, SINV-2-like sequences may be classified within the order Picornavirales. To test for monophyly, we obtained amino acid sequences for the conserved ‘core’ region of the RdRp from viruses in the order Picornavirales from the supplementary material of [7], appended the equivalent region from the 15 SINV-2-like sequences with RdRp coverage, realigned the sequences with muscle [20] and generated a Bayesian Markov chain Monte Carlo-based phylogenetic tree using MrBayes [21]. The SINV-2-like sequences form a monophyletic (albeit highly divergent) group (Fig. 5).

Fig. 5.

Fig. 5.

Phylogenetic tree for polycipiviruses and representative members of the order Picornavirales. Core RdRp amino acid sequences from representative Picornavirales viruses were obtained from Koonin et al. [7] and combined with the equivalent regions from the 15 sequences in Table 1 with RdRp coverage. Sequences were aligned with muscle, and a Bayesian Markov chain Monte Carlo-based phylogenetic tree was produced. Posterior probabilities are indicated for family root nodes, and elsewhere if P<1.00. Abbreviations: ALSV, apple latent spherical virus; BBWV1, broad bean wilt virus 1; CPMV, cowpea mosaic virus; CRLV, cherry rasp leaf virus; CrPV, cricket paralysis virus; DCV, Drosophila C virus; DWV, deformed wing virus; EMCV, encephalomyocarditis virus; FMDV, foot-and-mouth disease virus; GFLV, grapevine fanleaf virus; HaRNAV, Heterosigma akashiwo RNA virus; HAV, hepatitis A virus; HplV-81, Hubei picorna-like virus 81; HplV-82, Hubei picorna-like virus 82; HRV1A, human rhinovirus 1A; IFV, infectious flacherie virus; LneV-1, Lasius neglectus virus 1; LniV-1, Lasius niger virus 1; MsaV-1, Myrmica scabrinodis virus 1; NIMV, navel orange infectious mottling virus; PnPV, Perina nuda picorna-like virus; PV, polio virus; PYFV, parsnip yellow fleck virus; RTSV, rice tungro spherical virus; SDV, satsuma dwarf virus; ShiV-8, Shuangao insect virus 8; SINV-2, Solenopsis invicta virus 2; SINV-4, Solenopsis invicta virus 4; TRSV, tobacco ringspot virus; TrV, Triatoma virus; TSV, Taura syndrome virus.

Discussion

We have identified and sequenced four new viruses from ant species. Together with SINV-2, 6 sequences recovered from TSA databases and 4 recent additions to GenBank, these 15 sequences form a distinct group of arthropod-infecting viruses with a characteristic genome organization. Members have a polyadenylated positive-sense RNA genome that encodes three related picornavirus-like jelly-roll capsid domains, and a non-structural polyprotein containing superfamily III helicase, 3C-like chymotrypsin-related protease and superfamily I RdRp domains, characteristic of members of the order Picornavirales. Thus we propose that they should be classified into a new family, Polycipiviridae (polycistronic picorna-like viruses), within the order Picornavirales. The virion morphology (small icosahedral particles) observed for SINV-2 [2] is also consistent with this placement. Like the dicistronic dicistroviruses (family Dicistroviridae) and the bipartite cheraviruses, sadwaviruses and comoviruses (family Secoviridae), the polycipivirus coding sequences are split into separate non-structural and structural protein modules. In contrast to these other groups, the polycipivirus structural protein module is further split into separate 5′ ORFs rather than depending on polyprotein expression of multiple jelly-roll domains from a single ORF.

Although polycipiviruses form a distinct clade, there is considerable diversity among the 15 available sequences (some of the ORF5 pairwise amino acid identities are as low as 17 %), indicating that the family should be split into a number of genera. We propose the following groupings of the currently available sequences (see Fig. 4): Sopolycivirus (from Solenopsis invicta polycipivirus) to include SINV-2, SINV-4 and nine related sequences, all of which contain ORF2b; Hupolycivirus (from Hubei picorna-like virus 81 polycipivirus) to include the two Hubei picorna-like virus 81 sequences; and Chipolycivirus (from Chironomus riparius polycipivirus) to include the C. riparius TSA and Hubei picorna-like virus 82, both of which contain the GDD to ADD substitution in RdRp motif VI.

The polycipivirus 3C-like protease is unusual in that it contains a serine residue at its active site (serine protease), whereas the majority of Picornavirales 3C-like proteases have a cysteine residue at this location (cysteine protease). The only other currently known exceptions are the sole member of the Picornavirales family, Marnaviridae, and some members of genus Nepovirus in the family Secoviridae [1, 22]. The ‘picorna-like’ astroviruses, sobemoviruses and poleroviruses also have chymotrypsin-like serine proteases, and cellular homologues have serine at the active site [23, 24].

The most conserved protein of positive-sense RNA viruses is the RdRp. In polycipiviruses, we were able to identify all eight signature motifs of the superfamily I RdRp. Unusually, however, in two of the sequences the highly conserved GDD of motif VI was replaced with ADD. Motif VI is responsible for magnesium ion coordination. The first aspartate (D) is mainly responsible for the coordination and cannot be substituted. There is, however, potential for flexibility in the third position, and even more flexibility in the first position (reviewed in [25]). Indeed, the glycine (G) has been experimentally substituted with six different amino acids, and with some substitutions the polymerase remains active. Substitution with alanine (ADD) in tobacco vein mottling virus (family Potyviridae), polio virus (family Picornaviridae) or hepatitis C virus (family Flaviviridae) results in an in vitro RdRp activity of 5 to 12 % of wild-type activity [26–28], although when the same mutation was introduced in encephalomyocarditis virus (family Picornaviridae) the RdRp was inactive in vitro [29]. Positive-sense and double-stranded RNA viruses have a strong preference for GDD, although nidoviruses (order Nidovirales) and some, but not all, hypoviruses (family Hypoviridae) have SDD at this site. On the other hand, non-segmented negative-sense RNA viruses generally have GDN, whereas segmented negative-sense RNA viruses have SDD, and the reverse transcriptases of retroviruses have MDD [30–33]. Unusually, members of the positive-sense RNA virus family Permutotetraviridae and the double-stranded RNA virus family Birnaviridae have a permuted polymerase domain, with motif VI occurring between motifs III and IV; the GDD sequence is still present in permutotetraviruses but is substituted with ADN in birnaviruses, and mutation of ADN to GDD results in an almost complete loss of activity [34, 35]. In summary, therefore, even though the ADD in motif VI of two polycipivirus sequences is a deviation from the expected GDD, it is still a plausible variation. The fact that it was observed in two independent sequences (which also cluster phylogenetically and are relatively distant from the other polycipivirus sequences; Fig. 4) indicates that ADD is a real variant and not a result of sequencing error.

Where investigated, Picornavirales species have been found to harbour a VPg protein covalently linked to the 5′ end of their genome that primes RNA synthesis during genome replication. The Picornavirales VPg is typically <5 kDa and normally encoded between Hel and Pro [1]. Due to their small size and lack of structural domains, divergent VPg proteins are not easily recognizable by sequence homology, and we were not able to definitively identify a VPg domain in polycipivirus sequences. Nonetheless, there is a large region of unassigned function between Hel and Pro that we presume encodes a VPg, and probably (due to the size of the region) another protein of unknown function. A further one or two additional proteins are likely encoded upstream of Hel in the ORF5 polyprotein. Due to the high divergence between polycipiviruses and picorna-like viruses whose cleavage sites have been characterized, we were unable to definitively predict the polyprotein cleavage sites.

As in other Picornavirales species, gene expression in polycipiviruses likely depends on internal ribosome entry site (IRES)-mediated initiation. Consistent with this, the lengthy 5′ UTRs typically contained a number of AUG codons that would be expected to inhibit 5′ end-dependent scanning. Consequently, we suppose the 5′ UTR to contain an IRES to direct ribosome initiation at the ORF1 start codon. Similarly, we suppose the long intergenic region between ORFs 4 and 5 to contain a second IRES to direct ribosome initiation at the ORF5 start codon. A similar situation occurs in dicistroviruses (family Dicistroviridae), where the translation of structural and non-structural polyproteins is directed by separate IRESes (although in that case the non-structural polyprotein ORF is 5′ proximal). Although we attempted to define the potential IRES RNA structure in silico by means of RNA-folding algorithms and inter-species comparisons, the results to date have been inconclusive. The translation mechanism of the additional 5′ ORFs (ORFs 2 to 4 and, where present, 2b) remains uncertain; however, close spacing of the stop and start codons of consecutive ORFs suggests a ribosome reinitiation mechanism.

Additional work will be required to confirm the composition and structure of virus particles, the presence and sequence of a 5′-linked VPg protein, the non-structural polyprotein cleavage sites and products, and the gene expression mechanisms of this unusual family of arthropod-infecting picorna-like viruses.

Ten of the 11 members of the proposed Polycipiviridae genus Sopolycivirus appear to infect ant species, while the eleventh member (Shuangao insect virus 8) is apparently an insect virus whose host has yet to be identified (Fig. 4). These viruses have been isolated from ants across several continents and in four out of the five ant species targeted in this study. Moreover, they have been identified in three different ant subfamilies, and three individual ant species (Solenopsis invicta, Monomorium pharaonis and Lasius neglectus) were found to play host to more than one divergent viruses in the group, suggesting a long evolutionary history between Sopolycivirus species and ants. Although we cannot rule out other hosts, and our own sampling has been biased towards discovering new ant-infecting members, it seems possible that genus Sopolycivirus may be an ant-specific clade.

Methods

SINV-4 identification and sequencing

Adult worker Solenopsis invicta ants were collected from 46 nests in the eastern portion of the state of Formosa, Argentina, and returned to quarantine in Gainesville, Florida, United States. Total RNA was extracted from 10 to 15 live worker ants from each colony using Trizol (Invitrogen) and the PureLink purification kit (Ambion). Total RNA (10 µg per group) was submitted to GE Healthcare (Los Angeles, CA, USA) for mRNA purification, library preparation and Illumina RNA sequencing (MiSeq). Sequences were aligned to the S. invicta genome (GenBank accession AEAQ01000001.1) and non-matching sequences were compared to the UniProt annotated Swiss-Prot protein sequence database (download date 14 November 2014). The unmatched sequences were assembled (Vector NTI, Invitrogen) and a unique sequence with significant similarity to SINV-2 was identified by blastx analysis. RT-PCR with gene-specific oligonucleotide primers revealed that this sequence was present in some colonies of S. invicta in the USA. Total RNA from USA S. invicta colonies containing the sequence was extracted and used as template for cDNA synthesis, subsequent PCR and RACE (5′ and 3′) to obtain the entire genome sequence of this new virus, SINV-4, by Sanger sequencing. For 5′ RACE, the 5′ RACE System for Rapid Amplification of cDNA Ends, version 2.0 (Invitrogen, Carlsbad, CA, USA) was used. cDNA was first synthesized with oligonucleotide p1476 (5′-TGGAATTCCAGAATTTTCTAAGGTTCCCATATTAGT), followed by PCR with the gene-specific primer p1474 (5′-TGAATTCCAGGTAACGCTTGAACCATTGGT) and the abridged anchor primer (Invitrogen). For 3′ RACE, the GeneRacer kit (Invitrogen) was used. cDNA was synthesized with the GeneRacer Oligo dT primer. PCR was subsequently completed with the GeneRacer 3′ primer and the gene-specific primer, p1536 (5′-ATGGCTGTTGCTGACATGTTATG CATTATGTT).

LniV-1, LneV-1 and MsaV-1 identification and sequencing

Adult L. niger, L. neglectus, M. scabrinodis and L. flavus ants were collected from individual nests in Cambridge, UK; all ants were workers except the sequencing round 1 L . flavus sample, for which queens were used. Total RNA was extracted from 10 to 20 adult workers or 5 queens using Trizol (Invitrogen) following the manufacturer’s instructions. RNA was treated with DNase (Promega, RQ1 RNase-free DNase). For standard RNA-Seq libraries, ribosomal RNA was depleted using the Ribo-Zero kit (Illumina), and the remaining RNA subjected to alkaline hydrolysis, followed by acrilamide gel purification of RNA bands within the size range 35–50 nt for L. niger and 70–85 for the other four species. For small-RNA sequencing, RNA bands within the size range 18–30 nt were acrylamide gel-purified from total RNA after DNase treatment. All library amplicons were constructed using a small RNA cloning strategy [36, 37] and sequenced (single-end, 75 nt) using the NextSeq500 platform (Illumina) at the DNA Sequencing Facility (Department of Biochemistry, University of Cambridge). High-throughput sequencing data were deposited in ArrayExpress (http://www.ebi.ac.uk/arrayexpress) under the accession number E-MTAB-5781. RNA-Seq reads were trimmed using the FASTX-Toolkit and assembled using the Trinity (v 2.3.2) and Velvet (v 1.2.10) de novo assemblers [10, 11]. Using blastx, the assembled contigs were compared to a database of polypeptide sequences derived from SINV-2, SINV-4 and the SINV-2-like sequences identified in the NCBI TSA database. Contigs that mapped to SINV-2-like proteins with E-value <10−6 and length >300 nt of coding sequence were retained, manually joined where possible, and used to design primers for Sanger sequencing.

Gaps were filled and the entire genomes sequenced by Sanger sequencing and terminal sequences obtained by 5′ and 3′ RACE, as follows. Two µg of total ant RNA was treated with proteinase K and then recovered by acid phenol/chloroform extraction and used for 5′- and 3′-RACE using the SMARTer RACE 5′/3′ kit (Clontech) according to the manufacturer's instructions. Gene-specific primers for 5′ and 3′ RACE were located within 1000 nt of the expected 5′ and 3′ ends of the virus genome. The resulting clones (12–19 for each of MsaV-1 5′ and 3′, LneV-1 5′ and 3′ and LniV-1 5′, and 4 for LniV-1 3′) in the pRACE vector were sequenced with M13 universal primers. The rest of the viral genomes were PCR-amplified as eight partially overlapping DNA fragments using specific primers and the same cDNA templates as were generated for the 5′/3′ RACE PCRs. These fragments were cloned into the pJET1.2 vector (Thermo Scientific) and sequenced with pJET1.2-specific primers.

Computational analysis

Sequences were processed using EMBOSS [38] and analysed using blast [39], HHpred [13] (using the PDB and Pfam databases) and Phyre 2 [40] (analysis performed between November 2016 and May 2017). Comparison of amino acid sequences/alignments between different polycipivirus clades was performed using HHpred in the align two sequences/alignments mode during 23–28 June 2017. Amino acid sequences were aligned using muscle v 3.8.31 [20] and phylogenetic trees were estimated using the Bayesian Markov chain Monte Carlo method implemented in MrBayes v 3.2.6 [21], sampling across the default set of fixed amino acid rate matrices, with 10 million generations, and discarding the first 25 % as burn-in. The trees were visualized using FigTree v 1.4.3. Pairwise amino acid identities were calculated based on pairwise ORF5 muscle alignments. To calculate the coverage and polymorphism frequencies, Bowtie 2 (v 2.3.0, [12]), using default parameters, was used to map raw NextSeq sequencing reads back to the LniV-1, LneV-1 and MsaV-1 genomes.

Accession numbers, vouchers, and new family submission

We have submitted a proposal to the International Committee on Taxonomy of Viruses (ICTV) to name a new family Polycipiviridae containing the genera Sopolycivirus, Hupolycivirus and Chipolycivirus in the order Picornavirales. Sequences for SINV-4, SINV-2, LneV-1, MsaV-1 and LniV-1 have been submitted to NCBI with the accession numbers MF041808, MF041813, MF041809, MF041810 and MF041812. Voucher specimens of the S. invicta ant species from which SINV-4 was originally sequenced are retained at USDA-ARS, Gainesville, Florida, USA.

Funding information

This work was supported by a Wellcome Trust grant [106207] and a European Research Council (ERC) European Union Horizon 2020 research and innovation programme grant [646891] to A.  E.  F.

Acknowledgements

L. A. Calcaterra and the Fundación para el Estudio de Especies Invasivas (FuEDEI) in Hurlingham, Argentina are thanked for their long-term and ongoing assistance with collecting and studying fire ants in Argentina. We thank Phillip Buckham-Bonnett for valuable assistance with ant localization, collection and identification in the UK, John P.  Carr for many helpful discussions and strategic assistance and the Cambridge University Botanic Garden for access for sample collection.

Conflicts of interest

The authors declare that there are no conflicts of interest.

Ethical statement

No specific permits were required to collect field specimens because they did not occur in locations that are protected in any way. The field collections did not involve endangered or protected species.

Supplementary Data

Supplementary File 1

References

  • 1.Le Gall O, Christian P, Fauquet CM, King AM, Knowles NJ, et al. Picornavirales, a proposed order of positive-sense single-stranded RNA viruses with a pseudo-T=3 virion architecture. Arch Virol. 2008;153:715–727. doi: 10.1007/s00705-008-0041-x. [DOI] [PubMed] [Google Scholar]
  • 2.Valles SM, Strong CA, Hashimoto Y. A new positive-strand RNA virus with unique genome characteristics from the red imported fire ant, Solenopsis invicta. Virology. 2007;365:457–463. doi: 10.1016/j.virol.2007.03.043. [DOI] [PubMed] [Google Scholar]
  • 3.Hashimoto Y, Valles SM. Infection characteristics of Solenopsis invicta virus 2 in the red imported fire ant, Solenopsis invicta. J Invertebr Pathol. 2008;99:136–140. doi: 10.1016/j.jip.2008.06.006. [DOI] [PubMed] [Google Scholar]
  • 4.Valles SM, Varone L, Ramírez L, Briano J. Multiplex detection of Solenopsis invicta viruses -1, -2, and -3. J Virol Methods. 2009;162:276–279. doi: 10.1016/j.jviromet.2009.07.019. [DOI] [PubMed] [Google Scholar]
  • 5.Valles SM, Oi DH, Porter SD. Seasonal variation and the co-occurrence of four pathogens and a group of parasites among monogyne and polygyne fire ant colonies. Biological Control. 2010;54:342–348. doi: 10.1016/j.biocontrol.2010.06.006. [DOI] [Google Scholar]
  • 6.Manfredini F, Shoemaker D, Grozinger CM. Dynamic changes in host-virus interactions associated with colony founding and social environment in fire ant queens (Solenopsis invicta) Ecol Evol. 2016;6:233–244. doi: 10.1002/ece3.1843. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Koonin EV, Wolf YI, Nagasaki K, Dolja VV. The Big Bang of picorna-like virus evolution antedates the radiation of eukaryotic supergroups. Nat Rev Microbiol. 2008;6:925–939. doi: 10.1038/nrmicro2030. [DOI] [PubMed] [Google Scholar]
  • 8.Shi M, Lin XD, Tian JH, Chen LJ, Chen X, et al. Redefining the invertebrate RNA virosphere. Nature. 2016;540:539–543. doi: 10.1038/nature20167. [DOI] [PubMed] [Google Scholar]
  • 9.Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, et al. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol. 2011;29:644–652. doi: 10.1038/nbt.1883. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Haas BJ, Papanicolaou A, Yassour M, Grabherr M, Blood PD, et al. De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nat Protoc. 2013;8:1494–1512. doi: 10.1038/nprot.2013.084. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Zerbino DR, Birney E. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. 2008;18:821–829. doi: 10.1101/gr.074492.107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Langmead B, Salzberg SL. Fast gapped-read alignment with bowtie 2. Nat Methods. 2012;9:357–359. doi: 10.1038/nmeth.1923. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Söding J, Biegert A, Lupas AN. The HHpred interactive server for protein homology detection and structure prediction. Nucleic Acids Res. 2005;33:W244–W248. doi: 10.1093/nar/gki408. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Finn RD, Coggill P, Eberhardt RY, Eddy SR, Mistry J, et al. The Pfam protein families database: towards a more sustainable future. Nucleic Acids Res. 2016;44:D279–D285. doi: 10.1093/nar/gkv1344. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, et al. The protein data bank. Nucleic Acids Res. 2000;28:235–242. doi: 10.1093/nar/28.1.235. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Alva V, Nam SZ, Söding J, Lupas AN. The MPI bioinformatics Toolkit as an integrative platform for advanced protein sequence and structure analysis. Nucleic Acids Res. 2016;44:W410–W415. doi: 10.1093/nar/gkw348. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Söding J. Protein homology detection by HMM-HMM comparison. Bioinformatics. 2005;21:951–960. doi: 10.1093/bioinformatics/bti125. [DOI] [PubMed] [Google Scholar]
  • 18.Krogh A, Larsson B, Von Heijne G, Sonnhammer EL. Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J Mol Biol. 2001;305:567–580. doi: 10.1006/jmbi.2000.4315. [DOI] [PubMed] [Google Scholar]
  • 19.Koonin EV, Dolja VV. Evolution and taxonomy of positive-strand RNA viruses: implications of comparative analysis of amino acid sequences. Crit Rev Biochem Mol Biol. 1993;28:375–430. doi: 10.3109/10409239309078440. [DOI] [PubMed] [Google Scholar]
  • 20.Edgar RC. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004;32:1792–1797. doi: 10.1093/nar/gkh340. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Ronquist F, Teslenko M, van der Mark P, Ayres DL, Darling A, et al. MrBayes 3.2: efficient bayesian phylogenetic inference and model choice across a large model space. Syst Biol. 2012;61:539–542. doi: 10.1093/sysbio/sys029. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Isogai M, Tatuto N, Ujiie C, Watanabe M, Yoshikawa N. Identification and characterization of blueberry latent spherical virus, a new member of subgroup C in the genus Nepovirus. Arch Virol. 2012;157:297–303. doi: 10.1007/s00705-011-1177-7. [DOI] [PubMed] [Google Scholar]
  • 23.Gorbalenya AE, Koonin EV, Blinov VM, Donchenko AP. Sobemovirus genome appears to encode a serine protease related to cysteine proteases of Picornaviruses. FEBS Lett. 1988;236:287–290. doi: 10.1016/0014-5793(88)80039-5. [DOI] [PubMed] [Google Scholar]
  • 24.Gorbalenya AE, Donchenko AP, Blinov VM, Koonin EV. Cysteine proteases of positive strand RNA viruses and chymotrypsin-like serine proteases. FEBS Lett. 1989;243:103–114. doi: 10.1016/0014-5793(89)80109-7. [DOI] [PubMed] [Google Scholar]
  • 25.O'Reilly EK, Kao CC. Analysis of RNA-dependent RNA polymerase structure and function as guided by known polymerase structures and computer predictions of secondary structure. Virology. 1998;252:287–303. doi: 10.1006/viro.1998.9463. [DOI] [PubMed] [Google Scholar]
  • 26.Jablonski SA, Luo M, Morrow CD. Enzymatic activity of poliovirus RNA polymerase mutants with single amino acid changes in the conserved YGDD amino acid motif. J Virol. 1991;65:4565–4572. doi: 10.1128/jvi.65.9.4565-4572.1991. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Hong Y, Hunt AG. RNA polymerase activity catalyzed by a potyvirus-encoded RNA-dependent RNA polymerase. Virology. 1996;226:146–151. doi: 10.1006/viro.1996.0639. [DOI] [PubMed] [Google Scholar]
  • 28.Lohmann V, Körner F, Herian U, Bartenschlager R, Herian U, Ko F. Biochemical properties of hepatitis C virus NS5B RNA-dependent RNA polymerase and identification of amino acid sequence motifs essential for enzymatic activity. J Virol. 1997;71:8416–8428. doi: 10.1128/jvi.71.11.8416-8428.1997. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Sankar S, Porter AG. Point mutations which drastically affect the polymerization activity of encephalomyocarditis virus RNA-dependent RNA polymerase correspond to the active site of Escherichia coli DNA polymerase I. J Biol Chem. 1992;267:10168–10176. [PubMed] [Google Scholar]
  • 30.Biswas SK, Nayak DP. Mutational analysis of the conserved motifs of influenza A virus polymerase basic protein 1. J Virol. 1994;68:1819–1826. doi: 10.1128/jvi.68.3.1819-1826.1994. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Sánchez AB, de La Torre JC, de TJC, Ab S. Genetic and biochemical evidence for an oligomeric structure of the functional L polymerase of the prototypic Arenavirus lymphocytic choriomeningitis virus. J Virol. 2005;79:7262–7268. doi: 10.1128/JVI.79.11.7262-7268.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Zamoto-Niikura A, Terasaki K, Ikegami T, Peters CJ, Makino S. Rift valley fever virus L protein forms a biologically active oligomer. J Virol. 2009;83:12779–12789. doi: 10.1128/JVI.01310-09. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Te Velthuis AJ. Common and unique features of viral RNA-dependent polymerases. Cell Mol Life Sci. 2014;71:4403–4420. doi: 10.1007/s00018-014-1695-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Letzel T, Mundt E, Gorbalenya AE. Evidence for functional significance of the permuted C motif in Co2+-stimulated RNA-dependent RNA polymerase of infectious bursal disease virus. J Gen Virol. 2007;88:2824–2833. doi: 10.1099/vir.0.82890-0. [DOI] [PubMed] [Google Scholar]
  • 35.Gorbalenya AE, Pringle FM, Zeddam JL, Luke BT, Cameron CE, et al. The palm subdomain-based active site is internally permuted in viral RNA-dependent RNA polymerases of an ancient lineage. J Mol Biol. 2002;324:47–62. doi: 10.1016/S0022-2836(02)01033-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Cook S, Chung BY, Bass D, Moureau G, Tang S, et al. Novel virus discovery and genome reconstruction from field RNA samples reveals highly divergent viruses in dipteran hosts. PLoS One. 2013;8:e80720. doi: 10.1371/journal.pone.0080720. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Guo H, Ingolia NT, Weissman JS, Bartel DP. Mammalian microRNAs predominantly act to decrease target mRNA levels. Nature. 2010;466:835–840. doi: 10.1038/nature09267. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Rice P, Longden I, Bleasby A. EMBOSS: the European molecular biology open software suite. Trends Genet. 2000;16:276–277. doi: 10.1016/S0168-9525(00)02024-2. [DOI] [PubMed] [Google Scholar]
  • 39.Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215:403–410. doi: 10.1016/S0022-2836(05)80360-2. [DOI] [PubMed] [Google Scholar]
  • 40.Kelley LA, Sternberg MJ. Protein structure prediction on the web: a case study using the Phyre server. Nat Protoc. 2009;4:363–371. doi: 10.1038/nprot.2009.2. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary File 1

Articles from The Journal of General Virology are provided here courtesy of Microbiology Society

RESOURCES