Skip to main content
Journal of Virology logoLink to Journal of Virology
. 2011 Aug;85(15):7900–7911. doi: 10.1128/JVI.00645-11

Genomic and Proteomic Analysis of Invertebrate Iridovirus Type 9 ,

Chun K Wong 1, Vivienne L Young 1, Torsten Kleffmann 2, Vernon K Ward 1,*
PMCID: PMC3147941  PMID: 21632757

Abstract

Iridoviruses (IV) are nuclear cytoplasmic large DNA viruses that are receiving increasing attention as sublethal pathogens of a range of insects. Invertebrate iridovirus type 9 (IIV-9; Wiseana iridovirus) is a member of the major phylogenetic group of iridoviruses for which there is very limited genomic and proteomic information. The genome is 205,791 bp, has a G+C content of 31%, and contains 191 predicted genes, with approximately 20% of its repeat sequences being located predominantly within coding regions. The repeated sequences include 11 proteins with helix-turn-helix motifs and genes encoding related tandem repeat amino acid sequences. Of the 191 proteins encoded by IIV-9, 108 are most closely related to orthologs in IIV-3 (Chloriridovirus genus), and 114 of the 126 IIV-3 genes have orthologs in IIV-9. In contrast, only 97 of 211 IIV-6 genes have orthologs in IIV-9. There is almost no conservation of gene order between IIV-3, IIV-6, and IIV-9. Phylogenetic analysis using a concatenated sequence of 26 core IV genes confirms that IIV-3 is more closely related to IIV-9 than to IIV-6, despite being from a different genus of the Iridoviridae. An interaction between IIV and small RNA regulatory systems is supported by the prediction of seven putative microRNA (miRNA) sequences combined with XRN exonuclease, RNase III, and double-stranded RNA binding activities encoded on the genome. Proteomic analysis of IIV-9 identified 64 proteins in the virus particle and, when combined with infected cell analysis, confirmed the expression of 94 viral proteins. This study provides the first full-genome and consequent proteomic analysis of group II IIV.

INTRODUCTION

Iridoviruses (IV) are members of the nucleocytoplasmic large DNA viruses (NCLDV) (19). They possess a linear double-stranded DNA (dsDNA) genome with circular permutation and terminal redundancy (6, 13), and replication of the viral genome includes distinct nuclear and cytoplasmic phases (12). The genomes are encapsidated within an icosahedral shell ranging between 120 and 180 nm in diameter and comprised predominantly of a 50-kDa major capsid protein (MCP). The invertebrate iridoviruses (IIV), studied by cryo-electron microscopy, have 2-nm-diameter surface fibrils (23, 42); for invertebrate iridovirus type 6 (IIV-6), these fibrils extend from the 3-fold rotational axis of the 1,460 hexameric capsids found in the virus particle (43). IV are divided into 5 genera (Table 1), with members of three genera infecting poikilothermic vertebrates and members of the Iridovirus and Chloriridovirus genera infecting invertebrates. The Chloriridovirus genus has only one member, IIV-3 (mosquito iridovirus), and the primary defining differences between the Chloriridovirus and Iridovirus genera are particle sizes of approximately 180 and 135 nm, respectively, and the mosquito host range restriction of IIV-3 (4).

Table 1.

Fully sequenced genomes from vertebrate and invertebrate iridoviruses

Genus and virusa Genome size (bp) % G+C ORFb Coding density (%) Protein size range (aa) GenBank accession no. Reference
Iridovirus
    IIV-9 205,791 31 191 90 50–2,051 GQ918152 This study
    IIV-6 212,482 29 243c 85 40–2,432 AF303741 Jakob et al. (20)
Chloriridovirus
    IIV-3 191,132 48 126 68 60–1,377 DQ643392 Delhon et al. (5)
Lymphocystivirus
    LCDV-1 102,653 29 110 82 40–1,199 L63545 Tidona and Darai (35)
    LCDV-C 186,250 27 240 67 40–1,193 AY380826 Zhang et al. (46)
Ranavirus
    TFV 105,057 55 105 94 40–1,294 AF389451 He et al. (17)
    ATV 106,332 54 96 79 32–1,294 AY150217 Jancovich et al. (21)
    FV-3 105,903 55 98 80 50–1,293 AY548484 Tan et al. (34)
    STIV 105,890 55 105 80 40–1,294 EU627010 Huang et al. (18)
    SGIV 140,131 49 162 98 41–1,268 AY521625 Song et al. (32)
    GIV 139,793 49 120 83 62–1,268 AY666015 Tsai et al. (36)
Megalocytivirus
    ISKNV 111,362 55 124 93 40–1,208 AF371960 He et al. (16)
    RBIV 112,080 53 118 86 50–1,253 AY532606 Do et al. (7)
    RSIV 112,414 53 93 86–1,309 BD143114 Kurita et al. (25)
    OSGIV 112,636 54 121 91 40–1,168 AY894343 Lu et al. (26)
a

IIV-3, invertebrate iridescent virus type 3; IIV-6, invertebrate iridescent virus type 6; LCDV-1, lymphocystis disease virus 1; LCDV-C, lymphocystis disease virus, China strain; TFV, tiger frog virus; ATV, ambystoma tigrinum virus; FV-3, frog virus 3; STIV, soft-shelled turtle iridovirus; SGIV, Singapore grouper iridovirus; GIV, grouper iridovirus; ISKNV, infectious spleen and kidney necrosis virus; RBIV, rock bream iridovirus; RSIV, red sea bream iridovirus; OSGIV, orange spotted grouper iridovirus.

b

Essentially nonoverlapping ORF encoding a minimum length of 40 to 62 aa.

c

Revised annotation of Eaton et al. (8).

The vertebrate IV cause disease in fish, amphibians, and reptiles and have received considerable attention due to their effects upon aquaculture. In contrast, the IIV cause predominantly subpathogenic infections, and their consequently limited utility for pest control has meant that less is known about IIV. Of particular importance has been the recent study of Bromenshenk et al. (3) linking colony collapse disorder in honey bees to coinfection with Nosema and an unidentified iridovirus(es). A strong causal relationship was established; however, the identity of the IV was not established, at least in part due to a lack of IIV genomic information. In addition, the refraction of light by assemblies of IIV particles offers new opportunities in materials development (23, 28) that would benefit from more information on the virus particle and its constituents. The roles of viral proteins, such as the surface fiber, in iridescence are unknown, and the proteins and functional activities associated with the virus particle remain to be elucidated. Central to this is the need for information on IIV genomes and the proteomic analysis of the virus particle.

Fourteen iridovirus species have been fully sequenced (Table 1), with multiple members of the Ranavirus, Lymphocystivirus, and Megalocytivirus genera providing a comprehensive coverage of these vertebrate genera of IV. Vertebrate IV genomes range from 105 kbp for tiger frog virus (17) to 186 kbp for lymphocystis disease virus, China strain (LCDV-C) (46). The Ranavirus and Megalocytivirus species have G+C contents of approximately 50%, while the Lymphocystivirus species have G+C contents of less than 30%. There is a consistent lack of genome colinearity between IV except with very closely related isolates, although all IV sequenced to date possess a core cohort of 26 conserved genes (8). In contrast to the vertebrate IV, the only fully sequenced IIV are IIV-6 (Chilo iridovirus [CIV]) (20) and IIV-3 (mosquito chloriridovirus [MIV]) (5). IIV-6 is the type species of the Iridovirus genus, with a genome of 212 kbp and a G+C content of 29%; however, phylogenetic studies show that IIV-6 belongs in a clade distant from that of most iridoviruses (Fig. 1 A) (38). IIV-3, with a genome of 191 kbp and a G+C content of 48%, represents a different genus that may be more closely related to members of the Iridovirus genus than its placement in a separate genus suggests.

Fig. 1.

Fig. 1.

Phylogenetic trees of iridoviruses. (A) Alignment of the genus Iridovirus based upon a partial major capsid protein sequence as described in the work of Webby and Kalmakoff (38). The Chloriridovirus IIV-3 and the Lymphocystivirus LCDV-1 are included. (B) Alignment of representatives of the five iridovirus genera Chloriridovirus (IIV-3), Iridovirus (IIV-6, IIV-9), Lymphocystivirus (LCDV-1), Ranavirus (ISKNV), and Megalocytivirus (SGIV) based upon a concatenated amino acid sequence of the 26 core iridovirus genes. Bootstrap support from 1,000 iterations is indicated for all branches, with at least 70% support.

To date, only limited sequence information is available from members of the major clade of IIV, defined as group II iridoviruses by Williams and Cory (41), and genome analysis of a member of this clade would provide information on the relationships between disparate IIV. IIV-9 (Wiseana iridovirus [WIV]), a representative of the major clade, was isolated in New Zealand from larvae of the pasture pest Wiseana spp. (Lepidoptera: Hepialidae) (9). The mechanism of transmission of this virus is unknown, though the presence of this virus in damp and cryptic habitats is consistent with many other IIV (40), and suggestions of vector transmission have been made, though not confirmed. Like most invertebrate iridoviruses, IIV-9 replicates in larvae of the greater wax moth Galleria mellonella upon injection, and heavily infected larvae display typical iridescence upon accumulation of paracrystalline arrays of virus particles within infected tissues (9). IIV-9 also replicates in Spodoptera frugiperda (Sf9, Sf21) cells, albeit at the restricted temperature of 21°C. IIV-9 is a member of the major clade of IIV, as determined by partial major capsid protein phylogeny (38).

This study presents the complete genomic sequence of IIV-9 and uses this information for proteomic analysis of IIV-9's encoded proteins in purified virus particles and within infected cells. Analysis of the genome indicates that IIV-9 is more closely related to IIV-3 than to IIV-6 and provides the first complete genome from the major clade of invertebrate iridoviruses.

MATERIALS AND METHODS

IIV-9 purification, DNA extraction, and sequencing.

Sf21 cells were grown in SF900II serum-free medium (Invitrogen, Auckland, New Zealand) and infected with dilutions of a field isolate of IIV-9 that had been passaged repeatedly through G. mellonella larvae. Infected cells were incubated for 5 days at 21°C under an agarose overlay and stained with neutral red. Individual plaques were picked and passaged once in cell culture. One plaque isolate was randomly selected, propagated in G. mellonella, and purified on sucrose gradients as described previously (23). Genomic DNA was extracted by phenol-chloroform extraction (37), and 50 μl (100 ng μl−1) of genomic DNA in deionized water was sequenced using the Roche/454 GS FLX High Throughput Sequencing Service provided by the Department of Anatomy and Structural Biology, University of Otago. All contig junctions were determined by sequencing of available restriction fragment clones or by PCR. Briefly, primers were designed near the termini of contigs and used with primers on adjacent contigs to generate PCR products directly from genomic DNA using the Expand high-fidelity PCR kit (Roche Diagnostics, Auckland, New Zealand). The PCR products were either sequenced directly at the Allan Wilson Sequencing Centre, Palmerston North, New Zealand, on an ABI 3730 automated sequencer or cloned into pGEMTeasy (Promega Corp., Madison, WI) prior to sequencing. Sequence conflicts, long repeats, and long runs of single nucleotides were confirmed by PCR and sequencing of the region in question. All ABI 3730-generated sequences were edited in SeqMan (DNAStar) for sequence quality prior to use.

Sequence analysis.

Newbler Assembler software (454 Life Sciences, Branford, CT) was used to assemble data into unordered and unoriented contigs (default settings). The contigs were exported to the SeqMan program in the Lasergene suite of DNA analysis programs (DNAStar, Madison, WI) and reassembled into a draft alignment using the SeqMan assembler (match size, 12; minimum match percentage, 80%; minimum sequence length, 100; maximum number of added gaps per kb in the contig, 70; maximum number of added gaps per kb in the sequence, 70; maximum register shift difference, 70; last group considered, 2; gap penalty, 0.00; gap length penalty, 0.70). All contigs were aligned to generate a draft alignment with a minimum match percentage of 95%. PCR primers were designed using PrimerDesign (DNAStar). An in silico analysis of the restriction profile of the complete genome was performed using GeneQuest (DNAStar), and results were compared to published restriction profiles of IIV-9 genomic DNA as a confirmation of the assembly profile.

Tandem repeats within the IIV-9 genome were identified using Tandem Repeats Finder (2), with parameters set for match and mismatch and indels equal to 2, 7, and 7, respectively. The minimum alignment score was set at 50, with a maximum period size of 2,000 bases. Direct, inverted, and dyad repeats were identified using GeneQuest (DNAStar) with an unlimited loop size. The minimum period sizes set for direct, inverted, and dyad repeats were 25 bp, 25 and 50 bp, and 16 bp, respectively. Dot plot analysis was performed to identify DNA repeat clusters using MegAlign (DNAStar), with a window size of 50 bp and a 75% match.

Open reading frames (ORF) encoding proteins with a minimum size of 50 amino acids (aa) and that contained a start codon were designated using SeqBuilder (DNAStar). All designated ORF were named with “orf” followed by numbers corresponding to their position and a forward/reverse (right [R]/left [L], respectively) designation to indicate their orientation. ORF that fell completely within a larger ORF were excluded. Heavily overlapping open reading frames where the most likely ORF could not be determined were given the same number but different orientations. All designated IIV-9 open reading frames were exported from SeqBuilder (DNAStar) to EditSeq (DNAStar), and BLASTP analysis of the predicted amino acid sequences was performed for each open reading frame. Amino acid identity to the closest BLASTP match was performed using MegAlign (DNAStar). Analysis of IIV-9 ORF function was performed via the ExPASy Proteomics Server and included InterProScan, SignalP, and PredictProtein. Protein repeats were identified using the XSTREAM prediction server (27), and subsequent alignments of protein repeats were generated using MegAlign.

A phylogenetic tree was constructed based on the alignment of the 26 core gene amino acid sequences found in IIV-9, IIV-6, IIV-3, Singapore grouper iridovirus (SGIV), lymphocystis disease virus 1 (LCDV-1), and infectious spleen and kidney necrosis virus (ISKNV) using MegAlign (DNAStar), with bootstrap trials set at 1,000. All core gene-encoded proteins were combined as one continuous amino acid sequence in the same gene order prior to assembly. This was compared to the partial major capsid protein tree as described in the work of Webby and Kalmakoff (38).

The complete genome was scanned for miRNA coding regions using VMir (14, 33) and possible miRNA coding sequences further analyzed by MiPred (22). All images were generated using Microsoft PowerPoint and/or Adobe Photoshop CS4.

MS analysis.

Purified IIV-9 virions or infected Sf21 cells harvested 24 h postinfection were denatured in SDS-PAGE sample buffer, and proteins were separated on individual 10% SDS-PAGE gels using standard techniques. The gels were stained with Coomassie G250 and protein lanes cut into five (for liquid chromatography coupled with electrospray ionization linear ion trap [LC-ESI LTQ] Orbitrap tandem mass spectrometry [MS/MS] analyses of IIV-9 virions and infected Sf21 cells) or eight (for LC–matrix-assisted laser desorption ionization–tandem time of flight [MALDI TOF/TOF] analysis of IIV-9 virions) equally sized fractions. Fractions were subjected to in-gel protein digestion with trypsin essentially by following the protocol of Shevchenko et al. (30), using a liquid handling robotic workstation (DigestPro MSi; Intavis AG, Cologne, Germany). Each digested fraction was concentrated using a centrifugal vacuum concentrator and reconstituted in a 10-μl aqueous solution of 2% (vol/vol) acetonitrile (ACN) supplemented with either 0.1% (vol/vol) trifluoroacetic acid (TFA) for LC-MALDI TOF/TOF analyses or 0.2% formic acid for LC-ESI LTQ Orbitrap analyses.

Structural proteins from purified IIV-9 virions were analyzed by LC-MALDI TOF/TOF MS and LC-ESI LTQ Orbitrap MS/MS, and proteins from infected Sf21 cells were analyzed by LC-ESI LTQ Orbitrap MS/MS according to the details of methods described in the supplemental material.

Peak lists were processed through the 4000 series Explorer software (Applied Biosystems, MA) for MALDI TOF/TOF data and the Proteome Discoverer 1.1 software (Thermo Scientific, San Jose, CA) for all ESI LTQ Orbitrap data using the software's default settings. All peak lists were then searched with an in-house Mascot server (version 2.1.0; Matrix Science) against an amino acid sequence database combining all predicted and translated IIV-9 ORF and all entries from the NCBI nonredundant sequence database, matching the taxa Lepidoptera and Drosophila melanogaster (downloaded January 2011; 355,290 sequence entries). Mascot search settings allowed for full tryptic peptides with up to 3 missed cleavage sites and variable modifications of carbamidomethyl (C), oxidation (M), and pyroglutamic acid (E, Q). The precursor and fragment mass tolerances were set to ±10 ppm and 0.8 Da for LTQ Orbitrap data and 75 ppm and 0.4 Da for TOF/TOF data. To evaluate the false-discovery rate (FDR), all peak lists were searched against a decoy database using identical search settings. The decoy database was built using the decoy database tool at the Trans-Proteomic Pipeline (TPP; Seattle Proteome Center), comprising the reversed sequence entries of the aforementioned combined database. The FDR was calculated by determining the number of false-positive peptide hits from the decoy search versus the number of peptide identifications from the true search using the same Mascot score as a significance threshold.

Only peptide hits with an individual ion score of >40 (Mascot significance threshold at a P of <0.05) were accepted as significant identifications. This resulted in an FDR of <0.02 for all searches. A significant protein identification required at least two significant peptide hits covering different sequences of the protein. In addition, a protein that was identified by a single peptide-based protein identification in one experiment (IIV-9 particles analyzed by LC-MALDI TOF/TOF or LC-ESI LTQ Orbitrap MS/MS or infected cells analyzed by LC-ESI LTQ Orbitrap MS/MS) that was also confirmed by a different peptide identification covering another sequence stretch in one of the other experiments was considered a significant multipeptide identification.

Nucleotide sequence accession number.

The IIV-9 genome has been deposited in GenBank under accession number GQ918152.

RESULTS AND DISCUSSION

Genome assembly and properties.

Sequencing of the IIV-9 genome using a 454 FLX sequencer generated 20,734 sequences totaling 5,597,884 bases of sequence with 50.4% and 49.6% sequence orientation biases, for an average coverage of 27-fold. The initial Newbler assembly generated 3 large and 10 small contigs that were subsequently assembled into a single contiguous sequence by targeted PCR-based cloning and sequencing. The initial contig boundaries were defined by repeat sequences that the assembler was unable to resolve. The genome was shown to be 205,791 bp in size, with a G+C content of 31% (Table 1). This genome size compares to estimates of 192.5 and 222.6 kbp, as estimated by restriction profiles using standard (37) and pulsed-field gel (39) electrophoresis, respectively. Based upon an estimated 4.7% terminal redundancy in the IIV-9 genome (39), this equates to approximately 9.7 kbp of redundant sequence. Due to the high A+T content in the genome and the challenge of resolving long single base runs using 454 technology, a total of 34 PCR-based clones were generated to resolve 57 potential base conflicts. All base calls were inspected visually and resolved as necessary. In silico restriction endonuclease profiles were compared to experimentally derived restriction endonuclease profiles (37) to confirm correct global assembly of the genomic sequence (data not shown).

Genome analysis identified a range of complex repeat sequences, including tandem, direct, dyad, and inverted repeats. The percentage of repeat sequences in the genome is dependent upon the stringency of parameters employed and ranges from 20 to 23% of the genome. The largest repeat identified is 3.4 copies of a 1,002-bp repeat between nucleotides 69621 and 73024, with a 75% consensus match. Identification of repeat sequences on the genome is illustrated by dot plot analysis (Fig. 2 A). The repeats highlighted in the boxed region of the genome shown in Fig. 2A represented 10 of the contigs generated in the initial sequence assembly.

Fig. 2.

Fig. 2.

Sequence repeats and IIV-9 versus IIV-3 gene parity plot analysis. (A) The IIV-9 genome was compared against itself by dot plot analysis to identify repeats within the IIV-9 genome. The major clusters of sequence repeats are boxed. (B) The IIV-9 and IIV-3 gene orders were compared by parity plot analysis. Where three or more genes are contiguous in both genomes, regardless of orientation, they have been boxed. Numbers on the x and y axes represent ORF numbers.

IIV-9 ORF and their predicted protein products.

Analysis of the complete genome predicted 191 predominantly nonoverlapping ORF encoding proteins of 50 aa or more in length with an AUG start codon, with the genome displaying a coding density of 90% (Fig. 3). The genome shows a bias in that 63% of genes were oriented in the reverse direction. In conjunction with the genome analysis, we conducted a proteomic analysis, first to confirm expressed ORF and second to establish the first profile of expressed IIV-9 ORF in both isolated virions and infected insect cells. Of the total of 191 ORF, 94 were identified in either isolated virions (64 ORF detected) or infected insect cells (72 ORF detected), with 42 being expressed in both (Table 2; see also Tables S1 to S3 in the supplemental material). The number of expressed proteins in isolated virions roughly correlates with 44 identified proteins in a previous proteomic study of SGIV particles (31). Open reading frames that are discussed in the following paragraph are marked with a superscript “p” if their expression has been confirmed by proteomics of isolated virions or with a superscript “i” if they were confirmed to be protein products in infected insect cells.

Fig. 3.

Fig. 3.

Open reading frame map of the IIV-9 genome. The 205,791-bp IIV-9 genome is represented as a solid line, and predicted open reading frames are indicated with arrows. Arrows representing genes in the forward (right) direction are stippled, and those in the reverse (left) direction are open. The 26 core IV genes are indicated with bold ORF numbering. Arrows with a bold outline are HTH 7 domain-containing orthologs of orf091L of IIV-3, and those with a broken outline are orthologs of IIV-6 468L. orf061R/L and -139R/L are almost fully overlapping genes facing in opposite directions and have been represented as both R and L forms.

Table 2.

IIV-9 predicted open reading frames

IIV-9 ORFa Nucleotide positions Length (aa) Best match(es)b
Predicted motif and/or functione
IIV protein(s)c GenBank accession no. BLASTP score % aa identityd
001R 30–1223 397 IIV-3 004R; IIV-6 067R YP_654576 466 62.6
002R 1238–1516 92 Signal peptide
003R 1741–1980 79
004L 2999–2037 320 IIV-3 005L YP_654577 62 31.7 Signal peptide, RING finger
005R 3158–4705 515 IIV-3 006R; IIV-6 118L YP_654578 582 56.9 NCLDV membrane protein
006L 6016–4757 419 IIV-6 468L* NP_149463 285 44.9 Helix-turn-helix 7 motif
007R 6133–6459 108 IIV-6 248R; PBCV N288R NP_149711 70 44.1 Transmembrane
008R 6587–7456 289 IIV-6 404L NP_149867 102 32.5
009Lf 8013–7519 164 IIV-22 15.9 kDa; IIV-3 15R P25097 134 48.6 15.9-kDa protein, 5′ MCP gene
010R 8235–9689 484 IIV-9-MCP; IIV-1 MCP; IIV-3 014L; IIV-6 274L O39163 987 100.0 MCP
011L 9967–9782 61
012R 10245–11540 431 MAR 344; IIV-6 273R NP_149736 108 47.0
013R 11545–11865 106 Transmembrane
014L 12780–11935 281 IIV-6 219L; IIV-3 036R,091L NP_149682 239 51.8
015R 13053–13316 87 IIV-3 013L YP_654585 84 55.8 Signal peptide
016R 13458–16700 1080 IIV-3 035R; IIV-6 179R YP_654607 1154 51.2 Tyr protein kinase-like domain
017R 16743–17156 137 IIV-3 055R; IIV-6 349L YP_654627 148 54.1 TF IIS C-terminal domain
018R 17457–17726 89 IIV-3 019R YP_654591 55 52.8 Bro-N domain
019L 19158–17881 425 IIV-3 069L; IIV-6 198R YP_654641 389 49.9
020R 19213–20130 305 Yersinia ruckeri chitinase ZP_04617184 352 57.2 Chitinase, family 18 glycohydrolase
021R 20183–20626 147 IIV-3 057L YP_654629 44 25.6
022R 20680–21762 360 IIV-3 056L; IIV-6 287R; MAR 339 YP_654628 301 44.7 Putative phosphodiesterase
023L 23370–21841 509 IIV-6 380R; IIV-3 010L,011L NP_149843 367 45.0 Serine/threonine protein kinase
024R 23480–23920 146 IIV-6 293R NP_149756 123 44.9
025R 24044–25165 373 IIV-3 012R; IIV-6 302L YP_654584 322 47.6 C2H2 Zn finger
026R 25236–26552 438 IIV-6 468L*; IIV-3 093 NP_149463 306 44.9 Helix-turn-helix 7 motif
027L 26619–27086 155 IIV-3 085L; IIV-6 325L YP_654657 188 58.4 Signal peptide
028R 27029–27229 66 Transmembrane
029R 27260–28093 277 IIV-3 054L YP_654626 243 48.0
030Rf 28156–28518 120 IIV-3 102R; IIV-6 122R YP_654674 119 55.0
031R 28627–29910 427 IIV-3 047R; IIV-6 337L YP_654619 446 66.3 Transmembrane
032Lf 31068–30394 224 IIV-3 021L YP_654593 194 55.6 C3HC4 RING finger/BIR
033L 31730–31113 205 IIV-3 022L YP_654594 142 45.4 Transmembrane
034L 32649–31816 277 IIV-3 101R; IIV-6 142R YP_654673 430 75.8 RNase III
035L 34117–32813 434 IIV-6 468L* NP_149463 295 43.8 Helix-turn-helix 7 motif
036Lf 34736–34176 186 IIV-3 104L; IIV-6 355R YP_654676 271 66.7 Phosphatase
037R 34846–35577 243 IIV-3 105R; IIV-6 359L YP_654677 330 65.3
038L 37133–35655 492 IIV-6 159L,219L,261R,443R; IIV-3 091L,36R NP_149622 135 33.3
039L 38695–37220 491 IIV-6 159L,219L,261R,443R; IIV-3 091L NP_149622 151 36.1
040R 38826–40250 474 IIV-3 106R; IIV-6 030L YP_654678 629 63.9 ATP-dependent exo-DNase α subunit
041R 40283–41143 286 Transmembrane
042L 41817–41188 209 IIV-3-071L; IIV-6 259R YP_654643 285 68.4 Transmembrane
043R 42393–42890 165 IIV-3 020R; IIV-6 196R YP_654592 195 59.9 Thioredoxin domain/isomerase
044L 43528–42923 201 IIV-3 097L; IIV-6 170L; MAR 216 YP_654669 228 55.0 Holliday junction resolvase
045Rf 43660–44226 188 PBCV N269L XP_973701 137 44.4 Putative dUTPase
046R 44338–45663 441 IIV-6 468L*; IIV-3 093L NP_149463 294 42.0 Helix-turn-helix 7 motif
047L 45971–45708 87 Transmembrane
048R 46162–47862 566 IIV-3 059L; IIV-6 012L YP_654631 729 63.9 XRN 5′–3′ exonuclease
049R 48016–48504 162
050R 48602–49111 169 IIV-3 032R YP_654604 43 31.2
051L 49744–49310 144 IIV-3 058R, IIV-6 391R YP_654630 200 65.3
052L 50273–49851 140 IIV-6 413R; IIV-3 021L RING/U box motif
053R 50475–51242 255 IIV-3 060L; PBCV A193L YP_654632 254 56.0 Proliferating cell nuclear antigen
054R 51304–52656 450 IIV-6 468L* NP_149463 303 45.4 Helix-turn-helix 7 motif
055L 55623–52687 978 IIV-3 087L; IIV-6 022L YP_654659 1258 63.5 DEAD/H motif/putative NTPase
056R 55805–56563 252 IIV-3 070L; IIV-6 306R YP_654642 240 59.7 SWIB/MDM2 domain
057R 56642–57964 440 Acanthamoeba polyphaga mimivirus L12 YP_142366 152 26.7
058L 59618–58074 514 IIV-3 098L; IIV-6 493L YP_654670 627 60.9 Serine/threonine protein kinase
059R 59726–61090 454 IIV-3 039R; IIV-6 393L YP_654611 501 55.3
060Rf 61210–61602 130 Ixodes scapularis RNA-binding protein; IIV-6 340R EEC17992 53 26.7 dsRNA binding protein
061L 61781–61599 60
061R 61662–61832 56
062R 61866–62225 119 IIV-3 041R; IIV-6 453L YP_654613 152 59.7 Thioredoxin domain
063R 62315–63559 414 IIV-6 420R* NP_149883 204 33.9
064Lf 64363–63656 235 Transmembrane
065R 64427–66082 551 IIV-3 038R; IIV-6 098R YP_654610 589 53.2
066L 66388–66197 63 IIV-3 043R; IIV-6 010R YP_654615 101 67.2 Transmembrane
067L 69088–66401 894 IIV-3 091L; IIV-6 443R,261R,396L YP_654663 447 43.0
068L 75310–69158 2050 IIV-3 091L; IIV-6 443R,261R YP_654663 256 33.9 Transmembrane
069R 75461–77572 703 IIV-3 074L; IIV-6 268L YP_654646 617 50.5
070R 77864–80311 815 IIV-16-RNR; IIV-3 065R; IIV-6 085L AAY24450.1 559 71.4 RNR large-chain precursor/intein
071R 80408–81727 439 IIV-6 468L*; IIV-3 093L NP_149463 289 43.7 Helix-turn-helix 7 motif
072L 82690–81764 308 IIV-3 091L,036R,008L; IIV-6 219L,443R YP_654663 114 40.0
073Lf 83243–82767 158 IIV-3 042R; IIV-6 136R YP_654614 192 59.2
074L 84548–83364 394 IIV-3 079L; IIV-6 282R YP_654651 535 69.8 Poxvirus very late transcription factor
075R 84711–85385 224 IIV-3 080R YP_654652 181 43.8 NUDIX hydrolase domain
076R 85649–86833 394 Mimivirus L5, L12, R821, R865, L754, R433 protein YP_142359 151 30.3 Bro domain
077R 86853–87038 61
078L 87331–87053 92 Signal peptide
079R 87349–88107 252 IIV 3088R; IIV-6 075L YP_654660 410 77.9 NTPase domain
080L 88717–88181 178 Histones Q27443 55 17.5 H4 and H3 histone domains
081R 88803–89180 125 Pseudoalteromonas haloplanktis TAC125 YP_339369 92 48.0 GIY-YIG endonuclease
082L 90202–89231 323 IIV-3 044L YP_654616 333 51.7 Protein kinase domain
083R 90412–90696 94 IIV-3 045R YP_654617 127 70.0
084R 90821–92098 425 IIV-6 229L; IIV-3 046R NP_149692 389 47.6
085R 92175–92822 215 IIV-6 378R,232R; IIV-3 100L NP_149841 216 69.4 2-Cys adaptor domain
086L 93855–92863 330 IIV-3 099R; IIV-6 329R YP_654671 249 48.0
087R 94000–95811 603 IIV-3 019R; IIV-6 420R YP_654591 318 56.4 Bro-N domain
088L 96116–95925 63 Transmembrane
089R 96461–99868 1135 IIV-3 086L; IIV-6 045L YP_654658 1401 62.0 DNA topoisomerase II
090R 99858–100511 217 IIV-3 064L YP_654636 88 27.2
091L 101269–100583 228 IIV-3 063R; IIV-6 309L YP_654635 170 47.6
092R 101417–102835 472 IIV-6 420R*; IIV-3 019R,093L NP_149883 283 40.8
093L 103109–102885 74
094L 104606–103173 477 IIV-3 061R; IIV-6 467R YP_654633 346 39.5
095R 104681–105583 300 Bombyx mori thymidylate synthase XP_001033394 322 52.8 Thymidylate synthase
096R 105651–107114 487 IIV-3 019R,093R; IIV-6 420R* YP_654591 248 39.5
097R 107167–108060 297 IIV-3 028R YP_654600 191 39.7
098R 108107–108679 190 IIV-3 029R; IIV-6 143R YP_654601 231 55.0 Deoxyribonucleoside kinase
099L 109129–108725 134 IIV-3 030L YP_654602 109 42.3
100R 109185–109688 167 IIV-3 031R: IIV-6 115R YP_654603 84 32.4
101R 109843–110592 249 IIV-3 032R YP_654604 139 33.5
102R 110660–110956 98 Macaca mulatta regulatory subunit XP_001097695 60 35.4 Protein phosphatase 1C binding
103R 110998–112287 429 IIV-6 468L*; IIV-3 093L NP_149463 288 43.6 Helix-turn-helix 7 motif
104L 112901–112320 193 IIV-3 033L: IIV-6 307L YP_654605 222 61.5 Signal peptide, transmembrane
105Rf 113487–114305 272 IIV-3 034R; IIV-6 077L YP_654606 164 38.6 C3H1 Zinc finger
106R 114438–116879 813 IIV-3 094L; IIV-6 050L YP_654666 509 34.9
107R 117308–117733 141 IIV-3 053L YP_654625 143 51.1
108L 118009–117770 79
109R 118083–119972 629 IIV-3 052L; IIV-6 205R YP_654624 387 41.3 NAD-dependent DNA ligase
110R 120070–121347 425 IIV-3 007R YP_654579 415 52.6
111L 121746–121513 77 Signal peptide
112L 123866–121758 702 IIV-3 091L; IIV-6 443R,261R,396L,219L,159L YP_654663 226 43.1
113R 123945–127328 1127 IIV-3 009R; IIV-6 428L YP_654581 1801 77.2 DNA-dependent RNA Pol subunit 2
114R 127384–127998 204 IIV-6 404L NP_149867 238 60.5
115L 129575–128067 502 IIV-3 051L; IIV-6 213R YP_654623 141 46.3
116Rf 129723–133142 1139 IIV-3 120R; IIV-6 037L YP_654692 1503 66.1 DNA polymerase
117R 133208–133576 122 IIV-6 049L NP_149512 88 41.5 Transmembrane
118R 133691–135031 446 IIV-6 468L* NP_149463 277 39.7 Helix-turn-helix 7 motif
119R 135057–135410 117
120R 135520–138408 962 IIV-3 121R; IIV-6 184R YP_654693 1342 69.2 Helicase/primase
121R 138595–138990 131 Amsacta moorei entomopoxvirus AMV-075 NP_064857 107 45.8
122L 139310–139029 93 IIV-3 126R YP_654698 91 48.4 Transmembrane
123L 140261–139410 283 IIV-3 125R YP_654697 271 49.5
124L 144216–140356 1286 IIV-6 443R,261R,396L; IIV-3 091L NP_149906 587 47.5
125L 144906–144316 196 IIV-3 124R YP_654696 112 42.3
126R 144925–145308 127 IIV-3 123L YP_654695 80 39.5
127L 145500–145342 52 Transmembrane
128R 145520–145729 69 IIV-3 117L YP_654689 35 34.4
129R 145818–148631 936 IIV-1 L96; IIV-3 084L; IIV-6 232R P22856 927 70.0 OTU-like cysteine protease
130R 148675–149193 172 IIV-3 083L; IIV-6 358L YP_654655 95 34.7
131R 149308–150282 324 IIV-6 420R* NP_149883 179 36.5
132R 150254–150514 86
133R 150722–151150 142 IIV-3 082L YP_654654 82 34.5
134L 151813–151349 154 IIV-3 096R; IIV-6 347L YP_654668 108 40.5 ErvI/Alr sulfhydryl oxidase domain
135R 151908–152948 346 Acyrthosiohon pisum metalloprotein; IIV-3 095L XP_001945941 189 38.5 Matrix metalloproteinase
136L 154120–153011 369 IIV-3 091L,036R,008L; IIV-6 443R,219L,317L YP_654663 132 34.0
137L 154906–154187 239 IIV-3 067L; IIV-6 197R YP_654639 264 54.0 Protein tyrosine phosphatase
138R 155025–156068 347 IIV-3 078R; IIV-6 244L YP_654650 402 56.4 Phosphodiesterase domain
139L 156398–156222 58
139R 156283–156492 69
140L 157089–156538 183 IIV-3 073R, IIV-6 234R YP_654645 134 48.0 Transmembrane
141R 157589–158050 153 IIV-3 072L; IIV-6 374R YP_654644 188 60.1
142R 158110–159399 429 IIV-6 468L*; IIV-3 093L NP_149463 333 48.0 Homeodomain
143L 162815–159438 1125 IIV-3 016R; IIV-6 295L YP_654588 1079 49.9
144R 162892–164241 449 IIV-6 468L* NP_149463 308 45.2 Homeodomain
145Rf 164430–165746 438 IIV-6 161L; IIV-3 109L,108L NP_149624 342 44.0 Helicase
146R 165859–166065 68 IIV-6 212L,211L NP_149675 40 39.7
147R 166109–166318 69 IIV-6 388R*; IIV-3 093L NP_149851 43 42.9
148L 167577–166534 347
149L 168222–167611 203 IIV-3 066L; IIV-6 357R YP_654638 102 30.9 Transmembrane
150R 168376–170697 773 IIV-3 113L; IIV-6 155L,149L YP_654685 809 53.7
151L 171154–170834 106 IIV-3 112R; IIV-6 466R YP_654684 101 48.1 Transmembrane
152L 171684–171214 156 IIV-3 111R, IIV-6 414L YP_654683 203 63.7 NUDIX hydrolase domain
153R 171761–172324 187 IIV-3 001R; IIV-6 395R YP_654573 104 47.3
154L 173033–172515 172 IIV-3 092R; IIV-6 454R YP_654664 206 61.9 RPB5 domain
155L 173580–173080 166 Burkholderia oklahomensis EO147; MAR 217 ZP_02357920 80 31.9 dNMP kinase
156L 174825–173689 378 Dictyostelium discoideum AX4 XP_636066 88 24.3
157R 175059–175748 229 IIV-3 032R YP_654604 94 32.3
158R 175933–176637 234 Ixodes scapularis E3 UBQ ligase EEC07169 48 24.7 RING finger
159R 176791–177318 175 IIV-3 018L; IIV-6 415R YP_654590 157 50.6
160R 177923–179023 366 IIV-3 076L; IIV-6 369L YP_654648 372 52.2 XPG-like protein (excision repair)
161R 179110–179382 90
162R 179441–179710 89
163L 180006–179749 85
164L 181405–180044 453 A. polyphaga mimivirus L5,L12 YP_142359 176 30.6 Bro-N domain
165R 181529–185557 1342 IIV-3 090L; IIV-6 176R YP_654662 1964 72.3 DNA-depependent RNA Pol II large subunit
166L 186055–185801 84 IIV-3 089L YP_654661 44 48.8
167L 186230–186060 56 Transmembrane
168R 186298–187545 415 IIV-6 420R* NP_149883 181 33.8
169L 188173–187583 196 IIV-3 068R; IIV-6 401R YP_654640 343 84.7 HMG box
170Rf 188324–188803 159 Apis mellifera XP_624869 121 43.4 Dual-specificity phosphatase
171R 188941–189651 236 IIV-3 116R YP_654688 46 20.2
172L 190186–189704 160 IIV-3 119R YP_654691 73 48.0
173R 190308–190547 79 IIV-6 420R,200R NP_149883 55 40.5
174L 190957–190700 85 IIV-3 115R; IIV-6 342R YP_654687 97 64.5
175R 191065–191487 140 IIV-3 114L YP_654686 51 27.2 Signal peptide
176R 191553–192140 195 IIV-3 081L YP_654653 97 34.7 FasI domain
177R 192245–193678 477 IIV-3 024R; IIV-6 361L,224L YP_654596 539 56.3 Cathepsin
178R 193729–194130 133
179R 194134–194400 88
180L 195002–194424 192 CfDEF NPV 110; Eppo NPV 102 NP_932719 182 46.1
181R 195132–196025 297 IIV-3 017R; IIV-6 335L YP_654589 320 60.4
182R 196333–197700 455 IIV-6 468L* NP_149463 282 40.0 Helix-turn-helix 7, homeodomain
183R 197801–198733 310 IIV-3 107R; IIV-6 117L YP_654679 232 54.4 Transmembrane
184L 199094–198795 99 IIV-3 023R YP_654595 145 68.4
185R 199157–199636 159 IIV-3 050L YP_654622 162 63.6
186L 202115–199674 813 IIV-3 049R YP_654621 95 18.0
187R 202220–203323 367 IIV-3 048L; IIV-6 376L YP_654620 581 72.2 RNR small subunit
188R 203536–204051 171 IIV-3 025R; IIV-6 111R YP_654597 87 33.5 Transmembrane
189R 204093–204767 224 IIV-3 026R; IIV-6 350L YP_654598 296 67.6
190R 204824–205276 150 IIV- 027R; IIV-6 157L YP_654599 73 31.8 C3HC4 RING finger
191L 205760–205347 137
a

IIV-9 open reading frame number. Proteins identified by the proteomic experiments are indicated by bold ORF numbers for the purified IIV-9 particles and by underlined ORF numbers for infected cells. Note that the protein products of 071R and 182R were identified by the same set of peptides and can therefore not be distinguished unambiguously by the proteomic analysis.

b

Most closely related gene by BLASTP analysis.

c

Matching IIV proteins, with the first-listed protein being the most closely related. Non-IIV proteins with a high similarity score to an IIV-9 protein by BLASTP analysis are indicated. NCLDV species abbreviations are PBCV, Paramecium bursaria chlorella virus, and MAR, Marseilles virus. CfDEF NPV, Choristoneura fumiferana nucleopolyhedrovirus; Eppo NPV, Epiphyas postvittana nucleopolyhedrovirus.

d

Amino acid percent identity for most closely related protein. * indicates similarity to the cluster of proteins encoded by IIV-6 468L and its homologous genes in IIV-6.

e

TF IIS, transcription factor IIS; BIR, baculovirus inhibitor of apoptosis protein repeat; Bro, baculovirus repeated ORF; NUDIX, nucleoside diphosphate linked.

f

Protein identified by a single peptide hit.

The genome orientation and gene designation were defined by the start codon of the IIV-9 orf001R ortholog of the first conserved iridovirus core gene in mosquito IIV-3 (orf004R [5]). Four short ORF (orf007, -088, -061, -139) that were represented by dual heavily overlapping ORF in opposite orientations could not be resolved as forward or reverse by bioinformatics analysis alone. orf007 and -088 were subsequently designated orf007Rp and -088Lp based on the identification of their protein products by the proteomic analysis (Table 2; see also Table S1 in the supplemental material). The remaining two ORF could not be resolved, and hence, both overlapping ORF were designated orf061R and -061L or -139R and -139L, respectively.

The majority of IIV ORF have no predicted function. However, a wide range of predicted proteins showed similarity to proteins involved in nucleotide metabolism and DNA replication. These include enzymes required for deoxyribonucleotide synthesis, such as thymidylate synthase (095R), dUTPase (045R), deoxyribonucleoside kinase (098R), and both the large and small subunits of ribonucleotide reductase (070Ri, 187Ri). The last two have been confirmed as expressed proteins in infected cells. There are two putative NUDIX hydrolase proteins (075R and 152L), and these may play an important role in regulating nucleotides in the host cell. Delhon et al. (5) postulated that the IIV-3 ortholog of 075R (IIV-3 080R) might act similarly to the vaccinia virus NUDIX ortholog (with a D10R mutation) and function as a repressor of transcription and translation. The reported presence of an intein in the large subunit of ribonucleotide reductase (orf070Ri) was confirmed (10).

Forty-four genes that could be postulated to have a role in DNA metabolism or DNA replication were identified. These include viral DNA ligase (109R), DNA polymerase (116R), helicase/primase (120R,145R), PCNA (053R), endonuclease (081R), DNA exonuclease (040R), topoisomerase II (089Ri), and phosphodiesterase (022Ri) genes. However, only topoisomerase II (089Ri) and phosphodiesterase (022Ri) have been identified in infected cells. Genes encoding putative chromatin-binding regions, such as SWID/MDM2 (056Rpi) and HMG box domains (169Lpi), are also present. Although it is not known if these affect the host or viral genome structure, the identification of both proteins in isolated virions suggests a possible association with the viral genome.

This study is the first to identify a putative chitinase gene (020Ri) in an IIV. Analysis of this chitinase indicates that it is a member of the family 18 glycohydrolases (exochitinases) and is most closely related to the chitinase of a bacterial pathogen of fish, Yersinia ruckeri (57% identity), and to the chitinases of other bacteria and slime molds. Baculovirus chitinases, along with cathepsin, have been shown to be important in facilitating the release of virus from the host (15). Despite the IIV-9 chitinase displaying less than 20% identity to baculovirus chitinases, the presence of a viral cathepsin (177Rpi) in IIV-9 may reflect similar roles of chitinase and cathepsin, acting in concert to degrade the insect, thereby facilitating viral release and dissemination from the host insect (15). Both enzymes were identified in IIV-9-infected insect cells by our proteomic analysis.

Expressed orf180Lpi also encodes a protein with strong similarity to baculovirus genes, possessing 46% identity and 66% similarity to orf110 of Choristoneura fumiferana nucleopolyhedrovirus (CfDEF NPV). A related gene is also present in IIV-6 (422L), and alignment of CfDEF NPV orf110, Epiphyas postvittana NPV orf102, and the IIV-9 and IIV-6 proteins shows the presence of a highly conserved pan-caspase DEVD cleavage site. It is not known if this exploits caspase activity for processing or if it might regulate apoptosis. A further enzyme identified in IIV-9 is a putative ErvI/augmenter of liver regeneration (ALR) sulfhydryl oxidase (134Lp). This protein is common in large cytoplasmic DNA viruses (29), and in common with poxviruses, this was found in the IIV-9 virus particle. The role of this enzyme activity is unclear but has been postulated to work in concert with glutaredoxin or thioredoxin systems for regulating cytoplasmic disulfide bonds and protein folding. IIV-9 encodes two proteins with putative thioredoxin domains, 043Rp and 062Rpi, both of which were identified in the virus particle.

The repeat sequences identified in the genome are located predominantly in coding regions. This is reflected in the presence of multiple copies of closely related genes on the genome (see Fig. S1 and S3 in the supplemental material). IIV-9 orf006Lpi, -026R, -035L, -046Ri, -054Ri, -071Ri, -103R, -118R, -142R, -144Ri, and -182Ri form one cluster of paralogs, as reflected in amino acid identities ranging from 60 to 84% between the encoded proteins (see Fig. S1 in the supplemental material). With InterProScan, all of these proteins were predicted to contain helix-turn-helix 7 (HTH 7 [Pfam 02796]) motifs and/or the more stable homeodomain motifs that are involved in DNA binding, with a wide spectrum of roles, ranging from transcription regulation to DNA repair (1). These proteins may have a role in the regulation of viral gene expression or viral genome replication, with an array of closely related proteins being involved in sequence-specific fine-tuning of the viral gene cascade. An alternative role could be in the resolution of the branched complexes generated during IV genome replication, although it is not clear why so many copies would be required.

In addition, IIV-9 orf063R, -131R, and -168R (53 to 79% identity) form a less well conserved cluster of proteins (Fig. 3; see also Fig. S2 in the supplemental material) whose genes display some motif conservation to orf092R and -096R (46% identical). No motifs or predicted functions were identified for this cluster of repeated genes, but by BLAST analysis, they were distantly related to the same helix-turn-helix cluster of genes identified above.

Analysis of the protein of orf068Lp, which is indicated by the double-boxed repeat highlighted in Fig. 2A, using the XSTREAM protein tandem repeat finder (27) identified 4.8 copies of a 131-aa repeat at aa 104 to 731 (Fig. 4 A) and an immediately adjacent repeat consisting of 14.4 copies of an 84-aa repeat at aa 714 to 1922 (Fig. 4B). The respective C- and N-terminal flanks of the repeats overlap. The orf067Lpi protein possesses a repeat related to that illustrated in Fig. 4A (Fig. 4C). Both proteins have a high level of predicted β-sheet composition, and both were identified in the virus particle. The high β-sheet structure is similar to what is found in a range of fiber structures, such as bacteriophage fibers and tubulin. BLAST analysis of a single copy of the 131-aa repeat identified a very weak match between a short sequence located around the conserved PDATT motif and bacteriophage fiber proteins (data not shown). However, the location of orf067Lpi and orf068Lp in the particle is unknown, and hence a possible role in the surface fibril cannot be confirmed. An ortholog of this protein is in IIV-3 (091L) and IIV-6 (443R) but is not conserved in vertebrate IV, which would be consistent with the lack of surface fibrils on the vertebrate IV.

Fig. 4.

Fig. 4.

Tandem protein repeats in the proteins of orf068L and -067L. The repeat regions from aa 104 to 731 (A) and 714 to 1922 (B) of the orf068L protein are shown with residues matching the consensus (shaded). Underlined residues are the same amino acids. (C) The repeat sequence within the orf067L protein is shown aligned with the orf068L protein repeat shown in panel A (boxed underneath). The starred Y's are the same residue (to orient the 068L and 067L repeats).

Relationship to other viruses.

Of the 191 ORF predicted in IIV-9, 108 were most closely related to IIV-3 ORF (Table 2), indicating that IIV-9 is more closely related to the chloriridovirus IIV-3 than to IIV-6. Analysis of IIV-3 shows that 114 of the 126 ORF in IIV-3 (5) were identified as having an ortholog in IIV-9. In contrast, IIV-6 (Iridovirus genus) has 211 ORF, as defined by Eaton et al. (8), of which only 97 have orthologs in IIV-9. A total of 88 ORF are common to all 3 fully sequenced IIV. Interestingly, of the 45 ORF without an ortholog in other IV, 23 encoded proteins that were smaller than 100 amino acids, of which four (002R, 088L, 093L, 111L) were confirmed as expressed proteins by the proteomic analysis. In contrast to the high level of conserved genes between IIV genomes, there is a very low level of conservation in gene order. Comparison to IIV-3 with a gene parity plot (Fig. 2B) indicates only 5 clusters of 3 or more genes, with the largest conservation of gene order and orientation being a cluster of 5 genes represented by IIV-9 097R,101R and IIV-3 028R,032R (Table 2). IIV-9 and IIV-6 genomes possess no more than two genes that are conserved in order in any one cluster.

The 26 core genes previously identified as being conserved in all iridoviruses (8) were identified in IIV-9, and consistently with other genes in the genome, no conservation of gene order was apparent for these conserved genes. Phylogenetic analysis of the coding sequences for all 26 genes collated as a concatenated protein sequence for each of IIV-9, IIV-6, IIV-3, SGIV, LCDV-1, and ISKNV (Fig. 1B) shows the clear separation of the vertebrate and invertebrate IV. In addition, the core set provides strong evidence for the main features of the phylogenetic trees established for the partial MCP sequence (Fig. 1A) (38), with IIV-6 being in a separate clade and IIV-3 being more closely related to the major clade of IIV than its current taxonomic position in a separate genus suggests.

There were 36 NCLDV genes identified in the genome, including nine conserved orthologs found in all NCLDV (19) and seven that were present in all four families but that are missing from some lineages within those families. IIV-9 057R, 076R, and 164L were most closely related to predicted proteins of unknown function from Acanthamoeba polyphaga mimivirus.

miRNA coding prediction.

The analysis of miRNA shows an increasing complexity of viral interactions with this posttranscriptional control system, including the control of host and viral genes. Roles have included the establishment of latency and the avoidance of mammalian immune responses, as well as manipulation of the cellular environment to facilitate replication. Examples of viral miRNA to date have predominantly focused upon viruses that are relatively slow in their replication, such as the herpesviruses, and upon the role of virally encoded miRNA in latency. Because IIV have a nuclear replication phase, are relatively slow growing, and often have a range of sublethal effects upon their host insect, they are potential candidates for encoding miRNA.

Combined analysis for pre-miRNA sequences using VMir and MiPred generated seven possible pre-miRNAs (Table 3). All were located within open reading frames of unknown function, and five had no predicted motifs observable. Three putative pre-miRNAs were identified in the same orientation as the associated ORF, and four were in the opposite orientation. The absence of host cell sequence information makes identification of potential host target genes unfeasible. An XRN exonuclease gene (048R) is predicted on the genome of IIV-9, with orthologs in IIV-3 and IIV-6. This enzyme has a role in the processing of miRNA, in particular, the degradation of mature miRNA (24); hence, even if IIV genomes do not encode miRNAs, there is a strong likelihood that IIV interact with small noncoding RNA systems within the cells that they infect.

Table 3.

Predicted pre-miRNA sequences in the IIV-9 genomec

miRNAa Start siteb Apex nucleotide Sequence (5′–3′) (length [bp]) VMir score % G+C MiPred MFE P value MiPred % conf
MR171 11099 1134 CAAAGUCGACUCUUCCACUCGGAAAUGUGAAUGGUUUCCGAGUAAAAGCUGAAGAAAUGACUUUG (65) 130 42 −23.6 0.001 63
MR529 31179 31212 AAUGGGGUGUGUGAUGGAAUUGGAUUACCCACCCUUAUUUUAGGGUGGGUAAUUUUGUAUUACACUUUUGA (71) 181 38 −31 0.001 75
MR598 35790 35822 GGGUUGUGGGAGAGCCAACUGGAUCUACAUAUGUAGAUCCAGCUUGGGAUACAUCUACAGC (61) 127 48 −27.9 0.001 66
MD653 37793 37835 UGAGUAUUAUCAGCUUUUACCAAAGAUGCUGGAUCUACAAUUAACGUUGGACUAGCACCAGCUGAUAAACUUAAA (75) 127 36 −21.6 0.001 63
MR1469 89360 89395 UUUGGUAUUAUGUUGCACUUUUUAACCUUGAUGAAAUAUCCUUUUUUACGAGAAGAAGAAGUGCUCGAGUAUCA (74) 150 32 −20.2 0.005 63
MD2135 128453 128488 GGUGAUGUAAUCUGUGGAAUAACUCUGUUUAGUUUUUUUAGACAUUUGUUCAACAAAGAUUAAAUCGUCACCGU (74) 141 34 −22.9 0.001 69
MR2250 135129 135164 CGCCAUUAUAAUCAUUUUUAUAGUGGAUAGACUCCAAACUAUCAUUGUUCAAAUGAUUAUAAGAACCGG (69) 148 30 −18.6 0.001 64
a

R indicates that pre-miRNA is derived from the reverse genomic strand, and D indicates the direct genomic strand. The number refers to the pre-miRNA identified from the initial screen of the entire genomic sequence by VMir.

b

First nucleotide position of the pre-miRNA in the IIV-9 genome.

c

The minimum free energy (MFE), P value, and percent confidence (conf) were determined by MiPred.

The presence of an RNase III gene (orf034L) that has also been identified in IIV-3, IIV-6, and vertebrate IV (45), and a dsRNA binding protein (060R; IIV-6 340R), supports a role for noncoding RNA in IV replication. RNase III was identified in purified virus particles and infected cells by our proteomic experiments (Table 2; see also Tables S1 and S3 in the supplemental material) and was previously identified in particles of SGIV (31), confirming that this protein is produced and, hence, likely to have a role in IV replication. miRNA has also been predicted for soft-shelled turtle iridovirus (STIV) (18), and the presence of miRNAs has recently been confirmed for SGIV (44).

Conclusions.

IIV-9 is a member of the major IIV (group II) clade, and the complete genome provides insight into the relationships within and between IIV genera. The apparent close relationship to IIV-3, a virus from a separate genus, and the more distant relationship to IIV-6 have been confirmed through full-genome analysis. The genome encodes a wide range of proteins for which there is no functional prediction, and many of these are found in the complex virus particle. The presence of paralog proteins on the genome is a major contributor to the high incidence of repeat sequences associated with the genome, unlike with IIV-3, where the repeats are more likely to be in noncoding regions (5). The clustering of repeats within predominantly the β-sheet proteins suggests that these proteins may form filamentous structures that are associated with the virus particle and, as such, are candidates for the surface fibrils identified on IIV-9 particles. As for other IV, a large number of proteins are predicted to be involved in nucleotide regulation and genome replication, consistent with a life cycle that includes DNA replication in both the cytoplasm and nucleus and the branched concatemeric replication strategy of IV, which requires resolution of complex genome structures (11).

Supplementary Material

[Supplemental material]

ACKNOWLEDGMENT

This research was supported by the University of Otago.

Footnotes

Supplemental material for this article may be found at http://jvi.asm.org/.

Published ahead of print on 1 June 2011.

REFERENCES

  • 1. Aravind L., Anantharaman V., Balaji S., Babu M. M., Iyer L. M. 2005. The many faces of the helix-turn-helix domain: transcription regulation and beyond. FEMS Microbiol. Rev. 29:231–262 [DOI] [PubMed] [Google Scholar]
  • 2. Benson G. 1999. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 27:573–580 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Bromenshenk J. J., et al. 2010. Iridovirus and microsporidian linked to honey bee colony decline. PLoS One 5:e13181. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Chinchar V. G., et al. 2005. Iridoviridae, p. 145–162 In Fauquet C. M., Mayo M. A., Maniloff J., Desselberger U., Ball L. A. (ed.), Virus taxonomy. Eighth report of the International Committee on Taxonomy of Viruses. Elsevier Academic Press, San Diego, CA [Google Scholar]
  • 5. Delhon G., et al. 2006. Genome of invertebrate iridescent virus type 3 (mosquito iridescent virus). J. Virol. 80:8439–8449 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Delius H., Darai G., Flugel R. M. 1984. DNA analysis of insect iridescent virus 6: evidence for circular permutation and terminal redundancy. J. Virol. 49:609–614 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Do J. W., et al. 2004. Complete genomic DNA sequence of rock bream iridovirus. Virology 325:351–363 [DOI] [PubMed] [Google Scholar]
  • 8. Eaton H. E., et al. 2007. Comparative genomic analysis of the family Iridoviridae: re-annotating and defining the core set of iridovirus genes. Virol. J. 4:11. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Fowler M., Robertson J. S. 1972. Iridescent virus-infection in field populations of Wiseana-Cervinata Lepidoptera-Hepialidae) and Witlesia sp. (Lepidoptera-Pyralidae) in New Zealand. J. Invertebr. Pathol. 19:154–155 [Google Scholar]
  • 10. Goodwin T. J., Butler M. I., Poulter R. T. 2006. Multiple, non-allelic, intein-coding sequences in eukaryotic RNA polymerase genes. BMC Biol. 4:38. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Goorha R. 1982. Frog virus 3 DNA replication occurs in two stages. J. Virol. 43:519–528 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Goorha R., Murti G., Granoff A., Tirey R. 1978. Macromolecular synthesis in cells infected by frog virus 3. VIII. The nucleus is a site of frog virus 3 DNA and RNA synthesis. Virology 84:32–50 [DOI] [PubMed] [Google Scholar]
  • 13. Goorha R., Murti K. G. 1982. The genome of frog virus 3, an animal DNA virus, is circularly permuted and terminally redundant. Proc. Natl. Acad. Sci. U. S. A. 79:248–252 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. Grundhoff A., Sullivan C. S., Ganem D. 2006. A combined computational and microarray-based approach identifies novel microRNAs encoded by human gamma-herpesviruses. RNA 12:733–750 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Hawtin R. E., et al. 1997. Liquefaction of Autographa californica nucleopolyhedrovirus-infected insects is dependent on the integrity of virus-encoded chitinase and cathepsin genes. Virology 238:243–253 [DOI] [PubMed] [Google Scholar]
  • 16. He J. G., et al. 2001. Complete genome analysis of the mandarin fish infectious spleen and kidney necrosis iridovirus. Virology 291:126–139 [DOI] [PubMed] [Google Scholar]
  • 17. He J. G., et al. 2002. Sequence analysis of the complete genome of an iridovirus isolated from the tiger frog. Virology 292:185–197 [DOI] [PubMed] [Google Scholar]
  • 18. Huang Y., et al. 2009. Complete sequence determination of a novel reptile iridovirus isolated from soft-shelled turtle and evolutionary analysis of Iridoviridae. BMC Genomics 10:224. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Iyer L. M., Aravind L., Koonin E. V. 2001. Common origin of four diverse families of large eukaryotic DNA viruses. J. Virol. 75:11720–11734 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Jakob N. J., Muller K., Bahr U., Darai G. 2001. Analysis of the first complete DNA sequence of an invertebrate iridovirus: coding strategy of the genome of Chilo iridescent virus. Virology 286:182–196 [DOI] [PubMed] [Google Scholar]
  • 21. Jancovich J. K., et al. 2003. Genomic sequence of a ranavirus (family Iridoviridae) associated with salamander mortalities in North America. Virology 316:90–103 [DOI] [PubMed] [Google Scholar]
  • 22. Jiang P., et al. 2007. MiPred: classification of real and pseudo microRNA precursors using random forest prediction model with combined features. Nucleic Acids Res. 35:W339–W344 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. Juhl S., et al. 2006. Assembly of Wiseana iridovirus: viruses for colloidal photonic crystals. Adv. Funct. Mater. 16:1086–1094 [Google Scholar]
  • 24. Kim Y. K., Heo I., Kim V. N. 2010. Modifications of small RNAs and their associated proteins. Cell 143:703–709 [DOI] [PubMed] [Google Scholar]
  • 25. Kurita J., Nakajima K., Hirono I., Aoki T. 2002. Complete genome sequencing of Red Sea bream iridovirus (RSIV). Fish. Sci. 68:1113–1115 [Google Scholar]
  • 26. Lu L., et al. 2005. Complete genome sequence analysis of an iridovirus isolated from the orange-spotted grouper, Epinephelus coioides. Virology 339:81–100 [DOI] [PubMed] [Google Scholar]
  • 27. Newman A. M., Cooper J. B. 2007. XSTREAM: a practical algorithm for identification and architecture modeling of tandem repeats in protein sequences. BMC Bioinformatics 8:382. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28. Radloff C., Vaia R. A., Brunton J., Bouwer G. T., Ward V. K. 2005. Metal nanoshell assembly on a virus bioscaffold. Nano Lett. 5:1187–1191 [DOI] [PubMed] [Google Scholar]
  • 29. Senkevich T. G., White C. L., Koonin E. V., Moss B. 2000. A viral member of the ERV1/ALR protein family participates in a cytoplasmic pathway of disulfide bond formation. Proc. Natl. Acad. Sci. U. S. A. 97:12068–12073 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Shevchenko A., et al. 1996. Linking genome and proteome by mass spectrometry: large-scale identification of yeast proteins from two dimensional gels. Proc. Natl. Acad. Sci. U. S. A. 93:14440–14445 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31. Song W., Lin Q., Joshi S. B., Lim T. K., Hew C. L. 2006. Proteomic studies of the Singapore grouper iridovirus. Mol. Cell. Proteomics 5:256–264 [DOI] [PubMed] [Google Scholar]
  • 32. Song W. J., et al. 2004. Functional genomics analysis of Singapore grouper iridovirus: complete sequence determination and proteomic analysis. J. Virol. 78:12576–12590 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33. Sullivan C. S., Grundhoff A. 2007. Identification of viral microRNAs. Methods Enzymol. 427:3–23 [DOI] [PubMed] [Google Scholar]
  • 34. Tan W. G., Barkman T. J., Chinchar V. G., Essani K. 2004. Comparative genomic analyses of frog virus 3, type species of the genus Ranavirus (family Iridoviridae). Virology 323:70–84 [DOI] [PubMed] [Google Scholar]
  • 35. Tidona C. A., Darai G. 1997. The complete DNA sequence of lymphocystis disease virus. Virology 230:207–216 [DOI] [PubMed] [Google Scholar]
  • 36. Tsai C. T., et al. 2005. Complete genome sequence of the grouper iridovirus and comparison of genomic organization with those of other iridoviruses. J. Virol. 79:2010–2023 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37. Ward V. K., Kalmakoff J. 1987. Physical mapping of the DNA genome of insect iridescent virus type 9 from Wiseana spp. larvae. Virology 160:507–510 [DOI] [PubMed] [Google Scholar]
  • 38. Webby R., Kalmakoff J. 1998. Sequence comparison of the major capsid protein gene from 18 diverse iridoviruses. Arch. Virol. 143:1949–1966 [DOI] [PubMed] [Google Scholar]
  • 39. Webby R. J., Kalmakoff J. 1999. Comparison of the major capsid protein genes, terminal redundancies, and DNA-DNA homologies of two New Zealand iridoviruses. Virus Res. 59:179–189 [DOI] [PubMed] [Google Scholar]
  • 40. Williams T. 2008. Natural invertebrate hosts of iridoviruses (Iridoviridae). Neotrop. Entomol. 37:615–632 [DOI] [PubMed] [Google Scholar]
  • 41. Williams T., Cory J. S. 1994. Proposals for a new classification of iridescent viruses. J. Gen. Virol. 75:1291–1301 [DOI] [PubMed] [Google Scholar]
  • 42. Yan X., et al. 2000. Structure and assembly of large lipid-containing dsDNA viruses. Nat. Struct. Biol. 7:101–103 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43. Yan X., et al. 2009. The capsid proteins of a large, icosahedral dsDNA virus. J. Mol. Biol. 385:1287–1299 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44. Yan Y., et al. 2011. Identification of a novel marine fish virus, Singapore grouper iridovirus-encoded microRNAs expressed in grouper cells by Solexa sequencing. PLoS One 6:e19148. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45. Zenke K., Kim K. H. 2008. Functional characterization of the RNase III gene of rock bream iridovirus. Arch. Virol. 153:1651–1656 [DOI] [PubMed] [Google Scholar]
  • 46. Zhang Q. Y., Xiao F., Xie J., Li Z. Q., Gui J. F. 2004. Complete genome sequence of lymphocystis disease virus isolated from China. J. Virol. 78:6982–6994 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

[Supplemental material]
supp_85_15_7900__1.pdf (64.6KB, pdf)
supp_85_15_7900__2.pdf (55.6KB, pdf)
supp_85_15_7900__3.pdf (57.9KB, pdf)
supp_85_15_7900__4.pdf (103.6KB, pdf)
supp_85_15_7900__5.pdf (70.1KB, pdf)
supp_85_15_7900__6.pdf (59.4KB, pdf)

Articles from Journal of Virology are provided here courtesy of American Society for Microbiology (ASM)

RESOURCES