Abstract
Analysis of the R2 retrotransposons from multiple silkmoth and fruitfly species have revealed three segments that contain conserved RNA secondary structures. These conserved structures play important roles in the propagation of the R2 element, including R2 RNA processing and transposon integration into the host genome as well as a likely role in translation. Two of the structured regions comprise protein binding sites: one is located in the 3′ UTR and the other is in the 5′ UTR close to the putative start of the R2 open reading frame (ORF). The 3′ structure was deduced from chemical mapping and sequence comparison. The 5′ structure was determined using a combination of chemical mapping, oligonucleotide binding, NMR and sequence analysis and contains an unusual pseudoknot structure. The third structure occurs at the 5′ end of the R2 RNA and is responsible for self-cleavage of the 5′ end of the element from a 28S ribosomal RNA co-transcript. A structure for this fragment was proposed based on motif searching and sequence comparison. There is remarkable similarity in sequence and structure to the hepatitis delta virus (HDV) ribozyme. Seed alignments for the 5′ structure and the R2 ribozyme, containing representative sequences and consensus structures, have been submitted to the Rfam database.
Key words: Bombyx mori, Drosophila, pseudoknot, secondary structure, retrotransposon, ribozyme
Overview
R2 elements are non-long terminal repeat (non-LTR) retrotransposons that insert site-specifically into the host organism's 28S ribosomal RNA (rRNA) genes.1 To date these molecular parasites have been identified in arthropods, vertebrates, echinoderms, flatworms and hydra.2 Once a 28S gene is inserted it is no longer capable of producing functional rRNA; however, the ribosomal DNA locus itself is typically comprised of hundreds of genes. The “ecological landscape” for propagation of R2 elements is thus complex: the R2 element must produce enough copies to maintain itself, but overinsertion by R2 is detrimental to the host, so the elements must also contend with host mechanisms that suppress R2 replication.1
Originally discovered in 1977,3 many aspects of the biology of R2 retrotransposons have been deduced over subsequent years using silkmoths and fruitflies as model systems. As a retrotransposon, the R2 element reproduces itself through an RNA intermediate: the R2 RNA, which possesses a single ORF for the R2 protein as well as cis-regulatory signals necessary for insertion into the host genome. The R2 RNA is initially co-transcribed with host ribosomal RNA by RNA polymerase I and is processed out of this primary transcript utilizing an encoded self-cleaving ribozyme to free the 5′ end.4 The R2 ribozyme has marked similarity in sequence and secondary structure to the hepatitis delta virus ribozyme (Fig. 1),4 the latter being essential to the maturation of active viral segments through self-cleavage. The processed RNA would lack the 5′ cap structure normally associated with mRNA, which suggests initiation of R2 translation occurs through a non-canonical mechanism. Further support for this model is the lack of a conserved initiation codon upstream of the identifiably conserved R2 ORF.5 In silkmoth R2 RNAs, an unusual pseudoknot structure is present roughly 50–100 nucleotides upstream of the putative R2 ORF start site (Fig. 2). The location of this structure and the utilization of RNA pseudoknots as internal ribosome entry sites (IRES) for some viral RNAs6 suggests that this structure is important for R2 protein synthesis.5
Figure 1.
Comparison of secondary structures of HDV ribozyme and R2 ribozyme. The HDV ribozyme structure is taken from the crystal structure.16 The catalytic residue is boxed in red. The R2 ribozyme structure is shown for D. melanogaster and is annotated for mutations observed in other Drosophila R2 sequences: blue bases undergo compensatory mutations (double point mutations that preserve pairing), while green bases indicate consistent mutations (single point mutations that preserve pairing).
Figure 2.
Cartoon of the 5′ structured region of the R2 RNA from silk moths. Four conserved RNA hairpins and a pseudoknot are shown. The putative open reading frame starts downstream of the pseudoknot and second conserved hairpin. The three colored boxes indicate conserved amino acid sequences.
The R2 RNA interacts with the R2 protein during insertion into the host genome. The mechanism by which the R2 element inserts has been well documented for Bombyx mori and proceeds via an ordered sequence of DNA cleavage and synthesis.7,8 The process was named target primed reverse transcription (TPRT) and is now the accepted mechanism used by non-LTR retrotransposons (also referred to as LINE like elements). Two copies of the R2 protein differentially bind the R2 RNA at specific sites:9,10 either at the 3′ untranslated region (UTR) or towards the 5′ end including the R2 pseudoknot structure. The protein-RNA complex binds specifically on either side of the host 28S rDNA target site. The 3′ target bound R2 protein creates a nick, releasing a 3′ OH which is used to prime reverse transcription of the R2 RNA template into cDNA; this is accomplished via the reverse transcriptase domain of this R2 subunit.7 The 5′ target bound protein cleaves the opposite DNA strand10 and then proceeds to act as a DNA-dependent DNA polymerase in second strand synthesis, reading the newly made cDNA.11
RNA Structures
R2 ribozyme.
The R2 element was long assumed to be co-transcribed with the rRNA gene unit, but how R2 was processed out of this co-transcript was unknown. Using RNA generated in standard in vitro reactions, it was shown that the first 184 nucleotides at the 5′ end of the R2 element from Drosophila simulans were capable of rapid self-cleavage of the 28S-R2 co-transcript. Comparisons of possible secondary structures to the five known classes of self-cleaving ribozymes revealed that R2 RNA has remarkable similarities to the hepatitis delta virus (HDV) ribozyme (Fig. 1).4 Both ribozymes are characterized by a double pseudoknot with five base-paired regions (P1 to P4 and P1.1). Helixes, rather than sequences, are conserved in these regions as revealed by compensatory mutations across many Drosophila species. The only major difference observed between the two structures is that the R2 ribozyme J1/2 joining segment is over 100 nucleotides longer than in the HDV ribozyme. In vitro experiments with the J1/2 region excised from the R2 ribozyme revealed that these sequences are not necessary for the self-cleavage.4
Remarkably, many of the residues surrounding the catalytic core are conserved between the R2 and HDV ribozymes. These residues are predominantly located in loop region 3 (L3) and in the region joining P2 and P4 (J4/2). This conservation is surprising considering that mutagenesis experiments suggested that with the exception of the catalytic nucleotide (boxed C in Fig. 1), none of these nucleotides appear to be absolutely necessary for cleavage.12 Because R2 elements have been vertically transmitted since the origin of the Drosophila genus, their evolution can be followed over this time period. Nucleotide comparison among multiple species revealed that the R2 ribozyme underwent numerous changes, but the catalytic domain was conserved.4 Ongoing analysis of the R2s from other arthropod species, including B. mori, reveal similar ribozyme structures which are able to self-cleave (unpublished data). The presence of an HDV-like ribozyme at the 5′ end of R2 RNA transcripts is likely a remarkable instance of convergent evolution of RNA structure.
R2 5′ region.
The R2 5′ region also serves as a binding site for R2 protein and may be important in translation initiation. The role of this region in R2 retrotransposition has been studied in the silkmoth B. mori. The region is comprised of four conserved RNA hairpins and a pseudoknot (Fig. 2). The structurally conserved regions are separated by regions with no structural conservation. The roughly 330 nt R2 5′ region contains a transition from encoding primarily RNA structure to encoding protein (Fig. 2).5
The importance of the 5′ region was discovered by noting a persistent RNA “contamination” in preparations of R2 protein. This RNA is strongly bound by the R2 protein and has important roles in inducing one of the two R2 proteins to bind DNA downstream of the insertion site and conduct the second half of the TPRT integration reaction.10
A secondary structure for the B. mori 5′ region was first proposed based on single sequence free energy minimization, guided by constraints from chemical mapping and oligonucleotide hybridization experiments.13 A fragment of 74 nt is unusually insensitive to structural probing.5 A pseudoknot proposed for this fragment is supported by NMR spectra.14 When additional 5′ sequences from related silk-moth species were obtained, comparative sequence analysis coupled with structure probing confirmed the pseudoknot and identified another four conserved structural elements (Figs. 2 and 3).5 As shown in Figure 3, the pseudoknot secondary structure is conserved across silkmoth R2 sequences and is well-supported by substitutions that preserve base pairing. This pseudoknot has an unusual helical stacking arrangement, as deduced from NMR:14 rather than having helical stacking across the two pseudoknot helixes, the smaller of the two hairpins (P5) stacks on the larger pseudoknot helix, P4 (Fig. 3). The 5′ untranslated region of some fruitfly R2s can also form a pseudoknot structure similar to the pseudoknots in silkmoth R2s. The fruitfly pseudoknot is located in the large J1/2 region of the ribozyme structure, that is not needed for self-cleavage activity.
Figure 3.
Secondary structure of the R2 pseudoknot. The B. mori sequence, with only 54% average pairwise sequence identity to the other silkmoth R2 pseudoknots, is shown. The color annotation of mutations are as in Figure 1.
The R2 pseudoknot model, as well as its location in the R2 mRNA, may shed light on one of the mysteries of R2 retrotransposon biology: how to translate the R2 protein. The R2 pseudoknot occurs downstream of the un-capped R2 mRNA 5′ end and multiple in-frame stop codons. As part of the larger R2 5′ structured region, which spans the junction of the protein coding and non-coding RNA regions, the pseudoknot is positioned near the putative ORF start site.
R2 3′ region.
The 3′ untranslated region of the R2 transcript is bound by the R2 protein, and is involved in the initial DNA cleavage and reverse transcription step. The TPRT reaction only occurs with RNAs containing these 3′ sequences. Secondary structure models have been proposed for both fruitflies and silkmoths based on sequence comparison, free energy minimization and chemical mapping data.15 Within each of the groups, the secondary structures are supported by compensating changes that appear to be clade specific. The secondary structure for B. mori is shown in Figure 4. While the secondary structure of the 3′ untranslated regions of the fruitfly R2 elements show no obvious similarity to the silkmoth RNA structures, the R2 protein from B. mori can utilize the fruitfly 3′ RNA in a TPRT reaction.15
Figure 4.
Secondary structure of the 3′ structured region of R2 RNA from B. mori. The color annotation of mutations are as in Figure 1.
Materials and Methods
All sequences described were taken from previous publications: the R2 5′ end from Kierzek et al.5 the R2 ribozyme from Eickbush and Eickbush,4 and the R2 3′ UTR from Ruschak et al.15 For the R2 ribozyme, sequences and secondary structures were aligned manually; the other alignments were taken from references 5 and 15.
Concluding Remarks
RNA secondary structures play important roles in the propagation and maintenance of R2 retrotransposons. Analysis of RNA secondary structure has yielded functional and mechanistic insights into important R2 biological processes such as genome integration, 5′ end processing and translation. Continued analysis of R2 RNA will likely reveal additional roles for RNA structures. Structural knowledge will help to shed light on some of the continuing mysteries of R2 retrotransposon biology: e.g., specifics on how R2 RNAs bypass canonical translation initiation, how the R2 RNA 3′ end is processed, and how R2 protein binds both the 3′ and 5′ ends of the R2 RNA transcript.
Acknowledgments
This work was supported by NIH grants GM22939 (D.H.T.) and GM42790 (T.H.E.).
References
- 1.Eickbush TH. R2 and related site-specific non-LTR retrotransposons. In: Craig N, Craigie R, Gellert M, Lambowitz A, editors. Mobile DNA II. Washington DC: American Society of Microbiology Press; 2002. pp. 813–825. [Google Scholar]
- 2.Kojima KK, Kuma K, Toh H, Fujiwara H. Identification of rDNA-specific non-LTR retrotransposons in Cnidaria. Mol Biol Evol. 2006;23:1984–1993. doi: 10.1093/molbev/msl067. [DOI] [PubMed] [Google Scholar]
- 3.Wellauer PK, Dawid IB. The structural organization of ribosomal DNA in Drosophila melanogaster. Cell. 1977;10:193–212. doi: 10.1016/0092-8674(77)90214-8. [DOI] [PubMed] [Google Scholar]
- 4.Eickbush DG, Eickbush TH. R2 retrotransposons encode a self-cleaving ribozyme for processing from an rRNA cotranscript. Mol Cell Biol. 2010;30:3142–3150. doi: 10.1128/MCB.00300-10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Kierzek E, Christensen SM, Eickbush TH, Kierzek R, Turner DH, Moss WN. Secondary structures for 5′ regions of R2 retrotransposon RNAs reveal a novel conserved pseudoknot and regions that evolve under different constraints. J Mol Biol. 2009;390:428–442. doi: 10.1016/j.jmb.2009.04.048. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Kanamori Y, Nakashima N. A tertiary structure model of the internal ribosome entry site (IRES) for methionine-independent initiation of translation. RNA. 2001;7:266–274. doi: 10.1017/s1355838201001741. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Luan DD, Korman MH, Jakubczak JL, Eickbush TH. Reverse transcription of R2Bm RNA is primed by a nick at the chromosomal target site: a mechanism for non-LTR retrotransposition. Cell. 1993;72:595–605. doi: 10.1016/0092-8674(93)90078-5. [DOI] [PubMed] [Google Scholar]
- 8.Yang J, Malik HS, Eickbush TH. Identification of the endonuclease domain encoded by R2 and other site-specific, non-long terminal repeat retrotransposable elements. Proc Natl Acad Sci USA. 1999;96:7847–7852. doi: 10.1073/pnas.96.14.7847. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Christensen SM, Eickbush TH. R2 target-primed reverse transcription: ordered cleavage and polymerization steps by protein subunits asymmetrically bound to the target DNA. Mol Cell Biol. 2005;25:6617–6628. doi: 10.1128/MCB.25.15.6617-6628.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Christensen SM, Ye J, Eickbush TH. RNA from the 5′ end of the R2 retrotransposon controls R2 protein binding to and cleavage of its DNA target site. Proc Natl Acad Sci USA. 2006;103:17602–17607. doi: 10.1073/pnas.0605476103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Kurzynska-Kokorniak A, Jamburuthugoda VK, Bibillo A, Eickbush TH. DNA-directed DNA polymerase and strand displacement activity of the reverse transcriptase encoded by the R2 retrotransposon. J Mol Biol. 2007;374:322–333. doi: 10.1016/j.jmb.2007.09.047. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Nehdi A, Perreault JP. Unbiased in vitro selection reveals the unique character of the self-cleaving antigenomic HDV RNA sequence. Nucleic Acids Res. 2006;34:584–592. doi: 10.1093/nar/gkj463. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Kierzek E, Kierzek R, Moss WN, Christensen SM, Eickbush TH, Turner DH. Isoenergetic penta- and hexanucleotide microarray probing and chemical mapping provide a secondary structure model for an RNA element orchestrating R2 retrotransposon protein function. Nucleic Acids Res. 2008;36:1770–1782. doi: 10.1093/nar/gkm1085. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Hart JM, Kennedy SD, Mathews DH, Turner DH. NMR-assisted prediction of RNA secondary structure: identification of a probable pseudoknot in the coding region of an R2 retrotransposon. J Am Chem Soc. 2008;130:10233–10239. doi: 10.1021/ja8026696. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Ruschak AM, Mathews DH, Bibillo A, Spinelli SL, Childs JL, Eickbush TH, Turner DH. Secondary structure models of the 3′ untranslated regions of diverse R2 RNAs. RNA. 2004;10:978–987. doi: 10.1261/rna.5216204. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Ferre-D'Amare AR, Zhou K, Doudna JA. Crystal structure of a hepatitis delta virus ribozyme. Nature. 1998;395:567–574. doi: 10.1038/26912. [DOI] [PubMed] [Google Scholar]




