Abstract
R2 non-long-terminal-repeat retrotransposable elements integrate into a precise location in the 28S rRNA genes of arthropods. The purified protein encoded by R2 can cleave the 28S gene target site and use the 3′ hydroxyl group generated by this cleavage to prime reverse transcription of its own RNA, a process called target-primed reverse transcription. An integration system is described here in which components from the R2 element of the silkmoth, Bombyx mori, are injected into the preblastoderm embryo of Drosophila melanogaster. Silkmoth R2 sequences were readily detected in the 28S rRNA genes of the surviving adults as well as in the genes of their progeny. The 3′ junctions of these insertions were similar to those seen in our in vitro assays, as well as those from endogenous R2 retrotransposition events. The 5′ junctions of the insertions originally contained major deletions of both R2 and 28S gene sequences, a problem overcome by the inclusion of upstream 28S gene sequences at the 5′ end of the injected RNA. The resulting 5′ junctions suggested a recombination event between the cDNA and the upstream target sequences. This in vivo integration system should help determine the mechanism of R2 retrotransposition and be useful as a delivery system to integrate defined DNA sequences into the rRNA genes of organisms.
Non-long-terminal-repeat (non-LTR) retrotransposable elements are a widespread and abundant class of eukaryotic mobile elements. Direct evidence for the mechanism of non-LTR retrotransposition has been obtained from studies of R2, an element which exhibits extraordinary insertion specificity for the 28S rRNA genes of its arthropod host (2, 3). The single open reading frame (ORF) of the R2 element from the silkmoth, Bombyx mori, was expressed in bacteria and found to encode a sequence-specific endonuclease (37). In vitro studies revealed that the purified R2 protein was capable of synthesizing a cDNA copy of its own RNA transcript directly onto the 28S target site (27). As shown in Fig. 1, this mechanism involves the formation of a specific nick on the noncoding strand of the target 28S DNA. The exposed 3′ hydroxyl group is then used to prime reverse transcription, a process termed target-primed reverse transcription (TPRT), before cleavage of the coding strand. Only RNA molecules containing the 250 bp 3′ untranslated region (3′ UTR) of the R2 element can support the TPRT reaction (25). This reaction has many similarities to the mechanism used by mobile group II introns to insert into unoccupied target sites (intron homing) (5, 41).
FIG. 1.
Diagram of the TPRT model for R2 retrotransposition. In the first step of the reaction the R2 protein cleaves the noncoding (primer) strand of the target site and uses the released 3′ end to prime reverse transcription. After reverse transcription, cleavage of the coding (nonprimer) strand occurs. The R2 element does not have RNase H activity (27); thus, removal of the RNA template after reverse transcription is conducted by the cellular machinery. It is not known whether attachment of the cDNA to the upstream target sequences and synthesis of the second DNA strand of the element is also catalyzed by the R2 protein or is dependent upon the cellular DNA repair machinery. Thick lines, DNA target sequences; thin line, RNA template; dashed lines, synthesized first and second DNA strands of the new insertion.
Studies of non-LTR integration mechanisms have also been conducted with the L1 elements found in mammals (6, 8, 29). These studies have suggested that L1 elements are also likely to use a TPRT mechanism of retrotransposition; however, the endonuclease cleavage and RNA binding of the L1-encoded proteins are considerably less sequence specific than that of the R2 protein. This indiscriminate choice of RNA templates and insertion sites by L1 elements has significantly shaped the human genome. Nearly 30% of the human genome can be attributed to the reverse transcription of L1 and various other RNA templates (19, 20, 33).
Although the mechanism by which the R2 protein cleaves the DNA target and initiates the TPRT reaction has been extensively studied (25, 26, 39, 40), little is known of the subsequent step in the integration process: attachment of the R2 sequences to the upstream 28S gene target site. Analysis of endogenous R2 5′ junctions from a variety of arthropods have suggested that 5′ attachment is accomplished either by a recombination event or a jump (template switch) of the reverse transcriptase from the RNA template to the upstream DNA target site (3, 10). In either case, complete integration of an R2 element is likely to be highly dependent upon the cell's DNA repair machinery.
An R2 protein-R2 RNA complex can find and cleave the 28S gene target when incubated with total genomic DNA in vitro (39). In addition, a 1,000-fold excess of nonspecific RNA does not prevent the R2 reverse transcriptase from recognizing the 3′ end of its own transcript for TPRT (25). It thus seemed reasonable that an R2 protein-RNA complex injected into an intact cell would find its 28S rRNA gene target site and initiate the TPRT reaction. Such an in vivo system could supply any required cellular DNA repair machinery in trans, allowing complete R2 integration reactions to occur. In this report, we describe such an integration system in which RNP complexes containing R2 protein and R2 RNA from B. mori are injected into Drosophila melanogaster preblastoderm stage (1 to 2 h) embryos. R2 integration events in the 28S rDNA genes of the surviving adult flies and their progeny were detected by PCR amplification assays. This injection system has enabled us to monitor which RNA sequences may be needed at the 5′ end of the R2 transcript to complete the TPRT reaction. This approach should eventually make it possible to study engineered R2 elements introduced into their normal location within the nucleolus.
MATERIALS AND METHODS
Synthesis of the R2 RNA templates.
The clone used to synthesize the mini-R2–28S RNA was constructed by separate PCR amplification of the 5′ and 3′ ends of an R2Bm insertion in B. mori DNA. The 3′ end of the construct was amplified by using 5′-CTAAGTCGACTTGGTTGAGCCTTGCACAG-3′ to prime synthesis within the R2 element (starting 255 bp from the 3′ end) and 5′-CTGCAAGCTTGCTAGATAGTAGATAGG-3′ to prime synthesis in the downstream 28S gene sequence (ending 81 bp downstream of the R2 insertion site). The PCR product was cloned into the SalI and HindIII sites of the pBSIISK− (pBluescript) vector. The 5′ half of the construct was amplified by using 5′-CTAACTCGAGGAGTCTCTAGTCGATAG-3′ to prime synthesis in the upstream 28S gene (starting 495 bp upstream of the R2 site) and 5′-CTAAGTCGACCGTTCTAAGGCGGCACT-3′ to prime synthesis within the R2 element (ending 470 bp from the 5′ end), and inserted into the XhoI and SalI sites of the above clone. The clone used to synthesize the micro-R2–28S RNA was constructed by PCR amplification of the 5′ end of an R2Bm insertion from B. mori DNA by using 5′-CTATGTCGACGGTACCCAGATTAAGACGAC-3′ to prime synthesis in the upstream 28S gene (starting 170 bp upstream of the R2 site) and 5′-CATTGGTACCTCAGCTCAGAACTGGCACGG-3′ to prime synthesis in the R2 element (ending 50 bp from the 5′ end). This PCR product was inserted into the KpnI and SalI sites of construct R2Bm249 (25).
For in vitro transcription of RNA, plasmid HR4 (27) was linearized with XmnI, plasmid mini-R2–28S with HindIII and micro-R2–28S with EcoRI. Vector RNA was synthesized from pBSIISK− linearized with ApalI. DNA templates used for the synthesis of HR4/10R and HR4/1A RNA were generated by PCR amplification of the HR4 plasmid by using the universal −40 primer and specific primers to generate DNA with the appropriate 3′ ends as previously described (25, 26). After restriction digestion or PCR amplification the DNA was treated with proteinase K, extracted with phenol-chloroform, and ethanol precipitated. RNAs with 5′ methyl-G caps were synthesized with the T7 mMessage mMachine kit (Ambion). One microgram of linearized template DNA was transcribed in a 20-μl volume containing a ratio of cap analog [m7G(5′)ppp(5′)G] to GTP of 4:1, as well as the standard concentrations of T7 polymerase and remaining ribonucleotides. Reactions were carried out at 37°C for 2 h. The template DNA was degraded with 2 U of DNase I for 15 min and phenol extracted. After isopropyl alcohol precipitation, the RNA was resuspended in 10 μl of 10 mM Tris-HCl (pH 8)–1 mM EDTA, and the concentration was estimated on agarose gels.
Preparation of R2 protein-RNA for microinjection.
R2 protein was purified from Escherichia coli JM109/pR260 as previously described (39). For the injection, 200-μl aliquots from the DNA-cellulose column elutant (approximately 3 μg of protein) was mixed with 10 to 15 μg of in vitro-transcribed R2 RNA and 120 U of RNasin (Pharmacia). The mixture was dialyzed at 4°C versus injection buffer containing 5mM KCl, 0.1 mM dithiothreitol, and 0.1 mM NaPO4 (pH 7.5) for 2 h. It was then concentrated threefold by spinning in a Centricon-50 column at 6,000 rpm for 15 min and stored on ice throughout the injection. Negligible reductions in the yield of somatic integration events were obtained after storage of the RNP complex in this form for several days. The final concentration of RNP complex in the injection was approximately 50 μg/ml.
Microinjection and Drosophila maintenance.
Injections were performed as described by Spradling (34) with minor modifications. W1118 preblastoderm embryos (15) from 1-h collections on apple juice plates were dechorionated in 50% Clorox, followed by extensive washing with water. The embryos were then aligned on a second plate with their posterior end near the edge of a coverslip. After removal of the coverslip, the embryos were transferred to a microscope slide containing a small amount of glue made by dissolving double-stick Scotch tape in heptane (31). The embryos were then dehydrated in a box containing Drierite for 7 to 10 min, covered with halocarbon oil, and injected with the protein-RNA complex by using a Narishige Microinjector. After injection, embryos were allowed to develop on apple juice-agarose plates overnight at 18°C in an oxygen box. Yeast was then added, and the plates were left at 25°C for 2 days. Larvae were subsequently transferred to vials containing a standard cornmeal media.
Preparation of flies for PCR amplification.
Adult flies were prepared for PCR as described by Gloor and Engels (12). Individual flies were placed into 0.5-ml tubes and mashed for 5 to 10 s with a pipette tip containing 50 μl of squishing buffer (10 mM Tris HCl, pH 8.2; 1 mM EDTA; 25 mM NaCl; and 200 μg of proteinase K per ml). After being mashed, the buffer was expelled from the tip, mixed with the crushed carcass, and incubated at 37°C for 30 min. The proteinase digestion was stopped by heating the tube to 95°C for 2 min. To screen for germ line events, groups of 10 flies were mashed in 0.5-ml tubes containing 10 μl of squishing buffer; an additional 400 μl of buffer was then added. After the 30-min incubation the reaction was stopped by heating the tube to 95°C for 3 min. For both the somatic and germ line assays, 1 μl of the crude DNA preparation was directly used in each 25-μl PCR reaction.
Nested PCR and sequencing of the 5′ and 3′ junctions of the R2Bm insertions.
For each round of PCR, one primer was specific to Drosophila rDNA locus (either upstream or downstream 28S gene or the R1 element of Drosophila), while the second primer was specific to the R2Bm element (Fig. 2). To amplify the 3′ junction of the R2Bm insertions in a previously uninserted 28S gene, the 28S(+700R) primer (5′-AAGAGCCGACATCGAAGGATC-3′) and the R2Bm(−250) primer (5′-TTGGTTGAGCCTTGCACAG-3′) were used for first-round amplification. The 28S(+350R) primer (5′-CTCGTGATACTTTGATC-3′) and the R2Bm(−190) primer (5′-AGCTCGCTCCCTTGGC-3′) were used for second-round amplification. To amplify the R2 insertion in a 28S gene already containing an R1 insertion, the R1Dm(+350R) primer (5′-CAGTCCAGCAATCGTATGCTCG-3′) and the R1Dm(+100R) primer (5′-TTGCGCACCACTTCCACGGAAC-3′) were used in the first and second rounds of PCR, respectively, in conjunction with the R2Bm(−250) and R2Bm(−190) primers described above. To obtain the 5′ junctions, the 28S(−500) primer (5′-CCAATATCCGCAGCTGG-3′) and the R2Bm(−3R) primer (5′-TCATCGCCGGATCATC-3′) were used for the first-round amplification, and Dm(−270) and R2Bm(−260R) (5′-CCAAGGGAGCGAGCTCC-3′) were used for second-round amplification. Only first-round PCR amplifications were conducted in screens for germ line events. To continually guard against possible PCR artifacts or DNA contamination every fourth to seventh fly tested was a fly which had not been injected. For junction sequences, the second-round PCR products were cloned into mp18T2 (4), and individual clones were sequenced.
FIG. 2.
Diagram of the R2 insertion site and the location of PCR primers used to detect R2Bm insertions. Shown at the top of the figure is a diagram of the D. melanogaster rDNA unit with the location of the R1 and R2 insertion sites indicated. Shown at the bottom are diagrams of a complete HR4 sequence inserted into a 28S gene with or without an R1 insertion. Location and orientation of the PCR primers are indicated with the arrows. All primers were arranged as nested pairs to monitor somatic insertions, with only the external set used to identify germ line events. R2Bm primers are identified by their position relative to the 3′ end of the element; 28S primers are identified relative to the R2 insertion site, and R1Dm primers are identified relative to the 5′ end of a full-length element. Oligonucleotides oriented to prime opposite that of the 28S gene transcript are indicated with the letter R. Dark boxes, rRNA gene sequences; thin line, spacer regions of the rDNA unit; open boxes, R2Bm sequences; stippled box, endogenous R1Dm element.
RESULTS
Establishment of an in vivo integration system.
Our decision to establish a heterologous integration system, in which RNP complexes encoded by the B. mori R2 element are injected into embryos of D. melanogaster, was based on several perceived advantages. First, this system would enable us to take advantage of the procedures developed for the injection of P elements into the preblastoderm embryos of D. melanogaster (32, 35). Injection into the continuous cytoplasm at the posterior end of a preblastoderm stage embryo allows material to be incorporated into multiple embryonic cells, including the primordial germ cells, during subsequent cellularization of the embryo. This approach has enabled the development of a number of DNA-mediated transposable element transformation systems in addition to the P element (9, 24, 30). A second advantage of injecting B. mori R2 components into D. melanogaster is that PCR primers and hybridization probes readily differentiate all segments of a B. mori R2 insertion (R2Bm) from the endogenous D. melanogaster R2 elements (R2Dm) already present in the ribosomal DNA (rDNA) locus. A homologous system would be difficult to assay since no strains of B. mori or D. melanogaster have been identified that lack endogenous R2 elements (17, 38). Finally, the critical 28S gene sequences required for R2 endonuclease recognition are identical in all eukaryotes. Therefore, if injected B. mori RNP complexes functioned in the heterologous cells of D. melanogaster, then they are likely to function in the cells of most other insects.
To conduct the embryo injections, R2 protein purified from E. coli (27) was mixed with a four- to sixfold molar excess of R2 RNA, concentrated and injected into the posterior end of preblastoderm embryos (see Materials and Methods). Our initial injection experiments utilized RNAs corresponding to the 800-nucleotide (nt) RNA transcript (HR4) used in our in vitro studies (27). This RNA contains the final 550 nt encoding the R2 ORF and the 250 nt 3′ UTR. We assayed for the insertion of R2Bm sequences into the 28S rRNA genes of the surviving animals by PCR amplification as shown in Fig. 2. For each amplification, one primer was complementary to sequences within the 3′ UTR of the R2Bm element, while the second primer was complementary to DNA sequences either upstream or downstream of the 28S gene insertion site. Only a small fraction of the cells in the surviving animals contained the R2Bm integrations; thus, the assays used two rounds of PCR amplification with nested primers. Both larval and adult stages were initially tested for R2Bm integrations. Because the number and reliability of identifying the insertions was highest with adult DNA, all studies in this report are based on the assays with adult tissues.
We initially assayed for the 3′ junctions of R2Bm elements with the D. melanogaster 28S gene sequences. Approximately 15% of the rDNA units in the w1118 strain of D. melanogaster used in the injections already contain an R2 element insertion and were unavailable for integration. The remaining D. melanogaster rDNA units are approximately equally divided between rDNA units with no insertions and units already containing an R1 element insertion. R1 is a distantly related non-LTR retrotransposon which inserts into the 28S rRNA gene at a site 74 bp downstream of the R2 insertion site (16). Adult flies which survived the injections were tested with PCR primers complementary to the downstream 28S gene sequences and with primers complementary to sequences near the 5′ end of the R1Dm elements (Fig. 2).
Typical results from two of our initial injection experiments are shown in Table 1. In experiment 1, the injected HR4 RNA was synthesized without a 5′ methyl-G cap. Of the 32 adults tested, 13 (41%) contained R2Bm sequences inserted into rDNA units without R1 insertions, and three (9%) had R2Bm insertions in rDNA units already containing an R1 element. As will be discussed below, the R2Bm insertions generated in this injection experiment contained extensive deletions of their 5′ sequences, presumably due to the degradation inside the embryo of the uncapped HR4 RNA. Therefore, in all subsequent experiments, RNA was synthesized containing methyl-G-capped 5′ ends (see Materials and Methods). When capped HR4 RNA-R2 protein complexes were injected (Table 1, experiment 2), the frequency of adults with insertions was found to be 53% for uninserted units and 30% for R1 inserted units. A significant fraction of the adults contained R2Bm insertions in rDNA units both with and without an R1 insertion, indicating that multiple R2Bm insertions had occurred in the same embryo.
TABLE 1.
R2Bm integration frequenciesa
Expt | RNA | No. of flies tested | Uninserted 28S genes
|
R1-inserted 28S genes
|
Total
|
|||
---|---|---|---|---|---|---|---|---|
No. | % | No. | % | No. | % | |||
1 | HR4* | 32 | 13 | 41 | 3 | 9 | 15 | 47 |
2 | HR4 | 30 | 16 | 53 | 9 | 30 | 19 | 63 |
3 | HR4 (no protein) | 40 | 0 | 0 | 0 | 0 | 0 | 0 |
4 | Vector RNA | 40 | 0 | 0 | 0 | 0 | 0 | 0 |
5 | HR4/10R | 62 | 24 | 39 | 17 | 27 | 28 | 45 |
6 | HR4/1A | 48 | 29 | 60 | 20 | 42 | 29 | 60 |
R2 insertions were assayed separately in 28S genes with no previous insertion (uninserted) or in 28S genes already containing an R1 insertion (R1-inserted). The total percentage is not additive because some flies had insertions into both classes of rDNA units. Vector-specific primers were used in experiment 4. *, injections were carried out with R2 protein complexed with 5′-methyl-G-capped RNA in all experiments except experiment 1.
To demonstrate that the injected R2 protein was responsible for initiating the integration events scored, HR4 RNA was injected in the absence of R2 protein (experiment 3). No R2Bm integrations were observed. To demonstrate that only R2 RNA sequences are utilized in the TPRT reactions catalyzed by this injected R2 protein, the R2 protein was complexed with RNA corresponding to vector sequences (experiment 4). Again no integrations were observed, indicating that both the injected R2 protein and R2 RNA are required for the in vivo integration events.
Isolation of germ line events.
To ensure that the DNA products derived from the PCR amplifications represented stable R2Bm integrations into the rDNA locus of D. melanogaster, injected embryos which survived to adulthood (G0 generation) were mass mated, and individual females were allowed to lay eggs. From each of these lines, 20 G1 progeny (two groups of 10 flies) were tested by PCR for R2Bm integrations. Because all cells of a positive G1 fly should contain the R2Bm integration, only single-round PCR assays were necessary. The frequency of germ line insertions was variable and difficult to estimate because of the low numbers of animals tested. In our most extensive experiment, five G1 positives were obtained from the progeny of 56 G0 females. However, based on several smaller data sets this experiment probably yielded a higher than average number of germ line events. From lines in which positive G1 progeny were found, additional sibling pairs were allowed to lay eggs before being tested by PCR. The G2 progeny of the G1-positive pairs were pairwise mated and monitored by PCR, and the process was continued until homozygous lines were obtained. By this means, a total of 11 germ line events have been recovered as separate lines from a total of 14 lineages initially scored as positive. Southern blots of three representative lines probed with the 3′ UTR sequences of R2Bm are shown in Fig. 3A. Two of the lines (HR4-1 and 10R36) gave rise to restriction fragments predicted for an 800-nt segment of the R2Bm element inserted into a 28S gene of D. melanogaster (Fig. 3B). Southern and subsequent PCR analysis of the third line (10R21) indicated that insertion of this R2Bm sequence into the R2 site of a 28S gene was accompanied by a large deletion of the upstream rDNA sequences (Fig. 3B). As will be discussed below, deletions of upstream 28S gene sequences were seen in many of the integration events resulting from HR4 RNA. Of the 11 germ line insertions analyzed to date, 5 were found to be insertions into the rDNA locus on the X chromosome and 6 were inserted into the rDNA locus on the Y chromosome.
FIG. 3.
Genomic blot analysis of three germ line events resulting from the HR4 injections compared to wild type (w1118). (A) Approximately 2 μg of adult DNA was digested with the restriction enzyme indicated, and the DNA was separated on a 1% agarose gel, transferred to nitrocellulose paper, and hybridized with a 32P-labeled probe. The probe was the R2Bm 3′ UTR generated from clone pBmR2-249 and labelled by random priming (25). DNA size markers are shown at the left. (B) Diagram of the D. melanogaster rDNA repeat, with the location of the EcoRI (E), ClaI (C), and HindIII (H) restriction sites as shown. Lines 10R36 and HR4-1 gave rise to restriction fragments consistent with the insertion of an 800-bp fragment of R2Bm into a typical rDNA unit. Line 10R21 gave rise to restriction fragments indicating that a large deletion of the rDNA unit had accompanied the insertion (stippled box below the diagram).
R2Bm insertions outside the 28S genes were not recovered.
While the endogenous insertion of R2 elements is extremely specific, several R2 elements in the B. mori genome have been found outside the rDNA units (38). In most instances these insertions have found a target site with sequence identity to the 28S target site. The sequences of these non-rDNA R2 insertions revealed numerous substitutions and deletions compared to the uniform population of R2 elements in the rDNA loci, indicating that they represent infrequent events that slowly accumulate mutations over time. Because the high concentrations of R2 RNP complexes in our injection experiments might result in a lower specificity for the 28S gene insertion site, we attempted to identify R2Bm integration events that had occurred outside the rDNA units. The insertion of elements outside the rDNA locus will only be a complication if they occurred in lineages used for future transcription and retrotransposition analyses; thus, we conducted this search in the flies used to isolate the germ line events described in the previous section. The G1 progeny of the 56 G0 females were screened with PCR primers that were both complementary to the injected sequences. Positive products were obtained only in those lines already known to contain insertions within the rDNA locus. Southern blots of these lines revealed only single insertions in the rDNA loci (see Fig. 3).
We conclude that the R2 injections give rise to the frequent integration of R2Bm sequences into the 28S genes of D. melanogaster, with few (if any) insertions occurring outside the rDNA units. As will be shown below, the insertion mechanism used in these germ line events appears to be similar to that used in the somatic events scored in the G0 flies. Because of the greater difficulty in isolating large numbers of germ line events, most of integrations characterized in this report are somatic events isolated from the adult tissues of the injected animals.
3′ junctions of the integrated R2Bm elements.
To determine the exact location of the R2Bm insertions within the 28S gene we separately cloned and sequenced the amplified DNA corresponding to the 3′ junction of R2Bm elements from individual flies. A total of 39 3′ junctions obtained from four different injections of HR4 RNA are shown in Fig. 4A. Most junctions were of somatic R2Bm insertion events into previously uninserted rDNA units; however, insertions into rDNA units already containing an R1Dm insertion (*), as well as several germ line insertion events (+), are also included. R2 elements in all species insert into the identical site in the 28S gene (3), which corresponds to the initial nick observed in the in vitro TPRT reaction (27). The 28S sequence immediately downstream of this site is TAGCCAA. As shown in Fig. 4A, all R2Bm insertions obtained from our injections occurred within 2 bp of this site. Twelve insertions were precisely at the R2 site, six occurred 2 bp upstream of the R2 site, and twenty-one insertions occurred 2 bp downstream of this site. It should be noted that because of the ambiguity in the origin of an A nucleotide, these latter insertions could also be interpreted as 1 bp downstream of the R2 site.
FIG. 4.
R2Bm 3′ junctions obtained with the various HR4 RNA templates. (A) Junctions obtained with HR4 RNA. (B) Junctions obtained when the HR4 RNA contained downstream 28S sequences (HR4/10R). (C) Junctions obtained when the HR4 RNA was missing the last three nucleotides (HR4/1A). Shown at the top of each panel is the 3′ end of the RNA used in the injection, with the dashed vertical line indicating the normal border between R2 and 28S gene sequences. At the bottom of each panel are the junction sequences derived from randomly selected animals which gave rise to PCR products. 28S gene sequences are shown to the right of the solid vertical line, R2Bm sequences are shown to the left of the vertical line, with shaded nucleotides representing sequences not predicted on the basis of simple reverse transcription of the RNA. Three junctions in panel C are derived from initiations within the HR4 R2 template. The number of animals with each junction are shown at the right. ∗, R2Bm elements inserted into rDNA units already containing an R1 element; +, R2Bm junctions obtained from germ line events.
All 39 sequenced integration events indicated that reverse transcription initiated near the 3′ end of the HR4 RNA. The HR4 RNA used in the injection experiments was designed to end with the sequence GAAAA. However, the T7 RNA polymerase used to generate this RNA by runoff transcription frequently adds an additional A; thus, about half of the RNA molecules end in GAAAAA (27). As shown in Fig. 4A, most of the integrations end with four or five As, suggesting that TPRT began at the 3′-terminal nucleotide of the HR4 template. In the remaining insertions, reverse transcription either started 1 to 3 bp internal to the 3′ end or contained additional A nucleotides not present at the 3′ end of the HR4 RNA template. The sequence variation at these 3′ junctions is highly similar to that seen in our in vitro TPRT reactions (27). The additional nucleotides are believed to be a result of nontemplated additions by the R2 RT to the target site before the enzyme fully engages the HR4 RNA template.
Our previous in vitro data indicated that the efficiency and accuracy of the TPRT reaction is significantly affected by changes at the extreme 3′ end of the RNA template. Addition of 28S rRNA sequences downstream of the R2 sequences reduced the efficiency of the reaction but increased the accuracy with which the reverse transcriptase can initiate synthesis of the cDNA at the precise R2-28S gene junction (26). Deletion of nucleotides from the end of the HR4 template also reduced the efficiency of the TPRT reaction, and nearly all products contained nontemplated nucleotides at their 3′ junctions. To determine whether the injection assay would give analogous results in vivo, two RNA templates were tested. One template contained 10 nt of downstream 28S gene sequences (HR4/10R), while the second lacked the last three nucleotides of the HR4 RNA (HR4/1A). As shown in Table 1, the efficiency at which we were able to recover R2 integrations by using these RNAs (experiments 5 and 6) was similar to that obtained with HR4 RNA. The 3′ junction sequences obtained from these injections are shown in Fig. 4B and C. The sequences at these junctions were the same as those seen in our in vitro assay (see reference 26; Fig. 3). In the case of the HR4/+10R construct, 20 of 23 sequenced events contained a precise 3′ junction. In the case of the HR4/1A construct, almost all integrated products contained nontemplated additions varying from 1 to 10 nt in length. These nontemplated nucleotides were usually homopolymeric runs of As or Ts. Three clones derived from the HR4/1A RNA had initiated reverse transcription from a location within the HR4 template. These junctions are very similar to those seen in our in vitro assays with HR4/1A (see reference 25; Fig. 4).
We conclude that the initiation of the TPRT reactions in our embryo injections were similar to those we have scored in vitro by using the same RNA templates. In both assays, the addition of downstream 28S gene sequences increased the accuracy of the TPRT reaction, while nontemplated nucleotides were added if the R2 RNA template was deleted at its 3′ end. The major difference between the in vitro and in vivo assays was that in the in vivo assays cleavage of the target DNA frequently occurred 2 bp upstream and 2 (or 1) bp downstream of the target site defined by endogenous R2 elements in arthropods (3). We have not detected significant cleavage at these upstream or downstream sites in vitro.
5′ junctions of the integrated R2Bm elements.
In the TPRT reactions characterized in vitro we did not observe attachment of the synthesized cDNA to the DNA upstream of the target site (27). Our embryo injection system, on the other hand, requires that the integrated products survive until assayed either 10 days later in adults or weeks later in the progeny of these adults. Therefore, some means of attaching the 5′ end of the R2Bm sequence to the upstream target sequences must have occurred or the chromosome would have been lost. We cloned the 5′ junctions of the R2Bm elements by using the nested PCR primers shown in Fig. 2. Because R1 insertions are located downstream of the R2 site, all R2Bm insertions could be assayed with the same set of upstream 28S gene primers.
Analysis of the 5′ junctions from one of our initial injections with uncapped HR4 RNA (Table 1, experiment 1) revealed only PCR products approximately 300 bp in length, which was much shorter than the 800-bp fragment expected for the integration of a complete HR4 reverse transcript (Fig. 2). A summary of the sequence of four of these short PCR products is shown in Fig. 5A. The amplified DNAs correspond to insertions of R2Bm sequences containing only the 3′ UTR of the element. Because our in vitro results suggested that, once initiated, the R2 reverse transcriptase rapidly extends to the end of the RNA template, we assumed that the injected HR4 RNA had been largely degraded by cellular nucleases with only the 3′ UTR protected by the bound R2 protein (25). To better stabilize the injected RNA within the embryo, we synthesized HR4 RNA containing 5′ methyl-G caps (Table 1, experiment 2). The majority of the 5′-junction PCR products obtained with this capped RNA were also approximately 300 bp in length. However, in about 30% of the animals with R2Bm insertions, PCR fragments were seen that varied from 600 to 800 bp. The sequences of the longer PCR products from 11 such animals, as well as shorter PCR products sometimes found in these same animals, are shown in Fig. 5B. The 5′ ends of three germ line events are also shown in Fig. 5B. It should be noted that, while we have often sequenced both the 5′ and 3′ junctions of R2Bm insertions from the same fly, because multiple insertions can occur in these flies, only in the case of the germ line events do we know which junctions represent the ends of the same insertion.
FIG. 5.
R2Bm 5′ junctions obtained with the HR4 RNA injections. (A) Junctions obtained with uncapped RNA. (B) Junctions obtained with capped RNA. Almost all junctions contained deletions of both the 28S and R2Bm sequences. Boxes to the left of the vertical lines represent the 28S gene deletions, with the precise length of each deletion indicated by the numbers to the right of each box. Boxes to the right of the vertical lines are representative of the length of the R2Bm insertions, with shaded areas representing the 3′ UTR. The exact length of the R2Bm deletions are indicated by the numbers within or to the left of these boxes. Extra nucleotides at the junction that were not derived from either the 28S gene or HR4 RNA are given between the vertical lines. +, junctions obtained from three germ line events.
Only 2 of the 19 R2Bm insertions shown in Fig. 5 represented events resulting from the integration of a complete copy of the HR4 RNA template at the target site. The remaining insertions contained either short (2- to 142-bp) or long (422- to 550-bp) deletions of the HR4 sequences. 5′ truncations are common in R2 elements of certain species and in general represent one of the most diagnostic features of non-LTR retrotransposable elements. However, the 5′ junctions generated by our injection experiments were not like those seen in endogenous R2 elements found in either B. mori or D. melanogaster (3, 10). In particular, the R2Bm integrations derived from the injections contained large deletions of the upstream 28S gene sequences. Deletions of the 28S gene are detected at the 5′ end of endogenous R2 elements, but these deletions are usually only a few base pairs in length and seldom extend to more than 30 bp (3, 10). Indeed, the sample of 28S deletions presented in Fig. 5 represents an underestimate of the deletions associated with R2Bm insertions, since approximately one-fourth of the injected animals that contained an R2Bm insertion based on the 3′ junction PCR assays did not give rise to an amplifiable 5′ junction. In these instances, it is likely that the 28S deletion that accompanied the insertion included the region 270 bp upstream of the insertion site that anneals to the PCR primer. The deletions of upstream sequences accompanying the germ line events were similar to the somatic events shown in Fig. 5B (three sequences indicated with a “+”). In the cases of the other eight germ line events, PCR assays indicated that five contained deletions of at least 80 bp and three had deletions more than 270 bp, with the largest deletion including almost an entire rDNA unit (see Fig. 3).
We suggest that, based on these results the R2 protein in our injection assay is incapable of attaching the end of the HR4 sequence to the upstream target DNA. The junctions we observed presumably resulted from the DNA repair processes having ligated the free ends of the broken chromosome. We suggest the long (420- to 575-bp) deletions of R2Bm sequences result from degradation of the injected RNA, while the shorter (2- to 150-bp) deletions of R2Bm sequences occur after reverse transcription just prior to ligation of the two chromosomal ends.
Injections with a mini-R2Bm–28S cotranscript.
RNA transcripts of R2 elements are rare and represent cotranscription of the 28S genes (21). Consistent with a cotranscription model, we have been unable to identify a promoter at the 5′ end of R2 elements (11). We therefore generated a pBluescript construct which enabled us to synthesize a model 28S-R2 cotranscript for injection into embryos. A full-length cotranscript would be nearly 8,000 nt in length and unlikely to fold into its proper conformation in vitro. Therefore, we generated a mini-R2Bm–28S construct (Fig. 6A) containing 480 nt of the upstream 28S gene, the 450-nt 5′ UTR and 250-nt 3′ UTR of the R2Bm element, and 80 nt of the downstream 28S gene. The flanking 28S gene sequences in this construct comprise domain V of the 28S rRNA (14). Based on the predicted secondary structure of this domain, the 28S sequences downstream of the R2 insertion site should base pair with sequences near the 5′ end of the transcript. It should be noted that these flanking 28S sequences were derived from B. mori. The level of nucleotide identity between the 28S genes of D. melanogaster and B. mori exceeds 95% in the region immediately upstream and downstream of the R2 insertion site, as well as the extreme 5′ end of domain 5. However, the region from 170 to 400 nt upstream of the R2 site corresponds to the expansion region of domain V, where the B. mori and D. melanogaster 28S sequences exhibit little sequence similarity. The PCR primers used to amplify the 5′ R2Bm junctions generated in this injection experiment are either outside of domain V (first primer) or within the expansion domain (second primer) and thus would only anneal to the D. melanogaster 28S gene (see Fig. 2).
FIG. 6.
Full-length R2Bm 5′ junctions obtained with the mini-R2Bm–28S RNA injections. (A) Diagram of the injected mini-R2Bm–28S RNA. The RNA contains that portion of the B. mori 28S sequence corresponding to domain V (14), as well as the entire 5′ and 3′ UTRs of an R2Bm element. Domain V of the B. mori and D. melanogaster 28S genes are highly similar in sequence except for the expansion domain ending 170 bp upstream of the R2 insertion site (dotted region). (B) Diagram of the full-length R2Bm junctions obtained with the mini-R2Bm–28S RNA. Only the 170-bp region immediately upstream of the R2 insertion site is diagrammed, with the nucleotide differences between the B. mori and D. melanogaster sequences indicated. For each R2 element the origin of the 28S sequences upstream of the insertion is indicated as being derived from B. mori (RNA template, open box) or D. melanogaster (DNA target, shaded box). One of the insertions contained a 23-bp tandem duplication of the 28S sequences immediately upstream of the insertion site. The only other variation at the 5′ junction was a 1-bp deletion of the 28S gene in one insertion (solid box, three lines from the bottom). The number of animals containing each insertion type is indicated by the numbers on the right.
Of the 324 flies injected with the 5′-capped mini-R2Bm–28S RNA and tested by PCR for R2Bm 5′ junctions, 108 were scored as positive. Of these 108 flies, 20 gave rise to PCR products that were the size predicted for a precise integration of a full-length mini-R2 element into the target site. The sequence of the 17 insertions recovered by cloning are summarized in Fig. 6B. With two exceptions, these sequences revealed precise 5′ junctions with neither deletions in the R2Bm or 28S gene sequences. Of the two exceptions, one contained a 23-bp duplication of the upstream 28S gene (discussed below), and the second contained a 1-bp deletion common in endogenous insertions of R2Bm (3). The 170-bp region of the 28S gene immediately upstream of the R2 insertion site contains eight nucleotide differences between B. mori and D. melanogaster. These eight differences enabled us to determine whether the 28S sequences upstream of these R2Bm insertions corresponded to the D. melanogaster 28S target sequences or were derived by reverse transcription from the B. mori template RNA. Seven of the 17 full-length insertions contained upstream 28S gene sequences corresponding solely to those derived from the D. melanogaster target site (shaded bar). The remaining 10 clones contained various levels of upstream B. mori sequences introduced via the injected RNA. Thus, some means of recombination between the B. mori and D. melanogaster 28S gene sequences appears to be responsible for the attachment of the 5′ end of the R2Bm insertions.
Minimum sequences required for precise integrations: the micro-R2Bm–28S construct.
If the attachment of the 5′ end of the R2Bm sequences to the upstream DNA target involves simple recombination between the homologous sequences present at the target site and on the synthesized cDNA, then this attachment may not require that the RNA template mimic a complete 28S-R2 cotranscript. The minimum sequences required of the injected RNA to support insertion could include only a short region of the upstream 28S sequences at the 5′ end to stimulate recombination and the 3′ UTR of the R2 element to bind the R2 protein and initiate TPRT. To test this possibility the micro-R2Bm–28S construct shown in Fig. 7A was generated. This construct contained only the 170-nt 28S upstream region that is similar between B. mori and D. melanogaster, 50 nt of the R2Bm 5′ UTR, and the 3′ UTR sequences.
FIG. 7.
Full-length R2Bm 5′ junctions obtained with the micro-R2Bm–28S RNA injections. (A) Diagram of the injected micro-R2Bm–28S RNA. The RNA contains 170 bp of upstream B. mori 28S genes sequences, 50 nt from the 5′ end of the R2 5′ UTR, and the 250-nt 3′ UTR. (B) Diagram of the full-length R2Bm junctions obtained with the micro-R2Bm–28S RNA. As in Fig. 6 the 170-bp region upstream of the R2 insertion site is shown, with the nucleotide differences between the B. mori and D. melanogaster sequences indicated. An additional G residue difference was introduced via PCR during the generation of the micro-R2Bm–28S clone. For each R2 element the origin of the 28S sequences upstream of the insertion is indicated as being derived from B. mori (RNA template, open box) or D. melanogaster (DNA target, shaded box). One of the 5′ junctions contained 25 bp of 28S downstream sequences at the insertion site, presumably resulting in a 25-bp target site duplication. The number of animals containing each insertion type is indicated by the numbers on the right.
The results from the injection of the capped micro-R2Bm–28S construct were similar in many respects to that seen with the mini-R2Bm–28S construct. Of the 272 flies tested by PCR for the presence of a 5′ junction, 70 were scored as positive for a somatic integration event. This 28% recovery of flies with insertions (and the 33% recovery with the mini-R2Bm–28S RNA in the preceding section) is somewhat lower than that observed for the HR4 constructs shown in Table 1. It is not known whether this reduced recovery is due to minor variations in the activity of the isolated protein or the greater difficulty of the R2 protein transporting an RNA containing 28S gene sequences back into the nucleolus of a cell. In spite of this lower total level of insertion, the fraction of full-length insertions was much higher. Remarkably, of the 70 integration events scored, 58 appeared to be full-length events. The near absence of R2Bm insertions containing large 5′ truncations suggestive of RNA degradation would also suggest that the micro-R2Bm–28S RNA was more stable within the injected embryo. This greater resistance to degradation may be the result of the binding of cellular protein (e.g., ribosomal protein), the secondary structure of the upstream 28S sequences, or the binding of this RNA by the R2 protein itself.
The PCR products from 24 individual flies scored as having a full-length insert were cloned and sequenced. As shown in Fig. 7B the sequences of these junctions revealed that all but one were precise. The single junction that was not precise contained a duplication of downstream 28S gene sequences at the 5′ end of the insertion. This insertion contained in effect a 25-bp target site duplication. There were two differences between the junctions generated by the mini-R2Bm–28S construct (Fig. 6) and those generated by the micro-R2Bm–28S construct (Fig. 7). First, in the micro-R2Bm–28S injection the 170-bp upstream region in 8 of the 24 junctions contained the entire or nearly entire sequence derived from the injected RNA. In the case of the mini-R2Bm–28S construct, only two of the insertions contained this entire region from the injected RNA. Second, two clones had short regions of the B. mori 28S sequence embedded within the target 28S gene sequences from the D. melanogaster target site. While the number of cases is low, these differences suggest a slightly different mechanism for the recombination that attaches the 5′ end of the cDNA resulting from these RNAs to the target site.
DISCUSSION
We have shown that RNP complexes derived from the R2 element of B. mori are capable of inserting into their 28S rRNA gene target site when injected into Drosophila preblastoderm embryos. The efficiency of these integrations was such that 28 to 63% of the surviving animals contained R2Bm insertions in their adult tissues. The high efficiency of these R2 integrations was also evident in the significant fraction of flies recovered with more than one somatic insertion. The remarkable specificity of endogenous R2 retrotransposition events for the 28S gene was duplicated in these injection experiments since no evidence was found for R2Bm insertions outside the 28S target sites.
Selection and cleavage of the target site.
Is the ability of the R2 protein to recognize the DNA sequences of the target site the only factor responsible for this insertion specificity? When incubated in large excess (∼100-fold), purified R2 protein is capable of finding the 28S target in purified genomic DNA (39). While similar molar excesses of protein to target site are probably present in our injection experiments, it is not known whether the chromosomal protein bound to genomic DNA within a cell helps or hinders the DNA recognition process. Secondly, the long history of R2 insertion into arthropod rDNA units (2) suggests that they may have evolved, or acquired, a nucleolar localization signal. Such signals have been found for both proteins and small RNAs that must enter the nucleolus (22, 36). The presence of a nucleolar localization signal can be tested in future injection experiments by introducing the 28S gene target site outside the nucleolus and monitoring R2Bm insertions at these sites relative to the endogenous 28S target sites within the nucleolus. The effects of chromatin proteins on the ability of the R2 endonuclease to find the target site can also be assayed in vitro by comparing the ability of purified or assembled chromatin to serve as a target site relative to that of naked DNA.
The 3′ junctions of the R2Bm insertions resulting from the injections of various RNA templates (Fig. 4) indicate that the initial steps in the in vivo R2 integration reaction are similar to those defined for the in vitro TPRT reaction. If the 3′ end of the injected RNA template terminates at the junction of the R2 element with the 28S gene (HR4), then reverse transcription usually initiates at the 3′-terminal nucleotide of the RNA (27). However, both initiation at internal nucleotides and the addition of nontemplated nucleotides are sometimes found. If the injected RNA template contains downstream 28S gene sequences at its 3′ end (HR4/10R), then the reverse transcriptase initiates synthesis at the precise junction between the R2 element and 28S gene sequences (26). Finally, if the injected RNA template has the last few nucleotides of the R2 element deleted (HR4/1A), then reverse transcriptase adds a series of nontemplated nucleotides to the target site before engaging the RNA template (25).
The only qualitative difference detected between the initial steps of the in vitro and in vivo TPRT reactions was the surprising finding that, in vivo, cleavage of the DNA strand used to prime reverse transcription was sometimes 2 bp upstream or 1 to 2 bp downstream of the site of cleavage seen in vitro. (This was not due to a sequence difference between the 28S genes of B. mori and D. melanogaster since these sequences are identical in the 50-bp region surrounding the target site.) We have sequenced the 3′ junctions of nearly 200 R2 elements from a variety of arthropods and have found that all but one corresponded to the precise location of the initial cleavage site as determined in our in vitro experiments (3, 7, 23). The single exception was found in D. sechellia, where an insertion was 2 bp upstream of the normal site (7). The most likely explanation for the variation in cleavage site generated by the injected RNP is the chromatin structure of the DNA target site within the embryo. It is possible that when endogenous R2 elements retrotranspose, the rDNA units are in a more accessible (transcriptionally active) conformation. The injections are conducted prior to cellularization of the embryo, when the rDNA units are not being transcribed (28). We can approach the question of the effects of chromatin structure on the selection of the cleavage site by again using isolated chromatin, or short DNA molecules with positioned nucleosomes, as targets for our in vitro TPRT reaction.
Attachment of the R2 5′ end.
While the steps involved in the integration of the 3′ end of R2 have been elucidated, little is known about the means by which the 5′ end of non-LTR elements is attached to the DNA target during retrotransposition. The only clear characteristic of these insertions is that the elements are frequently 5′ truncated. Such 5′ truncations suggest either that attachment occurs before the reverse transcriptase has reached the 5′ end of the RNA template or that reverse transcription can occur from incomplete RNA transcripts. The uniformity of the 5′ junctions of endogenous R2 elements varies considerably in different species. In species such as B. mori, most junctions are identical. In Drosophila species the R2 5′ junctions frequently contain short deletions of the 28S gene and/or the insertion of nontemplated sequences (3, 10). 5′-truncated R2 elements have junctions similar to those of full-length elements, suggesting that no specific 5′ sequences are required for integration (10). Therefore, the initial R2 RNA templates we tested in our injection assay contained only sequences corresponding to the 3′ end of the R2 element, i.e., those sequences required for RNA binding and the initiation of TPRT. Unfortunately, the large 28S gene deletions associated with these RNAs were not characteristic of the endogenous R2 elements of D. melanogaster or of B. mori. We suggest that these 5′ junctions were generated by a cellular repair process which reseals the cleaved chromosome. Extensive studies have been conducted of the cellular responses to chromosome breakage induced by rare cutting endonucleases (reviewed in references 13 and 18). In a variety of organisms, including D. melanogaster (1), the cut appears to be enlarged and repaired by direct end joining. The 5′ ends of the R2Bm insertions resulting from the HR4 RNA injections appear to be similar to those reported in these studies.
Because R2 elements are believed to be expressed as cotranscripts with the 28S gene, we also tested R2 RNA templates that contained flanking 28S gene sequences. A construct that might mimic an R2 transcript embedded in the 28S rRNA sequence, mini-R2Bm–28S RNA (Fig. 6), resulted in about 20% of the insertions with precise 5′ junctions. Based on the nucleotide differences between the D. melanogaster 28S gene and the B. mori 28S sequences on the RNA template, it was shown that in these precise junctions a variable length of the upstream 28S gene sequences was derived from the RNA template rather than the target DNA sequences (Fig. 6). One possible model to explain this “coconversion” is diagrammed in Fig. 8A. Homologous recombination is postulated to occur between the newly made cDNA and the target DNA sequences upstream of the insertion site. For simplicity, the recombination in Fig. 8A is shown occurring on the target DNA after second-strand cleavage. Alternatively this recombination could occur at the chromosomal nick before second-strand cleavage (see Fig. 1). Because the 28S sequences in the mini-R2Bm–28S RNA included an expansion segment that is not similar between B. mori and D. melanogaster, only the 170-bp region immediately upstream of the cleavage site can anneal to the cDNA. DNA repair of this annealed complex, which can undergo branch migration over this 170-bp region, would explain the coconversion-like gradient seen.
FIG. 8.
Possible recombination models to explain the 5′ junctions generated by the mini- and micro-R2Bm–28S RNAs. (A) Reverse transcription proceeds beyond the 170-bp region of similarity between the RNA and the DNA target in the mini-R2Bm–28S RNA. The cDNA strand anneals to the upper strand of the target DNA. Branch migration within the 170-bp region of identity accounts for the gradient of B. mori sequences accompanying the insertions seen in Fig. 6. (B) Reverse transcription proceeds to the end of the micro-R2Bm–28S RNA. The cDNA strand displaces the lower DNA strand of the target site, accounting for the greater proportion of insertions with the entire 170-bp derived from B. mori (Fig. 7). DNA repair can use either the upper or lower strand to repair the mismatched bases, accounting for the patches seen in some insertions.
A more minimal RNA construct was next tested containing only the R2 3′ UTR and the 170 bp of upstream 28S gene sequences, which are similar between B. mori and D. melanogaster (Fig. 7). This micro-R2Bm–28S RNA resulted in more than 80% precise 5′ junctions, suggesting it is a better template for these injection experiments. As shown in Fig. 8B, the higher frequency of precise junctions and a greater percentage of these junctions that contained the entire upstream region derived from B. mori can be explained if the newly made cDNA strand from this RNA template more stably displaces the lower DNA strand of the upstream target site. The patchwork pattern of D. melanogaster and B. mori 28S sequences seen in two of the integrations (Fig. 7B) clearly adds support to a heteroduplex intermediate between the cDNA and the upstream DNA target site. It is not clear whether the R2 protein actively contributes to the formation of such a heteroduplex.
Do endogenous R2 retrotransposition events also use homologous recombination to attach the 5′ end of the element to the target site? Homologous recombination like that shown in Fig. 8A or B would lead to 5′ sequence uniformity; thus, it can explain those species such as B. mori with uniform 5′ junctions (see reference 3 for various examples). What about species with R2 elements containing variable 5′ junctions? Based on an analysis of R2 elements in those species with extensive 5′ variation, we have postulated a template jump of the reverse transcriptase from the R2 RNA template onto the 28S DNA at the cleavage site (3). Such template jumps can explain short deletions of the DNA target and insertion of nontemplated sequences at the precise R2-28S boundary. A template jump model also explains two other types of sequence duplications found at the 5′ junctions of R2 elements. If the template jump to the cleaved DNA target site occurs a short distance after the reverse transcriptase has passed the R2-28S junction of the RNA template, then a tandem duplication of the 28S gene is generated. We have detected such tandem duplications in several arthropod species, including 24- to 26-bp duplications in ca. 10% of the R2 elements in B. mori (3). The second type of sequence variation found associated with R2 insertions in certain species is target site duplications. To explain these duplications, we suggest that cleavage of the upper DNA strand can sometimes occur downstream of the lower-strand cleavage (instead of the normal location 2 bp upstream of the lower-strand site). A template switch onto a target site in which the upper band is cleaved downstream of the insertion site would result in a target site duplication (3).
While the 5′ junctions generated with the mini- and micro-R2Bm–28S RNAs were extremely uniform, we did see examples of a 1-bp deletion (Fig. 6), a 23-bp tandem duplication of the 28S gene (Fig. 6), and a 25-bp target site duplication (Fig. 7). Whether this variation results from a template switch or another more complicated recombination-repair mechanism is not known.
We are hopeful that the R2 integration system developed in this report will enable us to study other aspects of the retrotransposition mechanism used by R2 and that it will eventually allow us to introduce engineered (activated) R2 elements into the rDNA loci of D. melanogaster and potentially other arthropod species. A major issue in these studies is whether the inserted R2 element will be transcribed. Northern blots of RNA isolated from our germ line events have indicated that, like endogenous R2 (and R1) insertions, R2Bm sequences inserted into the 28S gene of D. melanogaster are not transcribed (D. Eickbush, unpublished data). However, to maintain their numbers in a species R2 elements need only be transcribed for short periods in the development of ovaries or testis and not even at each generation. We now should have the tools necessary to directly address these transcription questions.
ACKNOWLEDGMENTS
This work was supported by National Institutes of Health grant GM42790 to T.H.E.
We thank Robert Fleming for sharing his expertise on injections and fly development, Yi Gu for showing us how to conduct the injections, and especially Janet George and Karen Gentile for help in aligning embryos. We thank William Burke, Janet George, Cesar Perez-Gonzalez, Harmit Malik, and Jin Yang for comments and encouragement.
REFERENCES
- 1.Bellaiche Y, Mogila V, Perrimon N. I-SceI endonuclease, a new tool for studying DNA double-strand break repair mechanism in Drosophila. Genetics. 1999;152:1037–1044. doi: 10.1093/genetics/152.3.1037. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Burke W D, Malik H S, Lathe W C, Eickbush T H. Are retrotransposons long term hitchhikers? Nature. 1998;239:141–142. doi: 10.1038/32330. [DOI] [PubMed] [Google Scholar]
- 3.Burke W D, Malik H S, Jones J P, Eickbush T H. The domain structure and retrotransposition mechanism of R2 elements are conserved throughout arthropods. Mol Biol Evol. 1999;16:502–511. doi: 10.1093/oxfordjournals.molbev.a026132. [DOI] [PubMed] [Google Scholar]
- 4.Burke W D, Müller F, Eickbush T H. R4, a non-LTR retrotransposon specific to the large subunit rRNA gene of nematodes. Nucleic Acids Res. 1995;23:4628–4634. doi: 10.1093/nar/23.22.4628. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Cousineau B, Smith D, Lawrence-Cavanagh S, Mueller J E, Yang J, Mills D, Manias D, Dunny G, Lambowitz A M, Belfort M. Retrohoming of a bacterial group II intron: mobility via complete reverse splicing, independent of homologous DNA recombination. Cell. 1998;94:451–462. doi: 10.1016/s0092-8674(00)81586-x. [DOI] [PubMed] [Google Scholar]
- 6.Dhellin O, Maestre J, Heidmann T. Functional differences between the human LINE retrotransposon and retroviral reverse transcriptases for in vivo mRNA reverse transcription. EMBO J. 1997;16:6590–6602. doi: 10.1093/emboj/16.21.6590. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Eickbush D G, Eickbush T H. Vertical transmission of the retrotransposable elements R1 and R2 during the evolution of the Drosophila melanogaster species subgroup. Genetics. 1995;139:671–684. doi: 10.1093/genetics/139.2.671. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Feng Q, Moran J V, Kazazian H H, Boeke J D. Human L1 retrotransposon encodes a conserved endonuclease required for retrotransposition. Cell. 1996;87:905–916. doi: 10.1016/s0092-8674(00)81997-2. [DOI] [PubMed] [Google Scholar]
- 9.Garza D, Medhora M, Koga A, Hartl D L. Introduction of the transposable element mariner into the germline of Drosophila melanogaster. Genetics. 1991;128:303–310. doi: 10.1093/genetics/128.2.303. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.George J A, Burke W D, Eickbush T H. Analysis of the 5′ junctions of R2 insertions with the 28S gene: implications for non-LTR retrotransposition. Genetics. 1996;142:853–863. doi: 10.1093/genetics/142.3.853. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.George J A, Eickbush T H. Conserved features at the 5′ end of Drosophila R2 retrotransposable elements: implications for transcription and translation. Insect Mol Biol. 1999;8:3–10. doi: 10.1046/j.1365-2583.1999.810003.x. [DOI] [PubMed] [Google Scholar]
- 12.Gloor G B, Engels W R. Single fly DNA preps for PCR. Drosophila Inform Service. 1992;71:148–149. [Google Scholar]
- 13.Haber J E. In vivo biochemistry: physical monitoring of recombination induced by site-specific endonucleases. Bioessays. 1995;17:609–620. doi: 10.1002/bies.950170707. [DOI] [PubMed] [Google Scholar]
- 14.Hancock J M, Tautz D, Dover G A. Evolution of the secondary structures and compensatory mutations of the ribosomal RNAs of Drosophila melanogaster. Mol Biol Evol. 1988;5:393–414. doi: 10.1093/oxfordjournals.molbev.a040501. [DOI] [PubMed] [Google Scholar]
- 15.Hazelrigg T, Levis R, Rubin G M. Transformation of white locus DNA in Drosophila: dosage compensation, zeste interaction, and position effects. Cell. 1984;36:469–481. doi: 10.1016/0092-8674(84)90240-x. [DOI] [PubMed] [Google Scholar]
- 16.Jakubczak J L, Xiong Y, Eickbush T H. Type I (R1) and Type II (R2) ribosomal DNA insertions of Drosophila melanogaster are retrotransposable elements closely related to those of Bombyx mori. J Mol Biol. 1990;212:37–52. doi: 10.1016/0022-2836(90)90303-4. [DOI] [PubMed] [Google Scholar]
- 17.Jakubczak J L, Zenni M K, Woodruff R C, Eickbush T H. Turnover of R1 (Type I) and R2 (Type II) retrotransposable elements in the ribosomal DNA of Drosophila melanogaster. Genetics. 1992;131:129–142. doi: 10.1093/genetics/131.1.129. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Jasin M. Genetic manipulation of genomes with rare-cutting endonucleases. Trends Genet. 1996;12:224–228. doi: 10.1016/0168-9525(96)10019-6. [DOI] [PubMed] [Google Scholar]
- 19.Jurka J. Sequence patterns indicate an enzymatic involvement in integration of mammalian retroposons. Proc Natl Acad Sci USA. 1997;94:1872–1877. doi: 10.1073/pnas.94.5.1872. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Kazazian H H, Moran J V. The impact of L1 retrotransposition on the human genome. Nat Genet. 1998;19:19–24. doi: 10.1038/ng0598-19. [DOI] [PubMed] [Google Scholar]
- 21.Kidd S J, Glover D M. Drosophila melanogaster ribosomal DNA containing type II insertions is variably transcribed in different strains and tissues. J Mol Biol. 1981;151:645–662. doi: 10.1016/0022-2836(81)90428-9. [DOI] [PubMed] [Google Scholar]
- 22.Lange T S, Ezrokhi M, Borovjagin A V, Rivera-Leon R, North M T, Gerbi S A. Nucleolar localization elements of Xenopus laevis U3 small nucleolar RNA. Mol Biol Cell. 1998;9:2973–2985. doi: 10.1091/mbc.9.10.2973. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Lathe W C, III, Eickbush T H. A single lineage of R2 retrotransposable elements is an active, evolutionarily stable component of the Drosophila rDNA locus. Mol Biol Evol. 1997;14:1232–1241. doi: 10.1093/oxfordjournals.molbev.a025732. [DOI] [PubMed] [Google Scholar]
- 24.Loukeris T G, Livadaras I, Arca B, Zabalou S, Savakis C. Gene transfer into the medfly, Ceratitis capitata, with a Drosophila hydei transposable element. Science. 1995;270:2002–2005. doi: 10.1126/science.270.5244.2002. [DOI] [PubMed] [Google Scholar]
- 25.Luan D D, Eickbush T H. RNA template requirements for target DNA-primed reverse transcription by the R2 retrotransposable element. Mol Cell Biol. 1995;15:3882–3891. doi: 10.1128/mcb.15.7.3882. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Luan D D, Eickbush T H. Downstream 28S gene sequences on the RNA template affect the choice of primer and the accuracy of initiation by the R2 reverse transcriptase. Mol Cell Biol. 1996;16:4726–4734. doi: 10.1128/mcb.16.9.4726. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Luan D D, Korman M H, Jakubczak J L, Eickbush T H. Reverse transcription of R2Bm RNA is primed by a nick at the chromosomal target site: a mechanism for non-LTR retrotransposition. Cell. 1993;72:595–605. doi: 10.1016/0092-8674(93)90078-5. [DOI] [PubMed] [Google Scholar]
- 28.McKnight S L, Miller O L., Jr Ultrastructural patterns of RNA synthesis during early embryogenesis of Drosophila melanogaster. Cell. 1976;8:305–319. doi: 10.1016/0092-8674(76)90014-3. [DOI] [PubMed] [Google Scholar]
- 29.Moran J V, Holmes S E, Naas T P, DeBerardinis R J, Boeke J D, Kazazian H H. High frequency retrotransposition in cultured mammalian cells. Cell. 1996;87:917–927. doi: 10.1016/s0092-8674(00)81998-4. [DOI] [PubMed] [Google Scholar]
- 30.O'Brochta D A, Warren W D, Saville K J, Atkinson P W. Hermes, a functional non-drosophilid insect gene vector from Musca domestica. Genetics. 1996;142:907–914. doi: 10.1093/genetics/142.3.907. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Ramos R G P, Grimwade B G, Wharton K A, Scottgale T N, Artavanis-Tsakonis S. Physical and functional definition of the Drosophila Notch locus by P element transformation. Genetics. 1989;123:337–348. doi: 10.1093/genetics/123.2.337. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Rubin G M, Spradling A C. Genetic transformation of Drosophila with transposable element vectors. Science. 1982;218:348–353. doi: 10.1126/science.6289436. [DOI] [PubMed] [Google Scholar]
- 33.Smit A F A. The origin of interspersed repeats in the human genome. Curr Opin Genet Dev. 1996;6:743–748. doi: 10.1016/s0959-437x(96)80030-x. [DOI] [PubMed] [Google Scholar]
- 34.Spradling A C. P element-mediated transformation. In: Roberts D B, editor. Drosophila: a practical approach. Oxford, United Kingdom: IRL Press; 1986. pp. 175–197. [Google Scholar]
- 35.Spradling A C, Rubin G M. Transposition of cloned P elements into Drosophila germ line chromosomes. Science. 1982;218:341–347. doi: 10.1126/science.6289435. [DOI] [PubMed] [Google Scholar]
- 36.Tuteja R, Tuteja N. Nucleolin: a multifunctional major nucleolar phosphoprotein. Crit Rev Biochem Mol Biol. 1998;33:407–436. doi: 10.1080/10409239891204260. [DOI] [PubMed] [Google Scholar]
- 37.Xiong Y, Eickbush T H. Functional expression of a sequence-specific endonuclease encoded by the retrotransposon R2Bm. Cell. 1988;55:235–246. doi: 10.1016/0092-8674(88)90046-3. [DOI] [PubMed] [Google Scholar]
- 38.Xiong Y, Burke W D, Jakubczak J L, Eickbush T H. Ribosomal DNA insertion elements R1Bm and R2Bm can transpose in a sequence specific manner to locations outside the 28S genes. Nucleic Acids Res. 1988;16:10561–10573. doi: 10.1093/nar/16.22.10561. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Yang J, Eickbush T H. RNA-induced changes in the activity of the endonuclease encoded by the R2 retrotransposable element. Mol Cell Biol. 1998;18:3455–3465. doi: 10.1128/mcb.18.6.3455. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Yang J, Malik H S, Eickbush T H. Identification of the endonuclease domain encoded by R2 and other site-specific, non-long terminal repeat retrotransposable elements. Proc Natl Acad Sci USA. 1999;96:7847–7852. doi: 10.1073/pnas.96.14.7847. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Zimmerly S, Guo H, Perlman P S, Lambowitz A M. Group II intron mobility occurs by target DNA-primed reverse transcription. Cell. 1995;82:545–554. doi: 10.1016/0092-8674(95)90027-6. [DOI] [PubMed] [Google Scholar]