Abstract
The hammerhead ribozyme was originally discovered in subviral plant pathogens and was subsequently also found in a few other genomic locations. Using a secondary structure–based descriptor, we have searched publicly accessible sequence databases for new examples of type III hammerhead ribozymes. The more than 60,000 entries fulfilling the descriptor were filtered with respect to folding and stability parameters that were experimentally validated. This resulted in a set of 284 unique motifs, of which 124 represent database entries of known hammerhead ribozymes from subviral plant pathogens and A. thaliana. The remainder are 160 novel ribozyme candidates in 50 different eukaryotic genomes. With a few exceptions, the ribozymes were found either in repetitive DNA sequences or in introns of protein coding genes. Our data, which is complementary to a study by De la Peña and García-Robles in 2010, indicate that the hammerhead is the most abundant small endonucleolytic ribozyme, which, in view of no sequence conservation beyond the essential nucleotides, likely has evolved independently in different organisms.
Keywords: catalytic RNA, hammerhead ribozyme, thermodynamics, bioinformatics
INTRODUCTION
Autocatalytic cleavage of RNA comes naturally in different flavors, and is executed by the hairpin, hammerhead, hepatitis delta virus (HDV), or the Neurospora VS ribozymes and by the bacterial glmS motif (Prody et al. 1986; Sharmeen et al. 1988; Hampel and Tritz 1989; Wu et al. 1989; Saville and Collins 1990; Winkler et al. 2004). Despite the substantial differences in the groups involved in catalysis (Fedor 2009), all of the family members promote the scission of a conventional 5′-3′ phosphodiester bond (or the reverse ligation reaction).
As the first of these ribozymes, the hammerhead was found in subviral plant pathogens (Prody et al. 1986), where its self-cleavage reaction serves in the processing of multimeric replication intermediates. Formally, the structure of the hammerhead ribozyme (Fig. 1A) is composed of three helices that surround a catalytic core of 11 conserved nucleotides (Uhlenbeck 1987). In addition to these requirements of a minimal hammerhead, all natural examples of this motif feature loops or bulges in the helical arms I and II. These interact and thereby allow for catalysis at physiological Mg2+ concentrations (De la Peña et al. 2003; Khvorova et al. 2003; Penedo et al. 2004).
The hammerhead ribozyme is also involved in the processing of transcripts of satellite DNA in eukaryotic genomes, for example, in the newt Triturus carnifex (Epstein and Gall 1987), Dolichopoda cave crickets (Rojas et al. 2000), or the blood fluke Schistosoma mansoni (Ferbeyre et al. 1998). More recently, examples of the hammerhead were also found at discrete loci, for example, in Arabidopsis thaliana (Przybilski et al. 2005) or as a split version in several mammals (Martick et al. 2008).
In this study, we have set out to investigate the presence of this ribozyme in the publicly accessible sequence databases, for which we have used thermodynamic parameters to identify likely true catalysts in a large data set of potential ribozyme motifs. In line with recent publications (De la Peña and García-Robles 2010a,b), we find that this catalytic RNA motif is considerably more widespread than anticipated originally.
RESULTS AND DISCUSSION
Searching ribozymes in genomic sequences
In view of the significant increase of sequence availability, we have set out to search the databases for novel examples of the hammerhead ribozyme. To do so, we have designed a program pipeline that allows for the automated identification of sequences from a large data set, which fulfill the criteria defined by a given descriptor. Here, we have used a descriptor (Fig. 1B,C) that identifies hammerhead ribozymes of type III, with the open end in helix III and covalently closed helices I and II. These ribozymes predominantly have been found so far in the subviral plant pathogens (Hammann and Westhof 2007). With that descriptor, we have screened the DNA sequences of 174 eukaryotic, 616 bacterial, and 1371 viral entities from the Ensembl (release 53) databases and several other sources. As a control, we have included the currently 2877 entries in the subviral database (Rocheleau and Pelchat 2006) that contains more than 100 unique type III hammerhead sequences from satellite RNAs and viroids (Tabler and Tsagris 2004).
Within an overall sequence space of 1.52 × 1011 nucleotides (nt), we have identified more than 60,000 primary hits that correspond to the conditions set by our descriptor. The vast majority (95.1%) of these originated from eukaryotic genomes, some 1.8% from bacterial, and a negligible fraction (0.03%) from viral genomes. The remaining 3% originate from viroids and plant virus satellite RNAs that served as an internal positive control. This indicates that the used descriptor is mapping the motif properly.
For any descriptor used in such a process, the likelihood to find a hit in a random sequence can be calculated as the product of the likelihoods of its conserved sequences and helical elements, assuming an equal distribution of the four nucleotides. The descriptor used here features a minimal pattern that statistically is expected to occur once in a random sequence of 2.7 × 1011 nt. Because of the variable loop nucleotide and helix lengths, however, the overall likelihood to find a hit is higher. To assess this, we have generated several random sequences of 2.2 × 108 nt, which we searched with our descriptor (Fig. 1B,C). In average, we find eight hits per strand, indicating an overall likelihood of 3.63 × 10−8. In the searched sequence space, we thus find 10 times more hits than the 5518 hits that are predicted for a random occurrence.
Filtering primary data
In order to identify likely hammerhead ribozyme candidates that will adopt the typical structure required for cleavage activity, we have used several filter steps (Fig. 2). To this end, we combined different programs in a pipeline that allowed us to automatically calculate folding energies of two secondary structures using Mfold (Zuker 2003) for each candidate: one in which the sequence was allowed to fold into the structure of minimum free energy (ΔG037°C,free), and one in which the sequence was constrained to fold into the hammerhead motif (ΔG037°C,motif) (Fig. 1B). In a first step, we removed within the large set of primary hits all the sequences that displayed ΔG037°C,free > −10 kcal/mol, as their helices would likely be weak in maintaining a folded structure. The resulting set of sequences has the potential to adopt the desired ribozyme structure, but no more than that. Thus, to identify those sequences that also will adopt this structure, we selected only those motifs that preferred to fold into the hammerhead ribozyme structure and thus conform to
This resulted in 858 sequences that correspond to 284 unique motifs that are likely to be true hammerhead ribozymes. Of the latter, 124 are known hammerhead ribozymes, 122 unique motifs from subviral plant pathogens and two from A. thaliana (Prody et al. 1986; Przybilski et al. 2005). As the unique motifs that exist in the subviral plant pathogens were found, this observation indicates that the applied filtering steps are appropriate to identify true hammerhead ribozymes. In analogy, we propose that the remaining 160 unique motifs (Supplemental Table S1) are novel candidates of this catalytic RNA. They originate from 50 eukaryotic species (Table 1), next to three bacteria (Azorhizobium caulinodans, Chloroflexus aggregans, Mycobacterium vanbaalenii). Some of these had also been found in the recent study by De la Peña and García-Robles (2010b). To our surprise, no hammerhead ribozyme was identified in any of 1371 viruses.
TABLE 1.
In vitro kinetic analysis to validate the filtering steps
Although the observation that we can find known hammerhead ribozymes indicates already that the applied filter steps are appropriate, we have set out to validate the computational selection strategy experimentally. For this purpose, we have analyzed a total of 12 primary hits that we found in the genome of the African claw frog Xenopus tropicalis. While all fulfilled the used descriptor (Fig. 1B,C) only half matched the filtering criteria (Fig. 2), while the other half did not. Using recursive PCR, we created their DNA templates from synthetic DNA oligonucleotides. All 12 sequences were synthesized by in vitro transcription, and full-length transcripts were recovered from denaturing polyacrylamid gels.
Natural hammerhead ribozymes engage in tertiary interactions between the sequence elements in arms I and II (De la Peña et al. 2003; Khvorova et al. 2003; Penedo et al. 2004). Only when these interactions take place, can catalytic activity be observed under physiological magnesium ion concentrations, in the low millimolar range. For this reason, we tested all 12 RNA species in controlled kinetic analyses at a Mg2+ ion concentration of 2 mM. Under these conditions, none of the six tested sequences with Kfree > Kmotif displayed ribozyme cleavage activity (data not shown). Most of these sequences also had shown no sign of self-cleavage activity during the in vitro transcription, that is, at elevated Mg2+ ion concentrations, indicating that they genuinely do not adopt a hammerhead fold, as predicted by Mfold (Zuker 2003). In contrast, to these differently folded sequences, the other six motifs that all conform to Equation 1 showed in vitro ribozyme cleavage when tested under physiological Mg2+ ion concentrations, as exemplified for the motif Xetr8 (Fig. 3). This indicates that the applied filter steps are appropriate to separate sequences with catalytic activity from a large set of potential hammerhead ribozymes.
Annotation of the novel ribozyme sequences
An analysis of the genomic locations of the newly identified type III hammerhead ribozyme candidates indicates that they predominantly reside either in repetitive DNA sequences, intergenically, or frequently within introns of protein coding genes; for the majority of the latter, expression data are available (Supplemental Table S1). Organisms with a large number of hammerhead ribozymes within satellite DNA have been described before, as summarized in Przybilski et al. (2005), and since a substantial fraction of the new motifs was found in Hydra magnipapillata, the fresh water polyp appears to be a new member of this group.
A majority of the motifs identified here, however, has a single occurrence in a given genome only, either intergenically or residing in introns of protein coding genes (Supplemental Table S1). For the presence of hammerhead ribozymes in introns, cellular functions can be readily envisaged, as they might be involved in the nucleotide-exact processing of trans-acting regulatory RNA molecules, as has been shown, for example, for the processing of microRNAs from introns (Kim and Kim 2007). Similar to our observation, intronic hammerhead ribozymes were also reported recently by De la Peña and García-Robles (2010a).
Schistosomes are known to harbor hammerhead ribozymes in their repetitive satellite DNA sequences (Ferbeyre et al. 1998), similar to other organisms, like crickets and several amphibians, including newts. Those motifs, however, are of type I, which can be considered as a circular permutation of the ribozyme structure with a covalently closed helix III and an open helix I. The number of type I motifs in the blood fluke was recently shown to be enormously high (De la Peña and García-Robles 2010b). Together with that study, our data indicate that in S. mansoni and also Schistosoma japonicum, type I and III motifs can share the same genomic location. Also, that and our study indicate that several hammerhead ribozymes in intergenic regions can be associated with transposable elements.
Conclusions
For a long time, the small endonucleolytic ribozymes were considered to be an oddity, despite important evolutionary implications (Salehi-Ashtiani and Szostak 2001). In recent years, however, several studies have shown that such catalytic RNA motifs are considerably more widespread than anticipated, with HDV-like ribozymes discovered in humans (Salehi-Ashtiani et al. 2006) and, more recently, in a variety of other genomes (Webb et al. 2009). Recently, also new examples of the hammerhead ribozymes were found in a plant genome (Przybilski et al. 2005), and a split version was discovered in several mammals (Martick et al. 2008). Very recently, this has been dramatically extended by finding hammerhead ribozyme motifs in an unprecedented number of eukaryotic genomes (De la Peña and García-Robles 2010a,b; this study). This indicates that the hammerhead ribozyme motif is considerably more widely distributed in nature than was anticipated earlier, and that it is the most frequently occurring small nucleolytic ribozyme. The use of structure-based approaches has proven particularly useful, as has been shown also in earlier studies (Przybilski et al. 2005; Webb et al. 2009; De la Peña and García-Robles 2010b). The computational pipeline that comprises a series of filtering steps was important, as it allowed the separation of likely ribozyme motifs from the background of large data sets. An alignment of the 160 motifs identified in this study displayed an enormous heterogeneity, in terms of both sequence and length (data not shown). Together with the variability of their genomic locations, this supports the view that hammerhead ribozymes have multiple origins, as postulated from in vitro selection experiments (Salehi-Ashtiani and Szostak 2001).
MATERIALS AND METHODS
Bioinformatics
To search for novel hammerhead ribozyme motifs, we have downloaded the available genomes from the Ensembl (release 53) databases (Flicek et al. 2010) and selected other publicly accessible databases, including the subviral database (Rocheleau and Pelchat 2006) and the databases of NCBI. To allow for a secondary structure–based search, RNAbob 2.1 (Eddy 2005) was used with a descriptor designed for a type III hammerhead (Fig. 1C). To gain an overview on the distribution of these motifs, RNAbob was chosen as it allows for fast searches, while it will not find motifs embedded into another. For each of the more than 60,000 primary hits that were found by this approach, two ΔG values were calculated with Mfold 3.5 (Zuker 2003): either without constraints to get ΔG037°C,free or with the calculated constraints that force the sequences into the given structure to obtain ΔG037°C,motif. The following adjustments are carried out next: We discarded any entry with a ΔG037°C,free larger than −10 kcal/mol as a basal filter. After we added a fixed term of −0.5 kcal/mol to all ΔG037°C,motif values to make up for calculation fluctuations of Mfold, the term ΔΔG0 = ΔG037°C,motif − ΔG037°C,free is calculated, and only entries with ΔΔG0 ≤ 0 kcal/mol are considered. Finally, all hits that have the same sequence within an organism are filtered out to get a unique result set of candidates.
In vitro transcription
Templates for in vitro transcription were generated synthetically using partially overlapping primers in recursive PCR reactions, as described earlier (Przybilski et al. 2005). To ensure an efficient in vitro transcription, GGG, GGC, or GCG was inserted directly after the T7 promoter sequence. Transcription reactions were carried out using T7 RNA polymerase (Milligan et al. 1987) in 40 mM Tris/HCl (pH 8.0), 5 mM MgCl2, 2 mM spermidine, and 0.01% TritonX-100. ATP, GTP, and CTP were present in the reaction at a final concentration of 0.5 mM each, and UTP was at a final concentration of 0.1 mM, supplemented with traces of (α-32P) UTP. Full-length transcripts were purified from 15% denaturating polyacrylamide gels containing 7 M urea and were visualized by PhosphorImaging. RNA was eluted from gel slices in 40% formamide, 0.7% SDS/1×TE by shaking overnight. Eluted RNA was phenolized, precipitated, washed, and finally dissolved in 10 mM Tris/HCl (pH 7.5) and 25 mM NaCl, supplemented with 0.1 mM EDTA, to prevent self-cleavage.
Cleavage assay of the hammerhead ribozyme sequences
The samples were refolded by incubation for 2 min at 80°C, followed by snap-cooling on ice. Cleavage reactions were started at 37°C by adding MgCl2 to an effective concentration of 2 mM, taking into consideration the EDTA concentration of the refolding step. Reactions were stopped at suitable time points by adding RNA loading solution (95% formamide and 50 mM EDTA at pH 8.0). After heating to 95°C, the samples were resolved on 15% polyacrylamide gels containing 7 M urea. Cleavage products were quantified by PhosphorImager analysis. Kobs values were calculated by fitting the data to the equation F(t) = F0 + F∞(1 − e-kt) (Stage-Zimmermann and Uhlenbeck 1998).
SUPPLEMENTAL MATERIAL
Supplemental material can be found at http://www.rnajournal.org.
ACKNOWLEDGMENTS
We thank the Heisenberg research group ribogenetics and T.R. Cech for comments on the manuscript and H.U. Göringer, W. Nellen, and F. Pfeifer for support. This work was supported by Deutsche Forschungsgemeinschaft (grant HA3459/3) and by a Heisenberg stipend to C.H. (grant HA3459/5).
Footnotes
Article published online ahead of print. Article and publication date are at http://www.rnajournal.org/cgi/doi/10.1261/rna.2429911.
REFERENCES
- De la Peña M, Gago S, Flores R 2003. Peripheral regions of natural hammerhead ribozymes greatly increase their self-cleavage activity. EMBO J 22: 5561–5570 [DOI] [PMC free article] [PubMed] [Google Scholar]
- De la Peña M, Garcia-Robles I 2010a. Intronic hammerhead ribozymes are ultraconserved in the human genome. EMBO Rep 11: 711–716 [DOI] [PMC free article] [PubMed] [Google Scholar]
- De la Peña M, Garcia-Robles I 2010b. Ubiquitous presence of the hammerhead ribozyme motif along the tree of life. RNA 16: 1943–1950 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Eddy S 2005. RNABOB 2.1. ftp://selab.janelia.org/pub/software/rnabob/
- Epstein LM, Gall JG 1987. Self-cleaving transcripts of satellite DNA from the newt. Cell 48: 535–543 [DOI] [PubMed] [Google Scholar]
- Fedor MJ 2009. Comparative enzymology and structural biology of RNA self-cleavage. Annu Rev Biophys 38: 271–299 [DOI] [PubMed] [Google Scholar]
- Ferbeyre G, Smith JM, Cedergren R 1998. Schistosome satellite DNA encodes active hammerhead ribozymes. Mol Cell Biol 18: 3880–3888 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Flicek P, Aken BL, Ballester B, Beal K, Bragin E, Brent S, Chen Y, Clapham P, Coates G, Fairley S, et al. 2010. Ensembl's 10th year. Nucleic Acids Res 38: D557–D562 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hammann C, Westhof E 2007. Searching genomes for ribozymes and riboswitches. Genome Biol 8: 210 doi: 10.1186/gb-2007-8-4-210 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hampel A, Tritz R 1989. RNA catalytic properties of the minimum (-)sTRSV sequence. Biochemistry 28: 4929–4933 [DOI] [PubMed] [Google Scholar]
- Hertel KJ, Pardi A, Uhlenbeck OC, Koizumi M, Ohtsuka E, Uesugi S, Cedergren R, Eckstein F, Gerlach WL, Hodgson R, et al. 1992. Numbering system for the hammerhead. Nucleic Acids Res 20: 3252. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Khvorova A, Lescoute A, Westhof E, Jayasena SD 2003. Sequence elements outside the hammerhead ribozyme catalytic core enable intracellular activity. Nat Struct Biol 10: 708–712 [DOI] [PubMed] [Google Scholar]
- Kim YK, Kim VN 2007. Processing of intronic microRNAs. EMBO J 26: 775–783 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Martick M, Scott WG 2006. Tertiary contacts distant from the active site prime a ribozyme for catalysis. Cell 126: 309–320 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Martick M, Horan LH, Noller HF, Scott WG 2008. A discontinuous hammerhead ribozyme embedded in a mammalian messenger RNA. Nature 454: 899–902 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Milligan JF, Groebe DR, Witherall GW, Uhlenbeck OC 1987. Oligoribonucleotide synthesis using T7 RNA polymerase and synthetic DNA templates. Nucleic Acids Res 15: 8783–8798 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Penedo JC, Wilson TJ, Jayasena SD, Khvorova A, Lilley DM 2004. Folding of the natural hammerhead ribozyme is enhanced by interaction of auxiliary elements. RNA 10: 880–888 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Prody GA, Bakos JT, Buzayan JM, Schneider IR, Bruening G 1986. Autolytic processing of dimeric plant virus satellite RNA. Science 231: 1577–1580 [DOI] [PubMed] [Google Scholar]
- Przybilski R, Hammann C 2007. The tolerance to exchanges of the Watson/Crick basepair in the hammerhead ribozyme core is determined by surrounding elements. RNA 13: 1625–1630 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Przybilski R, Gräf S, Lescoute A, Nellen W, Westhof E, Steger G, Hammann C 2005. Functional hammerhead ribozymes naturally encoded in the genome of Arabidopsis thaliana. Plant Cell 17: 1877–1885 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rocheleau L, Pelchat M 2006. The Subviral RNA Database: A toolbox for viroids, the hepatitis delta virus and satellite RNAs research. BMC Microbiol 6: 24. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rojas AA, Vazquez-Tello A, Ferbeyre G, Venanzetti F, Bachmann L, Paquin B, Sbordoni V, Cedergren R 2000. Hammerhead-mediated processing of satellite pDo500 family transcripts from Dolichopoda cave crickets. Nucleic Acids Res 28: 4037–4043 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Salehi-Ashtiani K, Szostak JW 2001. In vitro evolution suggests multiple origins for the hammerhead ribozyme. Nature 414: 82–84 [DOI] [PubMed] [Google Scholar]
- Salehi-Ashtiani K, Luptak A, Litovchick A, Szostak JW 2006. A genomewide search for ribozymes reveals an HDV-like sequence in the human CPEB3 gene. Science 313: 1788–1792 [DOI] [PubMed] [Google Scholar]
- Saville BJ, Collins RA 1990. A site-specific self-cleavage reaction performed by a novel RNA in Neurospora mitochondria. Cell 61: 685–696 [DOI] [PubMed] [Google Scholar]
- Sharmeen L, Kuo MY, Dinter-Gottlieb G, Taylor J 1988. Antigenomic RNA of human hepatitis delta virus can undergo self-cleavage. J Virol 62: 2674–2679 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stage-Zimmermann TK, Uhlenbeck OC 1998. Hammerhead ribozyme kinetics. RNA 4: 875–889 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tabler M, Tsagris M 2004. Viroids: Petite RNA pathogens with distinguished talents. Trends Plant Sci 9: 339–348 [DOI] [PubMed] [Google Scholar]
- Uhlenbeck OC 1987. A small catalytic oligoribonucleotide. Nature 328: 596–600 [DOI] [PubMed] [Google Scholar]
- Webb CH, Riccitelli NJ, Ruminski DJ, Luptak A 2009. Widespread occurrence of self-cleaving ribozymes. Science 326: 953. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Winkler WC, Nahvi A, Roth A, Collins JA, Breaker RR 2004. Control of gene expression by a natural metabolite-responsive ribozyme. Nature 428: 281–286 [DOI] [PubMed] [Google Scholar]
- Wu H-N, Lin YJ, Lin FP, Makino S, Chang M-F, Lai MMC 1989. Human hepatitis d virus RNA subfragments contain an autocleavage activity. Proc Natl Acad Sci 86: 1831–1835 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zuker M 2003. Mfold web server for nucleic acid folding and hybridization prediction. Nucleic Acids Res 31: 3406–3415 [DOI] [PMC free article] [PubMed] [Google Scholar]