Abstract
In Chlamydomonas reinhardtii, 259 tRNA genes were identified and classified into 49 tRNA isoaccepting families. By constructing phylogenetic trees, we determined the evolutionary history for each tRNA gene family. The majority of the tRNA sequences are more closely related to their plant counterparts than to animals ones. Northern experiments also permitted us to show that at least one member of each tRNA isoacceptor family is transcribed and correctly processed in vivo. A short stretch of T residues known to be a signal for termination of polymerase III transcription was found downstream of most tRNA genes. It allowed us to propose that the vast majority of the tRNA genes are expressed and to confirm that numerous tRNA genes separated by short spacers are indeed cotranscribed. Interestingly, in silico analyses and hybridization experiments show that the cellular tRNA abundance is correlated with the number of tRNA genes and is adjusted to the codon usage to optimize translation efficiency. Finally, we studied the origin of SINEs, short interspersed elements related to tRNAs, whose presence in Chlamydomonas is exceptional. Phylogenetic analysis strongly suggests that tRNAAsp-related SINEs originate from a prokaryotic-type tRNA either horizontally transferred from a bacterium or originally present in mitochondria or chloroplasts.
A survey of the Chlamydomonas reinhardtii sequence project has resulted in the identification of 259 tRNA genes, allowing the decoding of every sense codon using the classical codon–anticodon wobble rules (Merchant et al. 2007). The C. reinhardtii genome is not fully sequenced but as stated by the Joint Genome Institute, the depth of sequence coverage for the main genome scaffolds is estimated at ∼12.8×. Thus, the 259 tRNA genes are expected to represent nearly all the total tRNA gene complement. The presence on the Chlamydomonas nuclear genome of both key plant and animal protein genes is one of the most exciting conclusions drawn from the sequencing and annotation project. However, the origin of noncoding RNAs such as tRNAs has not been studied yet. Several Northern experiments performed on a whole-cell tRNA fraction from Chlamydomonas and using as probes oligonucleotides specific to different higher plant tRNAs failed to give hybridization signals (C. Remacle and L. Maréchal-Drouard, unpublished data), indicating that at least some Chlamydomonas tRNA sequences are different from their higher-plant counterparts. Furthermore and in contrast to higher plants, a selenocysteine (Sec) protein insertion machinery was found in C. reinhardtii. A selenocysteine tRNASec was found expressed in this green alga (Rao et al. 2003) and the corresponding single gene was identified in frame of the annotation project (Merchant et al. 2007). Thus, this tRNASec represents the first nonanimal eukaryotic tRNASec. To determine whether the Chlamydomonas tRNA genes are evolutionarily closer to plants or to animals, we traced the evolutionary origin of most of Chlamydomonas tRNA genes by constructing phylogenetic trees. We then provided several in silico and experimental lines of evidence that most of these tRNA genes are expressed from mono- or polycistronic operons and correctly processed in vivo. As a consequence of the numerous gene duplications, the isoaccepting tRNA gene copy number varies from 1 to 17 copies. Furthermore, there is a codon bias in C. reinhardtii, with some codons highly used in highly expressed genes. In several organisms such as Escherichia coli, yeast, and Caenorhabditis elegans or in chloroplasts, an adjustment of tRNA population to the codon usage exists (Bulmer 1987; Pfitzinger et al. 1987; Duret 2000). We therefore asked the question whether a coadaptation between tRNA gene number, tRNA abundance, and codon usage occurs in this green alga. In silico analysis as well as hybridization experiments showed that also in Chlamydomonas, there is likely a gene dosage effect so as to adapt the tRNA population to an optimal cytosolic translation. Finally, tRNAs are at the origin of tRNA-related short interspersed elements (SINEs). SINE retroelements are widespread among eukaryotic organisms and have an important impact on the structure, evolution, and expression of the host genome. Remarkably for a unicellular microorganism, six SINE families were identified in Chlamydomonas (Jurka et al. 2005; Merchant et al. 2007). Because retroelements are one of the major sources of genome biodiversity, understanding their evolution is important. To get more insight on the origin of Chlamydomonas tRNA-related SINEs, phylogenies were constructed. The data obtained permitted us to propose that a prokaryotic tRNA is at the origin of tRNAAsp-related SINE families.
MATERIALS AND METHODS
Computational methods:
All available nucleic acid sequence data for C. reinhardtii were used. The majority of the nuclear sequence data were produced by the U.S. Department of Energy Joint Genome Institute (http://www.jgi.doe.gov/). All tRNA genes discussed here may be obtained at http://genome.jgi-psf.org/Chlre3/Chlre3.home.html. In this study, the most representative sequence for each isoaccepting tRNA gene family was chosen. This information was used to design oligonucleotide probes specific for the corresponding tRNAs (see supplemental Table 2). The cytosolic codon usage of C. reinhardtii has been obtained through the website http://www.kazusa.or.jp/codon/. For phylogenetic trees, the tRNA sequences from a set of organisms (Figure 1) were downloaded from the Genomic tRNA Database (http://lowelab.ucsc.edu/GtRNAdb/). Two tRNA sets were added: Cyanidioschyzon merolae and Dictyostelium discoideum; they were downloaded and extracted from their own websites (http://merolae.biol.s.u-tokyo.ac.jp/ and http://dictybase.org/, respectively). For each amino acid type, duplicated sequences were removed and alignments of tRNA sequences were done using the TFAM package (Ardell and Andersson 2006). Only the COVEMF program was used: this program aligns a set of sequences to an existing alignment and takes into account both primary and secondary tRNA structures. All sites in the alignments that contained too many variations or large numbers of gaps were excluded, especially in the variable loop because they could not be reliably aligned. Phylogenetic trees were produced using the PHASE2.0 package (Hudelot et al. 2003) and the Markov chain Monte Carlo (MCMC) method. This package is designed for rRNA and tRNA sequences and their conserved secondary structure. As recommended by the authors we used a gamma distribution with four rate categories and we executed three different runs with variable random seeds to check reproducibility and obtain similar consensus trees. The branch lengths were calculated on the consensus tree topology with the maximum-likelihood method. The trees were drawn using the Treedyn package (http://www.treedyn.org/).
Strain and growth conditions:
The strain of C. reinhardtii used in this work is the cell-wall-less mutant cw15 mt+. Cells were routinely grown under mixotrophic conditions, i.e., under light (4000 lux) on Tris-acetate phosphate (TAP) medium (Harris 1989).
Total tRNA isolation:
Total tRNA extract was prepared from whole cells as described in Maréchal-Drouard et al. (1995).
Northern analysis:
Total tRNAs from C. reinhardtii were separated on a 15% denaturing polyacrylamide gel and electrotransferred onto Hybond-N membrane (Amersham, Arlington Heights, IL). Oligonucleotides (see supplemental Table 2) were 5′-end labeled using T4 polynucleotide kinase and [γ-32P]ATP under standard conditions. Hybridization and washing were performed as described in Delage et al. (2003).
RNA labeling and hybridization to oligonucleotides:
Isolated tRNAs were 3′-end labeled using 5′[32P]pCp and RNA ligase (Maréchal-Drouard et al. 1990). Oligonucleotides separated on a 15% denaturing polyacrylamide gel and electrotransferred onto Hybond-N membrane (Amersham) were hybridized against 3′-end-labeled tRNAs, using conditions described above for Northern analysis.
Cloning and sequencing of RT–PCR products:
Transfer RNAs purified on a 15% denaturing polyacrylamide gel (Maréchal-Drouard et al. 1995) were used as a substrate for RT–PCR amplification. Gel-isolated RT–PCR products were cloned into PCR-2-TOPO (Invitrogen, San Diego).
RESULTS AND DISCUSSION
Phylogenetic analysis of the 259 nuclear tRNA genes:
Taking into account 2 families for tRNAAla(CGC) and 2 families for tRNAAla(UGC) (see below), 49 isoacceptor families were identified in Chlamydomonas, allowing the decoding of every sense codon using the classical codon–anticodon wobble hypothesis (Table 1). To determine the evolutionary history for each of these families, phylogenetic trees were constructed. Different situations were observed. Figure 1A shows that tRNACys(GCA) is closely related to Arabidopsis thaliana tRNACys(GCA). The situation is the same for tRNAAsn, tRNAGln(CTG), tRNAGlu, tRNAHis, tRNALys, elongator tRNAMete, tRNAPhe, tRNAPro, and tRNAGly [except for tRNAGly(CCC)] (supplemental Figure 6). All these families appear to be more closely related to the plant tRNA genes than to the animal ones. In contrast, other tRNA genes are more evolutionarily related to animal ones. For example, Chlamydomonas tRNATyr isoacceptors clearly branch among animal and fungus tRNA genes (Figure 1B). For two amino acids, Ile and Leu depending on the isoacceptor tRNA considered, tRNA gene sequences are more related to either the plant sequences [tRNAIle(TAT), tRNALeu(TAA, CAA)] or the animal ones [tRNAIle(AAT), tRNALeu(CAG, AAG, TAG)] (Figure 1C and supplemental Figure 6). The single-copy tRNAIle(GAT) gene (not shown in Table 1) was identified by blast as a plastidial tRNAIle gene. It is part of a 900-nt chloroplastic fragment that was found to be inserted into the nuclear genome (Merchant et al. 2007). For tRNAArg and tRNASer, a few tRNA gene sequences are linked to the plant sequences [tRNAArg(ACG), tRNASer(GCT)], and some others are more related to animal tRNA species [e.g., tRNAArg(CCG, TCG)], although in this latter case the Bayesian posterior probabilities are weak and cannot unambiguously ascertain this conclusion (supplemental Figure 6). The tRNATrp sequences are close to the C. elegans and Arabidopsis sequences. Initiator tRNAMeti and tRNAGln(TTG) has slightly diverged from the other eukaryotic sequences (supplemental Figure 6). No reliable phylogenetic tree was constructed from the alignment of tRNAThr sequences. Finally, the case of tRNAAla constitutes an interesting situation. Two groups of tRNAAla sequences were retrieved from the genome sequence. The first group (group I) contains 13, 6, and 1 tRNA isoacceptors with AGC, CGC, and UGC anticodons, respectively; the second group (group II) contains 4 and 3 tRNA isoacceptors with CGC and UGC anticodons, respectively. Whereas within each group, mature tRNA sequences are either identical or differ only by point mutations (in particular at the level of the anticodons), the sequences between the two groups of tRNAAla genes differ significantly (Figure 2A). Phylogenetic analysis shows that they form two distinct branches and thus have independent evolutionary history (Figure 2B). In the first group, tRNAs are related to higher plant and animal tRNAAla genes whereas tRNAs from the second group form a separate branch on the tree and are more closely related to Saccharomyces cerevisiae and Schizosaccharomyces pombe tRNAAla genes. It is also worth noting that all group I tRNAAla genes have an intron located at position 37/38; by contrast, only 1 of the 7 group II tRNAAla genes has acquired an intron at the same position. Alanine tRNAs with an AGC anticodon are present only in group I but tRNAs with the two other anticodons, CGC and UGC, are found in both groups and thus there is a priori a redundancy of function, leading to the following question: Are both types of tRNAAla genes expressed in Chlamydomonas? To address this question, the expression of all tRNA isoacceptor families was analyzed.
TABLE 1.
Amino acids | Anticodonsa | Codonsb | Gene nos.a | % cytoc | Probesd |
---|---|---|---|---|---|
Ala | AGC | GCU–GCC | 13 | 7.24 | Al |
CGC | GCG | 6 + 4 | 4.29 | A1–A2 | |
UGC | GCA | 3 + 1 | 1.06 | A1–A2 | |
Arg | CCG | CGG | 3 | 1.1 | Ri |
ACG | CGU–CGC | 11 | 4.04 | R2 | |
CCU | AGG | 2 | 0.27 | R3 | |
UCG | CGA | 1 | 0.19 | R4 | |
UCU | AGA | 3 | 0.07 | R5 | |
Asn | GGU | AAU–AAC | 7 | 3.09 | N |
Asp | GUC | GAU–GAC | 11 | 4.83 | D |
Cys | GCA | UGU–UGC | 7 | 1.39 | C |
Gln | UUG | CAA | 1 | 0.43 | Qi |
CUG | CAG | 6 | 3.6 | Q2 | |
Glu | UUC | GAA | 1 | 0.28 | El |
CUC | GAG | 13 | 5.41 | E2 | |
Gly | GCC | GGU–GGC | 17 | 7.27 | Gi |
UCC | GGA | 1 | 0.5 | G2 | |
CCC | GGG | 1 | 0.94 | G3 | |
His | GUG | CAC | 5 | 1.9 | H |
Ile | AAU | AUU–AUC | 7 | 3.49 | Ii |
UAU | AUA | 1 | 0.1 | 12 | |
Leu | CAG | CUG | 10 | 6.48 | Li |
CAA | UUG | 2 | 0.39 | L2 | |
UAA | UUA | 1 | 0.06 | L3 | |
UAG | CUA | 1 | 0.26 | L4 | |
AAG | CUU–CUC | 3 | 1.73 | L5 | |
Lys | CUU | AAG | ii | 4.42 | Ki |
UUU | AAA | 1 | 0.24 | K2 | |
Mete | CAU | AUG i | 8 | 0.21 | Mi |
CAU | AUG e | 6 | 2.38 | Me | |
Phe | GAA | UUU–UUC | 9 | 3.24 | F |
SeC | UCA | UGA | 1 | ND | SeC |
Pro | AGG | CCU–CCC | 13 | 3.65 | P |
CGG | CCG | 6 | 2.02 | P | |
UGG | CCA | 1 | 0.52 | P | |
Ser | AGA | UCU–UCC | 5 | 2.09 | Si |
UGA | UCA | 1 | 0.32 | S2 | |
CGA | UCG | 5 | 1.63 | S3 | |
GCU | AGC–AGU | 8 | 2.52 | S4 | |
Thr | AGU | ACU–ACC | 6 | 3.35 | Ti |
CGU | ACG | 3 | 1.55 | T2 | |
UGU | ACA | 2 | 0.41 | T3 | |
Trp | CCA | UGG | 5 | 1.31 | W |
Tyr | GAC | UAU–UAC | 8 | 2.56 | Y |
Val | AAC | GUU–GUC | 7 | 2.1 | Vi |
CAC | GUG | 10 | 4.61 | V2 | |
UAC | GUA | 1 | 0.21 | V3 |
Each tRNA isoacceptor family is identified by its anticodon and the number of corresponding genes is given according to Merchant et al. (2007). For tRNAAlaCGC and tRNAAlaUGC, the two gene numbers refer to the two types of sequences (groups I and II, see Figure 2) identified for each isoacceptor.
Minimal potential codon recognition pattern of tRNA isoacceptors listed according to the anticodons in the previous column.
Cytosolic (% cyto) codon usage as obtained from http://www.kazusa.or.jp/codon/.
Names of probes are according to supplemental Table 2. The plastidial tRNAIleGAT gene inserted into the nuclear genome is not indicated.
For Met codons AUG i and AUG e, i and e stand for initiator and elongator tRNAMet, respectively.
Expression of the tRNA isoacceptor families:
Northern blot experiments performed using as probes oligonucleotides specific to each tRNA species (supplemental Table 2) permitted us to obtain an overview of the expression of all nuclear tRNA gene families from C. reinhardtii (Figure 3A), except tRNAAla and tRNAPro. For these two tRNA isoacceptor families, RT–PCR experiments were additionally performed since sequences mostly differ only at the level of the anticodon (supplemental Table 2). RT–PCR experiments on total C. reinhardtii tRNA fraction followed by cloning and sequence analysis show that tRNAAla of group I (with AGC, UGC, or CGC anticodons) and group II (with UGC or CGC anticodons) as well as tRNAPro (with AGG or CGG anticodons) are indeed expressed and correctly processed (Figure 3B). In particular, for all tRNAs, splicing of introns was effective. It is worth noting that >20 years ago, Tyc et al. (1983) reported that a Chlamydomonas tRNA preparation contains small amounts of tRNA 5′ halves and corresponding 3′ halves. They postulated that Chlamydomonas tRNAAla and tRNAPro genes could have introns and that the tRNA halves they detected may be intermediates of the tRNA splicing process (Tyc et al. 1983). As mentioned above, part of the tRNAAla and tRNAPro gene population indeed contains introns. However, the position of the splicing site (between A37:U38 for tRNAAla and between G37:U38 for tRNAPro) does not correlate with the position of the sites of tRNA half-molecule ligation (between U34:G35 for tRNAAla and between G35:G36 for tRNAPro) found by these authors. The partial RNA sequences they obtained are close to the tRNA gene sequences now available. This discrepancy suggests, contrary to their hypothesis, that the tRNA halves detected in Chlamydomonas do not correspond to true intermediates of tRNA splicing and that the RNA kinase and ligase activities they observed may rather be involved in the tRNA repair. Concerning tRNAAla(AGC) and tRNAPro(AGG), all sequences obtained presented a GGC or a GGG anticodon. This means that the A at the wobble position of the anticodon is very likely post-transcriptionaly converted to inosine (I) since the reverse transcriptase recognizes I as a G. The presence of I at position 34 has already been reported for several plant tRNAs (Glover et al. 2001). No tRNAPro with a UGG anticodon was found out of 50 sequences. There is only a single-copy gene coding for a tRNAPro(UGG) as compared to 13 and 6 copies for tRNAPro with AGG and CGG anticodons, respectively. Either we did not sequence enough clones or this gene is not transcribed. As a whole, Northern analysis indicates that at least one copy of each tRNA family but one [tRNAPro(UGG)] is expressed. However, it is more difficult to know whether all currently annotated Chlamydomonas tRNA genes are transcribed. In an attempt to answer this question, we used another mark of the expression of eukaryotic tRNA genes. A specific feature of virtually all newly synthesized eukaryotic RNA polymerase III transcripts is the presence of a short poly(U) motif at their 3′ ends. We therefore looked for the presence of a short poly(T) stretch a few nucleotides downstream of Chlamydomonas tRNA genes. For 66% of them, a short stretch of four to five T's was found 1–10 nucleotides downstream of the predicted 3′ end of each tRNA gene (see, for example, supplemental Figure 7, tRNAsVal nos. 1, 4, 6, 7, 10, and 12). In particular, a short stretch of four T's is present 2 nucleotides downstream of the 3′ end of the single tRNAPro(TGG) gene, thus indicating that this gene is likely transcribed by polymerase III. Surprisingly, for one-third of the tRNA gene sequences, the poly(T) stretch is absent. In Chlamydomonas, ∼160 tRNA genes are clustered on the genome. They are associated on the same or opposite DNA strands and can be separated by spacers as short as 3–7 nt (Merchant et al. 2007). This unusual genomic organization in Chlamydomonas leads to the hypothesis that a few tRNA genes could be cotranscribed. As shown in Figure 3C for a few scaffolds, a poly(T) motif is mostly present only at the most downstream tRNA gene of each cluster (see also supplemental Figure 7). When tRNA genes are on opposite strands, then a stretch of T's is found in both directions. Thus, these data strongly support that polycistronic precursor tRNAs do exist in Chlamydomonas and explain why a poly(T) motif is not always present downstream of each tRNA gene. Polycistronic tRNA gene units are not the rule in eukaryotic systems and in yeast, animals, and plants, each tRNA gene is usually independently transcribed. Such unusual genomic organization has been reported only in a few cases, in particular in the protozoan Trypanosoma brucei where dicistronic tRNA precursors are found (LeBlanc et al. 1999; Mottram et al. 1991a,b). The presence of a short poly(T) motif at the 3′ end of monocistronic and polycistronic tRNA gene units strongly suggests that most of the predicted tRNA genes are expressed in Chlamydomonas. However, for a few tRNAMet and tRNALys genes, no poly(T) motif was observed, although they are apparently not clustered with other tRNA genes. The most striking exception is the case of tRNAPhe genes. Nine copies of the tRNAPhe(GAA) gene were identified on three independent scaffolds (4, 61, and 85) in the Chlamydomonas genome. None of them is close to other tRNA genes and none of them ends with a stretch of T residues. As a hybridization signal was obtained by a Northern experiment (Figure 3A) and as no other tRNAPhe genes have been predicted, at least some of these 9 copies must be expressed so as to decode the TTT and TTC codons. Kruszka et al. (2003) showed that plant tRNA genes can be cotranscribed with snoRNA genes, but neither we nor Olivier Vallon (Institut de Biologie Physico-Chimique, Paris; O. Vallon, personal communication) found snoRNAs in the vicinity of any tRNAPhe genes. Strikingly, however, for 8 of the 9 tRNAPhe genes, a poly(T) stretch can be found ∼75 nt downstream of the 3′ end of each gene (see supplemental Figure 8). It is thus likely that for the tRNAPhe gene family, these poly(T) stretches are important for transcription termination and for the stabilization of transcripts. Why the poly(T) stretches are not linked to the 3′ end of the tRNAPhe genes will need to be clarified in the future. It must also be noted that 6 of 9 of the tRNAPhe genes are found in introns of protein-coding genes (data not shown). Thus we cannot exclude the possibility that some tRNAs are generated during intron splicing, a process known to be essential for the biosynthesis of the snoRNA-coding region (e.g., Vincenti et al. 2007). These data hold true for all the tRNA genes where no poly(T) stretches were found in the close vicinity of their 3′ ends. Thus, to conclude, most nuclear tRNA genes are likely functional in Chlamydomonas.
The Chlamydomonas La protein:
In a wide range of eukaryotes the poly(U) motif is specifically recognized by an RNA-binding protein called the La protein (Wolin and Cedervall 2002). One of the major functions of the La protein is to stabilize newly synthesized small RNAs. For example, the binding of the La protein to primary tRNA transcripts prevents exonucleolytic nibbling of their 3′ trailer and promotes their endonucleolytic removal. Blast and advanced searches in the Chlamydomonas genome using the recently characterized A. thaliana La protein (Fleurdepine et al. 2007) revealed the presence of three La or La-like proteins. They were subsequently called Cr-La1, Cr-La2, and Cr-La3 (Figure 3D). The N-terminal domain (NTD) of true La proteins is extremely well conserved. It always contains the eukaryotic La motif of ∼60–80 amino acids (PF05383; http://pfam.sanger.ac.uk) at the very beginning of the NTD and closely followed by a typical RNA recognition motif (RRM) (PF00076, http://pfam.sanger.ac.uk/). On the basis of sequence alignments of various La proteins and the Pfam database, an La motif was found on the three Cr-La proteins. However in Cr-La3, the La domain is not located in the NTD but rather in the middle of the protein. The La domain is followed by the typical RRM in Cr-La1. No RRM is retrieved on Cr-La2 using Pfam but such a motif was identified by sequence alignments, suggesting that it has slightly diverged from the consensus RRM sequence. No RRM was found on Cr-La3. The C-terminal domain (CTD) of La proteins is more variable. A second atypical RRM (PF08777) is found in La proteins from all vertebrates and higher plants but is absent in yeast. Here, on the basis of sequence alignments and secondary structure prediction, this second RRM was identified in Cr-La1 but not in Cr-La2 and Cr-La3. It is worth mentioning that Cr-La3 contains DM15 domains in the CTD, a motif of unknown function already found in La-like proteins of various organisms, and Cr-La1 contains a nuclear localization signal as the genuine La proteins (Fleurdepine et al. 2007). Altogether, our in silico data provide evidence that Cr-La1 is the bona fide Chlamydomonas La protein whereas Cr-La2 and Cr-La3 are more related to La-like proteins (also called LARP) (Wolin and Cedervall 2002). In Arabidopsis, two “true” La proteins are expressed, although so far only one was shown to fulfill the genuine La function (Fleurdepine et al. 2007). It is worth noting that in contrast to higher plants and as in vertebrates and fungi, the nuclear genome of Chlamydomonas encodes only one true La homolog.
The copy number of nuclear tRNA genes is correlated to the codon usage of the nuclear genome in C. reinhardtii:
The 259 nuclear-encoded tRNA genes are grouped into 49 isoacceptor families (Table 1). As mentioned above, tRNA gene clusters and recent tRNA gene duplications are present on the Chlamydomonas genome. Consequently the isoaccepting tRNA gene copy number varies from 1 copy to 17 copies (Table 1). Codon usage is strongly biased in the nuclear genome of C. reinhardtii; some codons are mostly not used whereas others are very frequent (Table 1). Interestingly, the comparison between codon usage and gene copy number reveals that all the favored codons are decoded by the isoaccepting tRNAs that have the highest gene copy numbers. There is only one exception to this rule, the case of initiator AUG methionine codons where the number of genes is far higher than expected from the abundance of the initiation codon. It is likely that the initiator tRNAMeti has to be abundant to not be a limiting factor in the initiation of translation, as already suggested in chloroplasts (Pfitzinger et al. 1987). This mechanism of coadaptation between tRNA gene number and codon usage seems to be frequent and has already been reported in different organisms such as E. coli and C. elegans (Bulmer 1987; Duret 2000). Furthermore, hybridization of pCp-labeled total tRNAs from C. reinhardtii with oligonucleotides specific to different tRNA isoacceptors permitted us to show that indeed for the studied tRNAs (Figure 4), the most abundant tRNA isoacceptors (e.g., tRNALeuCAG, tRNAValCAC, tRNAGlyGCC, tRNALysCUU, and tRNAGlnCUG) recognize the most frequent codons whereas scarce tRNA isoacceptors (tRNALeuUAG, tRNAValAAC, tRNAGlyUCC, tRNALysUUU, and tRNAGlnUUG) recognize rare codons. In various organisms and chloroplasts, the abundance of specific tRNAs was shown to be directly correlated to the frequency of the cognate codons (Pfitzinger et al. 1987; Dong et al. 1996). Here, in Chlamydomonas, in relation to the number of the corresponding genes, it seems that there is also a good adjustment of the tRNA population to the cytosolic codon usage.
tRNA-related SINEs:
SINE retroelements are found in a wide variety of eukaryotes (Kramerov and Vassetzky 2005). These short (<500 bp) nonautonomous elements are transcribed by RNA polymerase III and rely on long interspersed elements (LINEs) for their propagation. Their copy number ranges from a few hundred to more than a million. Most eukaryotic SINEs are ancestrally derived from tRNA genes (and in rare cases from 7SL or 5S RNAs), although the typical tRNA cloverleaf structure is not apparent anymore for most SINE RNAs (Sun et al. 2007). SINEs derived from tRNAs usually have a composite structure made of a 5′ tRNA-related portion followed by a tRNA-unrelated portion (Okada and Ohshima 1995; Deragon and Zhang 2006). In all cases, an internal promoter (composed of A and B boxes and recognized by the RNA polymerase III machinery (Arnaud et al. 2001) is present in the SINE tRNA-related portion.
With >200 copies of tRNA-related SINEs [belonging to six families named SINEX-1 to -6_CR, see supplemental Appendix 1 for consensus sequences and Repbase (http://www.girinst.org/repbase/index.html)], C. reinhardtii is the first unicellular organism where such elements have been identified (Jurka et al. 2005; Merchant et al. 2007). SINEX-3_CR and SINEX-4_CR consensus sequences have no significant homologies with other C. reinhardtii SINE consensus sequences and therefore truly represent founder elements of two distinct families. In contrast, SINEX-1_CR and SINEX-2_CR consensus sequences have highly similar tRNA-related portions but divergent tRNA-unrelated portions (see supplemental Figures 9 and 10), suggesting that founder copies of these two families evolved from a common ancestor following the acquisition of a new tRNA-unrelated region (Deragon and Zhang 2006). SINEX-5_CR and SINEX-6_CR consensus sequences are also homologous, suggesting a common origin but nevertheless present patches of divergence (at the beginning of the tRNA-related and -unrelated portions; see supplemental Figures 9 and 10), likely resulting from gene conversion events (Kass et al. 1995; Lenoir et al. 1997). Still SINEX-5 and SINEX-6 were defined as two different families since their consensuses share 75% sequence identity and are therefore under the 80% threshold usually used to include sequences in a given family (Wicker et al. 2007).
Since the SINE/LINE partnership is, in some cases, based on common 3′-regions, we looked for LINEs in C. reinhardtii that would share primary sequence homologies in their 3′ ends with SINEs. A putative SINE/LINE partnership between L1-1_CR and SINEX-3_CR has already been described in Repbase (http://www.girinst.org/2004/vol4/issue2/SINEX-3_CR.html; see supplemental Figure 11A for its description). We observed that all other SINE families also possess primary sequence homologies in their 3′ ends with at least one LINE consensus sequence from the RandI family (see supplemental Figure 11B). This suggests that most SINE/LINE partnerships in C. reinhardtii are based on primary sequence homologies and are not of the “relaxed” type (Okada et al. 1997).
Remarkably, the SINE tRNA-related portions from SINEX-3_CR, SINEX-5_CR, and SINEX-6_CR families have apparently kept an authentic tRNA structure as well as an intron between positions 37 and 38 of the tRNA sequences (Figure 5A). The 5′ tRNA-related sequences of SINEX-3_CR, SINEX-5_CR, and SINEX-6_CR copies resemble a tRNAArg(CCG) (a few of them show point mutations in the anticodon leading to ACG, CCA, CCC, and CTG anticodons), a tRNAAsp(AUC), and a tRNAAsp(GUC), respectively. These families contain 40, 11, and 18 members, respectively. The tRNA-related region of 36 SINEX-3_CR copies ends with a genome-encoded CCA, a motif that is post-transcriptionaly added in true eukaryotic tRNAs. Two interesting questions can be addressed: Can we trace the origin of the tRNA domains of these SINEs and are these tRNA domains folded in the predicted canonical tRNA cloverleaf structure? By constructing phylogenetic trees, we found that the tRNA-related regions of SINE-3_CR sequences are closely related to eukaryotic tRNAArg (see supplemental Figure 6). Interestingly, whereas the true Chlamydomonas tRNAsAsp are, as expected, evolutionarily related to the eukaryotic aspartyl tRNAs (and branch with metazoan tRNA sequences), the tRNAAsp(AUC and GUC)-related SINE domains of SINEX-5_CR and SINEX-6_CR are evolutionarily linked to bacterial and organellar aspartyl tRNAs (Figure 5B). On the basis of these observations, we propose that tRNAAsp-related SINEs of Chlamydomonas derive initially from a prokaryotic tRNA that has been horizontally transferred during evolution. We cannot also exclude the possibility that the tRNA at the origin of these two SINE families comes from the plastidial or mitochondrial tRNAAsp integrated in the nuclear genome during evolution.
With a few exceptions (Matsumoto et al. 1986; Lin et al. 2001), the majority of SINEs derived from tRNAs do not fold as their ancestral tRNA molecule and it was recently shown that most tRNA-related SINEs have very similar RNA structures (Rozhdestvensky et al. 2001; Sun et al. 2007). Remarkably, when the intron is spliced, SINEX-3_CR, SINEX-5_CR, and SINEX-6_CR tRNA-related sequences can be folded in the theoretical cloverleaf structure. They present only one or two mismatches in the stems and all invariant and semi-invariant nucleotides are present (Figure 5A). On one hand, RT–PCR experiments performed with a pair of primers specific to SINEX-3_CR tRNAArg-related sequence allowed us to amplify the tRNA-related domain of this SINE family but in all 20 clones tested the intron was present (data not shown). This result shows that the folding of the SINE RNA is not compatible with the correct splicing of a pre-tRNA molecule. On the other hand, Northern experiments on a whole-cell tRNA fraction using as probes oligonucleotides specific to SINEX-5_CR and SINEX-6_CR tRNAAsp(AUC)- and tRNAAsp(GUC)-related SINEs gave no signal (data not shown). This demonstrates that no mature tRNA can be expressed from these SINE RNAs, very likely because the correct cloverleaf folding recognized by tRNA-processing enzymes (e.g., RNAse P and RNAse Z) is not present. As a whole, it is likely that, despite their almost perfect theoretical cloverleaf secondary structure, Chlamydomonas tRNA-related SINE RNAs do not fold as their ancestral molecule.
Conclusion:
On the basis of the data provided here on the evolution and expression of Chlamydomonas tRNA genes, it is clear that this green alga has retained some features of the plant–animal common ancestor, as attested, for instance, by an La protein containing two RRM domains. While the tRNASec gene is specific to the animal lineage, only few tRNA genes are linked to their animal counterparts and most of them appear to be phylogenetically related to the plant ones. Whether the set of amino-acyl-tRNA synthetases able to recognize these tRNAs has followed the same evolution will need to be answered in the future. In most cases, animal mitochondrial genomes have retained a complete set of tRNA genes whereas in higher plants, one-third to one-half of the mitochondrial tRNAs are nucleus encoded and used both in the cytosolic and in the mitochondrial translation machineries (Schneider and Marechal-Drouard 2000). The mitochondrial genome of C. reinhardtii encodes only three tRNA genes and thus the import of nucleus-encoded tRNAs is expected. The fact that Chlamydomonas cytosolic tRNAs resemble more their plant than their animal counterparts can be linked to the occurrence of this import process. Finally, we report the presence of several polycistronic tRNA units in C. reinhardtii, which is quite unusual in a eukaryotic organism (Haeusler and Engelke 2006). Why and how Chlamydomonas has developed such a specific mode of tRNA gene expression will need to be further addressed.
Acknowledgments
This work was supported by the French Centre National de la Recherche Scientifique, by the Belgian Fonds de la Recherche Fondamentale Collective (grant nos. 2.4582.05 and 2.4638.05 to C.R.), by a joint France–Wallonie Tournesol grant (to L.M.D. and C.R.), and by a short-term European Molecular Biology Organization fellowship (to E.V.).
References
- Ardell, D. H., and S. G. Andersson, 2006. TFAM detects co-evolution of tRNA identity rules with lateral transfer of histidyl-tRNA synthetase. Nucleic Acids Res. 34 893–904. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Arnaud, P., Y. Yukawa, L. Lavie, T. Pelissier, M. Sugiura et al., 2001. Analysis of the SINE S1 Pol III promoter from Brassica; impact of methylation and influence of external sequences. Plant J. 26 295–305. [DOI] [PubMed] [Google Scholar]
- Bulmer, M., 1987. Coevolution of codon usage and transfer RNA abundance. Nature 325 728–730. [DOI] [PubMed] [Google Scholar]
- Delage, L., A. M. Duchene, M. Zaepfel and L. Marechal-Drouard, 2003. The anticodon and the D-domain sequences are essential determinants for plant cytosolic tRNA(Val) import into mitochondria. Plant J. 34 623–633. [DOI] [PubMed] [Google Scholar]
- Deragon, J. M., and X. Zhang, 2006. Short interspersed elements (SINEs) in plants: origin, classification, and use as phylogenetic markers. Syst. Biol. 55 949–956. [DOI] [PubMed] [Google Scholar]
- Dong, H., L. Nilsson and C. G. Kurland, 1996. Co-variation of tRNA abundance and codon usage in Escherichia coli at different growth rates. J. Mol. Biol. 260 649–663. [DOI] [PubMed] [Google Scholar]
- Duret, L., 2000. tRNA gene number and codon usage in the C. elegans genome are co-adapted for optimal translation of highly expressed genes. Trends Genet. 16 287–289. [DOI] [PubMed] [Google Scholar]
- Fleurdepine, S., J. M. Deragon, M. Devic, J. Guilleminot and C. Bousquet-Antonelli, 2007. A bona fide La protein is required for embryogenesis in Arabidopsis thaliana. Nucleic Acids Res. 35 3306–3321. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Glover, K. E., D. F. Spencer and M. W. Gray, 2001. Identification and structural characterization of nucleus-encoded transfer RNAs imported into wheat mitochondria. J. Biol. Chem. 276 639–648. [DOI] [PubMed] [Google Scholar]
- Haeusler, R. A., and D. R. Engelke, 2006. Spatial organization of transcription by RNA polymerase III. Nucleic Acids Res. 34 4826–4836. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Harris, E. H., 1989. The Chamydomonas Sourcebook. Academic Press, San Diego.
- Hudelot, C., V. Gowri-Shankar, H. Jow, M. Rattray and P. G. Higgs, 2003. RNA-based phylogenetic methods: application to mammalian mitochondrial RNA sequences. Mol. Phylogenet. Evol. 28 241–252. [DOI] [PubMed] [Google Scholar]
- Jurka, J., O. Kohany, A. Pavlicek, V. V. Kapitonov and M. V. Jurka, 2005. Clustering, duplication and chromosomal distribution of mouse SINE retrotransposons. Cytogenet. Genome. Res. 110 117–123. [DOI] [PubMed] [Google Scholar]
- Kass, D. H., M. A. Batzer and P. L. Deininger, 1995. Gene conversion as a secondary mechanism of short interspersed element (SINE) evolution. Mol. Cell. Biol. 15 19–25. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kramerov, D. A., and N. S. Vassetzky, 2005. Short retroposons in eukaryotic genomes. Int. Rev. Cytol. 247 165–221. [DOI] [PubMed] [Google Scholar]
- Kruszka, K., F. Barneche, R. Guyot, J. Ailhas, I. Meneau et al., 2003. Plant dicistronic tRNA-snoRNA genes: a new mode of expression of the small nucleolar RNAs processed by RNase Z. EMBO J. 22 621–632. [DOI] [PMC free article] [PubMed] [Google Scholar]
- LeBlanc, A. J., A. E. YermovskyKammerer and S. L. Hajduk, 1999. A nuclear encoded and mitochondrial imported dicistronic tRNA precursor in Trypanosoma brucei. J. Biol. Chem. 274 21071–21077. [DOI] [PubMed] [Google Scholar]
- Lenoir, A., B. Cournoyer, S. Warwick, G. Picard and J. M. Deragon, 1997. Evolution of SINE S1 retroposons in Cruciferae plant species. Mol. Biol. Evol. 14 934–941. [DOI] [PubMed] [Google Scholar]
- Lin, Z., O. Nomura, T. Hasyahsi, Y. Wada and H. Yasue, 2001. Characterization of a SINE species from vicuma and its distribution in animal species including the family Camelidae. Mamm. Genome 12 305–308. [DOI] [PubMed] [Google Scholar]
- Maréchal-Drouard, L., P. Guillemaut, A. Cosset, M. Arbogast, F. Weber et al., 1990. Transfer RNAs of potato (Solanum tuberosum) mitochondria have different genetic origins. Nucleic Acids Res. 18 3689–3696. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Maréchal-Drouard, L., I. Small, J.-H. Weil and A. Dietrich, 1995. Transfer RNA import into plant mitochondria, pp. 310–327 in Mitochondrial Biogenesis and Genetics, Vol. 260, edited by G. M. Attardi and A. Chomym. Spectrum Publisher Services, New York. [DOI] [PubMed]
- Matsumoto, K., K. Murakami and N. Okada, 1986. Gene for lysine tRNA1 may be a progenitor of the highly repetitive and transcribable sequences present in the salmon genome. Proc. Natl. Acad. Sci. USA 83 3156–3160. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Merchant, S. S., S. E. Prochnik, O. Vallon, E. H. Harris, S. J. Karpowicz et al., 2007. The Chlamydomonas genome reveals the evolution of key animal and plant functions. Science 318 245–250. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mottram, J. C., S. D. Bell, R. G. Nelson and J. D. Barry, 1991. a tRNAs of Trypanosoma brucei. Unusual gene organization and mitochondrial importation. J. Biol. Chem. 266 18313–18317. [PubMed] [Google Scholar]
- Mottram, J. C., Y. Shafi and J. D. Barry, 1991. b Sequence of a tRNA gene cluster in Trypanosoma brucei. Nucleic Acids Res. 19 3995. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Okada, N., and K. Ohshima, 1995. Evolution of tRNA-derived SINEs, pp. 61–80 in The Impact of Short Interspersed Elements (SINEs) on the Host Genome, edited by R. Maraia. Landes Company/Springer, Austin, TX.
- Okada, N., I. M. Hamada, I. Ogiwara and K. Phshima, 1997. SINEs and LINEs share common 3′ sequences: a review. Gene 205 229–243. [DOI] [PubMed] [Google Scholar]
- Pfitzinger, H., P. Guillemaut, J. H. Weil and D. T. N. Pillay, 1987. Adjustment of the tRNA population to the codon usage in chloroplasts. Nucleic Acids Res. 15 1377–1386. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rao, M., B. A. Carlson, S. V. Novoselov, D. P. Weeks, V. N. Gladyshev et al., 2003. Chlamydomonas reinhardtii selenocysteine tRNA[Ser]Sec. RNA 9 923–930. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rozhdestvensky, T. S., A. M. Kopylov, J. Brosius and A. Huttenhofer, 2001. Neuronal BC1 RNA structure: evolutionary conversion of a tRNA(Ala) domain into an extended stem-loop structure. RNA 7 722–730. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schneider, A., and L. Marechal-Drouard, 2000. Mitochondrial tRNA import: are there distinct mechanisms? Trends Cell Biol. 10 509–513. [DOI] [PubMed] [Google Scholar]
- Sun, F. J., S. Fleurdepine, C. Bousquet-Antonelli, G. Caetano-Anolles and J. M. Deragon, 2007. Common evolutionary trends for SINE RNA structures. Trends Genet. 23 26–33. [DOI] [PubMed] [Google Scholar]
- Tyc, K., Y. Kikuchi, M. Konarska, W. Filipowicz and H. J. Gross, 1983. Ligation of endogenous tRNA 3′ half molecules to their corresponding 5′ halves via 2′-phosphomonester,3′5′-phosphodiester bonds in extracts of Chlamydomonas. EMBO J. 2 605–610. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Vincenti, S., V. De Chiara, I. Bozoni and C. Presutti, 2007. The position of yeast snoRNA-coding regions within host introns is essential for their biogenesis and for efficient splicing of the host pre-mRNA. RNA 13 138–150. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wicker, T., F. Sabot, A. Hua-Van, J. L. Bennetzen, P. Capy et al., 2007. A unified classification system for eukaryotic transposable elements. Nat. Rev. Genet. 8 973–982. [DOI] [PubMed] [Google Scholar]
- Wolin, S. L., and T. Cedervall, 2002. The La protein. Annu. Rev. Biochem. 71 375–403. [DOI] [PubMed] [Google Scholar]