Skip to main content
BMC Genomics logoLink to BMC Genomics
. 2025 Jul 19;26:676. doi: 10.1186/s12864-025-11870-w

Evolutionary origin of the frameshift sites in the ribosomal frameshifting genes of Euplotes

Ruan-lin Wang 1,2,, Xiao-yan Liu 1,2, Qing-yao Meng 1,2, Xin Zhang 1,2, Yue-jun Fu 1,2, Ai-hua Liang 1,2
PMCID: PMC12275403  PMID: 40684130

Abstract

Background

Programmed ribosome frameshifting is a translational recoding event in which ribosomes slip forward or backward along the mRNA. Although genes utilizing programmed ribosomal frameshifting for their expression have been found in most organisms, such genes are commonly considered rare. However, previous studies indicated that both +1 and +2 ribosomal frameshifting are frequently required for the expression of genes in the ciliates Euplotes. In this study, we explored the possible evolutionary origin of the frameshift sites by comparative transcriptome and genome analyses.

Results

We sequenced the transcriptomes of different Euplotes octocarinatus strains and performed comparative analyses of the frameshift sites. Finally, a total of 147 non-conserved frameshift sites among different strains were identified. Multiple sequence alignment results showed that +1 frameshift sites could be generated by random single-nucleotide insertion and +2 frameshift sites could be generated by insertion of ‘TA’ or single-nucleotide deletion. In addition, frameshift sites tend to be less frequent in highly expressed genes. The distances between indel sites and frameshift sites were generally short. And the changed amino acids numbers of the indel sites located inside the protein domain were significantly less than that of the indel sites located outside the domain. Furthermore, we also found a putatively newly formed frameshift site which exists only in the macronucleus but not in the micronucleus of Euplotes woodruffi.

Conclusions

We provide an overview of the evolutionary origin of frameshift sites in Euplotes. Our results suggest that the widespread ribosomal frameshifting in Euplotes may be a result of long-term accumulation of indel mutation. And preservation of these indel mutations in the genome of Euplotes apparently need to meet some constraints.

Supplementary Information

The online version contains supplementary material available at 10.1186/s12864-025-11870-w.

Keywords: Ribosomal frameshifting, Euplotes, Frameshift site, Evolutionary origin

Introduction

Ribosomes are cellular factories that catalyze the synthesis of proteins, a strictly regulated process [1]. During elongation phase, nucleotide triplets of an mRNA are sequentially translated into amino acid sequence of a protein. Errors in maintaining the reading frame can lead to the formation of truncated or incorrect proteins [2]. To avoid this, cells have evolved sophisticated quality control mechanisms to ensure the fidelity of decoding and reading frame maintenance. Thus, spontaneous frameshifting is highly infrequent [3]. However, in special cases, the ribosomes are induced to shift their reading frame upon encountering specific signals encoded in the mRNA. This process is termed programmed ribosomal frameshifting (PRF) [4]. In contrast to spontaneous frameshifting, PRF is a tightly controlled process, which occurs with high efficiency. To achieve higher efficiency, PRF often requires the presence of complex stimulatory signals such as slippery sequences, RNA pseudoknots, proteins, and small molecules such as polyamines [58].

PRF has been found in genes from a broad range of organisms, but usually only a handful of genes in a given organism utilize PRF for expression [911]. In contrast, the unicellular eukaryote Euplotes employ PRF at a large scale throughout their genomes [1216]. This was initially observed by analysis of available gene sequences in GenBank from Euplotes which indicated that about 5% genes required +1 ribosomal frameshifting to produce protein products [17]. Subsequent genome-wide investigations revealed that more than 10% genes use ribosomal frameshifting in their expression [12, 14, 15]. In addition to its high frequency, ribosomal frameshifting in Euplotes is highly efficient, which was revealed by a similar ribosome density upstream and downstream of frameshift sites [14]. Ribosome profiling data also suggested the presence of both +1 and +2 frameshifts in Euplotes, and that some genes have multiple frameshift sites [14].

Although PRF is strictly regulated by various stimulatory signals in other organisms, sequence signals or stimulators required for frameshifting in Euplotes seem to be weaker [12, 14, 18]. The sole sequence signal needed for frameshifting in Euplotes is either of the two stop codons (UAA or UAG) in slippery sequence. The RNA secondary structure that acts as a roadblock against the forward movement of the ribosome in other organisms is absent in Euplotes [12, 14]. Furthermore, the identity of the codon preceding stop codons is important for distinguishing between +1 and +2 frameshifting [14, 19]. As for trans-acting factors, it has been proved experimentally that altering eukaryotic release factor 1 (eRF1) of Euplotes to ignore UGA simultaneously decrease the affinity for UAA and UAG, increasing the probability of frameshifting [20]. Moreover, the eukaryotic translation initiation factor 5A (eIF5A) was also reported to participate in +1 PRF regulation in E. octocarinatus [21]. Thus, ribosomal frameshifting observed during mRNA translation in Euplotes seems strikingly different from the canonical PRF in other organisms. It has been proposed that the default function of stop codons in Euplotes is frameshifting, whereas termination of translation occurs only near the 3′ ends of mRNAs and probably requires additional factors [14, 22].

While the fitness benefits of such frameshift in Euplotes remain unclear, the accumulation of frameshift sites is a result of neutral evolution [18]. However, it is still unclear how these unusually large number of frameshift sites are generated in Euplotes. An initial comparative analysis of the telomerase reverse transcriptase (TERT) genes from different Euplotes species indicated that the +1 frameshift sites were generated by base insertions that generate an AAA–TAA motif [23]. Recently, comparative transcriptome study of eight Euplotes species also found several instances of insertions that disrupt the protein-coding reading frame [18]. However, these two studies offered only a few instances of insertions, and all were +1 frameshift sites. We suggested that comparative analysis of the transcriptomes from different strains of the same Euplotes species might be more useful in exploring the generation mechanism of frameshift sites.

Previously, we have performed a genome-wide investigation of PRF genes in E. octocarinatus (strain 69) through genome and transcriptome sequencing [12]. Here, we sequenced transcriptomes of another two E. octocarinatus strains (Zam5b-1 and VTN7) and explored the generation mechanism of both +1 and +2 frameshift sites. We show that the +1 frameshift sites could be generated by single-nucleotide insertion. The +2 frameshift sites could be generated either by an insertion of ‘TA’ or by single-nucleotide deletion. Our results also suggested that expression rates of transcripts containing frameshift sites were significantly lower than those not containing frameshift sites. Indel mutations only cause a change of a short segment of the encoded protein (up to 26 amino acids). And the changed amino acids numbers of the indel sites located inside the protein domain were significantly less than that of the indel sites located outside the domain. These may be necessary constraints for indel mutation to be preserved in the genome of Euplotes. We also found a newly formed frameshift site which exists only in the macronucleus but not in the micronucleus of Euplotes woodruffi.

Materials and methods

Cell culture, RNA isolation, and transcriptome sequencing

E. octocarinatus strains Zam5b-1 and VTN7 were derived from the University of Pisa, Italy [24]. Both strains were cultured in cell culture flasks containing spring water at 22 °C and fed on the photosynthetic flagellate Chlorogonium elongatum. Before collection, cells were starved for 7 days to remove algal contamination. Subsequently, cells were collected by centrifugation (4,000 rpm, 3 min). Total RNA was isolated from asexually growing E. octocarinatus cells using TRIzol plus purification kit (Thermo Fisher Scientific, Waltham, USA) per manufacturer’s instructions. Stranded RNAseq libraries were constructed using NEBNext®Ultra™ RNA Library Prep Kit for Illumina® (NEB, Massachusetts, USA) and sequenced on an Illumina Novaseq6000 at the Biomarker Technologies Co., Ltd (Beijing, China).

Data resources

The genome and transcriptome data of the E. octocarinatus strain 69 were obtained from our previous study [12]. The macronucleus genome (accession number: JAJLLT000000000) and micronucleus genome (accession number: JAJLLS000000000) of E. woodruffi were downloaded from Genbank database. The raw RNA-seq data for E. woodruffi were downloaded from the NCBI sequence read archive database under accession numbers of SRR21815378, SRR21815379, and SRR21815380. The raw micronucleus genome data for E. woodruffi were downloaded from the NCBI sequence read archive database under accession numbers of SRR13607456 and SRR13607458.

Assembly of transcriptomes and detection of frameshift sites

The transcriptomes were assembled with Trinity (version 2.3.2) [25] using the default set of parameters. All transcripts were aligned to the macronucleus genome by PASA (version r20140417) [26] to remove the contamination of algae and bacteria. Detection of frameshift sites was performed using the same method that was previously described [12]. Specifically, all transcripts were aligned against the NCBI non-redundant (nr) protein database using BLASTX (E-value cut-off = 1 × 10−5). Transcript which has more than one fragment with different reading frames in the same hit protein was extracted from the BLASTX results. Then the stop codon (TAA or TAG) was searched in the initial reading frame, and the ‘T’ or ‘TA’ of the stop codon was manually removed. Then the resultant transcript after artificially frameshift was again aligned to the nr protein database. Once a C-terminally extended protein was produced, it would be regarded as a frameshift site.

Identification of non-conserved frameshift sites

To identify non-conserved frameshift sites among the three E. octocarinatus strains, 100 bp upstream and downstream of the predicted frameshift sites from each strain were extracted and aligned against transcriptome database of the other two strains using BLASTN (E-value cut-off = 1 × 10−20). If the matched length of the best hit is unequal to the query sequence, this query sequence will be further checked manually through multiple sequence alignments with Clustal Omega (v1.2.3) [27]. As for the identification of non-conserved frameshift sites between the MAC and MIC genomes of E. woodruffi, 100 bp upstream and downstream of the predicted frameshift sites were extracted and respectively aligned against the MAC and MIC genome databases using BLASTN (E-value cut-off = 1 × 10−20). Completely matched sequences were ignored and the others were further checked manually. To further rule out the possibility of sequencing errors in the third-generation sequencing data, we used BWA (version 0.7.17) to align the highly accurate short reads generated by 2nd generation Illumina sequencing of MIC against this sequence. The alignment results were visualized using Integrative Genomics Viewer (IGV) (version 2.18.4).

PCR validation of the frameshift sites in the genome of E. octocarinatus

Randomly selected ten frameshift sites which were different among the three E. octocarinatus strains were validated by PCR to confirm the transcriptome assemblies. Genomic DNA of the three E. octocarinatus strains was purified as described previously [12] and used as template for PCR. Ten pairs of species-specific primers (Table S1) were designed to amplify products between 573 bp and 855 bp in length. All PCR products were visualized by electrophoresis on a 1.5% agarose gel, and bands of the expected size were extracted using a TIANgel Midi Purification Kit (TIANGEN, Beijing, China). The purified gel bands were Sanger sequenced by the Beijing Genomics Institute.

Gene expression analysis and domain annotation

For the expression analyses, the expression level of each transcript was calculated using transcripts per million (TPM) by the Salmon (v 1.4.0) [28] with the default parameters. In order to perform domain annotation, the ‘T’ or ‘TA’ within the frameshift sites were artificially removed, and these newly formed transcripts were translated into amino acid sequences using the GetOrf program in the EMBOSS package [29] with the Euplotid Nuclear Code. Then, the amino acid sequences were loaded into InterPro [30] to perform protein domain annotation.

Results

Transcriptome sequencing and identification of frameshift sites

The programmed ribosomal frameshifting occurs during the mRNA translation. Previously, we have performed a comprehensive investigation of PRF genes in E. octocarinatus (strain 69) through genome and transcriptome sequencing [12]. Here, we further sequenced the transcriptomes of another two different E. octocarinatus strains (Zam5b-1 and VTN7). The paired-end sequencing libraries (150 bp×2 by Illumina NovaSeq6000 platform) were constructed and sequenced (Table S2), and then all high-quality reads were assembled using Trinity (version 2.3.2). Finally, a total of 55,505 and 52,505 transcripts were generated with an N50 of 1,402 bp and 1,315 bp for Zam5b-1 and VTN7 strains, respectively (Table 1).

Table 1.

Comparison of transcriptome features of different E. octocarinatus strains

Strains Number of transcripts Max transcript Length (bp) N50 (bp) Assembly Size (Mb) Data source
Zam5b-1 55,505 13,229 1,402 60 This study
VTN7 52,505 12,031 1,315 53 This study
69 32,353 16,868 1,578 42 Wang et al. Scientific Reports, 2016

A similarity search-based method, which has been proved robust for PRF identification of E. octocarinatus strain 69 [12], was used here to detect frameshift sites. Overall, 4,258 and 3,033 putative frameshift sites were identified from Zam5b-1 and VTN7 strains, respectively. Similar to previous reports [14, 18], there are far more +1 frameshift sites than +2 frameshift sites in all three E. octocarinatus strains (Fig. 1A). Furthermore, 5′-AAA-TAR-3′ (R = A or G) was the most frequent motif of the +1 frameshift sites. In the cases of +2 frameshift sites, the motif 5′-NTA-TAR-3′ (N = A, T, G or C) was preferred (Fig. 1). Detailed information of the putative PRF transcripts, including frameshift type, coordinates of predicted slippery site, slippery sequence, transcript length, GC content and E-value of BLASTX is listed in Table S3 and S4.

Fig. 1.

Fig. 1

Comparative analysis of the properties of frameshift sites in different E. octocarinatus strains. A The number of identified +1 and +2 frameshift sites in three E. octocarinatus strains investigated here. B-D Nucleotide conservation associated with +1 and +2 frameshift sites. Sizes of letters represent sequence conservation of each position. The analysis was based on the alignment of 15 bp upstream and downstream of the predicted frameshift sites

Different origins of +1 and +2 ribosomal frameshifting in Euplotes

All identified frameshift sites were compared with their homologous genes in other strains. As expected, most frameshift sites of the three E. octocarinatus strains were completely identical, which indicated that they have been generated and preserved in the ancestral genome. However, there’re also many non-conserved frameshift sites, i.e. only one or two of the three orthologous genes use frameshifting, including 134 +1 frameshift sites and thirteen +2 frameshift sites (Additional file 1). PCR amplification and sequencing of randomly selected sequences also verified that these non-conserved frameshift sites are indeed present in the Euplotes genomes rather than a sequencing error or misassembly (Figs. 2 and 3).

Fig. 2.

Fig. 2

Generation of +1 frameshift sites through random single-nucleotide insertions. The DNA and predicted protein sequences in the vicinity of the frameshift sites are shown (highlighted with a red star in the phylogenetic tree), along with the corresponding regions from the other two E. octocarinatus strains. The Sanger sequencing results are displayed on the right. Asterisks indicate frameshift sites, and the red triangles denote the likely positions of the insertions associated with the generation of the frameshift sites. A-C, Randomly selected examples of different types of single-nucleotide insertions

Fig. 3.

Fig. 3

Generation of +2 frameshift sites through “TA” insertions or single-nucleotide deletions. The DNA and predicted protein sequences in the vicinity of the frameshift sites are shown (highlighted with a red star in the phylogenetic tree), along with the corresponding regions from the other two E. octocarinatus strains. The Sanger sequencing results are displayed on the right. Asterisks indicate frameshift sites, and the red triangles denote the likely positions of the insertions associated with the generation of the frameshift sites. Randomly selected examples of “TA” insertions A and single-nucleotide deletions B

The high sequence identity amongst the homologous genes from different strains of the same species makes it easier to detect the DNA change responsible for producing the frameshift site in Euplotes. The 134 +1 frameshifting instances were divided into three groups according to the location of the mutation site relative to the frameshift site. The first group contains 86 instances of the insertion of a single base ‘T’ which coincidentally created the frameshift motif, for example, a change from AAA-AAT to AAA-TAA-T (Fig. 2A). Given that frameshifting in Euplotes is highly efficient [14], this type of single-nucleotide insertion probably does not cause any change of the encoded protein. The second group contains 39 instances of random single-nucleotide insertion in the upstream region of the frameshift site. This type of single nucleotide insertion disrupted the protein-coding reading frame, but then restored the reading frame by frameshifting at a frameshift site downstream (Fig. 2B). Hence, it only results in a change of a short segment of the encoded protein between the insertion site and the frameshift site. The third group contains nine instances of single-nucleotide insertion in the downstream of the frameshift site which also created the frameshift motif, for example, a change from TTT-TGG to TTT-TAG-G (Fig. 2C). This situation may only cause a single amino acid change of the insertion site.

As for +2 frameshift sites, the thirteen instances can be divided into two groups. In the first group, an insertion of ‘TA’ exactly formed a premature stop codon within a proper +2 frameshift motif (Fig. 3A). This type of mutation does not result in any change of the encoded protein. In the second group (three instances), a random single-nucleotide deletion results in the formation of a +2 frameshift site in the downstream (Fig. 3B). This type of mutation can cause a change of a short segment of the encoded protein between the deletion site and the frameshift site. In conclusion, the origins of +1 and +2 ribosomal frameshifting in Euplotes are different. The +1 frameshift site is a result of the single-nucleotide insertion and +2 frameshift site is a result of the insertion of ‘TA’ or random single-nucleotide deletion.

Possible constraints enable the indel sites to be preserved in the genome of Euplotes

Frameshift mutations are generally considered to be deleterious in most organisms. Here, we identified some indel mutations causing ribosomal frameshifting from different E. octocarinatus strains, which supports the view that efficient frameshifting at stop codons in Euplotes makes their genes more resistant to single nucleotide insertions in protein-coding regions [18]. We further explore which kinds of mutations were more likely to be retained in the Euplotes genome. The expression levels of transcripts containing and not containing frameshift sites were compared. We found that the expression levels of frameshift genes are significantly lower than non-frameshift genes in all three E. octocarinatus strains (P-value < 0.05) (Fig. 4). This suggested that mutations causing ribosomal frameshifting tend to be retained in lowly expressed genes which is consistent with previous reports [14, 18].

Fig. 4.

Fig. 4

Expression rates of transcripts with (red) and without (grey) frameshift sites in three E. octocarinatus strains. All three P-values are lower than 0.05, demonstrating significantly lower expression rates in transcripts with frameshift sites. The P-values for difference comparison were calculated by the Mann-Whitney U test

Furthermore, we conducted a statistical analysis on the distances between the mutation sites and the frameshift sites. As shown in Fig. 5A, most mutation sites were close to the frameshift sites, and the distances between them were no more than 78 bp, meaning that up to 26 amino acids could be changed by these indel mutations. A plausible explanation is that the excessive distance will lead to changes of more amino acids, which might affect the function of encoded proteins. Moreover, we further investigated the location of mutation sites relative to the protein domain (Table S5). The result showed that twenty of the 39 mutation sites which cause the change of the amino acids were located within the domain. However, the changed amino acids number of the mutation sites located inside the domain were significantly less than that of the mutation sites located outside the domain (Fig. 5B). Even if the mutation site with the highest number of changed amino acids is removed, the difference remains significant (P-value = 0.001).

Fig. 5.

Fig. 5

The distributing characteristics of the non-conserved frameshift sites. A Distribution of distance between the putative mutation sites and frameshift sites. B Difference of the number of changed amino acids in transcripts with frameshift sites located inside or outside the domain. ** indicated significant differences (P-value < 0.01). The P-values were calculated by the Mann-Whitney U test

Continuous accumulation of frameshift sites in the macronucleus of E. woodruffi during vegetative growth

Like other ciliates, Euplotes possess two types of nuclei: a somatic macronucleus (MAC), which is transcriptionally active during vegetative growth, and a micronucleus (MIC), permitting transmission of genetic information across sexual generations [31]. Euplotes reproduce both asexually and sexually during their lifetime. During vegetative division, the MAC divides amitotically and reproduces asexually by binary fission. We suppose that long-term asexual reproduction in the lab may lead to the accumulation of new frameshift sites in the macronucleus which were different between MAC and MIC. Recently, both the macronuclear and micronuclear genomes of E. woodruffi have been sequenced [32], which allowed us to verify this hypothesis.

RNA-Seq raw data of the E. woodruffi were download from GenBank and then assembled using Trinity [32]. Finally, a total of 56,170 transcripts were generated with an N50 of 575 bp. Based on the method mentioned above, we identified 1,567 putative frameshift sites including 1,485 +1 frameshift sites and 82 +2 frameshift sites from the transcriptome of E. woodruffi (Table S6). Then all identified frameshift sites were aligned with the macronuclear and micronuclear genome. Of the 1,567 frameshift sites, only one +1 frameshift site was different between MAC and MIC (Fig. S1A). To further rule out the possibility of sequencing errors in the third-generation sequencing data, we used BWA to align the highly accurate short reads generated by 2nd generation Illumina sequencing of MIC against this sequence. As shown in Figure S1B, this single nucleotide insertion is confirmed to be genuinely present. An insertion of ‘T’ results in the generation of an ATT–TAA-T frameshift motif which provides direct evidence that Euplotes could accumulate frameshift sites in their macronuclear genomes during continuous asexual reproduction.

Discussion

The high frequency and high efficiency of ribosomal frameshifting in Euplotes makes it distinct from other organisms [12, 14, 18]. In most known cases of +1 PRF, including eukaryotic ornithine decarboxylase antizymes and bacterial peptide release factor 2, the frameshift sequence and its occurrence are considerably conserved [33, 34]. For example, in 8,160 bacterial release factor 2 surveyed, the canonical ‘ CTT TGA’ sequence is present in nearly every prfB frameshift sequence [35]. In contrast, the frameshift sites of Euplotes are extremely diverse and non-conserved. Ribosomal frameshifting has often been used only by one of the two orthologous genes from different Euplotes species [14]. Here, we found that even different strains of the same Euplotes species have different frameshift sites. Thus, it’s interesting to explore how such diverse frameshift sites have emerged and persisted in Euplotes.

+1 ribosomal frameshifting was the first-discovered and the most frequent type of frameshifting in Euplotes [14, 36]. Based on the alignment of the TERT gene sequences from seven different Euplotes species, Matthias et al. suggested that random base insertions that generate an AAA–TAA motif, or that bring such a motif into the suitable reading frame were responsible for generating the frameshift site [23]. Subsequent comparative transcriptome study did confirm this view by identifying several instances of insertions that disrupt the protein-coding reading frame [18]. Here, we provide additional evidence to prove that random single-nucleotide insertion is the main reason for the generation of +1 frameshift sites in Euplotes. By contrast, +2 ribosomal frameshifting was only recently discovered through high-throughput studies, which might be attributed to the small number of cases [14]. Here, we proposed firstly that +2 frameshift site is a result of the insertion of‘TA’ or random single-nucleotide deletion (Fig. 3). Overall, our results suggested that indel mutation might be the main reason for the generation of frameshift sites in Euplotes, which may partly explain the remarkably non-conserved trait of frameshift sites found in Euplotes. Although there is no evidence yet, it is possible that more indel mutation types, such as two-nucleotide deletion, which can generate a frameshift motif, or bring such a motif into the proper reading frame, can lead to the generation of frameshift sites.

In most organisms, indels in protein-coding regions are usually under strong negative selection due to their remarkable influence on the sequence of synthesized protein [37, 38]. Seemingly, Euplotes have evolved a remarkable ability that makes them more resistant to indel mutation. The highly efficient frameshifting at out-of-frame stops enables Euplotes to restore the reading frame at the downstream of the mutation sites, thereby reducing the adverse effects caused by indel mutations. However, the current results and previous studies suggest that there are some constraints on the types of indels resulting in the generation of frameshift sites, thus, that not all indels could be preserved in the genome of Euplotes. Firstly, it seems that indels are not tolerated within highly expressed genes. Previous study suggested that though ribosomal frameshifting does not affect the accuracy of synthesized proteins, there is a cost to the translation speed of the ribosome [14]. So ribosomal frameshifting would be harmful in highly expressed genes. Consistent with this view, we found that frameshift sites were less frequent in highly expressed genes (Fig. 4). Secondly, the indel site can not be too far from the frameshift site. Excessively long distance will lead to the change of a long segment of the encoded protein which might impair its function. Lastly, if the indel site located in key domain, the fewer amino acids it changed, the easier it was preserved in the genome.

Programmed ribosomal frameshifting has been proved to play an important role in regulating the expression of genes in other organisms [3941]. Given the frequency and non-conserved nature of frameshift sites in Euplotes, it’s reasonable to conclude that ribosomal frameshift is unlikely to be involved in gene expression regulation. A more likely scenario is that Euplotes acquired the ability of highly efficient frameshifting at stop codons through a hitherto unknown mechanisms, allowing the indel mutation to persist in genome, which has led to continuous accumulation of frameshift sites even if frameshifting is mildly deleterious. It will be interesting to further explore the key factors which endow Euplotes with highly efficient frameshifting. Stefanov et al. proved that release factor 1 (eRF1) from Euplotes can enhance frameshifting rate at the premature stop codons found in two human disease-causing genes which contain an internal frameshift within their coding regions [42]. This also underscores the potential of eRF1 or other regulated factors related to frameshifting from Euplotes to be applied in clinical gene therapy.

Conclusions

The ciliate Euplotes possess the most abundant ribosomal frameshifting described so far. Here, we sequenced transcriptomes of three different strains of the E. octocarinatus and explored the evolutionary origin of both +1 and +2 frameshift sites. We suggested that frameshift sites may be the result of insertion or deletion of one or more nucleotides in the genome. There are some possible constraints on the indel sites that enable them to be preserved in Euplotes, including occurring in lowly expressed genes, short distance between indel site and frameshift site, and lower number of changed amino acids when indel site located in domain. In the future, it will be interesting to clarify the key factors that endow Euplotes with the ability of highly efficient frameshifting.

Supplementary Information

Supplementary Material 1. (515.2KB, xlsx)
Supplementary Material 2. (153.2KB, txt)
Supplementary Material 3. (510.2KB, docx)

Acknowledgements

None.

Abbreviations

PRF

programmed ribosomal frameshifting

MAC

macronucleus

MIC

micronucleus

NCBI

National Center for Biotechnology Information

BLAST

Basic Local Alignment Search Tool

TPM

transcripts per million

Authors’ contributions

R.W. conceived the work. X.L., R.W. and Q.M. performed the bioinformatics analyses. X.L. and X.Z. performed the experiments. R.W. and X.L. wrote the manuscript. Y.F. and A.L. reviewed and edited the manuscript. All authors read and approved the final manuscript.

Funding

This project is supported by grants from National Natural Science Foundation of China (No. 32270447) and the Natural Science Foundation of Shanxi Province (No. 20220302121320) to R. Wang.

Data availability

All RNA-Seq data are available in the China National Center for Bioinformation under BioProject accession number PRJCA032545.

Declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  • 1.Ramakrishnan V. Ribosome structure and the mechanism of translation. Cell. 2002;108(4):557–72. [DOI] [PubMed] [Google Scholar]
  • 2.Drummond DA, Wilke CO. The evolutionary consequences of erroneous protein synthesis. Nat Rev Genet. 2009;10(10):715–24. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Rodnina MV, Korniy N, Klimova M, Karki P, Peng BZ, Senyushkina T, Belardinelli R, Maracci C, Wohlgemuth I, Samatova E, et al. Translational recoding: canonical translation mechanisms reinterpreted. Nucleic Acids Res. 2020;48(3):1056–67. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Atkins JF, Loughran G, Bhatt PR, Firth AE, Baranov PV. Ribosomal frameshifting and transcriptional slippage: from genetic steganography and cryptography to adventitious use. Nucleic Acids Res. 2016;44(15):7007–78. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Plant EP, Dinman JD. Comparative study of the effects of heptameric slippery site composition on -1 frameshifting among different eukaryotic systems. RNA. 2006;12(4):666–73. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Namy O, Moran SJ, Stuart DI, Gilbert RJC, Brierley I. A mechanical explanation of RNA pseudoknot function in programmed ribosomal frameshifting. Nature. 2006;441(7090):244–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Napthine S, Treffers EE, Bell S, Goodfellow I, Fang Y, Firth AE, Snijder EJ, Brierley I. A novel role for poly(C) binding proteins in programmed ribosomal frameshifting. Nucleic Acids Res. 2016;44(12):5491–503. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Matsufuji S, Matsufuji T, Miyazaki Y, Murakami Y, Atkins JF, Gesteland RF, Hayashi S. Autoregulatory frameshifting in decoding mammalian ornithine decarboxylase antizyme. Cell. 1995;80(1):51–60. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Caliskan N, Peske F, Rodnina MV. Changed in translation: mRNA recoding by -1 programmed ribosomal frameshifting. Trends Biochem Sci. 2015;40(5):265–74. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Loughran G, De Pace R, Ding N, Zhang J, Jungreis I, Carancini G, Mudge JM, Wang J, Kellis M, Atkins JF, Baranov PV, Firth AE, Li X, Bonifacino JS, Khan YA. Programmed ribosomal frameshifting during PLEKHM2 mRNA decoding generates a constitutively active proteoform that supports myocardial function. BioRxiv. 2024. 10.1101/2024.08.30.610563.
  • 11.Meydan S, Klepacki D, Karthikeyan S, Margus T, Thomas P, Jones J, Khan Y, Briggs J, Dinman JD, Vázquez-Laslop N, et al. Programmed ribosomal frameshifting generates a copper transporter and a copper chaperone from the same gene. Mol Cell. 2017;65(2):207–19. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Wang R, Xiong J, Wang W, Miao W, Liang A. High frequency of + 1 programmed ribosomal frameshifting in Euplotes octocarinatus. Sci Rep. 2016;6:21139. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Wang R, Zhang Z, Du J, Fu Y, Liang A. Large-scale mass spectrometry-based analysis of Euplotes octocarinatus supports the high frequency of + 1 programmed ribosomal frameshift. Sci Rep. 2016;6:33020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Lobanov AV, Heaphy SM, Turanov AA, Gerashchenko MV, Pucciarelli S, Devaraj RR, Xie F, Petyuk VA, Smith RD, Klobutcher LA, et al. Position-dependent termination and widespread obligatory frameshifting in Euplotes translation. Nat Struct Mol Biol. 2017;24(1):61–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Chen X, Jiang Y, Gao F, Zheng W, Krock TJ, Stover NA, Lu C, Katz LA, Song W. Genome analyses of the new model protist Euplotes vannus focusing on genome rearrangement and resistance to environmental stressors. Mol Ecol Resour. 2019;19(5):1292–308. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Jin DD, Li C, Chen X, Byerly A, Stover NA, Zhang TT, Shao C, Wang YR. Comparative genome analysis of three euplotid protists provides insights into the evolution of nanochromosomes in unicellular eukaryotic organisms. Mar Life Sci Tech. 2023;5(3):300–15. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Klobutcher LA, Farabaugh PJ. Shifty ciliates: frequent programmed translational frameshifting in euplotids. Cell. 2002;111:763–6. [DOI] [PubMed] [Google Scholar]
  • 18.Gaydukova SA, Moldovan MA, Vallesi A, Heaphy SM, Atkins JF, Gelfand MS, Baranov PV. Nontriplet feature of genetic code in ciliates is a result of neutral evolution. Proc Natl Acad Sci U S A. 2023;120(22):e2221683120. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Xiao Y, Wang RL, Du J, Zhang ZY, Chai BF, Liang AH. Slippery sequence is important for distinguishing between + 1 and + 2 programmed ribosomal frameshifting in Euplotes. Chin J Biochem Mol Biol. 2020;36(3):448–54. [Google Scholar]
  • 20.Vallabhaneni H, Fan-Minogue H, Bedwell DM, Farabaugh PJ. Connection between stop codon reassignment and frequent use of shifty stop frameshifting. RNA. 2009;15(5):889–97. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Xiao Y, Li J, Wang RL, Fan YJ, Han XX, Fu YJ, Alepuz P, Wang W, Liang AH. eIF5A promotes + 1 programmed ribosomal frameshifting in Euplotes octocarinatus. Int J Biol Macromol. 2024;254(Pt 1):127743. [DOI] [PubMed] [Google Scholar]
  • 22.Zinshteyn B, Green R. When stop makes sense. Science. 2016;354(6316):1106. [DOI] [PubMed] [Google Scholar]
  • 23.Möllenbeck M, Gavin MC, Klobutcher LA. Evolution of programmed ribosomal frameshifting in the TERT genes of Euplotes. J Mol Evol. 2004;58(6):701–11. [DOI] [PubMed] [Google Scholar]
  • 24.Wang RL, Liu JN, Di Giuseppe G, Liang AH. UAA and UAG May encode amino acid in Cathepsin B gene of Euplotes octocarinatus. J Eukaryot Microbiol. 2020;67(1):144–9. [DOI] [PubMed] [Google Scholar]
  • 25.Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, Adiconis X, Fan L, Raychowdhury R, Zeng Q, et al. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol. 2011;29(7):644–52. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Haas BJ, Delcher AL, Mount SM, Wortman JR, Smith RK Jr, Hannick LI, Maiti R, Ronning CM, Rusch DB, Town CD. Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies. Nucleic Acids Res. 2003;31(19):5654–66. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Sievers F, Higgins DG. Clustal Omega for making accurate alignments of many protein sequences. Protein Sci. 2018;27(1):135–45. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Patro R, Duggal G, Love MI, Irizarry RA, Kingsford C. Salmon provides fast and bias-aware quantification of transcript expression. Nat Methods. 2017;14(4):417–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Rice P, Longden I, Bleasby A. EMBOSS: the European molecular biology open software suite. Trends Genet. 2000;16(6):276–7. [DOI] [PubMed] [Google Scholar]
  • 30.El-Gebali S, Mistry J, Bateman A, Eddy SR, Luciani A, Potter SC, Qureshi M, Richardson LJ, Salazar GA, Smart A, et al. The Pfam protein families database in 2019. Nucleic Acids Res. 2019;47(D1):D427–32. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Prescott DM. The DNA of ciliated protozoa. Microbiol Rev. 1994;58(2):233–67. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Feng Y, Neme R, Beh LY, Chen X, Braun J, Lu MW, Landweber LF. Comparative genomics reveals insight into the evolutionary origin of massively scrambled genomes. eLife. 2022;11:e82979. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Ivanov IP, Atkins JF. Ribosomal frameshifting in decoding antizyme mRNAs from yeast and protists to humans: close to 300 cases reveal remarkable diversity despite underlying conservation. Nucleic Acids Res. 2007;35(6):1842–58. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Baranov PV, Gesteland RF, Atkins JF. Release factor 2 frameshifting sites in different bacteria. EMBO Rep. 2002;3(4):373–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Prince CR, Lin IN, Feaga HA. The evolution and functional significance of the programmed ribosomal frameshift in PrfB. BioRxiv. 2024. 10.1101/2024.09.24.614795.39463947 [Google Scholar]
  • 36.Jahn CL, Doktor SZ, Frels JS, Jaraczewski JW, Krikau MF. Structures of the Euplotes crassus Tec1 and Tec2 elements: identification of putative transposase coding regions. Gene. 1993;133(1):71–8. [DOI] [PubMed] [Google Scholar]
  • 37.Chen JQ, Wu Y, Yang HW, Bergelson J, Kreitman M, Tian DC. Variation in the ratio of nucleotide substitution and indel rates across genomes in mammals and bacteria. Mol Biol Evol. 2009;26(7):1523–31. [DOI] [PubMed] [Google Scholar]
  • 38.Studer RA, Dessailly BH, Orengo CA. Residue mutations and their impact on protein structure and function: detecting beneficial and pathogenic changes. Biochem J. 2013;449:581–94. [DOI] [PubMed] [Google Scholar]
  • 39.Advani VM, Dinman JD. Reprogramming the genetic code: the emerging role of ribosomal frameshifting in regulating cellular gene expression. BioEssays. 2016;38(1):21–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Champagne J, Mordente K, Nagel R, Agami R. Slippy-Sloppy translation: a Tale of programmed and induced-ribosomal frameshifting. Trends Genet. 2022;38(11):1123–33. [DOI] [PubMed] [Google Scholar]
  • 41.Savino S, Desmet T, Franceus J. Insertions and deletions in protein evolution and engineering. Biotechnol Adv. 2022;60:108010. [DOI] [PubMed] [Google Scholar]
  • 42.Stefanov BA, Ajuh E, Allen S, Nowacki M. Eukaryotic release factor 1 from Euplotes promotes frameshifting at premature stop codons in human cells. IScience. 2024;27(4):109413. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Material 1. (515.2KB, xlsx)
Supplementary Material 2. (153.2KB, txt)
Supplementary Material 3. (510.2KB, docx)

Data Availability Statement

All RNA-Seq data are available in the China National Center for Bioinformation under BioProject accession number PRJCA032545.


Articles from BMC Genomics are provided here courtesy of BMC

RESOURCES