Small RNAs from mitochondrial genome recombination sites are incorporated into T. gondii mitoribosomes

Sabrina Tetzlaff; Arne Hillebrand; Nikiforos Drakoulis; Zala Gluhic; Sascha Maschmann; Peter Lyko; Susann Wicke; Christian Schmitz-Linneweber

doi:10.7554/eLife.95407

. 2024 Feb 16;13:e95407. doi: 10.7554/eLife.95407

Small RNAs from mitochondrial genome recombination sites are incorporated into T. gondii mitoribosomes

Sabrina Tetzlaff ¹, Arne Hillebrand ¹, Nikiforos Drakoulis ¹, Zala Gluhic ¹, Sascha Maschmann ¹, Peter Lyko ², Susann Wicke ², Christian Schmitz-Linneweber ^1,^✉

Editors: Marisa Nicolás³, Lori Sussel⁴

PMCID: PMC10948144 PMID: 38363119

Abstract

The mitochondrial genomes of apicomplexans comprise merely three protein-coding genes, alongside a set of thirty to forty genes encoding small RNAs (sRNAs), many of which exhibit homologies to rRNA from E. coli. The expression status and integration of these short RNAs into ribosomes remains unclear and direct evidence for active ribosomes within apicomplexan mitochondria is still lacking. In this study, we conducted small RNA sequencing on the apicomplexan Toxoplasma gondii to investigate the occurrence and function of mitochondrial sRNAs. To enhance the analysis of sRNA sequencing outcomes, we also re-sequenced the T. gondii mitochondrial genome using an improved organelle enrichment protocol and Nanopore sequencing. It has been established previously that the T. gondii genome comprises 21 sequence blocks that undergo recombination among themselves but that their order is not entirely random. The enhanced coverage of the mitochondrial genome allowed us to characterize block combinations at increased resolution. Employing this refined genome for sRNA mapping, we find that many small RNAs originated from the junction sites between protein-coding blocks and rRNA sequence blocks. Surprisingly, such block border sRNAs were incorporated into polysomes together with canonical rRNA fragments and mRNAs. In conclusion, apicomplexan ribosomes are active within polysomes and are indeed assembled through the integration of sRNAs, including previously undetected sRNAs with merged mRNA-rRNA sequences. Our findings lead to the hypothesis that T. gondii’s block-based genome organization enables the dual utilization of mitochondrial sequences as both messenger RNAs and ribosomal RNAs, potentially establishing a link between the regulation of rRNA and mRNA expression.

Research organism: None

Introduction

Apicomplexan parasites are a group of intracellular protozoan pathogens that can cause infectious diseases, including malaria and toxoplasmosis. Toxoplasmosis is caused by Toxoplasma gondii, one of the most widespread human parasites, with a seroprevalence of more than one-quarter of humans worldwide (Molan et al., 2019). For many experimental questions, T. gondii remains an ideal model system for studying the molecular biology of Apicomplexa (Szabo and Finney, 2017). Given the economic and clinical impact of T. gondii and its apicomplexan relatives, there is continuous interest in distinctive biochemical and cellular features as potential targets for therapeutic intervention.

Apicomplexan cells possess a single mitochondrion that is respirationally active and essential for parasite survival (MacRae et al., 2012; Melo et al., 2000; Seeber et al., 1998; Vercesi et al., 1998). With only three exceptions, all mitochondrial proteins are nuclear-encoded and these nuclear genes contribute strongly to P. berghei and T. gondii cell fitness (Bushell et al., 2017; Sidik et al., 2016). Dozens of apicomplexan mitochondrial genomes have been sequenced (Berná et al., 2021b). These sequences showcase the extreme reductive evolution in apicomplexan mitochondria, setting records for the smallest mitochondrial genomes known to date, ranging in length from 6 to 11 kb (Hikosaka et al., 2013; Oborník and Lukeš, 2015). Mitochondrial genome organization is evolutionarily very variable, with some species having a linear, monomeric genome (Babesia spp.), while others have concatenated arrays of genomes (Hikosaka et al., 2013). The T. gondii mitochondrial genome is composed of 21 repetitive sequence blocks that are organized on multiple DNA molecules of varying lengths and non-random combinations (Berná et al., 2021a; Namasivayam et al., 2021). Apicomplexan mitogenomes code only for three proteins (COB, COX1, COXIII) and have highly fragmented genes for rRNAs (Feagin et al., 2012; Seeber et al., 2020). For example, in P. falciparum, 34 genes for rRNA fragments are scattered across the mitochondrial DNA on both strands of the genome without any linear representation of the full-length large or small subunit rRNA (Feagin et al., 2012). How the multitude of rRNA fragments are assembled into functional mitoribosomes in T. gondii remains unknown.

Since the electron transport chain in mitochondria is essential for the survival of apicomplexans, the expression of mitochondrial genes is likely to be essential as well. In fact, a nuclear-encoded RNA polymerase targeted to mitochondria has been shown to be essential in the blood stages of Plasmodium spp. (Ke et al., 2012; Suplick et al., 1990). Two mitochondrially localized RNA binding proteins from the RAP (RNA-binding domain abundant in apicomplexans) family are also essential for the survival of P. falciparum, although their role in RNA metabolism has not yet been determined (Hollin et al., 2022). In addition, the depletion of nuclear-encoded mitoribosomal proteins of T. gondii (Lacombe et al., 2019; Shikha et al., 2022) and P. falciparum (Ke et al., 2018) led to defects in the assembly of ETC complexes and in parasite proliferation, suggesting that mitochondrial translation is important for parasite survival. Resistance to the antimalarial drug atovaquone in P. falciparum and T. gondii has been linked to mutations in the cob (cytB) gene of mitochondria (McFadden et al., 2000; Srivastava et al., 1999; Syafruddin et al., 1999), further supporting the idea of active, essential translation in apicomplexan mitochondria.

Despite the clear importance of mitogenome expression in Apicomplexa, we lack a comprehensive catalog of transcripts and processing events. Genome-length, polycistronic transcripts are produced in Plasmodium that encompass all three open reading frames (Ji et al., 1996). The existence of full-length mRNAs was further confirmed by cDNA amplifications (Namasivayam et al., 2021) and long-read sequencing of mRNAs in both Plasmodium and Toxoplasma (Lee et al., 2021; Namasivayam et al., 2021). There is no evidence of dedicated transcription initiation for individual protein-coding genes, and it is assumed that mRNAs are processed from the polycistronic precursor (Feagin et al., 2012; Hillebrand et al., 2018; Rehkopf et al., 2000; Suplick et al., 1990), but the mechanism of processing and the required machinery are unknown. Similarly, the production of rRNA fragments is achieved post-transcriptionally by processing long precursor RNAs (Ji et al., 1996). A catalog of small RNAs, mostly rRNA fragments, was described for P. falciparum based on small RNA sequencing (Hillebrand et al., 2018), but in other apicomplexans, including T. gondii, our understanding of rRNA species is limited, and mostly based on predictions through sequence comparisons with Plasmodium (Feagin et al., 2012; Namasivayam et al., 2021).

We used a slightly modified protocol of mitochondria enrichment (Esseiva et al., 2004) to investigate the structure of the mitochondrial genome through long-read sequencing at an unprecedented depth. In parallel, we conducted high-throughput sequencing of small RNAs (<150 nt) and demonstrated that they are incorporated into polysomes. The combination of DNA sequencing results and transcriptome analysis also allowed us to identify previously undetected transcripts, many of which originate from block boundaries and represent fusions of coding and noncoding regions.

Results

T. gondii organellar nucleic acids are enriched by a combination of selective membrane disruption and degradation of nucleo-cytosolic DNA and RNA

The T. gondii nuclear genome contains many insertions of mitochondrial DNA sequences (Gjerde, 2013; Namasivayam et al., 2023; Ossorio et al., 1991). To better distinguish between NUMTs (nuclear DNA sequences that originated from mitochondria) and true mitochondrial sequences, it is helpful to enrich mitochondrial DNA. We modified a cell fractionation method that takes advantage of the differential sterol content in plasma membranes and organellar membranes (Esseiva et al., 2004; Subczynski et al., 2017). We incubated cells with the detergent digitonin, which selectively permeabilizes sterol-rich membranes, leaving mitochondria and other organelles intact. We used a T. gondii strain that constitutively expresses GFP localized to the mitochondrial matrix to track the mitochondria during the purification process (Figure 1A). After permeabilizing the plasma membrane with digitonin, we treated the cells with benzonase to degrade nucleo-cytosolic nucleic acids. Following the inactivation of the benzonase, we lysed the mitochondria with high concentrations of detergents to release the soluble content.

Figure 1—figure supplement 1. — (A) *T. gondii* tachyzoites were harvested and incubated in a buffer with digitonin for plasma membrane permeabilization. Subsequently, accessible nucleic acids were digested by benzonase. After removal of the benzonase and washing the pellet, the intact organelles were lysed by a high detergent buffer. Soluble nucleic acids were separated by centrifugation and extracted from the supernatant. (B) *T. gondii* cells expressing a mitochondrial-targeted GFP were subjected to organelle enrichment. GFP was tracked by immunoblotting (5% volume of each fraction was analyzed). The GFP signal remained in the pellet and only shifted into the supernatant after the lysis step. In contrast, much of the cytosolic HSP70 is removed during the procedure, indicating specific enrichment of mitochondria. (C) RNA extracted from selected fractions of the organelle enrichment protocol was analyzed by agarose gel electrophoresis. (D) RNA extracted from selected fractions of the organelle enrichment protocol was analyzed on a denaturing 10% PAGE gel and blotted onto a nylon membrane. Radiolabeled DNA oligonucleotide probes were used to detect the mitochondrial rRNA fragments SSUD (from the small subunit of the ribosome) and LSUF (large subunit of the ribosome). Both fragments were found in pellet fractions after digitonin treatment, where they were protected from benzonase digestion. They only shifted to the supernatant after lysis. This demonstrates that the mitochondria stay intact during the procedure.

Figure 1—source data 1. Raw gel and blot images.
Uncropped blots and gels accompanied by images indicating the areas shown in Figure 1B–D with a red rectangle. In addition, raw scan images are provided. If the scan contains multiple blots, the position of the blot of interest is indicated in the file name. Additionally, for immunoblots light image overlays depicting the membrane outline are provided.

elife-95407-fig1-data1.zip^{(38.2MB, zip)}

We observed that mitochondrial GFP remained in the pellet fraction (P in Figure 1B) with only a minor signal in the supernatant, indicating that the mitochondria remained largely intact. In contrast, cytosolic HSP70 largely remained in the supernatant after digitonin and benzonase treatment, indicating that the method successfully reduced cytoplasmic proteins. Upon lysis of the mitochondria using Triton X-100 (‘Lysis’ in Figure 1B), the GFP signal was predominantly recovered in the supernatant, demonstrating efficient lysis of the mitochondria and release of soluble components. This fraction still contains HSP70, but slight contamination with cytosol is expected. We also found that cytosolic 18 S and 26 S rRNA were almost completely degraded, indicating the high efficacy of the benzonase treatment (Figure 1C). In summary, our enhanced protocol for organellar enrichment effectively depletes cytosolic protein and RNA components and enriches mitochondrial macromolecules.

To assess the integrity of mitochondrial RNA following our protocol, we utilized small RNA gel blot hybridization for two known mitochondrial rRNA fragments, SSUD and LSUF (Namasivayam et al., 2021). Our analysis of total nucleic acids in the gel revealed effective removal of major rRNA species by benzonase treatment (Figure 1C), while smaller degradation products accumulated (Figure 1D lanes ‘Benz’ and ‘Lysis’). In contrast, we observed clear bands for both SSUD and LSUF, indicating their protection from benzonase degradation. Notably, we also detected signals for both rRNA fragments in the supernatant of the lysis fraction, confirming successful mitochondrial lysis and recovery of mitochondrial rRNA fragments.

Long-read sequencing of DNA from enriched mitochondrial fractions identifies prevalent sequence block combinations

For the analysis of mitochondrial sRNAs, we needed a reference mitochondrial genome that would represent the existing sequence block combinations. A previous sequencing study used Oxford Nanopore sequencing technology (ONT) to identify combinations of sequence blocks in T. gondii mitochondria (Namasivayam et al., 2021). To improve read depth, we performed ONT sequencing on DNA gained from our improved organelle enrichment protocol. After filtering against the nuclear genome, we found 86,761 reads that had similarities to the published mitochondrial genome sequence (Namasivayam et al., 2021). This corresponds to 78.5 Mbp, which accounts for 9.2% of the total sequences that we attribute to the nucleus and mitochondria (Supplementary file 1). This is a 42-fold increase in the sequencing depth of the T. gondii mitochondrial genome compared to previous attempts (Supplementary file 1), which can be attributed to the effectiveness of the purification process. The length of mitochondrial reads ranged from 87 nt to 17,424 nt (Supplementary file 2). The GC content of these reads, on average 36.4%, further supports their mitochondrial origin (Berná et al., 2021a). We annotated the reads using published sequence block information (Namasivayam et al., 2021) and confirmed the previously described 21 sequence blocks, designated by letters from A-V (Namasivayam et al., 2021). We searched the sequencing data for repeated sequence elements but found no additional blocks, suggesting that the T. gondii mitochondrial genome is fully covered. We compared our reads with published ONT reads for the T. gondii mitochondrial genome (Berná et al., 2021a; Namasivayam et al., 2021). While smaller reads of our dataset are found in full within longer reads in the published datasets, we do not find any examples for reads that would be full matches between the datasets. This suggests that while the block content is identical between the different sequencing results, the actual block order was reshuffled by recombination, indicating the rapidity of mitogenome evolution in T. gondii, thus corroborating at a deeper sequencing coverage conclusions so far gained from a more limited read set (Berná et al., 2021a; Namasivayam et al., 2021).

The depth of our sequencing enabled us to quantify the genome block combinations. We first asked whether there is a bias in sequential block succession. We observed that the rate of occurrence of blocks varies dramatically in our dataset, with the most frequent block, J, occurring 45,849 times and the least frequent, C, occurring 3138 times (Figure 2A, Supplementary file 3). This suggests that differences in block combinations can be expected. After counting block combinations, we found that only a small fraction of all possible block combinations are prevalent within the genome (Figure 2—figure supplement 1A, Supplementary file 4), confirming block combinations observed previously (Namasivayam et al., 2021) and adding combination frequencies based on higher read numbers. For example, the most frequent block combination is J-B, which occurs 19,622 times in our reads. In total we identified 84 combinations of which 52 occur less than 50 times and make up less than 0.06% of the total number of combinations found (Supplementary file 4). Fourteen blocks (R, N, M, F, L, U, I, C, K, D, H, E, T) are always found with the same preceding and following block. The direction of the blocks in these combinations is fixed (for example, the 3’-end of the L block is always fused to the 5’-end of the J block, never to the 3’-end of the J-block). This indicates that the genome’s flexibility is limited and that not all block combinations are realized (see also Namasivayam et al., 2021). Variability in genome structure is caused primarily by blocks S, O, Q, B, and P, which have up to three possible neighbors, and by blocks Fp, Kp, V, A, and J, which have more than three possible neighbors. All combinations are well covered in our ONT results and helped to refine block borders relative to previous annotations (Figure 2—figure supplement 1, Figure 2—figure supplement 2). Using the 32 high-frequency block combinations we found, we generated a map centered on the protein-coding genes (Figure 2B). We will later show that all blocks not encoding proteins express rRNA fragments and are thus called rRNA blocks here. The map is not designed to represent physically existing DNA molecules, since the sequencing technology used cannot detect branched or circular DNAs, nor are the blocks drawn to scale. However, it helps to indicate the complexity of the nonrandom recombination events shaping the genome. For example, when considering the positioning of the three protein-coding genes within this map, it is conspicuous that their three 5'-ends are always flanked by blocks also encoding a protein (Figure 2B), although it is unclear whether this is of functional relevance.

Figure 2. — (A) Oxford Nanopore sequencing technology (ONT) DNA sequencing results were analyzed for the number of block occurrences. The gray scale indicates how many times each of the mitochondrial sequence blocks was found in the mitochondrial ONT reads. (B) Map of the *T. gondii* mitochondrial genome representing block combinations occurring in mitochondrial ONT reads. The thickness of the connecting lines indicates the absolute number of occurrences for each combination.

Figure 2—figure supplement 1. — (A) Oxford Nanopore sequencing technology (ONT) DNA sequencing results were analyzed for the number of block occurrences. The gray scale indicates how many times each of the mitochondrial sequence blocks was found in the mitochondrial ONT reads. (B) Map of the *T. gondii* mitochondrial genome representing block combinations occurring in mitochondrial ONT reads. The thickness of the connecting lines indicates the absolute number of occurrences for each combination.

Full-length coding regions had been found previously on nanopore reads in T. gondii (Namasivayam et al., 2021). In our improved representation of the mitogenome, we identified a large number of full representations of cob, coxI, and coxIII in our dataset (cob: 1612; coxI: 1404; coxIII: 1487). Thus, of all reads long enough to carry one of the following ORFs, 5.1% contain full-length coxIII, 8.6% contain full-length cob, and 11.1% full-length cox1, respectively (Supplementary file 5). This may be adequate for the expression of the encoded proteins; however, we cannot currently exclude the possibility of genomic or post-transcriptional block shuffling that could lead to more complete open reading frames (as discussed in Berná et al., 2021b). We next applied approaches established previously to represent biased recombination events based on alternative block combinations (see Figure S12 in Namasivayam et al., 2021) to our improved ONT read set. For example, block S, as part of the coxI coding region, was followed in 80% of its occurrences by the rRNA block R and only in 20% by the next coding block C (Figure 2B, Figure 2—figure supplement 1B). Similarly, block B, as part of the coxIII reading frame, was followed in 66% of all cases by the rRNA block U and less frequently by the terminal block of coxIII, which is block M. Whether such fusions of coding and non-coding block is of functional relevance, was unclear so far.

Small RNA sequencing identifies a comprehensive set of mitochondrial rRNA fragments

For T. gondii mitochondria, rRNA species were previously predicted based on sequence comparisons to Plasmodium rRNA sequences (Namasivayam et al., 2021). We used Illumina-based small RNA sequencing of T. gondii to identify mitochondrial rRNA fragments. We sequenced both the mitochondria-enriched fraction and the input fractions described in Figure 1. In the enriched fraction, we mostly retrieved reads for cytosolic rRNA degradation products and poor coverage of mitochondrial sequences (not shown). This is likely caused by the benzonase step in the protocol, which generates an abundance of small degradation products of cytosolic rRNA (Figure 1D; Figure 1—figure supplement 1). In the input samples, intact full-length rRNAs are removed by selecting small RNAs during library preparation, resulting in less prevalent cytosolic rRNA reads. We bioinformatically removed the remaining cytosolic rRNA reads as well as any reads shorter than 20 nt. Poly-A stretches at the 3’-ends were clipped as well. Finally, we mapped the remaining reads against the mitochondrial genome (Supplementary file 6) using the block combinations identified here by ONT sequencing (Figure 2).

The results of our mapping study show that the 5'-termini of mitochondrial small RNAs are clearly defined, making it easy to identify their transcript ends (as shown in Figure 3A). In agreement with findings from previous studies on small RNAs from P. falciparum (Hillebrand et al., 2018), we found that the 3'-ends of many transcripts were more variable (as seen in the example of RNA17 in Figure 3A). We identified a total of 34 small RNAs that accumulate in T. gondii mitochondria (Supplementary file 7). Among these, 11 correspond to previously predicted rRNA fragments (Namasivayam et al., 2021). Our sequencing data confirmed that these sequences are expressed and allowed us to refine the exact rRNA fragment ends (Supplementary file 7). Additionally, our small RNA sequencing revealed some larger differences compared to previous transcript predictions. This included a reassignment of SSUF to the opposite strand and also affected the four rRNA fragments LSUF, LSUG, LSUD, and LSUE, which had been predicted as separate transcripts of the large subunit (LSU) of the ribosome (Namasivayam et al., 2021). Our sequencing results suggest that there is an accumulation of transcripts containing LSUF and LSUG regions and LSUD and LSUE regions, respectively. Both of these transcripts were verified via northern blotting (Figure 3—figure supplement 1). The longer transcripts were found to be much more abundant than smaller transcripts that were also detected (Figure 3—figure supplement 1), suggesting that the longer transcripts represent the functional rRNA fragments in T. gondii mitochondria. Knowing the exact sequence of rRNA fragments is crucial for further investigations into the structure of mitoribosomes, as it is key for predictions of secondary structures.

Figure 3. — (A) Upper three rows: excerpt of mapping results in three strand-specific small RNA sequencing replicates starting from total input RNA from our *T. gondii* organelle RNA preparation. Read depth was counted by the number of reads at each position. Note that the 5’-ends of the RNAs shown here are on the right. Lower four rows: only the terminal nucleotides of each read on the plus and minus strands of the genome are shown. The size and position of genes below the coverage graphs are all drawn to scale. Red bars with lowercase letters indicate the positions of probes used for detecting RNAs in RNA gel blot hybridizations shown in (B). (B) Equal quantities of total input RNA from *T. gondii* and from organelle preparations were loaded onto a denaturing PAGE gel and analyzed by RNA gel blot hybridization with the probes indicated.

Figure 3—source data 1. Raw blot images.
Uncropped blots accompanied by images indicating the areas shown in Figure 3B with a red rectangle. In addition, raw scan images are provided. If the scan contains multiple blots, the position of the blot of interest is indicated in the file name.

elife-95407-fig3-data1.zip^{(10.7MB, zip)}

Figure 3—figure supplement 1. — (A) Upper three rows: excerpt of mapping results in three strand-specific small RNA sequencing replicates starting from total input RNA from our *T. gondii* organelle RNA preparation. Read depth was counted by the number of reads at each position. Note that the 5’-ends of the RNAs shown here are on the right. Lower four rows: only the terminal nucleotides of each read on the plus and minus strands of the genome are shown. The size and position of genes below the coverage graphs are all drawn to scale. Red bars with lowercase letters indicate the positions of probes used for detecting RNAs in RNA gel blot hybridizations shown in (B). (B) Equal quantities of total input RNA from *T. gondii* and from organelle preparations were loaded onto a denaturing PAGE gel and analyzed by RNA gel blot hybridization with the probes indicated.

Figure 3—source data 1. Raw blot images.
Uncropped blots accompanied by images indicating the areas shown in Figure 3B with a red rectangle. In addition, raw scan images are provided. If the scan contains multiple blots, the position of the blot of interest is indicated in the file name.

elife-95407-fig3-data1.zip^{(10.7MB, zip)}

Discovery of previously undetected rRNA fragments

sRNAs cover most of the noncoding sequence blocks of the T. gondii genome

Out of the 34 small RNA fragments identified, 23 have not been described previously for T. gondii (see Supplementary file 7 and Figure 3—figure supplement 2). 17 of the 23 are homologs of sRNA fragments in the apicomplexans P. falciparum and E. leuckarti (marked in bold in Supplementary file 7). These were named according to their Plasmodium homologs (Feagin et al., 2012). Six further sRNAs are exclusively conserved within cyst-forming Eucoccidians, the closest apicomplexan relatives of T. gondii (last column in Supplementary file 7). We assigned numbers to these sRNAs extending the Plasmodium nomenclature (Feagin et al., 2012). We next asked, how many of all 34 sRNAs have homologies to rRNA from E. coli. We could find twelve LSU homologs and eleven SSU homologs in accordance with previous analyses in P. falciparum (Feagin et al., 2012; Hillebrand et al., 2018; Supplementary file 8 ). Only for the P. falciparum LSU fragment LSUC and the SSU fragment RNA12, we were unable to identify corresponding homologs in T. gondii.

We next analyzed the accumulation of selected RNAs previously undetected in T. gondii for the sequence blocks Kp-K: Transcripts RNA5, 17, and 29, as well as the already predicted transcripts RNA10 and SSUD are found on the minus strand strandedness according to Namasivayam et al., 2021; MN077088.1 - MN077111.1. They are separated by single nucleotides in the mitochondrial genome. In this and most other noncoding blocks, the DNA sequence is almost fully used for transcript production (see Figure 3A, Figure 3—figure supplement 2, Figure 3—figure supplement 3). To validate the sequencing data, we performed RNA gel blot hybridization using probes against RNA17, RNA29, and RNA5 (Figure 3B) and indeed detected transcripts of the expected size (41 nt, 42 nt, and 83 nt, respectively). The transcripts were more abundant in our organelle preparations than in input samples, which indicates that they are of organellar origin and did not originate from NUMTs. The sequencing and RNA gel blot efforts demonstrate that almost the entire sequence of the blocks Kp-K is represented by small RNAs.

Dual use of block borders for sRNA production

Among the small RNAs identified here, there is also a class that was only detectable due to our insights into genome block combinations. Using block combinations for mapping analysis, we identified 15 transcripts that span two blocks, i.e., there are sRNAs at almost half of all 31 block borders. Often, we find sRNAs that share a block border sequence at one end, but differ at the other end depending on the block combination (see Figure 4A, orange boxes). An example of sequences shared at the 5'-end is represented by the pair RNA8 and RNA31, which both start in block Fp but end in different blocks (Figure 4A). An example of block sharing at the 3'-end is the RNA pair RNA1/RNA2. Both RNAs terminate at very similar positions in block Kp but start in different blocks (Figure 4A). Using a probe for the common 3'-part in block Kp, we detected both RNA1 and RNA2 in an RNA gel blot hybridization experiment (Figure 4A and B). Both RNA1 and RNA2 have homology to E. coli LSU in their 3'-portion via block Kp, but only RNA1 has additional homology to E. coli LSU via block I (Figure 4C; Feagin et al., 2012). It is interesting to consider that RNA1 could, therefore, be used in two positions in the large subunit of the ribosome. In conclusion, block combinations can lead to the expression of RNAs in T. gondii that are not found in apicomplexan species with a simpler genome organization (Supplementary file 7).

Coding regions contribute to sRNAs at block borders

In addition to the three pairs of overlapping, noncoding RNAs described above (Figure 4A), we also found five RNAs that combine sequences from coding and non-protein-coding blocks (Figure 5A and B). Among them, RNA16, RNA23t, and RNA34 were partially antisense to mRNAs (Figure 5A), and we confirmed the accumulation of RNA34 by RNA gel blot hybridization (Figure 5C). None of the three RNAs had detectable homologies to E. coli rRNA based on simple sequence searches, but structural conservation cannot be ruled out. Mitochondrial rRNA sequences from kinetoplastids and diplonemids show very little sequence conservation but are still part of the mitoribosome (Valach et al., 2023 ; Ramrath et al., 2018), suggesting that future analyses might uncover hidden rRNA similarities. With the exception of RNA34, the RNAs representing a fusion of non-protein-coding regions and protein-coding regions have homologous sequences in the mitochondrial genome of P. falciparum (Feagin et al., 2012). In P. falciparum, however, none of them are antisense to coding regions. Whether these sRNAs form sRNA:mRNA interactions and whether this has functional consequences remains to be investigated.

Figure 5—figure supplement 1. — (A) Schematic representation of the genomic position of three RNAs that are partially antisense to coding sequences. (B) Schematic representation of the genomic position of two RNAs that partially overlap with coding sequences. (C) RNA gel blot hybridization of RNA34. Equal amounts of RNA extracts from input and organelle-enriched fractions were analyzed. (D) RNA gel blot hybridization of RNA19 and RNA3 – for details see (E) Alignment of *T. gondii* RNA19 with homologous sequences from other apicomplexan species and from *E. coli*. (F) Alignment of a section of the *coxIII* and *cob* sequences from different apicomplexan species. Alignments were prepared using Clustal Omega (https://www.ebi.ac.uk/Tools/msa/clustalo/). Nucleotides of *T. gondii* RNA19 are marked in bold in both alignments and as a lilac wiggle line above the sequences. The sequences are interrupted by indels in *coxIII* and *cob* of other apicomplexans that have stand-alone RNA19 sequences (e.g. *P. falciparum*). Within the *cob* sequence, the RNA19 sequence covers an annotated start codon (Alday et al., 2017, yellow box; Namasivayam et al., 2021). However, an alternative start codon downstream has been suggested as well (red box, McFadden et al., 2000) and is in fact where sequence conservation starts. The green circle indicates the *cob* 5’ transcript terminus as determined by rapid amplification of cDNA ends (RACE) (see G). (G) Agarose gel analysis of 5’ RACE PCR products was performed to identify the *cob* 5’ transcript end. +RT and -RT indicates the presence or absence of reverse transcriptase, respectively.

Figure 5—source data 1. Raw gel and blot images.
Uncropped blots and gels accompanied by images indicating the areas shown in Figure 5C and D and -G with a red rectangle. In addition, raw scan images are provided.

elife-95407-fig5-data1.zip^{(4.9MB, zip)}

Two of the small RNAs contained coding sequences in the sense direction (Figure 5B). Both were validated by RNA gel blot hybridization (Figure 5D). RNA19 contains the coxIII coding sequence adjacent to the cob coding region by a combination of block J and block E (Figure 5E). Cob has two in-frame start codons (Figure 5F). Using rapid amplification of cDNA ends (RACE), we showed that the 5’-end of the cob mRNA is 6 nt upstream of the second start codon and 10 nt downstream of RNA19 (Figure 5F and G). Being only four nucleotides apart, the 3’-end of RNA19 is possibly generated during the formation of the 5’-end of cob. RNA19 is conserved in other apicomplexan mitochondrial genomes (Figure 5E) but is not part of open reading frames in Plasmodium. Homologies and a structural fit to the SSU rRNA of E. coli have been noted for P. falciparum RNA19 (Figure 5E, Feagin et al., 2012). This sequence similarity to rRNA is maintained in T. gondii (Figure 5E), which suggests that despite overlapping with coxIII coding sequence at the J-E block border, RNA19 is functional. It is remarkable that this sequence serves dual purposes for coding a protein and an rRNA. There is only low sequence conservation of the coxIII sequence used by RNA19 (Figure 5F), which might have allowed the evolutionary acquisition of an additional use as an rRNA fragment. In addition to RNA19, a second RNA-encompassing coding sequence is RNA3, which is situated at the S-R block border (Figure 5B). Block S is part of the coxI coding region - its terminal 22 nucleotides contribute to RNA3. The similarity to E. coli 23 S rRNA is, however, restricted to the noncoding block R (Figure 5—figure supplement 1). The Block S sequence in RNA3 is, therefore, a 5’-extension of this rRNA, and it remains to be determined whether this is of functional consequence for the mitochondrial ribosome. It is noteworthy that the 5’ end of RNA3 is located within a region of the 23 S rRNA secondary structure that exhibits low conservation (Feagin et al., 2012), which may suggest that its overlap with coding regions is potentially tolerable.

In sum, these results suggest that at block borders, T. gondii makes dual use of several protein-coding sequence blocks. Depending on the respective block combination, they are either part of the protein-coding sequence or can code for an rRNA fragment.

Mitochondrial small RNAs are part of a large-molecular weight complex

The apicomplexan mitochondrial ribosome with its peculiar organization has not been characterized in any detail. There is strong evidence from genetic and pharmacological studies that translation occurs in the mitochondria of apicomplexans (Alday et al., 2017; Lane et al., 2018; McFadden et al., 2000; Vaidya et al., 1993). Additionally, the presence of a large Megadalton complex containing nuclear-encoded ribosomal proteins targeted to mitochondria has been shown by blue native gel electrophoresis (Lacombe et al., 2019). However, a comprehensive catalog of mitoribosomal constituents is missing, and the extent to which rRNA fragments are part of mitoribosomes remains unanswered, nor have polysomes been identified as a direct readout for active ribosomes. To ascertain the presence of sRNAs within high-molecular weight complexes, we fractionated enriched organellar preparations using sucrose gradient analysis. We employed conditions known to preserve mitochondrial ribosomes (Waltz et al., 2021a) conditions established to dissociate ribosomes (10 mM EDTA, no MgCl₂, 300 mM KCl). Should sRNAs be integral to ribosomes, their dissociation would be expected to result in a migration towards lower molecular weight fractions.

RNA extracted from the gradient fractions was initially analyzed using Ethidium bromide-staining in denaturing polyacrylamide urea gels (Figure 6A). The paucity of detectable signals aligns with the expected degradation of abundant cytosolic rRNAs during preparation with benzonase. Notably, only the top fractions of the gradient (fractions 1–6) displayed bands, while the deeper fractions showed no discernible signals, reaffirming the effectiveness of the organellar preparation in eliminating cytosolic contaminants. The observed bands in the top fractions may represent degradation products of prevalent rRNA species, released from ribosomes by benzonase but persisting in the organellar preparation due to their abundance. Subsequent to gel electrophoresis, we transferred the RNA onto membranes and hybridized them with probes targeting seven specific sRNA species. LSUD/E and LSUF/G, along with RNA1, RNA2, and RNA3, were chosen for analysis due to their sequence similarities with E. coli 23 S rRNA, as noted by Feagin et al., 2012. Similarly, SSUA and RNA19 were chosen since they resemble parts of E. coli 16 S rRNA (Feagin et al., 2012). Additionally, RNA3 and RNA19 were tested as they utilized the coding sequences of coxI and coxIII, respectively. Finally, we also included RNA29, an sRNA not yet assigned to the ribosome based on sequence similarity.

Figure 6. — *T. gondii* organelle-enriched extracts were fractionated by sucrose density gradient centrifugation and subsequently analyzed by RNA gel blot hybridization. Buffer conditions during the experiment were either 30 mM MgCl₂, 100 mM KCl (labeled Mg²⁺ next to the blots), or 10 mM EDTA, 300 mM KCl (labeled EDTA next to the blots). (A) Prior to blotting, the gels were stained with ethidium bromide. Atop the gel images, a schematic representation illustrates the sucrose density gradient. Subsequent panels display a series of RNA gel blot hybridization assays. These assays were conducted on blots derived from the two ethidium bromide-stained gels presented in (A). Following each probe, the blot was stripped of its signal by employing a denaturing buffer. The oligonucleotides in these assays targeted LSUD/E (B), LSUF/G (C), SSUA (D), RNA1 +RNA2 (E), RNA3 (F), RNA19 (G), and RNA29 (H). Percentages in square brackets encircling each blot indicate the proportion of signal detected in fractions 7–9 or 6–9 for large subunit (LSU) or small subunit (SSU) small RNAs (sRNAs), respectively, versus signal intensity in fractions 10–14. ‘I’ corresponds to 10% of the lysate prior to gradient loading. The RNA names are color-coded to indicate homology: deep purple for RNAs with homology to *E. coli* rRNA from the large ribosomal subunit, light purple for those from the small subunit, and black for RNAs without previously noted homology.

Figure 6—source data 1. Raw blot images.
Uncropped blots accompanied by images indicating the areas shown in Figure 6A–D with a red rectangle. In addition, raw scan images are provided. If the scan contains multiple blots, the position of the blot of interest is indicated in the file name.

elife-95407-fig6-data1.zip^{(65.9MB, zip)}

Figure 6—source data 2. Raw blot images.
Uncropped blots accompanied by images indicating the areas shown in Figure 6E–H with a red rectangle. In addition, raw scan images are provided. If the scan contains multiple blots, the position of the blot of interest is indicated in the file name.

elife-95407-fig6-data2.zip^{(52MB, zip)}

Figure 6—figure supplement 1. — *T. gondii* organelle-enriched extracts were fractionated by sucrose density gradient centrifugation and subsequently analyzed by RNA gel blot hybridization. Buffer conditions during the experiment were either 30 mM MgCl₂, 100 mM KCl (labeled Mg²⁺ next to the blots), or 10 mM EDTA, 300 mM KCl (labeled EDTA next to the blots). (A) Prior to blotting, the gels were stained with ethidium bromide. Atop the gel images, a schematic representation illustrates the sucrose density gradient. Subsequent panels display a series of RNA gel blot hybridization assays. These assays were conducted on blots derived from the two ethidium bromide-stained gels presented in (A). Following each probe, the blot was stripped of its signal by employing a denaturing buffer. The oligonucleotides in these assays targeted LSUD/E (B), LSUF/G (C), SSUA (D), RNA1 +RNA2 (E), RNA3 (F), RNA19 (G), and RNA29 (H). Percentages in square brackets encircling each blot indicate the proportion of signal detected in fractions 7–9 or 6–9 for large subunit (LSU) or small subunit (SSU) small RNAs (sRNAs), respectively, versus signal intensity in fractions 10–14. ‘I’ corresponds to 10% of the lysate prior to gradient loading. The RNA names are color-coded to indicate homology: deep purple for RNAs with homology to *E. coli* rRNA from the large ribosomal subunit, light purple for those from the small subunit, and black for RNAs without previously noted homology.

Figure 6—source data 1. Raw blot images.
Uncropped blots accompanied by images indicating the areas shown in Figure 6A–D with a red rectangle. In addition, raw scan images are provided. If the scan contains multiple blots, the position of the blot of interest is indicated in the file name.

elife-95407-fig6-data1.zip^{(65.9MB, zip)}

Figure 6—source data 2. Raw blot images.
Uncropped blots accompanied by images indicating the areas shown in Figure 6E–H with a red rectangle. In addition, raw scan images are provided. If the scan contains multiple blots, the position of the blot of interest is indicated in the file name.

elife-95407-fig6-data2.zip^{(52MB, zip)}

We observed the peak signals for all sRNAs in the middle of the gradient, but specifically not in fractions 1–4, where free sRNAs would be expected to localize, akin to the signals observed in the ethidium bromide-stained samples (Figure 6). Second, all tested sRNAs were found to also migrate into the deepest gradient fractions 12–14. Collectively, these observations strongly indicate that the sRNAs in question are predominantly part of high molecular weight complexes. To evaluate the influence of ribosome-dissociative conditions on the fractionation of sRNAs in sucrose gradients, we quantified the sRNA signals in the deeper fractions (10–14) compared to the peak fractions (7–9 for LSU rRNAs and 6–9 for SSU rRNAs). For all sRNAs examined, a shift was observed from the deeper to the earlier fractions under dissociative conditions (Figure 6). This shift indicates that the large molecular weight complexes, with which these sRNAs are associated, are susceptible to ribosome-dissociative conditions. The most parsimonious conclusion is that all sRNAs tested are rRNAs.

We then asked whether rRNA fragments with homology to E. coli 23 S versus those with homology to 16 S rRNA exhibit differential signal distribution. We found that rRNAs presumed to be small subunit rRNAs, namely SSUA and RNA 19, displayed signals in fraction 6 (Figure 6D and G), a pattern not observed or less pronounced in presumed LSU rRNAs such as LSUD/E, LSUF/G, RNA1 +2, and RNA3 (Figure 6B, C, E and F). The SSU-specific signal in fraction 6 persisted under both standard and dissociative conditions, implying that it represents free SSU (Figure 6D and G). We quantified the signals in all fractions under dissociative conditions and compared the profiles of sRNA signals across the gradient (Figure 6—figure supplement 1). This shows that under dissociative conditions, the distribution and peaks differ markedly between the four predicted LSU rRNAs and the two SSU sRNAs (Figure 6—figure supplement 1). The distinct distribution patterns for rRNAs with homology to LSU versus SSU rRNA suggest the gradient allows to assign individual rRNA fragments to the two subunits of the ribosome. Indeed, the previously unclassified RNA29 exhibited a distribution pattern very similar to LSU rRNAs (Figure 6H). It showed significantly less signal in fraction 6 than SSUA and a distribution across the gradient similar to LSUF/G and LSUD/E (Figure 6C and D and Figure 6—figure supplement 1). This pattern indicates that RNA29 likely belongs to the LSU category. Unlike most other rRNAs, RNA29 also showed strong signals in the low molecular weight fractions 1–5 (Figure 6H), hinting at a proportion of unassembled RNA29 in our preparation. Such un-assembled rRNA was also observed for RNA2, while the cognate RNA1 was barely detected in lower molecular weight fractions (Figure 6E). It remains to be determined whether different assembly kinetics or degradation rates contribute to the RNA signals in the upper parts of the gradient. Both RNA1 and RNA2 demonstrated deep penetration into the gradient, and showed a gradient distribution consistent with the LSU distribution pattern. Both RNA1 and RNA2 are sensitive to EDTA treatment. Together this leads to the conclusion that RNA1 and RNA2 are both assembled into ribosomes.

We furthermore observed that despite containing sequences derived from mRNAs, both RNA19 and RNA3 are incorporated into ribosomes (Figure 6B). For RNA19, several transcripts were found. The largest isoform is mainly found in low-molecular weight fractions. Conversely, the smaller RNA19 transcripts are observed to migrate into the high-molecular weight fractions, which is EDTA-sensitive (Figure 6G). Our findings suggest that RNA19 and RNA3 are incorporated into the ribosome.

Mitochondrial mRNAs show an association with high-molecular weight complexes that are sensitive to ribosome-dissociative conditions

We next analyzed the distribution of mitochondrial mRNAs in a sucrose gradient, under non-dissociative and dissociative conditions. The extended length of these mRNAs facilitates their detection using qRT-PCR, a method that is impractical for the analysis of many shorter sRNAs. To ensure the gradient exhibited characteristics akin to those used in the sRNA distribution analysis (Figure 6), we concurrently analyzed the sRNA LSUF/G. This confirmed that LSUF/G migrates deep into the gradient, which is sensitive to ribosome-dissociative conditions (see Figures 6C and 7A).

All three mRNAs migrated deep into the gradient, reaching their peak in fractions 11–13, as depicted in Figure 7B–D. Upon treatment with EDTA, the mRNAs shifted towards fractions associated with lower molecular weights. We conclude that mitochondrial mRNAs in T. gondii are populated with ribosomes, which directly supports previous suggestions of active translation in apicomplexan mitochondria.

The nuclear-encoded mitoribosomal protein L11 is found in EDTA-sensitive large complexes

We next sought to analyze how a ribosomal protein is distributed in our sucrose gradient with the goal to have an additional read-out for ribosomes and polysomes and to investigate whether this mirrors the distribution of sRNAs. We chose TGRH88_003980_t1, which was in previous in silico screens identified as a potential mitoribosomal protein L11 (Gupta et al., 2014; Lacombe et al., 2019). We used CRISPR/Cas9 to introduce an HA-tag to the C-terminus of TguL11m by homologous recombination (Figure 8—figure supplement 1A and B). The tagged protein is immunologically detected as a single band, demonstrating the specificity of the tagging experiment (Figure 8—figure supplement 1C). We next performed an immunofluorescence assay (IFA) to check the localization TguL11m:HA. The HA-signal co-localized with the mitochondrial marker protein TgTom40 (van Dooren et al., 2016) confirming the localization of TguL11m to T. gondii mitochondria (Figure 8A). We next analyzed the distribution of TguL11m:HA in sucrose gradients and found a main peak in fractions 9–11 with signals reaching into the deepest fractions of the gradients (Figure 8B). This mirrors the distribution of sRNAs from the large subunits like LSUD/E (Figure 6B). Upon EDTA treatment, the signal shifts from the deepest fractions (12-14) into lower fractions (8-10), suggesting that we observe polysomes in fractions 11–14 and that the large subunit of the ribosome is found in fractions 8–10.

Figure 8. — (A) *T. gondii* parasites expressing TguL11m:HA were subjected to immunofluorescence assays. These assays were conducted using an anti-HA antibody, visualized in the green channel. The mitochondria within these samples were identified using an antibody against the outer membrane protein Tom40, shown in the magenta channel. (B) For the analysis of HA-tagged mitochondrial ribosomal protein L11 (TguL11m:HA), sucrose density centrifugation was employed. Organelle-enriched extracts from *T. gondii* lines were either untreated (Mg²⁺) or treated with EDTA. Subsequently, these extracts were fractionated via a sucrose gradient. The fractions obtained were analyzed using SDS-PAGE, followed by immunoblot analysis. TguL11m:HA was detected using an antibody against HA. Percentages in square brackets show the proportion of signal detected in fractions 7–9 versus signal intensity in fractions 10–14. 'I' indicates 10% of the input material for the centrifugation.

Figure 8—source data 1. Raw blot images.
Uncropped immunoblots accompanied by images indicating the areas shown in Figure 8B with a red rectangle. In addition, raw scan images and light image overlays depicting the membrane outline are provided.

elife-95407-fig8-data1.zip^{(5.2MB, zip)}

Figure 8—figure supplement 1. — (A) *T. gondii* parasites expressing TguL11m:HA were subjected to immunofluorescence assays. These assays were conducted using an anti-HA antibody, visualized in the green channel. The mitochondria within these samples were identified using an antibody against the outer membrane protein Tom40, shown in the magenta channel. (B) For the analysis of HA-tagged mitochondrial ribosomal protein L11 (TguL11m:HA), sucrose density centrifugation was employed. Organelle-enriched extracts from *T. gondii* lines were either untreated (Mg²⁺) or treated with EDTA. Subsequently, these extracts were fractionated via a sucrose gradient. The fractions obtained were analyzed using SDS-PAGE, followed by immunoblot analysis. TguL11m:HA was detected using an antibody against HA. Percentages in square brackets show the proportion of signal detected in fractions 7–9 versus signal intensity in fractions 10–14. 'I' indicates 10% of the input material for the centrifugation.

Figure 8—source data 1. Raw blot images.
Uncropped immunoblots accompanied by images indicating the areas shown in Figure 8B with a red rectangle. In addition, raw scan images and light image overlays depicting the membrane outline are provided.

elife-95407-fig8-data1.zip^{(5.2MB, zip)}

Discussion

Constraints in the recombination-active mitochondrial DNA of Toxoplasma gondii

The complexity of the T. gondii mitochondrial genome is puzzling, with an extensive variety of sequence block combinations and repetitions. The individual ONT reads are not repeated in the two previously published datasets (discussed in Berná et al., 2021b) and we also did not find an overlap between the data presented here and published previously (Namasivayam et al., 2021). Thus, the large number of block combinations identified here reinforces and elevates previous conclusions that continuous recombination shuffles the blocks (Namasivayam et al., 2021). At first sight, this might be considered a sign of evolutionary tinkering, similar to the mitochondrial idiosyncrasies found in other taxa that have been suggested to be non-adaptive, such as the fragmented mitochondrial genomes of diplonemids (Burger and Valach, 2018) or the repetitive and heterogeneous genomes of plant mitochondria (Kozik et al., 2019). Upon closer scrutiny, however, the reshuffling appears limited to specific block borders and is not random. In fact, the improved sequencing depth presented in this study revealed strong constraints on the allowed block combinations in T. gondii mtDNA. Thus, a limited number of allowed recombination sites generates the variety of actual DNA fragments sequenced. The question remains whether the constraints outlined in the comprehensive model of mitochondrial genome architecture presented here indicate functional features.

The purpose of genome blocks in light of small RNA production

Our small RNA sequencing results revealed that 15 small RNAs span block borders. There are different explanations for this peculiar localization of small RNAs. One possibility is that these RNAs are involved in the DNA recombination and replication process that occurs at block borders. RNA molecules have been found to play important roles in DNA repair at double-stranded breaks of nuclear DNA, both as long and small RNAs (Ohle et al., 2016; Wei et al., 2012). However, it is currently unclear whether RNA:DNA hybrids play a role in mitochondrial DNA recombination and repair (Allkanjari and Baldock, 2021).

Another possible explanation for the presence of small RNAs at block borders is that T. gondii simply utilizes all available genomic space for RNA production, including block borders. T. gondii appears to be as economical with its mitochondrial sequence space as other apicomplexans, and it does have hardly any unused noncoding sequences in its genome. In fact, the sequences at recombination sites could be regarded as an expansion of the mitochondrial genome sequence space, which is not available to other apicomplexans like the genus Plasmodium. There is evidence from recent polyA-sequencing efforts that long transcripts spanning several short RNAs are also produced in T. gondii (Lee et al., 2021). This is reminiscent of the situation in P. falciparum, where both strands of the entire mitochondrial genome are transcribed into polycistronic precursor RNAs. Therefore, block borders in T. gondii are likely transcribed into RNA as well. The functionality of these block border RNAs is unclear, but the unique sequence combinations generated at the block borders give rise to sRNAs that are not found in other apicomplexans with a simpler genome organization.

The discovery of sRNAs sharing sequences with mRNAs or being antisense to mRNAs in T. gondii mitochondria is an unexpected finding of our study that was facilitated by our improved understanding of mitochondrial genome organization. Antisense RNA is typically removed in mitochondria from flies and humans (Pajak et al., 2019). Similar antisense degradation mechanisms must be assumed in apicomplexans, as both strands of their genomes are transcribed in full (Rehkopf et al., 2000). The fact that three small antisense RNAs survive antisense RNA surveillance in T. gondii mitochondria is intriguing, and it remains to be established whether RNA:RNA interactions of these sRNAs with mRNAs occur in vivo and could be of regulatory impact.

Another potential regulatory link between mRNAs and sRNAs is represented by small RNAs that utilize coding regions. RNA19 is of particular interest because it contains a piece of coxIII at the 5’-end and terminates close to the cob start codon, while also showing homology to rRNA. A few cases of mRNA sequences overlapping in antisense orientation with rRNA have been described in mammals and yeast (Coelho et al., 2002; Kermekchiev and Ivanova, 2001). Sequences homologous to rRNAs have been found in many coding regions in sense and antisense orientation (Mauro and Edelman, 1997). It has been suggested that this could link ribosome production to other cellular processes by reciprocal inhibition of mRNA and rRNA expression (Coelho et al., 2002). It is possible that RNA19 and ribosome production could be balanced with cob protein production by tuning the processing or RNA degradation of cob mRNA. A quantitative analysis of RNA19 and cob mRNA accumulation under different conditions could help to clarify whether there is such an inverse correlation. Overall, the discovery of block-border sRNAs highlights the complex biogenesis of sRNAs in T. gondii mitochondria and will be a starting point to understand the processing of sRNAs and their function in general. Regarding their function, many, if not all, of the sRNAs at block borders could be used in ribosomes as rRNA fragments, which is discussed in the next chapter.

sRNAs are incorporated into polysome-size complexes

A major unsolved problem in apicomplexan mitochondrial gene expression is the nature of the ribosome. Recently, compelling evidence has surfaced suggesting the existence of ribosome-sized particles, including ribosomal proteins. Using blue-native gel electrophoresis, it was demonstrated that tagged ribosomal proteins TgmS35, TgbL12m, TguL3m, and TguL24m migrate as constituents of a macromolecular complex within the size range of a ribosome (Lacombe et al., 2019; Shikha et al., 2022). Additionally, TgmS35 and TguL24m coprecipitated with mitochondria-encoded rRNA. Knockdown of TgmS35, TgbL35m, TgbL36m, and TgbL28m resulted in a specific loss of activity of one respiratory complex, which is partially encoded in the mitochondrial genome (Lacombe et al., 2019; Shikha et al., 2022). Nevertheless, it remains to be determined whether these large complexes are actively involved in translation.

Here, we provide evidence that sRNAs with homology to rRNA are part of polysomes. This conclusion is based on our finding that sRNAs, mRNAs, and the ribosomal protein TguL11m are all found in high-molecular weight complexes that are sensitive to ribosome-dissociative conditions. We, furthermore, observed a difference in the gradient distribution of sRNAs assigned to SSU and LSU, respectively. Based on a comparison with these two gradient distribution patterns, we were able to assign RNA29 to LSU. In sum, our findings strongly support the notion of active translation by mitochondrial ribosomes in tachyzoites of T. gondii, and demonstrate that most if not all mitochondrial small RNAs fragments are part of the apicomplexan mitochondrial ribosome. This includes sRNAs expressed from block borders like the mRNA sequence-containing RNA3 and RNA19. Also, block border RNAs RNA1 and RNA2 that share their 3’-end sequence, but differ in their 5’-part are both incorporated into ribosomes. RNA1 is of particular interest since it contains sequences homologous to two different parts of the LSU rRNA from E. coli and hence could occupy two positions within a single ribosome. RNA1 is almost exclusively found in ribosomes and polysomes, whereas RNA2 is also found free of ribosomes - similar to RNA29. This suggests that RNA2 and RNA29 might not integrate into ribosomes as effectively as other rRNA fragments, raising questions about the coordination of rRNA fragment assembly. The assembly of a ribosome with numerous short rRNA fragments and a comparable number of ribosomal proteins is undeniably complex. Notably, longer variants of RNA19 are not incorporated into ribosomes, unlike smaller isoforms, highlighting a connection between small RNA processing and ribosome assembly. In the case of RNA19, processing precedes integration, contrasting with plant chloroplast ribosomes where the 23 S rRNA is split endonucleolytically at two sites post-integration (Liu et al., 2015; Nishimura et al., 2010).

The analyses presented here mark a first step towards describing the assembly steps of the fascinating T. gondii ribosome. Unraveling the mechanistics of ribosome assembly will require the identification of rRNA processing factors, which is within reach, given the excellent genetic toolbox available for T. gondii. It will also eventually require understanding the ribosome 3D structure. Within the spectrum of mitochondrial ribosome structures across species (Ramrath et al., 2018; Saurer et al., 2023; Tobiasson et al., 2022; Waltz et al., 2021b) mitoribosomes of T. gondii present an extreme example of ribosome diversification. The comprehensive collection of small RNA fragments identified in our study, including those at block borders, provides a valuable resource for future studies on the structure and function of the T. gondii mitochondrial ribosome.

Conclusions

Apicomplexan mitochondrial genomes are vital for organism survival, expressing key respiratory chain and gene expression components, particularly ribosomes. The latter were proposed to be assembled from highly fragmented ribosomal RNAs, but whether rRNA fragments are expressed and used in ribosomes was unclear. We adapted a protocol to enrich T. gondii mitochondria (Esseiva et al., 2004) and used Nanopore sequencing to comprehensively map the genome with its repeated sequence blocks. Small RNA sequencing identified fragmented ribosomal RNAs, including some RNAs spanning block boundaries, thus fusing protein-coding and rRNA sequences. Sucrose density gradient analysis showed that such rRNA fragments are in polysome-size complexes. This distribution mirrored the localization of the mitoribosomal protein L11 as well as mRNAs in sucrose gradients. sRNAs, L11, and mRNAs shift to lower molecular weight complexes upon treatment with ribosome-dissociative buffer. We conclude that most if not all mitochondrial sRNAs are components of active ribosomes that assemble into polysomes. T. gondii’s dynamic block-based genome organization leads to the usage of mitochondrial sequences in mRNA as well as rRNA contexts, potentially linking rRNA and mRNA expression regulation.

Methods

Cultivation of host cells and T. gondii parasites

T. gondii parasites were cultured according to standard procedures ( Jacot and Soldati-Favre, 2020) unless otherwise indicated. Briefly, human foreskin fibroblasts (HFF-1, ATCC SCRC-1041) were grown as host cells in Dulbecco’s modified Eagle’s medium (DMEM) (Capricorn Scientific) supplemented with 10% fetal bovine serum (FBS) (Capricorn Scientific) and 100 µg/ml penicillin‒streptomycin (Capricorn Scientific). T. gondii tachyzoites were maintained in flasks containing confluent HFF-1 cells in DMEM supplemented with 1% FBS and 100 µg/ml penicillin‒streptomycin. T. gondii strains were kindly provided by Frank Seeber. Experiments were either carried out with parasites of the strain RH-ΔKUΔHX (Huynh and Carruthers, 2009) or RHpSAG1-βGal/pmt-GFP. The latter consists of RHβ1 (PMID: 8635747), which in addition expresses eGFP N-terminally fused to the T. gondii mitochondrial targeting sequence S9(33–159) (DeRocher et al., 2000).

CRISPR/Cas9 mediated endogenous tagging of Tgurpl11m

C-terminal tagging of TGRH88_003980_t1 was performed as described previously (Parker et al., 2019). The sequence of a single guide RNA (sgRNA) specifically targeting the 3’ end of TGRH88_003980_t1 was integrated into pSAG1:CAS9-GFPU6::sgUPRT (Addgene plasmid # 54467) using the Q5-site directed mutagenesis kit (NEB). The 3xHA epitope tag was amplified from the pPR2-HA3 plasmid (Katris et al., 2014) and integrated into the pU6-DHFR vector (Addgene plasmid #80329). The 3xHA-DHFR cassette was amplified from the resulting vector by Q5 polymerase (NEB) using primers containing a 50 nt overlap homologous to the either upstream or downstream regions of the TGRH88_003980_t1 stop codon (primers ‘flank fwd Tgurpl11m tagging’ and ‘flank rev Tgurpl11m tagging’ in Supplementary file 9, respectively). The amplified donor construct was together with the pSAG1::CAS9-GFPU6:sgUPRT plasmid encoding the Cas9 and the sgRNA co-transfected into RH-ΔKUΔHX as described previously (Jacot and Soldati-Favre, 2020 ). Parasites were grown under Pyrimethamine selection and subsequently cloned by limiting dilution in 96 well plates (Roos et al., 1994). Genotyping was performed directly from 96 well plates as described by Piro et al., 2020. Proper integration of the 3xHA tag was verified in single clones using Sanger Sequencing. Sequences of all the primers used for cloning and genotyping are listed in Supplementary file 9.

Immunofluorescence assays

Immunofluorescence assays were done as described previously (van Dooren et al., 2008). Coverslips with confluent human foreskin fibroblasts (HFF) were infected with freshly egressed Toxoplasma gondii parasites. The next day, cells were fixed in 3% (w/v) paraformaldehyde in PBS for 15 min at room temperature and permeabilized in 0.25% (v/v) Triton X-100 in PBS for 10 min. Blocking was carried out in 2% (w/v) bovine serum albumin in PBS overnight. All antibody incubation steps were done at room temperature for 1 hr. Rabbit anti-TgTom40 (van Dooren et al., 2016, 1:2000 dilution) and rat anti-HA high affinity (Sigma, 11867431001, 1:200 dilution) primary antibodies were used. Then the coverslips were washed three times in PBS and incubated with donkey anti-rabbit CF 647 (Sigma, SAB4600177, 1:2000 dilution) and goat anti-rat AlexaFluor 488 (Thermo Fisher Scientific, A-11006, 1:500 dilution) secondary antibodies. Microscopy and image acquisition were performed on a DeltaVision Elite deconvolution setup (GE Healthcare) using an inverted Olympus IX71 microscope fitted with a UPlanSApo 100 x objective lens and Photometrics CoolSNAP HQ2 camera. Images were deconvolved using SoftWoRx Suite 2.0 software, brightness and contrast were linearly adjusted in FIJI/ImageJ (release 1.53 c).

T. gondii organelle enrichment

A previously established protocol to enrich T. gondii organelles was modified here slightly (Esseiva et al., 2004). Freshly lysed T. gondii cultures from four T175 flasks (approximately 8 × 10⁸ parasites) were filtered through a 3 μm pore size polycarbonate filter to remove host cell debris and harvested by centrifugation at 1500 × g for 10 min. The parasite pellet was briefly washed in ice-cold Dulbecco’s phosphate-buffered saline (PBS) (Capricorn Scientific) and again centrifuged for 10 min at 1500 × g. Afterwards, the parasite pellet was resuspended in 20 mM Tris-HCl, pH 7.4, 600 mM mannitol, 5 mM MgCl₂, and 2 mM EDTA containing 0.05% digitonin (Invitrogen) and incubated for 5 min on ice. The extract was clarified by centrifugation at 8000 × g for 5 min at 4 °C. The supernatant containing soluble cytosolic components was discarded, and the corresponding pellet was resuspended in 20 mM Tris-HCl, pH 7.4, 600 mM mannitol, 5 mM MgCl₂, and 1 mM EDTA containing benzonase (345 U/ml) (Novagen), incubated for 20 min at room temperature and subsequently centrifuged at 8000 × g for 5 min at 4 °C. The pellet was retained and washed in 20 mM Tris-HCl, pH 7.4, 600 mM mannitol, and 17 mM EDTA to inactivate benzonase remnants. Following a spin at 8000 × g for 5 min at 4 °C, the pellet consisted of a crude enrichment of organelles and membranes. To study the T. gondii mitochondrial DNA content, the organelle pellet was directly used for DNA isolation. For organellar RNA analysis, the organelle pellet was frozen at –80 °C for at least 24 hr. Afterwards, it was thawed on ice and resuspended in a high detergent lysis buffer (20 mM HEPES-KOH pH 7.5, 100 mM KCl, 30 mM MgCl₂, 1% Igepal-CA630, 1.5% Triton X-100, 0.5% sodium deoxycholate, 1 mM DTT, EDTA-free protease inhibitor). The lysate was incubated during continuous end-over-end rotation at 4 °C for 40 min and afterwards clarified by centrifugation at 16,000 × g for 10 min at 4 °C. The pellet, composed of cell debris and membranes, was discarded, and the supernatant, which contained the organelle lysate, was resuspended in TRIzol Reagent (Invitrogen) for RNA isolation.

SDS‒PAGE and immunoblotting

To detect mitochondrial GFP in RH pSAG1-βGal/pmt-GFP parasites, samples taken during organelle enrichment were directly diluted in Laemmli buffer (0.2 M Tris–HCl, 8% (w/v) SDS, 40% glycerol, 20% (v/v) β-mercaptoethanol, 0.005% bromophenol blue), incubated at 95 °C for 5 min and subjected to sodium dodecyl sulfate‒polyacrylamide gel electrophoresis (SDS‒PAGE). SDS‒PAGE and immunoblotting were carried out as described previously (Kupsch et al., 2012). For immunodetection, monoclonal primary mouse anti-GFP antibody (G1546, Sigma‒Aldrich) and polyclonal rabbit anti-heat shock protein 70 antibody (AS05 083 A, Agrisera) followed by secondary horseradish peroxidase (HRP)labeled goat antimouse antibody (ab205719, Abcam) and goat anti-rabbit antibody (ab205718, Abcam) were used.

Proteins from sucrose gradient fractions were isolated using methanol-chloroform-water extraction (Wessel and Flügge, 1984). Protein pellets were resuspended in Laemmli buffer, heated for 5 min at 95 °C, and subjected to SDS-PAGE and immunoblotting. Primary rat anti-HA antibody (11867423001, Roche) and secondary rabbit anti-rat (HRP) antibody (ab6734, Abcam) were used for TguL11m:HA detection.

DNA isolation and library construction

DNA was isolated from organelle-enriched fractions using the GeneMATRIX Tissue DNA Purification Kit (Roboklon) according to the manufacturer’s protocol for cultured cells. The Oxford Nanopore Technology sequencing library was prepared with 1 µg of organelle-enriched DNA using the SQK-LSK108 ligation sequencing kit (Oxford Nanopore, ONT) according to the manufacturer’s instructions. Sequencing was performed on an R9.4.1 Spot ON Flow Cell; live-basecalling was done using the ONT Guppy software package.

RNA isolation and RNA-seq library construction

RNA was isolated from total organelle enriched and sucrose gradient fraction samples using TRIzol Reagent (Invitrogen), followed by the Monarch Total RNA Miniprep Kit (New England BioLabs) applying the manufacturer’s protocol for TRIzol extracted samples. Two hundred nanograms of RNA was used to generate sequencing libraries using the NEBNext Multiplex Small RNA Library Prep Set for Illumina Kit (New England BioLabs) according to the manufacturer’s instructions. For identification of PCR duplicates in the library preparation process, the 5’-linker was modified with eight N-nucleotides (as unique molecular identifiers, UMIs). Small RNA library clean-up was carried out with AMPure XP Beads (Beckmann Coulter). Sequencing was performed by Genewiz on an Illumina NovaSeq platform with a read length of 150 bp for each pair.

Agarose-formaldehyde gel electrophoresis

RNA samples were diluted in denaturing loading buffer (Deionized formamide 62.5% (v/v), formaldehyde 1.14 M, bromophenol blue 200 μg/mL, xylene cyanole 200 μg/mL, MOPS-EDTA-sodium acetate), and separated on a 1% agarose gel containing 1.2% formaldehyde. Cytosolic rRNAs were detected with ethidium bromide staining.

Denaturing urea-PAGE and sRNA gel blot

RNA was separated by denaturing Urea-PAGE (10% or 12% polyacrylamide gel for total RNA/organelle enriched RNA and RNA extracted from sucrose density gradient fractions, respectively). Small RNA gel blotting and detection of transcripts with radiolabeled DNA oligo probes was carried out as described previously (Loizeau et al., 2014). Sequences of the oligo probes used in this study are listed in Supplementary file 9. Blot hybridization of sucrose density gradient fractions was carried out consecutively in the order RNA1_2, RNA19, SSUA, LSUD_E, RNA3, LSUF_G, RNA29, with removal of probes by heating the blot to 60 °C for 1 hr in 0.5% SDS.

5’ RACE

T. gondii total RNA was ligated to a small RNA oligo (Rumsh, see Supplementary file 9) using T4 RNA Ligase 1 (NEB) according to the manufacturer’s instructions. Afterwards, RNA was reverse transcribed using ProtoScript II Reverse Transcriptase (NEB) and random primers. PCR amplification of cDNA was performed with a gene-specific cob primer and a primer annealing to the ligated oligo (Rumsh1, see Supplementary file 9) using Taq DNA Polymerase (Roboklon). The purified PCR product was sequenced by Sanger Sequencing (LGC Genomics GmbH).

Sucrose density gradient centrifugation

Ribosome fractionation by sucrose density gradient centrifugation was performed as previously described Waltz et al., 2021a. Briefly, T. gondii parasites were harvested and organelle enrichment was performed as described above with small modifications. Salt concentration in the digitonin and lysis buffer was either adjusted for optimal mitoribosome stability (30 mM MgCl₂, 100 mM KCl) or for dissociative conditions (10 mM EDTA, 300 mM KCl). The organelle-enriched lysate was layered onto a continuous 10–50% sucrose gradient. Sucrose solutions (10% and 50%) were previously prepared freshly in 20 mM HEPES-KOH pH 7.5,1 mM DTT, EDTA-free protease inhibitor, and either 30 mM MgCl2, 100 mM KCl or 10 mM EDTA, 300 mM KCl (dissociative conditions). A continuous sucrose gradient was formed using BioComp Gradient Master 108. The gradient was centrifuged at 200.000 × g for 2 hr at 4 °C in a S52-ST rotor (Thermo Scientific) and subsequently fractionated into 14 fractions using the BioComp Piston Gradient Fractionator device. Fractions were resuspended directly in TRIzol reagent (Invitrogen). For RT-qPCR samples 1.2 ng of human estrogen receptor 1 (ESR1) RNA was introduced into each fraction as spike-in control. RNA was extracted from fractions using Trizol-Chloroform phase separation followed by overnight precipitation with 1 volume Isopropanol.

RT-qPCR

RNA isolated from fractions was treated with TurboDNase (NEB) and afterwards subjected to reverse transcription using ProtoScript II Reverse Transcriptase (NEB). Synthesized cDNA from each fraction was used as a template in qPCR. Reactions were performed as technical triplicates using Luna Universal qPCR Master Mix (NEB). Primers listed in Supplementary file 9 were used for the amplification of mitochondrial cob, coxI, coxIII, LSUF/G rRNA, and ESR1 spike-in for normalization. The relative quantity of each amplicon was determined using the Pfaffl method (Pfaffl, 2001). Relative quantities for amplicons in each fraction are presented as a percentage of the cumulative relative quantities of the respective amplicon across all fractions. Data was visualized using GraphPad Prism Version 10.1.0.

Bioinformatic analyses

All bioinformatic analyses except for the quantification of sequence block combinations were performed on the Galaxy web platform (v.22.01; https://usegalaxy.eu, Afgan et al., 2018). All reference sequences used were retrieved from GenBank (Sayers et al., 2019). Sequence alignments were prepared using Clustal Omega (https://www.ebi.ac.uk/Tools/msa/clustalo/) with default settings. E. coli sequences were added manually according to sequence positions established previously (Feagin et al., 2012).

Masking NUMTs in the T. gondii nuclear genome

To identify NUMTs in the T. gondii nuclear genome, mitochondrial sequence blocks (MN077088.1-MN077111.1) were aligned against the RH-88 nuclear reference genome (GCA_019455545.1, August 2021) using NCBI-BLASTN (v. 2.13.0) (Altschul et al., 1990) (default parameters, except max_target_seqs = 5000). All hits obtained in the BLASTN tabular output were manually integrated into a GFF3 file format. NUMTs in the nuclear genome were masked with bedtools MaskFastaBed (Quinlan and Hall, 2010) based on the intervals defined in the GFF3 file. In total, we masked 8118 sites with an average length of 92 nt representing in total ~1% of the nuclear genome.

DNA sequencing data analysis

DNA Oxford Nanopore sequencing data were trimmed using Porechop (Version 0.2.4, https://github.com/rrwick/Porechop; Wick, 2018). As a first step, trimmed reads were mapped against the human reference genome GRCh38.p14 (GCA_000001405.29, February 2022) using Minimap2 (Version 2.24) (Li, 2018)(PacBio/Oxford Nanopore read to reference mapping, default settings). Mapping results were filtered using SAMtools view (Version 1.15.1) (Li et al., 2009) (-f read is unmapped). Unmapped reads were subsequently simultaneously mapped against the T. gondii RH-88 nuclear reference genome masked for NUMTs, the apicoplast genome (CM033583.1), and mitochondrial sequence blocks (MN077088.1- MN077111.1) using Minimap2. A fraction of the reads that fit best to the mitochondrial genome still retain shorter nuclear sequence sections and are, therefore, representing NUMTs. We, therefore, took all reads that mapped to the mitochondrial genome in our first mapping and mapped them against the nuclear reference genome masked for NUMTs using Minimap2 (Version 2.24). The remaining unmapped reads were considered mitochondrial and visualized with Geneious Prime v. 2019.2.3. (Biomatters Ltd.). Reads were annotated providing the mitochondrial sequence blocks (MN077088.1 - MN077111.1) and ORFs (MN077082.1 - MN077084.1) as annotation database using the Geneious Prime ‘Annotate from Database’ tool with similarity set to ≥75%.

Sequence comparisons of ONT reads found here with published ONT reads for the T. gondii mitochondrial genome were performed using the NUCmer tool in the MUMmer package of Galaxy with the following settings: minimum match length 20 and minimum alignment length 290 (shortest read in Namasivayam et al., 2021). The Show-Coords tool was used to output alignment information. We found 1555 reads of our dataset being entirely part (defined as ≥99% read coverage) of reads from the Namasivayam et al., 2021 and 212 reads of their dataset being fully found within reads from our dataset.

Analysis of mitochondrial DNA sequence block arrangements

For a comprehensive overview of the analysis, including the complete.Rmd report please refer to the GitHub repository (https://github.com/Kovox91/ToxoBlocks, copy archived at Maschmann, 2024). In brief, annotated block IDs, indicating the directionality of each block, were exported in plain text. A custom R script, leveraging the tidyverse library (Wickham et al., 2019), was employed for the subsequent analysis. This script facilitated the removal of block combinations where two blocks were distanced by more than 10 nucleotides. For two-block combinations, the script summed up all identified combinations, filtering out those occurring less than 50 times. The results, incorporating block orientations, were visualized using ComplexHeatmap (Gu and Hübschmann, 2022). To quantify the relative abundance of ORFs, counts for the respective ORF combinations were tallied across all reads. The ORF frequency was then computed as the ratio of reads containing at least one instance of the specific ORF to the total number of reads that were at least as long as the given ORF. RNA sequencing data analysis.

Adapter and quality trimming of RNA sequencing data was performed using Cutadapt (Version 4.1; Martin, 2011). Umi-tools was used to extract UMIs (Version 1.1.2; Smith et al., 2017). Reads were aligned to the T. gondii RH-88 nuclear reference genome masked for NUMTs and the apicoplast genome (CM033583.1) using Bowtie2 (Version 2.4.5; Langmead et al., 2009) in default settings. Mapping results were filtered using SAMtools view (Version 1.15.1; Li et al., 2009) (-f read is unmapped). Unmapped reads were kept and mapped with Bowtie2 using default settings against the T. gondii mitochondrial genome using all combinations of the sequence blocks found in our genome sequence analysis (GenBank accession MN077088.1 - MN077111.1, OR086910 - OR086916) as determined in our ONT DNA sequencing data. Transcripts were determined as follows: first, most transcripts had short poly-A tails - up to 12 adenines at their 3’-ends - which is similar to the situation in P. falciparum (Rehkopf et al., 2000). For the identification of the transcript 3’-ends, all polyA-tails longer than three nucleotides were removed from reads using fastp (Chen et al., 2018). The number of mapped 3’ and 5’ read ends at each position in the reference was calculated using Bedtools Genome Coverage 2.30.0 (Quinlan and Hall, 2010). The 5’-ends at the beginning of coverage plateaus over the background were considered starting points of transcripts. For each 5’ end, the mapped 3’-end most far away but still represented by at least 10% of the reads of the 5’-terminus and connected to the starting point by uninterrupted coverage over the background was considered the corresponding 3’-terminus. Alignments and coverage graphs were visualized in Integrated Genome Viewer (Robinson et al., 2011). All RNA sequencing data are available at SRA under accession number PRJNA978626.

Sequence similarity search

All newly identified transcripts were subjected to an NCBI-BLASTN homology search (v. 2.13.0; Altschul et al., 1990; default parameters, except word size = 7) against the Plasmodium falciparum mitochondrial genome (M76611.1) and the Eimeria leuckarti mitochondrial genome (MW354691.1). Transcripts not matching the P. falciparum or E. leuckarti mitochondrial genome were further inspected for homology to the E. coli rRNA operon (J01695.2) using BLASTN.

Data access

The RNA and DNA sequencing data generated in this study have been submitted to the NCBI BioProject database (https://www.ncbi.nlm.nih.gov/bioproject/) under accession number PRJNA978626. New block annotations can be found in GenBank, OR086910 - OR086916.

Acknowledgements

We are grateful to Giel van Dooren, Frank Seeber, Kai Matuschewski, and Hannes Ruwe for the critical discussion of our data and manuscript. We sincerely thank Frank Seeber for providing the T. gondii strains. We are grateful to Giel van Dooren for providing the plasmids for CRISPR/Cas9 tagging and for support from the microscopy facility at ANU Canberra. We thank Svea Beier for preparing the constructs for RPL11 tagging and Florian Rösch for preparing the vector for the ESR1-spike in amplification. This work was supported by the German research foundation (DFG) by grant IRTG2290, project B01 to CS. The article processing charge was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – 491192747 and the Open Access Publication Fund of Humboldt-Universität zu Berlin.

Funding Statement

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Contributor Information

Christian Schmitz-Linneweber, Email: smitzlic@rz.hu-berlin.de.

Marisa Nicolás, Laboratório Nacional de Computação Científica, Brazil.

Lori Sussel, University of Colorado Anschutz Medical Campus, United States.

Funding Information

This paper was supported by the following grant:

Deutsche Forschungsgemeinschaft IRTG2290-B01 to Christian Schmitz-Linneweber.

Additional information

Competing interests

No competing interests declared.

Author contributions

Formal analysis, Investigation, Methodology, Writing - review and editing.

Validation, Investigation, Writing - review and editing.

Investigation.

Resources, Data curation, Formal analysis, Writing - review and editing.

Formal analysis, Investigation.

Data curation, Formal analysis, Supervision, Writing - review and editing.

Conceptualization, Supervision, Writing - original draft, Writing - review and editing.

Additional files

Supplementary file 1. Oxford Nanopore sequencing technology (ONT) sequencing mapping statistics on nuclear and organellar genomes.

elife-95407-supp1.docx^{(14.5KB, docx)}

Supplementary file 2. Read length distribution of T. gondii mitochondrial Oxford Nanopore sequencing technology (ONT) reads.

elife-95407-supp2.docx^{(12.8KB, docx)}

Supplementary file 3. Mitochondrial sequence block frequencies in Oxford Nanopore sequencing technology (ONT) DNA sequencing data.

elife-95407-supp3.docx^{(12.9KB, docx)}

Supplementary file 4. Sequence block combinations identified in T. gondii mitochondrial Oxford Nanopore sequencing technology (ONT) reads using a custom R script.

Combinations that were found less than 50 times are considered false positives and shown in gray.

elife-95407-supp4.docx^{(18KB, docx)}

Supplementary file 5. Fraction of reads containing full-length open reading frames.

elife-95407-supp5.docx^{(12.6KB, docx)}

Supplementary file 6. Mapping statistics of the RNA-seq data on the different genomes/sequence blocks.

After filtering the raw reads against the nuclear rRNA genes, the remaining reads were mapped against the three subgenomes of T. gondii RH-88.

elife-95407-supp6.docx^{(13.4KB, docx)}

Supplementary file 7. List of mitochondrial non-coding RNAs identified by small RNA (sRNA) sequencing.

elife-95407-supp7.docx^{(19.6KB, docx)}

Supplementary file 8. Overview of mitochondrial non-coding RNAs identified in P. falciparum.

elife-95407-supp8.docx^{(14.7KB, docx)}

Supplementary file 9. List of oligonucleotides used in this study.

elife-95407-supp9.docx^{(15.9KB, docx)}

MDAR checklist

elife-95407-mdarchecklist1.pdf^{(190.9KB, pdf)}

Data availability

The following datasets were generated:

Tetzlaff S, Schmitz-Linneweber C. 2023. Characterization of short RNAs, in particular rRNAs from mitochondria of Toxoplasma gondii. NCBI BioProject. PRJNA978626

Schmitz-Linneweber C, Tetzlaff S. 2024. Toxoplasma gondii element block F genomic sequence; mitochondrial. NCBI Nucleotide. OR086910

Schmitz-Linneweber C, Tetzlaff S. 2024. Toxoplasma gondii element block K genomic sequence; mitochondrial, revised. NCBI Nucleotide. OR086911

Schmitz-Linneweber C, Tetzlaff S. 2024. Toxoplasma gondii element block M genomic sequence; mitochondrial, revised. NCBI Nucleotide. OR086912

Schmitz-Linneweber C, Tetzlaff S. 2024. Toxoplasma gondii element block H genomic sequence; mitochondrial, revised. NCBI Nucleotide. OR086913

Schmitz-Linneweber C, Tetzlaff S. 2024. Toxoplasma gondii element block C genomic sequence; mitochondrial, revised. NCBI Nucleotide. OR086914

Schmitz-Linneweber C, Tetzlaff S. 2024. Toxoplasma gondii element block Q genomic sequence; mitochondrial, revised. NCBI Nucleotide. OR086915

Schmitz-Linneweber C, Tetzlaff S. 2024. Toxoplasma gondii element block B genomic sequence; mitochondrial, revised. NCBI Nucleotide. OR086916

References

Afgan E, Baker D, Batut B, van den Beek M, Bouvier D, Cech M, Chilton J, Clements D, Coraor N, Grüning BA, Guerler A, Hillman-Jackson J, Hiltemann S, Jalili V, Rasche H, Soranzo N, Goecks J, Taylor J, Nekrutenko A, Blankenberg D. The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2018 update. Nucleic Acids Research. 2018;46:W537–W544. doi: 10.1093/nar/gky379. [DOI] [PMC free article] [PubMed] [Google Scholar]
Alday PH, Bruzual I, Nilsen A, Pou S, Winter R, Ben Mamoun C, Riscoe MK, Doggett JS. Genetic evidence for cytochrome b qi site inhibition by 4(1h)-quinolone-3-diarylethers and antimycin in Toxoplasma gondii. Antimicrobial Agents and Chemotherapy. 2017;61:e01866-16. doi: 10.1128/AAC.01866-16. [DOI] [PMC free article] [PubMed] [Google Scholar]
Allkanjari K, Baldock RA. Beyond base excision repair: an evolving picture of mitochondrial DNA repair. Bioscience Reports. 2021;41:BSR20211320. doi: 10.1042/BSR20211320. [DOI] [PMC free article] [PubMed] [Google Scholar]
Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. Journal of Molecular Biology. 1990;215:403–410. doi: 10.1016/S0022-2836(05)80360-2. [DOI] [PubMed] [Google Scholar]
Berná L, Marquez P, Cabrera A, Greif G, Francia ME, Robello C. Reevaluation of the Toxoplasma gondii and Neospora caninum genomes reveals misassembly, karyotype differences, and chromosomal rearrangements. Genome Research. 2021a;31:823–833. doi: 10.1101/gr.262832.120. [DOI] [PMC free article] [PubMed] [Google Scholar]
Berná L, Rego N, Francia ME. The elusive mitochondrial genomes of apicomplexa: where are we now? Frontiers in Microbiology. 2021b;12:751775. doi: 10.3389/fmicb.2021.751775. [DOI] [PMC free article] [PubMed] [Google Scholar]
Burger G, Valach M. Perfection of eccentricity: mitochondrial genomes of diplonemids. IUBMB Life. 2018;70:1197–1206. doi: 10.1002/iub.1927. [DOI] [PubMed] [Google Scholar]
Bushell E, Gomes AR, Sanderson T, Anar B, Girling G, Herd C, Metcalf T, Modrzynska K, Schwach F, Martin RE, Mather MW, McFadden GI, Parts L, Rutledge GG, Vaidya AB, Wengelnik K, Rayner JC, Billker O. Functional profiling of a plasmodium genome reveals an abundance of essential genes. Cell. 2017;170:260–272. doi: 10.1016/j.cell.2017.06.030. [DOI] [PMC free article] [PubMed] [Google Scholar]
Chen S, Zhou Y, Chen Y, Gu J. Fastp: an ultra-fast all-in-one fastq preprocessor. Bioinformatics. 2018;34:i884–i890. doi: 10.1093/bioinformatics/bty560. [DOI] [PMC free article] [PubMed] [Google Scholar]
Coelho PSR, Bryan AC, Kumar A, Shadel GS, Snyder M. A novel mitochondrial protein, Tar1p, is encoded on the antisense strand of the nuclear 25S rDNA. Genes & Development. 2002;16:2755–2760. doi: 10.1101/gad.1035002. [DOI] [PMC free article] [PubMed] [Google Scholar]
DeRocher A, Hagen CB, Froehlich JE, Feagin JE, Parsons M. Analysis of targeting sequences demonstrates that trafficking to the Toxoplasma gondii plastid branches off the secretory system. Journal of Cell Science. 2000;113 ( Pt 22):3969–3977. doi: 10.1242/jcs.113.22.3969. [DOI] [PubMed] [Google Scholar]
Esseiva AC, Naguleswaran A, Hemphill A, Schneider A. Mitochondrial tRNA import in Toxoplasma gondii. The Journal of Biological Chemistry. 2004;279:42363–42368. doi: 10.1074/jbc.M404519200. [DOI] [PubMed] [Google Scholar]
Feagin JE, Harrell MI, Lee JC, Coe KJ, Sands BH, Cannone JJ, Tami G, Schnare MN, Gutell RR. The fragmented mitochondrial ribosomal RNAs of Plasmodium falciparum. PLOS ONE. 2012;7:e38320. doi: 10.1371/journal.pone.0038320. [DOI] [PMC free article] [PubMed] [Google Scholar]
Gjerde B. Characterisation of full-length mitochondrial copies and partial nuclear copies (numts) of the cytochrome b and cytochrome c oxidase subunit I genes of Toxoplasma gondii, Neospora caninum, Hammondia heydorni and Hammondia triffittae (Apicomplexa: Sarcocystidae) Parasitology Research. 2013;112:1493–1511. doi: 10.1007/s00436-013-3296-4. [DOI] [PubMed] [Google Scholar]
Gu Z, Hübschmann D. Make interactive complex heatmaps in R. Bioinformatics. 2022;38:1460–1462. doi: 10.1093/bioinformatics/btab806. [DOI] [PMC free article] [PubMed] [Google Scholar]
Gupta A, Shah P, Haider A, Gupta K, Siddiqi MI, Ralph SA, Habib S. Reduced ribosomes of the apicoplast and mitochondrion of Plasmodium spp. and predicted interactions with antibiotics. Open Biology. 2014;4:140045. doi: 10.1098/rsob.140045. [DOI] [PMC free article] [PubMed] [Google Scholar]
Hikosaka K, Kita K, Tanabe K. Diversity of mitochondrial genome structure in the phylum Apicomplexa. Molecular and Biochemical Parasitology. 2013;188:26–33. doi: 10.1016/j.molbiopara.2013.02.006. [DOI] [PubMed] [Google Scholar]
Hillebrand A, Matz JM, Almendinger M, Müller K, Matuschewski K, Schmitz-Linneweber C. Identification of clustered organellar short (cos) RNAs and of a conserved family of organellar RNA-binding proteins, the heptatricopeptide repeat proteins, in the malaria parasite. Nucleic Acids Research. 2018;46:10417–10431. doi: 10.1093/nar/gky710. [DOI] [PMC free article] [PubMed] [Google Scholar]
Hollin T, Abel S, Falla A, Pasaje CFA, Bhatia A, Hur M, Kirkwood JS, Saraf A, Prudhomme J, De Souza A, Florens L, Niles JC, Le Roch KG. Functional genomics of RAP proteins and their role in mitoribosome regulation in Plasmodium falciparum. Nature Communications. 2022;13:1275. doi: 10.1038/s41467-022-28981-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
Huynh M-H, Carruthers VB. Tagging of endogenous genes in a Toxoplasma gondii strain lacking Ku80. Eukaryotic Cell. 2009;8:530–539. doi: 10.1128/EC.00358-08. [DOI] [PMC free article] [PubMed] [Google Scholar]
Jacot D, Soldati-Favre D. CRISPR/Cas9-mediated generation of tetracycline repressor-based inducible knockdown in Toxoplasma gondii. Methods in Molecular Biology. 2020;2071:125–141. doi: 10.1007/978-1-4939-9857-9_7. [DOI] [PubMed] [Google Scholar]
Ji YE, Mericle BL, Rehkopf DH, Anderson JD, Feagin JE. The Plasmodium falciparum 6 kb element is polycistronically transcribed. Molecular and Biochemical Parasitology. 1996;81:211–223. doi: 10.1016/0166-6851(96)02712-0. [DOI] [PubMed] [Google Scholar]
Katris NJ, van Dooren GG, McMillan PJ, Hanssen E, Tilley L, Waller RF. The apical complex provides a regulated gateway for secretion of invasion factors in Toxoplasma. PLOS Pathogens. 2014;10:e1004074. doi: 10.1371/journal.ppat.1004074. [DOI] [PMC free article] [PubMed] [Google Scholar]
Ke H, Morrisey JM, Ganesan SM, Mather MW, Vaidya AB. Mitochondrial RNA polymerase is an essential enzyme in erythrocytic stages of Plasmodium falciparum. Molecular and Biochemical Parasitology. 2012;185:48–51. doi: 10.1016/j.molbiopara.2012.05.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
Ke H, Dass S, Morrisey JM, Mather MW, Vaidya AB. The mitochondrial ribosomal protein L13 is critical for the structural and functional integrity of the mitochondrion in Plasmodium falciparum. The Journal of Biological Chemistry. 2018;293:8128–8137. doi: 10.1074/jbc.RA118.002552. [DOI] [PMC free article] [PubMed] [Google Scholar]
Kermekchiev M, Ivanova L. Ribin, a protein encoded by a message complementary to rRNA, modulates ribosomal transcription and cell proliferation. Molecular and Cellular Biology. 2001;21:8255–8263. doi: 10.1128/MCB.21.24.8255-8263.2001. [DOI] [PMC free article] [PubMed] [Google Scholar]
Kozik A, Rowan BA, Lavelle D, Berke L, Schranz ME, Michelmore RW, Christensen AC. The alternative reality of plant mitochondrial DNA: one ring does not rule them all. PLOS Genetics. 2019;15:e1008373. doi: 10.1371/journal.pgen.1008373. [DOI] [PMC free article] [PubMed] [Google Scholar]
Kupsch C, Ruwe H, Gusewski S, Tillich M, Small I, Schmitz-Linneweber C. Arabidopsis chloroplast RNA binding proteins CP31A and CP29A associate with large transcript pools and confer cold stress tolerance by influencing multiple chloroplast RNA processing steps. The Plant Cell. 2012;24:4266–4280. doi: 10.1105/tpc.112.103002. [DOI] [PMC free article] [PubMed] [Google Scholar]
Lacombe A, Maclean AE, Ovciarikova J, Tottey J, Mühleip A, Fernandes P, Sheiner L. Identification of the Toxoplasma gondii mitochondrial ribosome, and characterisation of a protein essential for mitochondrial translation. Molecular Microbiology. 2019;112:1235–1252. doi: 10.1111/mmi.14357. [DOI] [PMC free article] [PubMed] [Google Scholar]
Lane KD, Mu J, Lu J, Windle ST, Liu A, Sun PD, Wellems TE. Selection of Plasmodium falciparum cytochrome B mutants by putative PfNDH2 inhibitors. PNAS. 2018;115:6285–6290. doi: 10.1073/pnas.1804492115. [DOI] [PMC free article] [PubMed] [Google Scholar]
Langmead B, Trapnell C, Pop M, Salzberg SL. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biology. 2009;10:R25. doi: 10.1186/gb-2009-10-3-r25. [DOI] [PMC free article] [PubMed] [Google Scholar]
Lee VV, Judd LM, Jex AR, Holt KE, Tonkin CJ, Ralph SA. Direct nanopore sequencing of mrna reveals landscape of transcript isoforms in apicomplexan parasites. mSystems. 2021;6:e01081-20. doi: 10.1128/mSystems.01081-20. [DOI] [PMC free article] [PubMed] [Google Scholar]
Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R, 1000 Genome Project Data Processing Subgroup The sequence alignment/map format and samtools. Bioinformatics. 2009;25:2078–2079. doi: 10.1093/bioinformatics/btp352. [DOI] [PMC free article] [PubMed] [Google Scholar]
Li H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics. 2018;34:3094–3100. doi: 10.1093/bioinformatics/bty191. [DOI] [PMC free article] [PubMed] [Google Scholar]
Liu J, Zhou W, Liu G, Yang C, Sun Y, Wu W, Cao S, Wang C, Hai G, Wang Z, Bock R, Huang J, Cheng Y. The conserved endoribonuclease YbeY is required for chloroplast ribosomal RNA processing in Arabidopsis. Plant Physiology. 2015;168:205–221. doi: 10.1104/pp.114.255000. [DOI] [PMC free article] [PubMed] [Google Scholar]
Loizeau K, Qu Y, Depp S, Fiechter V, Ruwe H, Lefebvre-Legendre L, Schmitz-Linneweber C, Goldschmidt-Clermont M. Small RNAs reveal two target sites of the RNA-maturation factor Mbb1 in the chloroplast of Chlamydomonas. Nucleic Acids Research. 2014;42:3286–3297. doi: 10.1093/nar/gkt1272. [DOI] [PMC free article] [PubMed] [Google Scholar]
MacRae JI, Sheiner L, Nahid A, Tonkin C, Striepen B, McConville MJ. Mitochondrial metabolism of glucose and glutamine is required for intracellular growth of Toxoplasma gondii. Cell Host & Microbe. 2012;12:682–692. doi: 10.1016/j.chom.2012.09.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
Martin M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet.Journal. 2011;17:10. doi: 10.14806/ej.17.1.200. [DOI] [Google Scholar]
Maschmann S. Toxoblocks. swh:1:rev:784e01420fb2e66250cc8d7c03a8ce8cd52477a4Software Heritage. 2024 https://archive.softwareheritage.org/swh:1:dir:96431634e26f39477122f5903761dfb67b73eb7c;origin=https://github.com/Kovox91/ToxoBlocks;visit=swh:1:snp:2d12bf05e81399b2d3781175600213a4b07a98ce;anchor=swh:1:rev:784e01420fb2e66250cc8d7c03a8ce8cd52477a4
Mauro VP, Edelman GM. rRNA-like sequences occur in diverse primary transcripts: implications for the control of gene expression. PNAS. 1997;94:422–427. doi: 10.1073/pnas.94.2.422. [DOI] [PMC free article] [PubMed] [Google Scholar]
McFadden DC, Tomavo S, Berry EA, Boothroyd JC. Characterization of cytochrome b from Toxoplasma gondii and Q(o) domain mutations as a mechanism of atovaquone-resistance. Molecular and Biochemical Parasitology. 2000;108:1–12. doi: 10.1016/s0166-6851(00)00184-5. [DOI] [PubMed] [Google Scholar]
Melo EJ, Attias M, De Souza W. The single mitochondrion of tachyzoites of Toxoplasma gondii. Journal of Structural Biology. 2000;130:27–33. doi: 10.1006/jsbi.2000.4228. [DOI] [PubMed] [Google Scholar]
Molan A, Nosaka K, Hunter M, Wang W. Global status of Toxoplasma gondii infection: systematic review and prevalence snapshots. Tropical Biomedicine. 2019;36:898–925. [PubMed] [Google Scholar]
Namasivayam S, Baptista RP, Xiao W, Hall EM, Doggett JS, Troell K, Kissinger JC. A novel fragmented mitochondrial genome in the protist pathogen Toxoplasma gondii and related tissue coccidia. Genome Research. 2021;31:852–865. doi: 10.1101/gr.266403.120. [DOI] [PMC free article] [PubMed] [Google Scholar]
Namasivayam S, Sun C, Bah AB, Oberstaller J, Pierre-Louis E, Etheridge RD, Feschotte C, Pritham EJ, Kissinger JC. Massive invasion of organellar DNA drives nuclear genome evolution in Toxoplasma. PNAS. 2023;120:e2308569120. doi: 10.1073/pnas.2308569120. [DOI] [PMC free article] [PubMed] [Google Scholar]
Nishimura K, Ashida H, Ogawa T, Yokota A. A DEAD box protein is required for formation of A hidden break in Arabidopsis chloroplast 23S rRNA. The Plant Journal. 2010;63:766–777. doi: 10.1111/j.1365-313X.2010.04276.x. [DOI] [PubMed] [Google Scholar]
Oborník M, Lukeš J. The organellar genomes of chromera and vitrella, the phototrophic relatives of apicomplexan parasites. Annual Review of Microbiology. 2015;69:129–144. doi: 10.1146/annurev-micro-091014-104449. [DOI] [PubMed] [Google Scholar]
Ohle C, Tesorero R, Schermann G, Dobrev N, Sinning I, Fischer T. Transient rna-dna hybrids are required for efficient double-strand break repair. Cell. 2016;167:1001–1013. doi: 10.1016/j.cell.2016.10.001. [DOI] [PubMed] [Google Scholar]
Ossorio PN, Sibley LD, Boothroyd JC. Mitochondrial-like DNA sequences flanked by direct and inverted repeats in the nuclear genome of Toxoplasma gondii. Journal of Molecular Biology. 1991;222:525–536. doi: 10.1016/0022-2836(91)90494-q. [DOI] [PubMed] [Google Scholar]
Pajak A, Laine I, Clemente P, El-Fissi N, Schober FA, Maffezzini C, Calvo-Garrido J, Wibom R, Filograna R, Dhir A, Wedell A, Freyer C, Wredenberg A. Defects of mitochondrial RNA turnover lead to the accumulation of double-stranded RNA in vivo. PLOS Genetics. 2019;15:e1008240. doi: 10.1371/journal.pgen.1008240. [DOI] [PMC free article] [PubMed] [Google Scholar]
Parker KER, Fairweather SJ, Rajendran E, Blume M, McConville MJ, Bröer S, Kirk K, van Dooren GG. The tyrosine transporter of Toxoplasma gondii is a member of the newly defined apicomplexan amino acid transporter (ApiAT) family. PLOS Pathogens. 2019;15:e1007577. doi: 10.1371/journal.ppat.1007577. [DOI] [PMC free article] [PubMed] [Google Scholar]
Pfaffl MW. A new mathematical model for relative quantification in real-time RT-PCR. Nucleic Acids Research. 2001;29:e45. doi: 10.1093/nar/29.9.e45. [DOI] [PMC free article] [PubMed] [Google Scholar]
Piro F, Carruthers VB, Di Cristina M. PCR Screening of Toxoplasma gondii single clones directly from 96-well plates Without DNA Purification. Methods in Molecular Biology. 2020;2071:117–123. doi: 10.1007/978-1-4939-9857-9_6. [DOI] [PubMed] [Google Scholar]
Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010;26:841–842. doi: 10.1093/bioinformatics/btq033. [DOI] [PMC free article] [PubMed] [Google Scholar]
Ramrath DJF, Niemann M, Leibundgut M, Bieri P, Prange C, Horn EK, Leitner A, Boehringer D, Schneider A, Ban N. Evolutionary shift toward protein-based architecture in trypanosomal mitochondrial ribosomes. Science. 2018;362:eaau7735. doi: 10.1126/science.aau7735. [DOI] [PubMed] [Google Scholar]
Rehkopf DH, Gillespie DE, Harrell MI, Feagin JE. Transcriptional mapping and RNA processing of the Plasmodium falciparum mitochondrial mRNAs. Molecular and Biochemical Parasitology. 2000;105:91–103. doi: 10.1016/s0166-6851(99)00170-x. [DOI] [PubMed] [Google Scholar]
Robinson JT, Thorvaldsdóttir H, Winckler W, Guttman M, Lander ES, Getz G, Mesirov JP. Integrative genomics viewer. Nature Biotechnology. 2011;29:24–26. doi: 10.1038/nbt.1754. [DOI] [PMC free article] [PubMed] [Google Scholar]
Roos DS, Donald RG, Morrissette NS, Moulton AL. Molecular tools for genetic dissection of the protozoan parasite Toxoplasma gondii. Methods in Cell Biology. 1994;45:27–63. doi: 10.1016/s0091-679x(08)61845-2. [DOI] [PubMed] [Google Scholar]
Saurer M, Leibundgut M, Nadimpalli HP, Scaiola A, Schönhut T, Lee RG, Siira SJ, Rackham O, Dreos R, Lenarčič T, Kummer E, Gatfield D, Filipovska A, Ban N. Molecular basis of translation termination at noncanonical stop codons in human mitochondria. Science. 2023;380:531–536. doi: 10.1126/science.adf9890. [DOI] [PubMed] [Google Scholar]
Sayers EW, Cavanaugh M, Clark K, Ostell J, Pruitt KD, Karsch-Mizrachi I. GenBank. Nucleic Acids Research. 2019;47:D94–D99. doi: 10.1093/nar/gky989. [DOI] [PMC free article] [PubMed] [Google Scholar]
Seeber F, Ferguson DJ, Gross U. Toxoplasma gondii: a paraformaldehyde-insensitive diaphorase activity acts as a specific histochemical marker for the single mitochondrion. Experimental Parasitology. 1998;89:137–139. doi: 10.1006/expr.1998.4266. [DOI] [PubMed] [Google Scholar]
Seeber F, Feagin JE, Parsons M, Dooren GG. In: Toxoplasma gondii. Weiss LM, Kim K, editors. Academic Press; 2020. Chapter 11 - the Apicoplast and Mitochondrion of Toxoplasma gondii; pp. 499–545. [Google Scholar]
Shikha S, Silva MF, Sheiner L. Identification and validation of Toxoplasma gondii mitoribosomal large subunit components. Microorganisms. 2022;10:863. doi: 10.3390/microorganisms10050863. [DOI] [PMC free article] [PubMed] [Google Scholar]
Sidik SM, Huet D, Ganesan SM, Huynh MH, Wang T, Nasamu AS, Thiru P, Saeij JPJ, Carruthers VB, Niles JC, Lourido S. A genome-wide CRISPR screen in Toxoplasma identifies essential Apicomplexan Genes. Cell. 2016;166:1423–1435. doi: 10.1016/j.cell.2016.08.019. [DOI] [PMC free article] [PubMed] [Google Scholar]
Smith T, Heger A, Sudbery I. UMI-tools: modeling sequencing errors in unique molecular identifiers to improve quantification accuracy. Genome Research. 2017;27:491–499. doi: 10.1101/gr.209601.116. [DOI] [PMC free article] [PubMed] [Google Scholar]
Srivastava IK, Morrisey JM, Darrouzet E, Daldal F, Vaidya AB. Resistance mutations reveal the atovaquone-binding domain of cytochrome b in malaria parasites. Molecular Microbiology. 1999;33:704–711. doi: 10.1046/j.1365-2958.1999.01515.x. [DOI] [PubMed] [Google Scholar]
Subczynski WK, Pasenkiewicz-Gierula M, Widomska J, Mainali L, Raguz M. High cholesterol/low cholesterol: effects in biological membranes: a review. Cell Biochemistry and Biophysics. 2017;75:369–385. doi: 10.1007/s12013-017-0792-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
Suplick K, Morrisey J, Vaidya AB. Complex transcription from the extrachromosomal DNA encoding mitochondrial functions of Plasmodium yoelii. Molecular and Cellular Biology. 1990;10:6381–6388. doi: 10.1128/mcb.10.12.6381-6388.1990. [DOI] [PMC free article] [PubMed] [Google Scholar]
Syafruddin D, Siregar JE, Marzuki S. Mutations in the cytochrome b gene of Plasmodium berghei conferring resistance to atovaquone. Molecular and Biochemical Parasitology. 1999;104:185–194. doi: 10.1016/s0166-6851(99)00148-6. [DOI] [PubMed] [Google Scholar]
Szabo EK, Finney CAM. Toxoplasma gondii: one organism, multiple models. Trends in Parasitology. 2017;33:113–127. doi: 10.1016/j.pt.2016.11.007. [DOI] [PubMed] [Google Scholar]
Tobiasson V, Berzina I, Amunts A. Structure of a mitochondrial ribosome with fragmented rRNA in complex with membrane-targeting elements. Nature Communications. 2022;13:6132. doi: 10.1038/s41467-022-33582-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
Vaidya AB, Lashgari MS, Pologe LG, Morrisey J. Structural features of Plasmodium cytochrome b that may underlie susceptibility to 8-aminoquinolines and hydroxynaphthoquinones. Molecular and Biochemical Parasitology. 1993;58:33–42. doi: 10.1016/0166-6851(93)90088-f. [DOI] [PubMed] [Google Scholar]
Valach M, Benz C, Aguilar LC, Gahura O, Faktorová D, Zíková A, Oeffinger M, Burger G, Gray MW, Lukeš J. Miniature RNAs are embedded in an exceptionally protein-rich mitoribosome via an elaborate assembly pathway. Nucleic Acids Research. 2023;51:6443–6460. doi: 10.1093/nar/gkad422. [DOI] [PMC free article] [PubMed] [Google Scholar]
van Dooren GG, Tomova C, Agrawal S, Humbel BM, Striepen B. Toxoplasma gondii Tic20 is essential for apicoplast protein import. PNAS. 2008;105:13574–13579. doi: 10.1073/pnas.0803862105. [DOI] [PMC free article] [PubMed] [Google Scholar]
van Dooren GG, Yeoh LM, Striepen B, McFadden GI. The import of proteins into the mitochondrion of Toxoplasma gondii. The Journal of Biological Chemistry. 2016;291:19335–19350. doi: 10.1074/jbc.M116.725069. [DOI] [PMC free article] [PubMed] [Google Scholar]
Vercesi AE, Rodrigues CO, Uyemura SA, Zhong L, Moreno SN. Respiration and oxidative phosphorylation in the apicomplexan parasite Toxoplasma gondii. The Journal of Biological Chemistry. 1998;273:31040–31047. doi: 10.1074/jbc.273.47.31040. [DOI] [PubMed] [Google Scholar]
Waltz F, Giegé P, Hashem Y. Purification and cryo-electron microscopy analysis of plant mitochondrial ribosomes. Bio-Protocol. 2021a;11:e4111. doi: 10.21769/BioProtoc.4111. [DOI] [PMC free article] [PubMed] [Google Scholar]
Waltz F, Salinas-Giegé T, Englmeier R, Meichel H, Soufari H, Kuhn L, Pfeffer S, Förster F, Engel BD, Giegé P, Drouard L, Hashem Y. How to build a ribosome from RNA fragments in Chlamydomonas mitochondria. Nature Communications. 2021b;12:7176. doi: 10.1038/s41467-021-27200-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
Wei W, Ba Z, Gao M, Wu Y, Ma Y, Amiard S, White CI, Rendtlew Danielsen JM, Yang YG, Qi Y. A role for small RNAs in DNA double-strand break repair. Cell. 2012;149:101–112. doi: 10.1016/j.cell.2012.03.002. [DOI] [PubMed] [Google Scholar]
Wessel D, Flügge UI. A method for the quantitative recovery of protein in dilute solution in the presence of detergents and lipids. Analytical Biochemistry. 1984;138:141–143. doi: 10.1016/0003-2697(84)90782-6. [DOI] [PubMed] [Google Scholar]
Wick R. Porechop. 109e437GitHub. 2018 https://github.com/rrwick/Porechop
Wickham H, Averick M, Bryan J, Chang W, McGowan L, François R, Grolemund G, Hayes A, Henry L, Hester J, Kuhn M, Pedersen T, Miller E, Bache S, Müller K, Ooms J, Robinson D, Seidel D, Spinu V, Takahashi K, Vaughan D, Wilke C, Woo K, Yutani H. Welcome to the Tidyverse. Journal of Open Source Software. 2019;4:1686. doi: 10.21105/joss.01686. [DOI] [Google Scholar]

eLife. doi: 10.7554/eLife.95407.sa0

Editor's evaluation

Marisa Nicolás ¹

This study brings a solid methodological advancement on previous attempts to understand the nature of the mitochondrial genome in Toxoplasma, applicable to Apicomplexa in general. The authors achieve extensive mitochondrial sequencing through a compelling and validated methodology, providing valuable RNA data. The improved catalog of mitochondrial rRNA and identification of overlapping protein-coding and rRNA genes will interest researchers in mitochondrial evolutionary biology and the mitoribosome community.

eLife. doi: 10.7554/eLife.95407.sa1

Decision letter

Editor: Marisa Nicolás¹

In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.

[Editors' note: this paper was reviewed by Review Commons.]

eLife. 2024 Feb 16;13:e95407. doi: 10.7554/eLife.95407.sa2

Author response

General Statements

Thank you for providing us the opportunity to submit a revised version of our manuscript. To address the reviewers' concerns, we have significantly enhanced the manuscript by adding extensive data. This new data is crucial for demonstrating that our polysome analysis can effectively differentiate between polysomes and monosomes. It strengthens our key conclusion that the mitochondrial sRNAs we detected by a combination of nanopore-based genome sequencing and small RNA sequencing are used in actively translating ribosomes. This is to date the most direct evidence for translation in apicomplexan mitochondria. Moreover, the small RNAs generated from genomic recombination sites that are fusions of mRNA and rRNA sequences are found in polysomes as well. As far as we know, the linking of rRNA production to genomic recombination sites has not been described before.

The data added are:

We introduced an additional transgenic line of T. gondii that produces a tagged variant of the ribosomal protein L11. This step was crucial to illustrate the distribution of ribosomes in our sucrose gradient analysis.
We examined the distribution of all three mitochondrial mRNAs in our sucrose gradient centrifugation experiments, showing that they are present in large molecular weight complexes susceptible to ribosome dissociation conditions.
Our results indicate that small RNAs mirror the behavior of mRNAs and L11 in our gradients, suggesting their role in the formation of ribosomes and, subsequently, polysomes.

The generation of these data involved Zala Gluhic, who conducted the IFA of the tagged L11 line, Nikiforos Drakoulis who carried out the mRNA analysis in sucrose gradients, and Sascha Maschmann, who re-evaluated some of our mitochondrial genome Nanopore sequencing data bioinformatically, as per the reviewers' request. Consequently, all three contributors are now coauthors of the manuscript.

Point-by-point description of the revisions

Reviewer #1

Your appreciation of our work is greatly encouraging; thank you for analyzing our manuscript!

Evidence, reproducibility and clarity

Summary:

Mitochondrial genomes of Apicomplexa parasites have undergone dramatic reductions during their evolution with genes for only three proteins remaining. In addition, ribosomal RNA genes are present in different, often species-specific gene arrangements. Toxoplasma exhibits massive variations in gene arrangement that are distributed over multiple copies. In this study the Schmitz-Linneweber lab not only re-analysed the mitochondrial genome of Toxoplasma gondii using a novel protocol for enriching the organellar nucleic acid, allowing to sequence the mitochondrial genome at unprecedented depth, they also addressed an enigma regarding the expression status of mitochondrial ribosomes.

While indirect evidence of mitochondrial translation exists, no direct evidence for active mitoribosomes exist and their composition is still poorly understood. Here, using HTS or small RNAs the authors demonstrate that they are incorporated into polysomes. Furthermore, the authors developed the hypothesis that the block-based genome organization enables the dual utilization of mitochondrial sequences as both messenger RNAs and ribosomal RNAs.

Own opinion/Major comments

The mitochondria of the Apicomplexa are characterized by massive gene transfer into the cell nucleus, and sequence rearrangements, which has led to a single, questioned genome reorganization. The underlying mechanisms of gene transcription and translation are also poorly understood. In a previous study, the Kissinger lab demonstrate the unique organization of the mitochondrial genome that consists of minimally of 21 sequence blocks (SBs) totaling 5.9 kb that exist as nonrandom concatemers (Namasivayam et al. 2021).

In this study the authors optimized a new isolation technique of organellar content to sequence the mitochondrial genome. This new purification protocol appears to be very robust and allowed the sequencing of mitochondrial genome at unprecedented depth. The obtained data not only validate previous studies, but they also suggest several new features, such as (potentially) continuous reshuffling of DNA blocks, leading to independent block combinations.

The most important aspect of this study is the demonstration of polysomes and the presence of rRNAs within these complexes, taking previous studies (i.e. Lacombe et al., 2019) a step further.

Taking all these efforts and data into account it is a very nice and interesting study that will certainly be of interest for a broader readership. All the presented data and analysis appear to be solid and well controlled. However, it must be mentioned that this reviewer is not an expert when it comes to the analysis and comparison of huge genomic datasets and the opinion of a bioinformatician would be helpful in assessing this study in more detail.

All other data (organellar purification and analysis of polysomes) appear state of the art and no corrections are required.

Significance

General assessment:

Taking all these efforts and data into account it is a very nice and interesting study that will certainly be of interest for a broader readership. All the presented data and analysis appear to be solid and well controlled. However, it must be mentioned that this reviewer is not an expert when it comes to the analysis and comparison of huge genomic datasets and the opinion of a bioinformatician would be helpful in assessing this study in more detail.

All other data (organellar purification and analysis of polysomes) appear state of the art and no corrections are required.

Advance:

The study fills an important gap in our knowledge regarding the organization and translational activity of the apicomplexan (Toxoplasma) mitoribosome. See also comments above.

Audience: Cell Biology, Parasitology, Mitochondria

Reviewer #2

We sincerely thank you for your constructive feedback and the thorough evaluation of our manuscript.

Evidence, reproducibility and clarity

In this article, the authors delve into an intriguing topic, aiming to enhance our understanding of the organization of the mitochondrial genome of T. gondii, a parasite of significant importance in both human and animal health contexts.

In essence, their approach involves enriching mitochondrial material, followed by genome sequencing and the analysis of mitochondrial short RNAs. They achieve a remarkable depth of mitochondrial sequencing and generate valuable RNA data. Furthermore, their efforts lead to the discovery and annotation of new short RNAs.

Overall, the article is well-crafted and presents compelling results. However, it's worth noting that, at times, the authors appear somewhat self-congratulatory, and certain results might be perceived as overly ambitious. Nevertheless, the discussion is aptly constructed.

Major comment: (we have numbered the comments of the reviewer)

1. They assert certain discoveries that had already been reported. Notably, they adapt an existing protocol for mitochondrial enrichment and describe it as 'We developed a protocol to enrich T. gondii mitochondria.'

We did reference the protocol that our method was built on in the methods section: “A previously established protocol to enrich T. gondii organelles was modified here slightly (Esseiva et al. 2004).” We agree that the phrasing “developed a protocol” in the Conclusions chapter is an overstatement and changes this to “adapted a protocol, citing Esseiva et al. (line 675)

2. It's worth noting that they neither reference a more recently described protocol (PMC6851545) nor compare the performance of their modified protocol with the original.

We added this reference, but since we have not been using the Lacombe protocol mentioned by the reviewer and have no need to compare efficiencies to support our conclusions (this not being a methods paper), we would rather abstain from a protocol comparison / citation. We do add info on why our protocol was serving our DNA-sequencing efforts very well (see next answer please).

3. The protocol they employ does not seem to yield exceptionally high success rates, as mitochondrial DNA constitutes less than 10% of the total sequenced DNA.

We value the reviewer's critical evaluation. We understand that the proportion of mitochondrial sequences in our data may imply a low success rate. Nevertheless, we would suggest a comparison of the proportion of mitochondrial DNA sequenced from a total DNA sample (Namasivayam et al.) versus DNA obtained from the organelle enrichment protocol (Supplementary file 1). With the organelle enriched sample we could obtain a ~42-fold increase in mitochondrial sequencing depth. When compared to the RH Δuprt ONT sequencing data from Namasivayam et al. the increase is even ~184-fold.

We have added following info to the text:

Line 177-179:

“This is a 42-fold increase in the sequencing depth of the T. gondii mitochondrial genome compared to previous attempts (Supplementary file 1), which can be attributed to the effectiveness of the purification process.”

4. Additionally, they frequently mention the identification of specific combinations of sequence blocks previously identified by Namasivayam et al. (PMC8092004), which was also discussed in Namasivayam et al. 2021."

After publishing this manuscript on bioRxiv, we were approached by an author from the Namasivayam paper, who also raised this point and helped us to improve the manuscript substantially and to properly represent previous work by the Kissinger lab. We apologize for this failure to properly represent important previous work. According to your and this authors’ advice, we amended the text at several positions:

Lines 35-38 (abstract):

“It has been established previously that the T. gondii genome comprises 21 sequence blocks that undergo recombination among themselves, but that their order is not entirely random. The enhanced coverage of the mito genome allowed us to characterize block combinations at increased resolution.”

Line 115: we added the citation:

“The T. gondii nuclear genome contains many insertions of mitochondrial DNA sequences (Ossorio et al. 1991; Gjerde 2013; Namasivayam et al. 2023)”

Line 169-169: we deleted the following sentence:

“However, a quantitative analysis of block combinations remained beyond reach due to the low amount of sequence reads.”

Line 191: we added the following sentence:

“, thus corroborating at a deeper sequencing coverage conclusions so far gained from a more limited read set (Namasivayam et al. 2021; Berná et al. 2021a).”

Line 200: we added a sentence to specify the work by Namasivayam versus our work:

“, confirming block combinations observed previously (Namasivayam et al. 2021) and adding combination frequencies based on higher read numbers.”

Line 222:

“Full-length coding regions had been found previously on nanopore reads in T. gondii (Namasivayam et al. 2021). In our improved representation of the mitogenome, we identified a large number of full representations of cob, coxI and coxIII in our dataset […].”

Line 221: Namasivayam et al. did not claim post-transcrional shuffling of blocks as we erroneously stated; thus we deleted their reference here and added a sentence to make clear that we use approaches established by Namasivayam et al. for describing block combinations.

Now line 227:

“This may be adequate for the expression of the encoded proteins; however, we cannot currently exclude the possibility of genomic or post-transcriptional block shuffling that could lead to more complete open reading frames (as discussed in Berná et al., 2021b). We next applied approaches established previously to represent biased recombination events based on alternative block combinations (see Figure S12 in Namasivayam et al. 2021) to our improved ONT read set.”

Line 278 and 284.

Namasivayam et al. did not annotate but rather predict rRNAs based on homology searches. We changed the verb accordingly.

Now line 267-268:

“Among these, 11 correspond to previously predicted rRNA fragments (Namasivayam et al., 2021).”

Now line 271-273:

“This included a reassignment of SSUF to the opposite strand and also affected the four rRNA fragments LSUF, LSUG, LSUD, and LSUE, which had been predicted as separate transcripts of the large subunit (LSU) of the ribosome (Namasivayam et al., 2021).”

Now line 305-308:

“We next analyzed the accumulation of selected RNAs previously undetected in T. gondii for the sequence blocks Kp-K: Transcripts RNA5, 17, and 29, as well as the already predicted transcripts RNA10 and SSUD are found on the minus strand (strandedness according to Namasivayam et al., 2021; MN077088.1 – MN077111.1).”

Line 567:

The individual ONT reads are not repeated in the two previously published datasets (discussed in Berná et al., 2021b) and we also did not find overlap between our data presented here and published previously (Namasivayam et al., 2021). Thus, the large number of block combinations identified here reinforces and elevates previous conclusions that continuous recombination shuffles the blocks (Namasivayam et al. 2021).”

5. Missing in the supplementary material are basic details on the sequences performed. Distribution of mitochondrial reads length, depth, etc.

We included a supplemental table (Supplementary file 2) showing read length distribution of mitochondrial ONT reads. Other basic information on ONT sequencing and Illumina sRNA sequencing data we provide in Tab S1 and S6.

6. Further clarification is needed for Figure 2. Specifically, the frequency units or combinations of frequency (A, B, and C) are not clearly explained. While the matrix's asymmetry suggests a 5'- 3' orientation difference, this orientation difference is not explicitly specified (B). Additionally, the fragment Mp does not appear in the block combination figure (C).

We thank the Reviewer for pointing out the lack of clarity in Figure 2. In Figure 2 we show the number of occurrences of individual blocks- and block combinations across the entire set of mitochondrial ONT reads. To enhance clarity we revised the figure caption and included additional details. We have moved Figure 2B to the supplementary section (now Figure 2—figure supplement 1A), because the information shown there is mostly redundant with Figure 2C. We refined the R code for analyzing the block combinations by also considering block directionality. Furthermore we removed combinations of blocks with gaps or overlaps between the respective annotated blocks larger than 10 nucleotides. This removed the majority of lowly abundant block combinations. The heatmap presenting the results is now differently labeled and includes a legend for distinguishing between block orientations. Mp was merged to fragment B since no independent B fragment was observed in the dataset. We added a sentence in the capture of Figure 2—figure supplement 2 explaining the reasoning behind merging block B and Mp. The new sequence version was published in GenBank OR086910-OR086916.

Some points to improve the introduction:

7. Provide an evolutionary context for the following phrase: 'An idiosyncratic feature of Apicomplexa is a highly derived mitochondrial genome.' Specify what you intend to emphasize.

Line 55. We deleted this sentence since this is explained further down in greater detail (now line 61):

Dozens of apicomplexan mitochondrial genomes have been sequenced (Berná et al. 2021b). These sequences showcase the extreme reductive evolution in apicomplexan mitochondria, setting records for the smallest mitochondrial genomes known to date, ranging in length from 6 to 11 kb (Hikosaka et al. 2013; Oborník and Lukeš 2015).”

Line 56: The sentence must begin with a capital letter

OK, thanks.

8. In line 58 "Nuclear genes encoding proteins with functions in mitochondria contribute strongly to P. falciparum and T. gondii cell fitness"

Although it is mentioned later, it would be more effective to introduce the fact that all but three genes are encoded in the nucleus.

We added the following:

Line 59-61:

“With only three exceptions, all mitochondrial proteins are nuclear encoded and these nuclear genes contribute strongly to P. berghei and T. gondii cell fitness (Sidik et al. 2016; Bushell et al. 2017).”

9. Line68: "Apicomplexan mitogenomes usually code only for three proteins"

It seems to me that 'usually' should not be included.

Agreed.

10. Line 65-67: The sentence should include that the mitochondrial genome is composed of a total of 20 blocks of repeating sequences organized in multiple DNA molecules of varying length and non-random combinations

The corrected sentence reads (line 67-69):

“The T. gondii mitochondrial genome is composed of 21 repetitive sequence blocks that are organized on multiple DNA molecules of varying lengths and non-random combinations (Namasivayam et al. 2021; Berná et al. 2021a).”

11. At the end of the introduction, the authors state that they have developed a protocol for mitochondrial enrichment. The text should be modified taking into account that:

1- The new protocol is an adaptation of another existing protocol. In fact, the Methods the authors say the protocol was "slightly" modified.

We rephrased the sentence (line 105-107):

“We used a slightly modified protocol of mitochondria enrichment (Esseiva et al. 2004) to investigate the structure of the mitochondrial genome through long-read sequencing at an unprecedented depth.”

2- There is already existing mitochondrial enrichment protocol available [Reference: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6851545/#mmi14357-bib-0074]. In any case, they should consider performing a comparative analysis between the proposed protocol and existing ones to determine its relative effectiveness. It should be noted that the proposed protocol enriches in organelles (including the nucleus and apicoplast), but when sequencing DNA, mitochondrial DNA accounts for only 5% of the total reads, which may raise doubts about its overall efficacy.

Thank you for pointing out this protocol. Since we based our protocol on Esseiva et al., we will not cite Lacombe et al. here. The modified Esseiva protocol allowed a 42 fold better coverage of mitochondrial genome sequences over previous efforts, which is all we asked for. We do not intend to focus here on methods development and comparisons and clearly should not have stressed novelty as we did. The method is now appropriately described and referenced in abstract, introduction, and results. Please see also our answer to your point #2.

Benzonase is 60 kDa and will move into the nucleus. And yes, the apicoplast will be co-enriched with mitochondria, which is however not an issue given the good coverage we get for mitochondrial DNA (and we wrote “leaving mitochondria and other organelles intact”).

Some points related to Results section:

12. Lines 113-115: 'To distinguish between NUMTs (nuclear DNA sequences that originated from mitochondria) and true mitochondrial sequences, it is necessary to enrich mitochondrial DNA.' I disagree with this sentence. NUMTs, in general, consist of very short sequences. With long reads, it is relatively straightforward to differentiate mitochondrial sequences from those nuclear sequences that have small mitochondrial fractions. In my opinion, even many Illumina reads can be confidently identified as belonging solely to the mitochondria. I found this article that supports this argument, indicating that the majority of NUMTs are less than 100 nucleotides in length [Reference: https://pubmed.ncbi.nlm.nih.gov/37293002/].

According to the paper you cite, there are hundreds of NUMTs that are longer than 100 nt and thus pose potential problems for mis-mapping of Illumina reads. We believe it is therefore helpful to increase sequencing depth using enrichment of mitochondria and long reads to say with high confidence that the sequences are of mitochondrial origin. We are however agreeing that toning down our statement is appropriate given that Illumina read alone are also already allowing analyses of mitochondrial DNA long read sequencing. The phase now reads like this (line 116-118):

“To better distinguish between NUMTs (nuclear DNA sequences that originated from mitochondria) and true mitochondrial sequences, it is helpful to enrich mitochondrial DNA.”

13. Lines 166-168: 'A previous sequencing study used Oxford Nanopore sequencing technology (ONT) to identify combinations of sequence blocks in T. gondii mitochondria (Namasivayam et al. 2021).' However, it's important to note that Namasivayam's group did not merely use ONT to identify combinations of blocks; rather, they discovered, identified, and defined these combinations based on sequencing with long reads.

As explained in our reply to your comment #4 (see above), we have discussed this directly with an author from Namasivayam et al. (2021) and have adjusted all our statements to adequately reference their work.

14. Line 177: "The length of mitochondrial reads ranged from 87 nt to 17,424 nt"

It would be beneficial to include a histogram depicting the length distribution of the obtained reads. It's worth noting that nanopore reads tend to be shorter than Illumina reads

See point 5, we included a supplemental table (Supplementary file 2) showing read length distribution of mitochondrial ONT reads.

15. Line 194-195 "we found that only a small fraction of all possible block combinations are

prevalent within the genome" this has been previously described (PMC8092004)

We added this to the sentence in question (line 200-202):

“[…] confirming block combinations observed previously (Namasivayam et al. 2021) and adding combination frequencies based on higher read numbers.”

Please see also our answer to your point #4.

Line 201. "This indicates that the genome's flexibility is limited and that not all block combinations are realized". This is consistent with the findings published by Namasivayam et al. in 2021, which have already established that the combination of the 21 blocks is non-random.

We added the citation to this statement: ”[…] (see also Namasivayam et al. 2021).”

16. Line 205: "All combinations are well covered in our ONT results and helped to refine

block borders relative to previous annotations (Figure S2)" In the supplementary materials the authors say: "However, the blocks Fp, Kp, and Mp frequently occur separately in the mitochondrial genome We therefore treated Fp, Kp and Mp as separate blocks and have shortened the blocks F, K and M accordingly".

As far as I understand, for this very reason, Namasivayam and collaborators annotate them as partial fragments, which may appear in other regions but are, in turn, parts of larger F, K, and M fragments. To redefine the segments F, K, and M without the sequences corresponding to Fp, Kp, and Mp, as shown in Figure S2, these fragments should be distinct from the 'partials.' In other words, segments of the type (F minus Fp), (K minus Kp), and (M minus Mp) should appear in the reads, and should be distinguishable from Fp, Kp, and Mp. If this distinction is made, I am satisfied with the new definition. However, if such a separation is not evident, it seems important to clarify it in the text or to reconsider this new definition.

Implied in the concept of sequence blocks are active recombination sites. Thus, the occurrence of Fp, Kp and Mp in other combinations than with (F minus Fp), (K minus Kp) and (M minus Mp) requires their definition as single, distinct blocks. Consequently, the remaining sequences (F minus Fp), (K minus Kp) and (M minus Mp) are also blocks, on the other site of the recombination point. This is similar to many other blocks in the genome. For example, in the case of blocks P and I, block I also does not occur in other combinations than with block P in the genome whereas block P does (in combination with block H). This justifies having blocks P and I as separate blocks, rather than having a combined block "PI" or defining a block Ip and I. In sum, our new annotations harmonize block definitions.

17. Lines 221-223: "This suggests that there is no need to postulate mechanisms of genomic or posttranscriptional block shuffling to arrive at full-length open reading frames."

The authors argue that invoking mechanisms of genomic or post-transcriptional block shuffling is unnecessary to explain the presence of full-length open reading frames, given that genes represent 2-3% of mitochondrial sequences. However, there is a missing estimate regarding the probability of encountering all three genes within a single molecule or mitochondrial genome, as well as the total number of sequenced mitochondria. Consequently, the statement appears overly assertive. In the absence of alternative mechanisms for generating complete genes, this would mean that at most only 1646 mitochondrial genomes would have been sequenced.

To comprehensively address this issue, the authors should consider discussing this scenario further. They should also provide information about how many reads they found containing all three genes and how many contained two of the genes.

We do not think that the three ORFs need to be on one molecule since many mitochondria genomes are fragmented. Stochastic inheritance of a multi-copy genome can work just like strict inheritance of defined chromosomes. To address this is clearly beyond the scope of our work and would need controls for mitochondrial integrity and number during preparation that we simply do not have. But the numbers shown demonstrate that there are full-length ORFs and thus that it is most parsimonious to assume that this suffices to support mitochondrial gene expression.

Regarding the exact numbers that the reviewer would like to see:

We found 18 reads containing all three ORFs and 369 containing two different ORFs at least once. If we calculate the numbers of reads containing a full-length ORF relative to all reads that are long enough to fit an ORF, we arrive at the following numbers: coxIII: 5.1%, cob: 8.6%, coxI: 11.1%.

We added these numbers to the text (line 225-227).

“Thus, of all reads long enough to carry one of the following ORFs, 5.1% contain full-length coxIII, 8.6% contain full-length cob, and 11.1% full-length cox1, respectively (Supplementary file 5).”

Furthermore, to address the reviewer’s concern, we toned down our statement on post-

transcriptional shuffling (line 227-229):

“This may be adequate for the expression of the encoded proteins; however, we cannot currently exclude the possibility of genomic or post-transcriptional block-shuffling that could lead to more complete open reading frames”

18. Lines 249-250 "using the block combinations identified here by ONT sequencing " which is the difference between blocks identified here with those on Namasivayam ? The division of M, K and F fragments?

Yes, the differences concern the mentioned blocks, but also some other block adjustments as shown in Figure 2—figure supplement 2 (adjusted sequences uploaded in GenBank accessions OR086910-OR086916). We used the adjusted blocks for mapping due to reasons explained in our answer to point #16.

19. Line 287: "The six remaining small RNA fragments are specific to T. gondii"

I would suggest being more cautious in this sentence by stating that they were not found in other organisms. Given the similarity of the mitochondrial genome between T. gondii, N. caninum, and other coccidians, it would be expected to find them in these organisms as well.

We thank the reviewer for bringing up this issue. It’s true that the sequences of these small RNAs are conserved among other cyst-forming Eucoccidians. We corrected the sentence in the manuscript accordingly and changed the title of the last column in Supplementary file 7 to “only found in cyst-forming Eucoccidians”.

Corrected sentence (line 297-299):

“Six further sRNAs are exclusively conserved within cyst-forming Eucoccidians, the closest apicomplexan relatives of T. gondii (last column in Supplementary file 7).”

20. Line 300 "Among the novel small RNAs identified, there is also a class that was only detectable due to our insights into genome block combinations."

A valid strategy is to map the small RNAs to the generated nanopore reads or to an assembly made with these reads, rather than solely relying on the single blocks or combinations of blocks, as this approach would yield the same result.

This would have been an alternative strategy. An assembly of a Holo-genome is however not possible. Based on all sequencing efforts to date, we would hypothesize that the genome is represented by multiple DNA fragments. Still you are right that we could have mapped the RNA reads to the entire set of reads / read clusters. We would have had to disable selection against multi mapping reads and we would have the problem that nanopore reads have many sequencing errors. Therefore, we decided the cleaner and faster method is to map against a set of all block combinations found. The results should not differ.

21. Line 444: "Upon closer scrutiny, however, the reshuffling appears limited to specific block borders and is not random" This was already established by Namasivayam et al. 2021.

I would like to highlight the potential for a more comprehensive examination of the mitochondrial genome in the discussion. While the proposed explanations for the presence of sRNAs at the 'block borders' appear plausible, it's worth noting that the definition of these blocks is artificial rather than biological. I think it is interesting to discuss without the concept of block sequences, but of sequences existing in the mitochondrial genome. Therefore, it's important to discuss whether these sequences (the block borders) are consistently present in all mitochondrial genomes. The total cumulative length of the blocks is 5.9 Kb, which is relatively small and comparable to one of the smallest mitochondrial genomes on record. It is conceivable that recombination and the generation of new sequences play a role in expanding genomic space for encoding, such as RNAs.

The blocks are defined by recombination events that are implied by sequence combinations found. Therefore, the block definition by Namasivayam et al. 2021 is based on biological processes and not arbitrary / artificial. We agree however with the reviewer that recombination increases the genomic space, in particular when comparing with mitogenomes from relatives like Plasmodium species. To account for this interesting idea, we added the following sentence to the discussion:

line 592-595:

“In fact, the sequences at recombination sites could be regarded as an expansion of the mitochondrial genome sequence space, which is not available to other Apicomplexans like the genus Plasmodium.”

22. Line 535-536 "We developed a protocol to enrich T. gondii mitochondria and used Nanopore sequencing to comprehensively map the genome with its repeated sequence blocks."

I find this sentence to be somewhat assertive, especially considering that they modified an existing protocol and obtained results that may not be optimal. Additionally, they have not compared their protocol with other available methods for mitochondrial enrichment.

We agree that this was not appropriately phrased (and cited) – please see our answer to your point #1.

Some points related to Method section:

23. In none of the experiments is it specified how many parasites were initially used as a starting point

We added the missing information about the number of parasites we used for organelle enrichment. For all the other experiments performed we used a specified amount of RNA or DNA rather than a certain number of parasites.

The adjusted sentence reads (line 735-739):

“Freshly lysed T. gondii cultures from four T175 flasks (according to 8*10⁸ parasites) were filtered through a 3 μm pore size polycarbonate filter to remove host cell debris and harvested by centrifugation at 1500 x g for 10 minutes.”

24. "Masking NUMTs in the T. gondii nuclear genome" it's unclear whether the authors utilize all hits or filter the results of BLASTN. It would be helpful if they specify the criteria for filtering, such as identity percentage or query coverage. Additionally, it's not clear how they generate the GFF3 file from the BLAST results, or whether they instead create a BED file. Providing clarification on this process would enhance the reproducibility of their methods.

Moreover, it would be beneficial if the authors include information regarding the number of sequences they intend to mask, the average length of the NUMTs, and the total percentage of the genome these masked sequences represent.

We did not filter the BLASTN results given that NUMTs can be of small length and have diverged relative to their mitochondrial counterpart (Namasivayam et al. 2023).

The interval information given in the BLASTN tabular output was manually integrated into a GFF3 file format using Excel tools.

We reformulated the sentences (line 850-854):

“All hits obtained in the BLASTN tabular output were manually integrated into a GFF3 file format. NUMTs in the nuclear genome were masked with bedtools MaskFastaBed (Quinlan and Hall 2010) based on the intervals defined in the GFF3 file. In total, we masked 8118 sites with an average length of 92 nt representing in total ~1% of the nuclear genome.“

25. Line 657 "Mapping results were filtered using SAMtools"

The text does not specify the filtering criteria or the parameters used for this process.

We specified the criteria and parameters as followed (line 860-861) :

“Mapping results were filtered using SAMtools view (Version 1.15.1) (Li et al. 2009) (-f read is unmapped).”

26. Line 673 establish "No matching reads were found" in the "Sequence comparisons of ONT reads found here with published ONT reads for the T. gondii mitochondrial genome" but in the results the authors say: "While smaller reads of our dataset are found in full within longer reads in the published datasets, we do not find any examples for reads that would be full matches between the dataset.

Could you provide a more detailed explanation? Specifically, I would like to know how many reads from the dataset (including their length) are also present in other datasets, and at what minimum length do they cease to coincide?

We apologize for the contradiction. We deleted the sentence "No matching reads were found" from the method section, which was meant to describe that we do not find identical reads.

We found 1555 reads of our dataset being entirely part (defined as ≥99% read identity) of reads from the Namasivayam et al. dataset and 212 reads of their dataset being fully found as part of reads from our dataset. The longest reads that coincide are around ~2500 nt. The sequence of a 2521 nt long read from our dataset was found completely as part of a 5574 nt long read from the Namasivayam et al. dataset. Conversely, a 2479 nt long read from the Namasivayam et al. dataset was entirely found within a 4149 nt long read of our dataset.

We added the following sentence to the methods section (line 877-879):

“We found 1555 reads of our dataset being entirely part (defined as ≥99% read identity) of reads from the Namasivayam et al. (2021) and 212 reads of their dataset being fully found within reads from our dataset.“

27. 689 – The text does not specify the filtering criteria or the parameters used for Samtools filtering process.

We added the missing specifications as followed (line 897-898):

“Mapping results were filtered using SAMtools view (Version 1.15.1) (Li et al. 2009) (-f read is unmapped).”

28. Lines 689-693 Please describe better the methodology used.

We had used default settings in Bowtie2 and added this info (line 898-901):

“Unmapped reads were kept and mapped with Bowtie2 using default settings against the T. gondii mitochondrial genome using all combinations of the sequence blocks found in our genome sequence analysis (GenBank accession MN077088.1 – MN077111.1, OR086910 – OR086916) as determined in our ONT DNA sequencing data.”

29. Line 696: the program is fastp not fastq (Chen et al. 2018)

Thanks, we corrected it.

30. Line 697: what do you mean only the ends of the reads were mapped? how many bases? Or do they mean that they map the reads fowrards and reverse reads?

The sentence is misleading, we used a tool that calculates for each base in the reference the number of mapped reads that have their 3’ and 5’ end at this specific position. We corrected the sentence as follows (line 905-907).

“The number of mapped 3’ and 5’ read ends at each position in the reference was calculated using Bedtools Genome Coverage 2.30.0 (Quinlan and Hall 2010).”

Significance

In this article, the authors delve into an intriguing topic, aiming to enhance our understanding of the organization of the mitochondrial genome of T. gondii, a parasite of significant importance in both human and animal health contexts.

In essence, their approach involves enriching mitochondrial material, followed by genome sequencing and the analysis of mitochondrial short RNAs. They achieve a remarkable depth of mitochondrial sequencing and generate valuable RNA data. Furthermore, their efforts lead to the discovery and annotation of new short RNAs.

Overall, the article is well-crafted and presents compelling results. However, it's worth noting that, at times, the authors appear somewhat self-congratulatory, and certain results might be perceived as overly ambitious. Nevertheless, the discussion is aptly constructed.

Reviewer #3

We are immensely grateful for the detailed and thoughtful review you provided, particularly your insightful comments on our analysis of ribosomes, which we found to be highly valuable for enhancing the quality of our work. As you will see, we have added results from several experiments to this manuscript in response to your suggestions.

Evidence, reproducibility and clarity

Summary

In their manuscript, Tetzlaff et al. report a substantially improved protocol for the isolation of mitochondria from the parasitic apicomplexan Toxoplasma gondii, which allowed improved sequencing and in-depth analyses of the organism's peculiarly complex mitochondrial genome. Follow-up small RNA-sequencing made it then possible to confirm the expression of fragmented mitochondrial ribosomal RNAs (mt-rRNAs) and to identify a dozen new RNA species of unknown function. The authors document not only multiple Toxoplasma mitochondrial genes that overlap one another-including rRNA and protein-coding genes, otherwise a rare occurrence-but also show that some fragmented rRNA genes recombine, effectively leading to multifunctional sequence segments, another rare feature and consequence of the peculiar architecture of the organism's mitochondrial genome. Lastly, the authors confirm that products of three genes presumed to encode pieces of the highly fragmented mitochondrial large subunit (mtLSU) rRNA do indeed assemble-presumably with additional components-into large molecular-weight complex(es).

Major comments

Key conclusions of the manuscript are that Toxoplasma's mitogenome encodes overlapping rRNA and protein-coding genes, divergent and chimeric rRNA pieces, and several small RNAs (sRNAs) of unknown function. Provided evidence is very solid for certain aspects of the study, but objectionable for the others as detailed below.

1. The extent of the presented analysis of rRNAs and unassigned sRNAs seems lacking. In several places of the manuscript, the authors wonder about potential implications of divergent rRNA sequences, but their analyses appear to have been limited to sequence similarity searches. Had modelling of secondary structure interactions been attempted, this conundrum could potentially be solved. Importantly, similarity searches (to conventional rRNAs) were performed using BLASTN, which is a rather crude tool for the purpose, instead of covariance models/HMMs. It is therefore not entirely surprising that some sRNAs remained unassigned. Admittedly, recognizing rRNA motifs in divergent RNAs is a challenging issue. However, it is important to not conflate similarity to conventional rRNA and the molecule's functionality as an rRNA, i.e., sequence divergence does not necessarily disqualify the unassigned sRNAs as potential rRNAs. Mitochondrial rRNA sequences are among the most divergent, often constrained only by base-pairing, if at all, as has shown the research on kinetoplastid and diplonemid mt-rRNAs, which contain very few conserved elements and very few base pairs (e.g., Ramrath,2018,Science & Valach,2023,NAR). Even in generally less divergent cases such as green algae, the fragment encoding a highly divergent and derived 5S-like rRNA has only been recognized as such only after the mitoribosome structures were determined (Waltz,2021,Nature Comm & Tobiasson,2022,Nature Comm). It would not be surprising if the same was the case for Toxoplasma's fairly quickly evolving mitochondrial genome.

We certainly did not want to imply that lack of sequence similarity rules out inclusion into a ribosome. Regarding structural analyses and more refined tools to identify similarities to rRNAs: we agree with the reviewer that such bioinformatic tools can only give tentative hints, no answers. We therefore opted for providing experimental evidence that several of the unassigned small RNAs are found in ribosomes (see extended Figure 6, new figures 7 and 8). We furthermore modified a statement on sRNA:rRNA sequence similarity and incorporated the citations suggested by the reviewer:

Line 339-343:

“None of the three RNAs had detectable homologies to rRNA based on simple sequence searches, but structural conservation cannot be ruled out. Mitochondrial rRNA sequences from kinetoplastids and diplonemids show very little sequence conservation but are still part of the mitoribosome (Valach et al. 2023; Ramrath et al. 2018), suggesting that future analyses might uncover hidden rRNA similarities.”

2. The discovery of overlapping protein-coding and rRNA genes is intriguing, but the authors do not explain why it should be considered as fundamentally groundbreaking as the 'Abstract' and 'Discussion' make it sound. Gene overlaps are found in mitochondria of many organisms (e.g., fungi, animals, various protists), especially of tRNA and protein-coding genes. Even in Plasmodium, a rather close relative of Toxoplasma studied in the presented work, LSUB (rRNA) gene overlaps cob (protein) gene in the antisense orientation. Admittedly, the extent of the overlaps in Toxoplasma does seem fairly high at a first glance, but it is necessary to provide more data and, importantly, broader context to make the case that Toxoplasma overlaps are in any way special. For instance, what is the average size of the overlaps? What is their cumulative size? How does their extent (i.e., the size of overlapping coding sequences compared to the total length of coding sequences) compare to gene overlaps in other (mitochondrial) genomes?

Certain additional aspects of the analysis and interpretation of protein- and/or rRNA-coding sequence overlaps are somewhat underdeveloped. For example, are the RNA-coding regions that overlap protein-coding sequences more divergent in those three conserved proteins compared to other organisms, i.e., does their function as rRNA take precedence, or is the converse the truth, i.e., are the rRNA sections more divergent? RNA19 (overlapping coxIII and cob) is the only example discussed in depth, but at least a short sentence summarizing the overall picture would be useful. As for the authors' interpretations, proposed formation of sRNA:mRNA hybrids, through which sRNAs could by implicated in facilitating mRNA recognition by the mitoribosome, is an interesting hypothesis, but a simpler scenario, which is given very little space, is that the genes happen to overlap by chance and that the overlaps are merely a consequence of genome compaction (a phenomenon that the authors rightly highlight). Without a comprehensive analysis, it is impossible to conclude which possibility is more likely. For instance, if both protein-coding and non-protein-coding sequences are divergent, this would indicate that there are few evolutionary constraints and so the fact that these sequences overlap means very little and might be just due to neutral drift, an effect of genome compaction without much consequence for the organism.

Lastly, considerable significance is attributed in the study to the presence of antisense overlaps, especially between rRNA- (or sRNA-) and protein-coding genes. Yet, the overall extent of sense and antisense overlaps in the Toxoplasma mitogenome is quite similar, which-again-seems to point to a neutral evolutionary process. Can the authors elaborate if this aspect of the genome architecture was taken into account and if they regard it as of lesser relevance (and why, if so)?

We are at a loss why the reviewer thinks that we mark the mRNA-sRNA overlap as “groundbreaking”. This is not our wording and this is not our main finding. In the abstract, we write “we find that many small RNAs originated from the junction sites between protein-coding blocks and rRNA sequence blocks.“ In the discussion we write: “A few cases of mRNA sequences overlapping in antisense orientation with rRNA have been described in mammals and yeast (Kermekchiev and Ivanova 2001; Coelho et al. 2002). Sequences homologous to rRNAs have been found in many coding regions in sense and antisense orientation (Mauro and Edelman 1997). It has been suggested that this could link ribosome production to other cellular processes by reciprocal inhibition of mRNA and rRNA expression (Coelho et al. 2002). It is possible that RNA19 and ribosome production could be balanced with cob protein production by tuning the processing or RNA degradation of cob mRNA.” We honestly do not feel that we overstated anything here and in fact we cite prior ideas on this phenomenon.

What we think is outstanding and novel is that there are small RNAs made at recombination sites and that these are functional – incorporated into ribosomes. We show this in our revision for several more sRNA examples in response to the helpful remarks by this reviewer – see below. The further news is that the block border sRNAs fuse mRNA and non-mRNA sequence, which is discussed in the snippet above for RNA19. Maybe we are ignorant and missed something, but where has it been demonstrated that a short RNA from a genomic recombination site that fuses coding and noncoding sequences is incorporated into ribosomes?

In sum, we do not think that extending the manuscript by analyzing the overlaps in greater detail adds anything to our prime conclusion: sRNAs at block borders are incorporated into ribosomes. But for the reviewers’ sake: there are 9 sites in sRNAs with overlap to mRNAs and other sRNAs, average size of overlaps: 20 nt, cumulative size overlaps: 176 nt, longest overlap: 49 nt (RNA1/2), shortest: 8 nt (RNA17/13). We doubt however that these numbers are interesting for the readers and would rather not include them in the manuscript.

3. Another controversial issue concerns prevalent sequence block combinations and their impact on mitochondrial gene expression regulation. The authors postulate that 5′-terminal blocks of protein-coding genes always occurring near other protein-coding blocks has some functional significance. However, concluding this from just two cases (even if out of two) is quite speculative and seems like reading too much into a pattern that could very well be due to chance alone. The authors argue that the fact that 5′ ends of coxI & coxIII genes overlap is another indication of potential gene expression coordination. While it is possible to envisage such a regulation because of the 5′ termini proximity, the overlap between these genes means that their connection is hardwired into the genome, making it difficult to compare this particular case to the other sequence blocks. Arguably, it is tempting to speculate that an evolutionary pressure exists to coordinate protein expression and such a coordination does not indeed seem implausible, but the presented data and arguments are not convincing. The authors should at least expand on their ideas in the 'Discussion' and indicate potential experiments and/or which additional data could support (or refute) their speculation.

We agree that this is a speculation and that we would need to suggest an experiment to address this. Possibly, this could be tested by mutagenizing the linker regions between coding regions. e.g. with TALEN-based genome editors, and monitoring whether there are joined effects on the expression of adjacent genes. However, this is not established for Apicomplexan mitochondria. So, on second thought, we deleted this paragraph from the discussion since we really do not want to distract from our main findings on sRNAs.

4. My last major point concerns the experimental examination of large-molecular weight complexes and the interpretation of its results. To prove incorporation of the sRNAs into the mitoribosome, i.e., confirm that they do indeed represent rRNAs, the authors opted to investigate their distribution across a sucrose velocity gradient. This is a relatively simple and powerful approach and although it does not provide an irrevocable proof, it can be used to gain very useful insights. However, the presented design has critical flaws: 1) all sRNAs selected for Northern blot were mtLSU components, so only the mtLSU would be detected;

We now included two SSU rRNAs

2) a single cytosolic LSU component was used as the control, so the distribution of cyto-SSU subunit, cyto-ribosome, and cyto-polysomes is actually unclear;

We now purified organelles including a benzonase step, which removes the cytosolic rRNAs. Besides, we focused on the mitochondrial ribosomes and can draw conclusions on their positioning based on added controls (see below) in the absence of information on cytosolic ribosomes.

3) the authors' interpretation relies on the assumption that both mitochondrial and cytosolic ribosomes preserve their association as polysomes, but no relevant control is provided for this. For example, in Figure 6, fractions 6-14 clearly contain cyto-LSU, but polysomes (e.g., disomes) might just as well start in fractions 12-14; without additional controls, or at least continuous monitoring of UV absorbance across the gradient (to show a typical polysomal pattern), it is not guaranteed that what was detected actually included cyto-polysomes.

We now analyzed a total of seven RNAs under ribosome-dissociative conditions and see a strong shift in the sRNA signals towards lower molecular weight fractions.

The main concern, however, is the migration of mitoribosomes. First, the authors presume that the fractions 7-8 contain the mitochondrial monosomes because they are the fractions closest to the gradient top. This is not guaranteed. In fact, based on the experience of our and our colleagues' labs and taking into consideration the conditions used for the described experiment (more precisely, the use of Triton and deoxycholate, which in many organisms lead to mitoribosome subunit dissociation), it seems quite likely that fractions 7-9 actually contain separated mtLSU, not monosomes. Fractions in higher sucrose concentration would then represent monosomes and possibly assembly intermediates, though perhaps also a minor polysomal fraction (if the interactions are preserved in the conditions used). In particular, if the assembly process in Apicomplexa is as complex as in Euglenozoa (e.g., see papers on kinetoplastid mitoribosomes Saurer,2019,Science & Tobiasson,2021,EMBO Journal), which does not seem unlikely in Toxoplasma given the necessity to incorporate ~15 distinct rRNA pieces per mitoribosomal subunit, then the assembly intermediates might form ribonucleoprotein complexes that migrate quite far into a sucrose gradient (e.g., as in kinetoplastid mtSSU, Maslov,2007,Mol Biol Parasit). Thus, while it can be reasonably well argued that the detected RNAs co-migrate with the mtLSU (and possibly mito-monosome), the claim that they associate with mito-polysomes is open to question. More critically, investigating only sRNAs that are clearly identifiable as rRNA pieces-and all from the mtLSU at that-does not automatically prove that all sRNAs associate with the mitoribosome.

To argue that the unassigned sRNAs are associated with mitoribosomes, northern blots of as many as possible (but at the very least one) unassigned sRNAs are absolutely necessary. However, I encourage the authors to consider performing additional experiments to address the issues raised in the preceding paragraph: for example, a western blot of mitochondrial ribosomal protein(s), a northern blot with at least one mtSSU rRNA fragment (since all three shown are from mtLSU), as well as a test that would examine the influence of detergents on mitoribosome stability (e.g., use milder detergents such as digitonin or dodecylmaltoside). Furthermore, if experimental conditions are identified allowing subunit dissociation, it would be possible to discern to which subunit which sRNA belongs and, importantly, whether the unassigned sRNAs are just "disguised" rRNAs (simplest explanation) or something completely different (speculative explanation seemingly favoured by the authors). All this would substantially boost the significance of the presented work.

We thank the reviewer for these thoughtful comments and the many helpful suggestions. Following the reviewer's advice, we have opted for analyzing a total of seven sRNAs, including two RNAs assigned to the small subunit and one RNA not described to share homology with E. coli rRNA. Importantly, we included an analysis using standard dissociative conditions (10mM EDTA, no Mg, 300 mM K) for all sRNAs. Finally, we also include an analysis of the three mRNAs and one mitoribosomal protein. We have furthermore used mitochondrial preparations as starting material, which has the advantage of fewer cross-hybridizations and better detectability of sRNAs in general.

This demonstrated:

High molecular weight complexes deep in the gradient are sensitive to EDTA treatment. Most parsimoniously, these are polysomes, since they comigrate with mitochondrial mRNA and a ribosomal protein we tested (L11:HA, required making of a transgenic line) , which are equally EDTA-sensitive.
SSU and LSU sRNAs show a distinct distribution in the gradient, based on which we could assign hitherto unassigned RNA29 to LSU.

Here are some answers to further reviewer comments:

There are no antibodies available that would recognize mitochondrial ribosomal proteins, but we included now analyses of SSU sRNAs and tagged a ribosomal protein (L11) to allow gradient analysis.
We showed that the high-molecular weight complexes containing sRNAs / rRNAs / L11 are sensitive to EDTA treatment, suggesting they are indeed ribosomes / polysomes.
We did not intend to favor other hypothesis for sRNA functions over them being rRNA fragments. We hope this is now more obvious giving the longer discussion of our gradient analysis. Still, we think it is important to consider other explanations for their accumulation.

The massive addition of data (extension of figure 6 and new figures 7 and 8) led to various textual changes in the last chapter of the results as well as in the last chapter of the discussion. There are so many changes that we refrain from listing them all here and have to refer the reviewer to the manuscript itself.

General comments

The word "novel" is rather overused in the manuscript. At several places, it is inappropriate, as the presented results are not as unprecedented as the manuscript makes them sound; at other places, it might be acceptable, but as the word's meaning is vague, the text would benefit from using more informative term(s) instead. The former case is exemplified by the sentence at the lane 102 "Here, we present a novel method for enriching organellar nucleic acids" – "novel" does not simply mean "new", but alludes to "unprecedented"; yet, the devised method, albeit clever, is a modification of existing approaches. The sentence at the lane 182 illustrates the latter case where "novel blocks" are mentioned, but "previously not detected blocks" would be more appropriate and to the point.

The labelling of 5′ and 3′ is inconsistent throughout the manuscript – sometimes the prime is used, sometimes the apostrophe, sometimes it is the single quotation mark.

We removed the “novel” in “novel method” in response to a comment by reviewer #2.

We went with the reviewer’s suggestion to rename “novel blocks” to “previously not detected blocks” or “additional blocks” and also applied this to “novel transcripts” and “novel rRNAs”. The word “novel” is no longer found in our text; only in the references.

Abstract

In light of the raised concerns, the authors should consider carefully rewording this section, as some of the formulations are mis-representing the data and lead to unjustified generalizations.

We rephrased the section on block arrangements. Since we strengthened our conclusion on functional sRNAs (as rRNAs) with further experiments, we uphold the part that refers to RNAs.

Introduction

lanes 72-73: "How rRNA fragments are assembled into functional ribosomes remains an enigma." – Without proper context, this statement feels like an exaggeration. Fragmented rRNAs are known from other species and their mitoribosome structures were determined in the past few years (i.e., Tetrahymena, Polytomella, Chlamydomonas). Arguably, these mt-rRNAs are not as fragmented as in Toxoplasma, but at the very least, it is clear that base-pairing of rRNA pieces and RNA-binding proteins play significant roles in the process. If the authors think that this is not the case in apicomplexans, this should be at least alluded to, if not explained.

OK, the absoluteness of that statement is indeed misleading. We changed it to:

Line 74-75:

“How the multitude of rRNA fragments are assembled into functional mitoribosomes in T. gondii remains unknown.”

l. 80-83: The paragraph mixes information on Plasmodium and Toxoplasma. To a non-initiated reader, this can be quite confusing. It would be useful to specify which species the authors refer to.

We added species information:

Line 82-86:

“In addition, the depletion of nuclear encoded mito-ribosomal proteins of T. gondii (Lacombe et al. 2019; Shikha et al. 2022) and P. falciparum (Ke et al. 2018) led to defects in the assembly of ETC complexes and in parasite proliferation, suggesting that mito translation is important for parasite survival.”

l. 83-86: The information on the atovaquone impact lacks reference(s).

Thanks, we added the appropriate references as followed.

Line 86-89:

“Resistance to the antimalarial drug atovaquone in P. falciparum and T. gondii has been linked to mutations in the cob (cytB) gene of the mitochondria (McFadden et al. 2000; Syafruddin et al. 1999; Srivastava et al. 1999), further supporting the idea of active, essential translation in apicomplexan mitochondria.”

l. 105: "demonstrated that they are incorporated into polysomes" – In light of the issues raised above and if the authors opt not to expand the work as suggested above, this claim (and similar throughout the text) should be emended.

We provide additional information that this is polysome based – by adding an analysis of sRNAs, mRNAs and L11 distribution under ribosome-destabilizing conditions. Please see the last chapter of the Results section.

l. 106-108: "allowed us to identify novel transcripts, many of which originate from block boundaries and contain mixed origins from coding and noncoding regions." – This sentence would benefit from rephrasing because it is difficult to comprehend (the sequences overlap protein-coding and non-protein-coding regions, but do not contain any origins).

We rephrased (line 108-111):

“The combination of DNA sequencing results and transcriptome analysis also allowed us to identify previously undetected transcripts, many of which originate from block boundaries and represent fusions of coding and noncoding regions.”

Results

l. 115-117: "cell fractionation method that takes advantage of the differential cholesterol content in plasma membranes" – Does Toxoplasma contain cholesterol? Perhaps it might be more practical to refer to sterols (since the effect of digitonin is not limited to cholesterol).

To be more precise we have replaced “cholesterol” with "sterol".

Line 118-122:

“We modified a cell fractionation method that takes advantage of the differential sterol content in plasma membranes and organellar membranes (Esseiva et al., 2004; Subczynski et al., 2017). We incubated cells with the detergent digitonin, which selectively permeabilizes sterol-rich membranes, leaving mitochondria and other organelles intact.”

According to literature, T. gondii cannot synthesize sterols, but instead scavenges cholesterol from the host cell (Coppens et al. 2000). There is a lot of evidence supporting cholesterol incorporation into T. gondii membranes (Foussard et al. 1991, Coppens et al. 2000).

l. 147: "significant increase" – It might be useful to specify that the increase was ~42-fold, so that readers can see the extent of improvement; it has the advantage of really highlighting the achievement.

We value the reviewer's suggestion on how to better emphasize the achievement. We have integrated the fold-change into the manuscript.

Line 177-179:

l. 180: "have been lettered from A-V" – Rewording to "designated by letters from A to V" works better.

We followed the Reviewer’s recommendation and rephrased the sentence.

l. 213-218: This section is essentially a discussion so should be moved the corresponding section of the manuscript.

Since the reviewer pointed out that our suggestion is a speculation and we in response deleted the corresponding discussion, we shortened this part as well.

Line 220-221:

“[…] it is conspicuous that their three 5'-ends are always flanked by blocks also encoding a protein (Figure 2C), although it is unclear whether this is of functional relevance.”

l. 262-265: cotranscripts/transcript isoforms – It is a matter of nomenclature, but it seems more appropriate to refer to "a transcript containing LSUF and LSUG regions" instead of a co-transcript, because in the latter case, one then expects that these two will be separated in a following processing step, which-as the authors demonstrate-is clearly not the case for the vast majority of the population of these rRNA pieces. Given the prevalence of the larger pieces, it seems more appropriate to refer to the "smaller transcript isoforms" as possible degradation products and not isoforms, which implies some kind of functional relevance.

Yes, we agree that this makes more sense. Here is the revised sentence:

Line 274-278:

“Our sequencing results suggest that there is accumulation of transcripts containing LSUF and LSUG regions and LSUD and LSUE regions, respectively. Both of these transcripts were verified via northern blotting (Figure 3—figure supplement 1). The longer transcripts were found to be much more abundant than smaller transcripts that were also detected, suggesting that the longer transcripts represent the functional rRNA fragments in T. gondii mitochondria.”

l. 281: In the section "Discovery of novel rRNA fragments", it might be useful to provide a graphical representation or at least a sentence summarizing all different categories of sRNAs. For instance, what is missing from the text is that there are 11 species for which homologous sequences in "conventional" rRNAs were not identified and out of these only 4 seem to have sequence homologs in other Apicomplexa. In addition, in Supplementary file 5, the authors could indicate where these homologs are located in Plasmodium, since these appear to be newly identified candidates for Plasmodium sRNA species/rRNA pieces.

We tried to describe short RNA categories and made a table summarizing the categories (Supplementary file 8):

Line 294-304:

“Out of the 34 small RNA fragments identified, 23 have not been described previously for T. gondii (see Supplementary file 7 and Figure 3—figure supplement 2). 17 of the 23 are homologs of sRNA fragments in the apicomplexans P. falciparum and E. leuckarti (marked in bold in Supplementary file 7). These were named according to their Plasmodium homologs (Feagin et al. 2012). Six further sRNAs are exclusively conserved within cyst-forming Eucoccidians, the closest apicomplexan relatives of T. gondii (last column in Supplementary file 7). We assigned numbers to these sRNAs extending the Plasmodium nomenclature (Feagin et al. 2012). We next asked, how many of all 34 sRNAs have homologies to rRNA from E. coli. We could find twelve LSU homologs and eleven SSU homologs in accordance with previous analyses in P. falciparum (Feagin et al. 2012; Hillebrand et al. 2018 ; see Supplementary file 8). Only for the P. falciparum LSU fragment LSUC and the SSU fragment RNA12, we were unable to identify corresponding homologs in T. gondii.”

l. 313-314: "In general, block combinations lead to the expression of novel RNAs in T. gondii that are not found in apicomplexan species with a simpler genome organization. " – It is not clear where this generalization comes from: Figure S5A shows that RNA5, RNA7, RNA23t extend across block borders (but based on Table S5 are not unique to Toxoplasma), while only RNA31 and RNA34 are both absent from other Apicomplexa and extend across block borders – yet, this is still less than half of all newly identified sRNAs. In addition, the novelty claim is not clear either: based on the presented data, several sRNAs that overlap are clearly present in other apicomplexans (e.g., RNA1 and RNA2) and thus are not completely new, but merely more divergent in Toxoplasma, because parts of their sequence have been replaced by the shared sequence segment.

We think that this is poor wording on our side. What we wanted to say is that there are sRNA in T. gondii that are not found in Plasmodium and that these are at block borders. Such RNAs are highlighted in Supplementary file 5 (now Supplementary file 7) in the last column. We rephrased this statement accordingly:

Line 331-333:

“In conclusion, block combinations can lead to the expression of novel RNAs in T. gondii that are not found in apicomplexan species with a simpler genome organization (Supplementary file 7)”

l. 319-320: "None of the three RNAs had detectable homologies to rRNA." – Specify to which rRNAs were the sequences compared to make the inference.

RNA16 and RNA23t are homologs of Plasmodium sRNAs that have not been assigned to ribosomal RNA regions previously (Feagin et al. 2012). In their study, Feagin et al. employed tools which also consider secondary structure and base pairing probability of RNAs to assign these sRNAs to the ribosome. The sequence of RNA34 is conserved in Eimeria, however, the corresponding region is not annotated in publicly available Eimeria mitochondrial genomes. Using BLASTN we could not identify homology to E. coli rRNA regions for RNA34.

We added (line 339-340):

“None of the three RNAs had detectable homologies to E. coli rRNA based on simple sequence searches, but structural conservation cannot be ruled out.”

l. 320-321: "For all five coding-noncoding RNAs, homologs are present in the mitochondrial genome of P. falciparum." – Does this mean that they remain unassigned in Plasmodium as well or that they have not been previously recognized in Plasmodium?

We rephrased this to make clear that they have been recognized before.

Line 343-346:

“With the exception of RNA34, the RNAs representing a fusion of non-protein-coding regions and protein-coding regions, have homologous sequences in the mitochondrial genome of P. falciparum (Namasivayam et al. 2021).”

Confusingly, RNA34 is labeled as not having homologs in Apicomplexa in Table S5.

RNA34 does have a homologue in Eucoccidians and the sequence is also found in Eimeria, but not in Plasmodium (see above); we corrected this in Supplementary file 5 (now Supplementary file 7).

In addition, mentioning "coding-noncoding RNAs" is somewhat misleading because some of the sRNAs clearly code for mt-rRNA pieces.

It is debatable whether “coding” should be used for DNA sequences expressing RNAs that do not code for proteins. “Coding” to us implies the genetic code, i.e. protein coding. Nevertheless, to make this clearer, we rephrased this:

Line 336:

“[…] we also found five RNAs that combine sequences from coding and non-protein-coding blocks […]”

Line 343-346:

l. 335-338: This section contains contradictory statements that should be reformulated. A couple of sentences prior, the authors experimentally determined that RNA19 actually overlaps only a single protein-coding sequence (coxI), but then refer to the original and demonstrably incorrect annotation of RNA19 overlapping also the cob gene.

We apologize for the contradiction. The RACE experiment was performed during the writing process and we unfortunately overlooked this sentence when revising our statements about the RNA19 overlap. We have now corrected the sentence.

line 358-360:

“This sequence similarity to rRNA is maintained in T. gondii (Figure 5E), which suggests that despite overlapping with coxIII coding sequence at the J-E block border, RNA19 is functional.”

l. 341: The authors mention similarity to rRNA, but do not specify which rRNA. Referring to similarity to known or conserved rRNA sequences or segments would work better. Still, the region of the block S (i.e., 5′ proximal segment of RNA19) falls into the region between helices H51 and H60 of the domain III in the LSU secondary structure, which is sequence-wise relatively poorly conserved-especially in mitochondrial rRNAs-so sequence divergence is not unexpected.

Thank you. We added this idea to the text.

Line 366-367:

“The similarity to E. coli 23S rRNA is, however, restricted to the noncoding block R (Figure 5—figure supplement 1).”

Line 369-371:

“It is noteworthy that the 5’ end of RNA3 is located within a region of the 23S rRNA secondary structure that exhibits low conservation (Feagin et al. 2012), which may suggest that its overlap with coding regions is potentially tolerable.”

l. 366: "Note that RNA1 and RNA2 are registered according to their shared sequence" – Unclear what "registered" means here.

We replaced “registered” with “aligned”.

l. 416-421: Specifying when reference is made to cytosolic vs. mitochondrial monosomes and polysomes would make this section and the related parts of the 'Discussion' clearer. Also, the authors clearly state here that there might be technical reasons for what they observed, but ignore this possibility in the 'Discussion' and assume that they did indeed separate polysomes.

We added data to this section and did a complete rewrite. We are speaking almost exclusively about mitochondrial monosomes and polysomes and were in the revised version careful to state when cytosolic ribosomes were meant.

Discussion

l. 444: "the reshuffling appears limited to specific block borders and is not random" – How many biological replicates of nanopore sequencing were performed? Did the authors test other T. gondii strains? What about other apicomplexan species? Unless this has been done, there is no demonstration that the block order and block-joining frequencies documented here are (more or less) constant and that block order is under some kind of purifying selection. Hence, the conclusion that the block borders are not random is debatable. Arguably, it is not random in this particular experiment, but neither is it limited to specific blocks because most combinations have been detected (even if at low frequency; Figure S1).

After revisiting the analysis and excluding falsely identified block combinations separated by more than 10 un-annotated nucleotides, we arrive at a much clearer distribution of block combinations. This distribution is evidently non-random, confined to specific blocks. As depicted in Figure 2—figure supplement 1A, only a minority of block combinations appears more than 50 times in our dataset. Supplementary file 4, encompassing both infrequently occurring and potentially falsely identified combinations, further substantiates the notion that block combinations are not randomly distributed.

We also would like to stress that the two other attempts at sequencing the mitogenome of T. gondii found the same block combinations, although the composition of individual reads differed (Namasivayam et al. 2021; Berna et al. 2021). Thus, block combinations are not random.

A straightforward thought experiment underscores this point: in the case of random combinations, all 24 blocks would theoretically pair head-to-head, head-to-tail, and tail-to-tail with every other block, resulting in (24*2)² = 2304 unique two-block combinations. However, even without filtering, we identified only 84 different combinations (see Supplementary file 4), with 32 of these appearing only once and 52 appearing less than 50 times in a total of 284,679 pairs.

We adjusted statements in the manuscript based on our revised analysis:

Line 202-204:

“For example, the most frequent block combination is J-B, which occurs 19,622 times in our reads. In total we identified 84 combinations of which 52 occur less than fifty times and make up less than 0.06% of the total number of combinations found (Supplementary file 4).”

Furthermore we found a mistake in the number of high-frequency block combinations found and corrected it.

Line 213-214:

“Using the 32 high-frequency block combinations we found, we generated a map centered on the protein-coding genes (Figure 2B).”

l. 450: "One intriguing finding is the obligate linkage of coding sequences" – Presuming this sentence is about protein-coding sequences, this should be reformulated because it mis-represents the actual data. Figure 2 clearly shows that protein-coding blocks are often linked to rRNA-coding blocks.

We have deleted this entire section given the comments on the low number of coding regions and thus the difficulty to draw conclusions from the few head-to-head and head-to-tail linkages of coding regions in the genome.

l. 454: "balancing the expression of coxI and coxIII" – Not clear where this information comes from, as it is not from the cited papers.

We deleted this section in response to other comments.

l. 460-461: "Our small RNA sequencing results revealed another potential advantage of the block organization of the T. gondii mitochondrial genome" – This should be reformulated. Clearly, the discovery of the 15 sRNAs was facilitated by the recognition of block order, but the presented argument is a bit confusing: how does the organization into blocks provide an "advantage" and what kind of advantage do the authors mean? (An evolutionary advantage or an advantage related to gene expression regulation or an advantage for their sRNA-Seq data mapping?)

The idea of an advantage was raised since the block organization can lead to the expression of two alternative RNAs and thus could increase the number of RNA species and novel evolutionary options. However, since this is difficult to understand in this introductory sentence and will be explained below, we deleted the phrase:

Line 582:

Our small RNA sequencing results revealed that 15 small RNAs span block borders.

l. 462-478: Multiple explanations are provided for the existence of sRNAs at block borders and what these sRNAs represent. While I agree that it is important to consider all options, even the more debatable ones, the authors seem to forget the simplest possibility: the identified unassigned sRNAs could well be rRNA pieces and them being encoded across block borders is not any more, nor any less surprising than the fact that protein-coding genes are encoded across (several) gene blocks.

There is a conceptual difference between a long mRNA spread across several blocks (and thus also block borders) and a biased distribution of short RNAs to block borders. The short RNAs could also have been positioned elsewhere – being short, they could all be located inside blocks, sense or antisense to coding regions or other non-coding RNAs. But almost half of all RNAs are positioned at block borders. This asks for an explanation, which we discuss here.

Regarding the idea that the sRNAs are rRNA fragments: yes, we agree and had not excluded that. We provide now experimental evidence that many sRNAs, including block-border sRNAs are part of ribosomes. We added the following sentence and discussed this in greater detail in the last chapter of the discussion.

Line 625-626:

“Regarding their function, many, if not all, of the sRNAs at block borders could be used in ribosomes as rRNA fragments, which is discussed in the next chapter.”

l. 485: "antisense RNA surveillance" – In contrast to the nuclei, the existence of a genuine antisense RNA "surveillance" mechanism in mitochondria is uncertain. Given what is known from mitochondria of other organisms (especially plants and kinetoplastids), it seems more likely that certain regions of sense and antisense transcripts are protected from exonucleases by RNA-binding proteins (RBPs such as PPR and related helix-turn-helix repeat proteins, e.g., Toxoplasma's homologs HPRs discovered in Plasmodium [Hillebrand,2018,NAR]), leading to RNAs that partially overlap, but are actually protected from base-pairing by these RBPs. This is not taken into account in any presented explanation of the phenomenon of antisense gene overlaps.

There is no accumulation of RNA antisense to rRNAs and mRNAs. On the other hand it has been shown that long precursor RNAs exist for both strands of Apicomplexa mito genomes including T. gondii (work by Stuart Ralph lab). Hence, there must be degradation of antisense RNA.

We in general like the idea of RNAs as footprints, however, in Apicomplexa, most short RNAs are too long to be footprints for single HPRs or PPRs; also, there are only two PPRs described for T. gondii. As for HPRs, experimental links between HPR and sRNAs are missing so far. We would therefore rather not include this additional speculation in the discussion.

l. 490: "start codon. while also " – Typo: should be a comma, not a dot.

This was corrected.

l. 500: "discovery of block-border sRNAs highlights the complex regulatory mechanisms at play" – This should be reformulated: the claim is very speculative, since no hard data are provided on such regulatory mechanisms in the presented work.

We rephrased this:

Line 622-625:

“Overall, the discovery of block-border sRNAs highlights the complex biogenesis of sRNAs in T. gondii mitochondria and will be a starting point to understand the processing of sRNAs and their function in general.”

l. 504: "sRNAs are incorporated into polysome-size structures" – In light of the concerns raised in the preceding section, this should be reformulated.

We revised this part of the discussion completely since we provide additional data showing that sRNAs are in polysomes.

l. 539-540: The closing sentence should be reformulated. The mitogenome organization in blocks per se does not "allow" the sequences to function as both mRNA and rRNA. Rather, it seems to be a combination of 1) the compactness of the genome that seems to lead to the re-use of certain segments in both mRNA and rRNA or in two distinct rRNAs, and 2) the apparently dynamic nature of the genome (due to recombination among gene blocks) that brings together certain combinations of gene blocks.

Although we do not see why our phrasing would not be correctly representing our findings, we are happy to rephrase according to the reviewer’s suggestion:

Line 684-686:

“T. gondii's dynamic block-based genome organization leads to usage of mitochondrial sequences in mRNA as well asrRNA contexts, potentially linking rRNA and mRNA expression regulation.”

Methods

l. 607: Only agarose gel separation is mentioned, but most experiments shown are of denaturing PAGE separations (which is actually mentioned in several figure legends).

Initially, the description of Denaturing PAGE was included within the chapter on agarose gel electrophoresis. We now created a separate chapter titled “Denaturing Urea-PAGE and sRNA gel blot” and added some details on the method.

Corrected version of the agarose gel electrophoresis chapter:

Line 792-794:

“RNA samples were diluted in denaturing loading buffer (Deionized formamide 62.5% (v/v), formaldehyde 1.14 M, bromophenol blue 200 μg/mL, xylene cyanole 200 μg/mL, MOPS-EDTA-sodium acetate)and separated on a 1% agarose gel containing 1.2% formaldehyde.”

added under “Denaturing Urea-PAGE and sRNA gel blot”:

Line 797-799:

“RNA was separated by denaturing Urea-PAGE (10% or 12% polyacrylamide gel for total RNA/organelle enriched RNA and RNA extracted from sucrose density gradient fractions, respectively).”

l. 636: "Paste your Materials and methods section here." – To be removed.

We have removed the sentence.

l. 662: "NUMTS" – This should be "NUMTs"; the same typo occurs at multiple places in the 'Methods' section.

We have corrected the typo in all places.

l. 704: "Homology search for novel transcript annotation" – Somewhat confusing title; it is possible to guess what the authors likely mean, but it is unclear.

We changed the chapter name to “Sequence similarity search”.

l. 715: "New block annotations can be found in GenBank." – 1) The whole community would very likely appreciate if the GenBank entries were properly annotated (i.e., genes added), not just showed sequences as is currently the case for all Namasivayam,2021, Genome Res entries (not sure about the authors' own entries because they were inaccessible). If impossible to update the entries of the Namasivayam,2021, Genome Res study, then just submitting anew properly annotated GenBank entries would be appropriate.

We are not allowed by GenBank to change the published entries, nor do we have information to add beyond what was published by the Kissinger lab. The exceptions are blocks that we redefined due to our sequencing efforts (GenBank, OR086910 – OR086916). Here, we added annotations on the sRNAs.

2) It was not possible to properly assess some of the claims in the manuscript because access to the files was not provided to reviewers, nor have been the newly submitted GenBank entries made public by the authors.

They are fully publicly available. Same for the GenBank entries.

Figures

Figure 1B – The load of total proteins into each well is unclear. Ponceau stain does not show identical loads, so it is unclear what the reader should take as the reference.

As written in the legend, 5% volume of each fraction was analyzed. The Ponceau is thus not a mass control, but shows that with increasing purification, less total protein remains, while the Western shows that mitochondrial GFP remains / is enriched.

Figure 1D -The phrasing "fragments found in the pellet fractions of the protocol" is a bit awkward. The fragments are in the pellet fractions after plasma membrane permeabilization and benzonase incubation, not in the "fractions of the protocol".

We changed the phrasing to “Both fragments are found in pellet fractions after digitonin treatment, where they are protected from benzonase digestion”.

Figure 2 – The chosen hues of red and green (for coxI and coxIII) are of such similar intensity that they are virtually indistinguishable to ~2% of the readers. A colourblind-friendly palette would be very much appreciated. For guidelines, see for example: https://www.nature.com/articles/nmeth.1618.

We thank the reviewer for the advice. We have adjusted the color palette in the schematics to make it more colorblind-friendly.

Figure 3 – The use of lowercase letters to indicate the probes (instead of the full probe names) is a nice idea and simplifies the reading experience, but the use of the same letter 'a' in different figures for different probes is confusing. Labeling each probe with a unique ID/letter and indicating this ID in the Table S6 (e.g., by adding an additional column) would work much better.

We changed the designation to unique probe IDs which have been integrated as a column in Supplementary file 6 (now Supplementary file 9).

Figure 4A – The wiggle lines for rRNAs are coloured in purple shades, which contrast with the grey colour that is assigned to them in the Figure 2. Keeping a consistent colour palette across figures would be preferable.

We now keep the color code consistent and show RNAs in Figure 2 with colored wiggle lines.

Figure 4C – If the E. coli sequence was on the outer lines, the Toxoplasma sequences could be closer to one another, which would make it easier for the reader to understand the alignment.

We changed this.

Figure 5 – Purple shades for rRNA are somewhat difficult to discern from the blue cob. Also, the 'reference' wiggles would work better if demarcated as a key because this would make it visually clearer that they are shared by the A and B panels.

We’ve revised the color scheme and also designated the legend as such.

Supplementary Information

Figure S1 – An explanation what the A and B panels show is missing.

Thanks, we have added the missing information. We now show the revised heatmap (including directionality) in Figure 2—figure supplement 1 instead of 2B and added a supplemental table (Supplementary file 4) representing the absolute numbers of occurring block combinations.

Figure S5 – It is difficult to appreciate the extent of overlaps with protein-coding sequences if these are missing from the figure (unlike in Figure 5).

The information regarding overlaps is available in various other figures; we aim to avoid further complicating this overview figure. The intent is not to display the overlaps of sRNAs.

Table S4 – Nuclear genome accession number is missing. Add "mitochondrial" to the label of the column "sequence blocks".

We added the missing information.

Table S5 – 1) It is unclear what the 'rRNA homology' refers to. (It does not seem to be the nomenclature used by Feagin et al.,2012, PLoS One.) 2) An extension of the table (or perhaps a separate table) with the cumulative size of mtLSU and mtSSU rRNA pieces, as well as unassigned sRNAs, would be useful. 3) It should also be stated somewhere if homologs of any of the rRNA pieces known from Plasmodium are missing in Toxoplasma. (If so, they could be among the newly identified short RNAs.)

We changed the column title to “assigned rRNA region” and included a footnote to clarify what we refer to.
We incorporated an additional supplementary Table (Supplementary file 8) that compares fragment numbers and cumulative sizes of SSU rRNA, LSU rRNA and unassigned sRNAs between P. falciparum and T. gondii.
We included a sentence in the manuscript referring to P. falciparum rRNA pieces that were not found in T. gondii.

Line 303-304:

“Only for the P. falciparum LSU fragment LSUC and the SSU fragment RNA12, we were unable to identify corresponding homologs in T. gondii.”

Significance

Speaking from personal experience, devising a protocol for such a substantial mitochondrial enrichment, as the study presents, is a great technical achievement, which cannot be understated, especially for a protist or any somewhat unconventional model organism. The mitoribosomal community will certainly take notice of the improved catalogue of mitochondrial rRNA pieces, while the discovery of overlapping protein-coding and rRNA genes will be of interest to those working in the field of mitochondrial evolutionary biology. The study already provides a significant upgrade from the previous attempts to understand the nature of the mitochondrial genome in Toxoplasma (and in Apicomplexa in general), and is well positioned to become a source of inspiration for future studies in the field. However, being at a crossroad of genomics, evolution, and molecular biology, it has certain limitations in its current form, mainly because the evolutionary and molecular biology aspects would benefit from further development (see 'Major concerns'). The text is generally well written and accompanying figures well designed, but clarifications, broader context, and less speculative interpretation would be welcome (as detailed mostly in 'Minor concerns'). To justify publication in a journal with a broad readership, the authors should provide additional experimental evidence to strengthen their case and generalize their findings.

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Citations

Tetzlaff S, Schmitz-Linneweber C. 2023. Characterization of short RNAs, in particular rRNAs from mitochondria of Toxoplasma gondii. NCBI BioProject. PRJNA978626
Schmitz-Linneweber C, Tetzlaff S. 2024. Toxoplasma gondii element block F genomic sequence; mitochondrial. NCBI Nucleotide. OR086910
Schmitz-Linneweber C, Tetzlaff S. 2024. Toxoplasma gondii element block K genomic sequence; mitochondrial, revised. NCBI Nucleotide. OR086911
Schmitz-Linneweber C, Tetzlaff S. 2024. Toxoplasma gondii element block M genomic sequence; mitochondrial, revised. NCBI Nucleotide. OR086912
Schmitz-Linneweber C, Tetzlaff S. 2024. Toxoplasma gondii element block H genomic sequence; mitochondrial, revised. NCBI Nucleotide. OR086913
Schmitz-Linneweber C, Tetzlaff S. 2024. Toxoplasma gondii element block C genomic sequence; mitochondrial, revised. NCBI Nucleotide. OR086914
Schmitz-Linneweber C, Tetzlaff S. 2024. Toxoplasma gondii element block Q genomic sequence; mitochondrial, revised. NCBI Nucleotide. OR086915
Schmitz-Linneweber C, Tetzlaff S. 2024. Toxoplasma gondii element block B genomic sequence; mitochondrial, revised. NCBI Nucleotide. OR086916

Supplementary Materials

Figure 1—source data 1. Raw gel and blot images.

Uncropped blots and gels accompanied by images indicating the areas shown in Figure 1B–D with a red rectangle. In addition, raw scan images are provided. If the scan contains multiple blots, the position of the blot of interest is indicated in the file name. Additionally, for immunoblots light image overlays depicting the membrane outline are provided.

elife-95407-fig1-data1.zip^{(38.2MB, zip)}

Figure 3—source data 1. Raw blot images.

Uncropped blots accompanied by images indicating the areas shown in Figure 3B with a red rectangle. In addition, raw scan images are provided. If the scan contains multiple blots, the position of the blot of interest is indicated in the file name.

elife-95407-fig3-data1.zip^{(10.7MB, zip)}

Figure 3—figure supplement 1—source data 1. Raw blot images.

Uncropped blots accompanied by images indicating the areas shown in Figure 3—figure supplement 1A–B with a red rectangle. In addition, raw scan images are provided. If the scan contains multiple blots, the position of the blot of interest is indicated in the file name.

elife-95407-fig3-figsupp1-data1.zip^{(39.2MB, zip)}

Figure 3—figure supplement 2—source data 1. Raw blot images.

Uncropped blots accompanied by images indicating the areas shown in Figure 3—figure supplement 2B with a red rectangle. In addition, raw scan images are provided. If the scan contains multiple blots, the position of the blot of interest is indicated in the file name.

elife-95407-fig3-figsupp2-data1.zip^{(33.7MB, zip)}

Figure 4—source data 1. Raw blot images.

Uncropped blot accompanied by an image indicating the areas shown in Figure 4B with a red rectangle. In addition, the raw scan image is provided with the position of the blot of interest being indicated in the file name.

elife-95407-fig4-data1.zip^{(7.2MB, zip)}

Figure 5—source data 1. Raw gel and blot images.

Uncropped blots and gels accompanied by images indicating the areas shown in Figure 5C and D and -G with a red rectangle. In addition, raw scan images are provided.

elife-95407-fig5-data1.zip^{(4.9MB, zip)}

Figure 6—source data 1. Raw blot images.

Uncropped blots accompanied by images indicating the areas shown in Figure 6A–D with a red rectangle. In addition, raw scan images are provided. If the scan contains multiple blots, the position of the blot of interest is indicated in the file name.

elife-95407-fig6-data1.zip^{(65.9MB, zip)}

Figure 6—source data 2. Raw blot images.

Uncropped blots accompanied by images indicating the areas shown in Figure 6E–H with a red rectangle. In addition, raw scan images are provided. If the scan contains multiple blots, the position of the blot of interest is indicated in the file name.

elife-95407-fig6-data2.zip^{(52MB, zip)}

Figure 8—source data 1. Raw blot images.

Uncropped immunoblots accompanied by images indicating the areas shown in Figure 8B with a red rectangle. In addition, raw scan images and light image overlays depicting the membrane outline are provided.

elife-95407-fig8-data1.zip^{(5.2MB, zip)}

Figure 8—figure supplement 1—source data 1. Raw gel and blot images.

Uncropped blots and gels accompanied by images indicating the areas shown in Figure 8—figure supplement 1B–C with a red rectangle. In addition, raw scan images are provided. Additionally, for immunoblots light image overlays depicting the membrane outline are provided.

elife-95407-fig8-figsupp1-data1.zip^{(16.2MB, zip)}