Skip to main content
eLife logoLink to eLife
. 2018 Sep 18;7:e37927. doi: 10.7554/eLife.37927

Six domesticated PiggyBac transposases together carry out programmed DNA elimination in Paramecium

Julien Bischerour 1,, Simran Bhullar 2,3, Cyril Denby Wilkes 1, Vinciane Régnier 1,4, Nathalie Mathy 1,, Emeline Dubois 1, Aditi Singh 2, Estienne Swart 2,, Olivier Arnaiz 1, Linda Sperling 1, Mariusz Nowacki 2, Mireille Bétermier 1,
PMCID: PMC6143343  PMID: 30223944

Abstract

The domestication of transposable elements has repeatedly occurred during evolution and domesticated transposases have often been implicated in programmed genome rearrangements, as remarkably illustrated in ciliates. In Paramecium, PiggyMac (Pgm), a domesticated PiggyBac transposase, carries out developmentally programmed DNA elimination, including the precise excision of tens of thousands of gene-interrupting germline Internal Eliminated Sequences (IESs). Here, we report the discovery of five groups of distant Pgm-like proteins (PgmLs), all able to interact with Pgm and essential for its nuclear localization and IES excision genome-wide. Unlike Pgm, PgmLs lack a conserved catalytic site, suggesting that they rather have an architectural function within a multi-component excision complex embedding Pgm. PgmL depletion can increase erroneous targeting of residual Pgm-mediated DNA cleavage, indicating that PgmLs contribute to accurately position the complex on IES ends. DNA rearrangements in Paramecium constitute a rare example of a biological process jointly managed by six distinct domesticated transposases.

Research organism: Other

Introduction

The mobility of DNA transposons is ensured by their self-encoded transposase (reviewed in Hickman and Dyda, 2015). The most commonly studied transposases harbor an RNase H-related catalytic domain including three conserved acidic residues DD(D/E) and have been grouped into distinct superfamilies (Curcio and Derbyshire, 2003; Wicker et al., 2007; Hickman et al., 2010). During evolution, exaptation of transposon-borne genes has sometimes given rise to novel cellular functions through a process called domestication (Volff, 2006; Jangam et al., 2017). Several instances of domesticated DD(D/E) transposases have been reported, some of which still exhibit at least partial catalytic activity. The Transib-originating Rag1 protein catalyzes V(D)J recombination of vertebrate immunoglobulin genes (Kapitonov and Jurka, 2005; Huang et al., 2016); SETMAR, a partially active domesticated mariner transposase, is involved in DNA double-strand break repair in primates (Liu et al., 2007; Kim et al., 2014); α3, domesticated from a hAT transposon, and Kat1, domesticated from a Mutator-like element, carry out mating type switching in the yeast Kluyveromyces lactis (Barsoum et al., 2010; Rajaei et al., 2014). CENP-B, related to mariner elements, serves as a centromere-binding factor, but its ancestral catalytic domain is no longer required for its function (Mateo and González, 2014).

Transposases from the piggyBac family have repeatedly been domesticated in eukaryotes (Bouallègue et al., 2017). In mammals, five PGBD (piggyBac-derived) genes have been identified, but their cellular function has so far remained elusive (Sarkar et al., 2003). The most ancient, PGBD5 (Pavelitz et al., 2013), encodes a protein with a highly divergent catalytic domain that is active for DNA cleavage and transposition (Henssen et al., 2015) and promotes DNA rearrangements in human cancers (Henssen et al., 2017). PGBD1 and 2 are conserved in mammals, but their encoded proteins have lost the DDD catalytic triad characteristic of active PiggyBac (PB) transposases and their cellular function is unknown. PGBD3 and 4 are restricted to primates. Pgbd3, expressed as a fusion with the Cockayne Syndrome CSB transcription factor, does not carry an intact catalytic site, but has retained specific DNA binding activity to piggyBac-related genomic sequences, which may expand the gene network that is transcriptionally regulated by CSB-Pgbd3 (Gray et al., 2012; Weiner and Gray, 2013). In contrast, Pgbd4 harbors a conserved DDD triad, but its cellular function is unknown. Remarkably, catalytically active domesticated PB transposases play an essential role during developmentally programmed genome rearrangements in the ciliates Paramecium and Tetrahymena (Baudry et al., 2009; Cheng et al., 2010; Vogt and Mochizuki, 2013; Cheng et al., 2016; Dubois et al., 2017).

Ciliates are unicellular eukaryotes characterized by their nuclear dimorphism, with two types of nuclei coexisting in the same cytoplasm (Prescott, 1994). The diploid germline micronucleus (MIC), transcriptionally inactive during vegetative growth, undergoes meiosis and transmits the parental genetic information to the zygotic nucleus during sexual reproduction. The highly polyploid somatic macronucleus (MAC), streamlined for gene expression and essential for cell growth, is fragmented and destroyed at each sexual cycle and a new MAC develops from a mitotic copy of the zygotic nucleus. During MAC development, massive genome amplification takes place and, following a few endoduplication rounds,~30% of germline sequences are removed from the somatic genome in P. tetraurelia (Arnaiz et al., 2012) and T. thermophila (Hamilton et al., 2016). In both species, DNA elimination requires the introduction of programmed DNA double-strand breaks (DSB) at the boundaries of eliminated sequences (Saveliev and Cox, 1996; Gratias and Bétermier, 2003). Two modes of sexual reproduction have been described in Paramecium: conjugation and autogamy, a self-fertilization process (reviewed in Betermier and Duharcourt, 2014). In both processes, programmed DNA elimination targets two types of germline sequences. Repeated sequences, for example transposable elements (TEs) or minisatellites, are removed in association with chromosome fragmentation. In addition, the precise excision of 45,000 single-copy non-coding Internal Eliminated Sequences (IESs), which interrupt 47% of all genes in the germline genome, allows proper assembly of functional open reading frames in the somatic genome, essential for the survival of sexual progeny. Paramecium IESs are short (93% shorter than 150 bp), non-coding sequences, whose size follows a sinusoid-shaped distribution with a periodicity equal to the helical pitch of double-stranded B DNA (Arnaiz et al., 2012). IESs are flanked with a conserved TA dinucleotide at each end; a single TA remains at the excision site. IES ends define a loosely conserved 8 bp consensus sequence (5’-TAYAGYNR-3’), of unclear mechanistic significance. Indeed, how the excision machinery accurately targets IES ends remains an open question.

IES excision is a precise ‘cut-and-close’ mechanism that starts with the introduction of DNA DSBs centered on the flanking TAs (Gratias and Bétermier, 2003). PiggyMac (Pgm), a domesticated PB transposase with an intact DDD catalytic motif, is responsible for DNA cleavage (Baudry et al., 2009; Dubois et al., 2017) and the resulting DSBs are repaired through the classical non-homologous end joining pathway (C-NHEJ) (Kapusta et al., 2011; Allen et al., 2017). Tight coupling of DSB introduction and repair is thought to be ensured by the assembly of a Pgm/Ku complex required for DNA cleavage (Marmignon et al., 2014). Here, we report the discovery of five groups of paralogous Paramecium domesticated PB transposases, designated as Pgm-like(s) (PgmL), that appear to be novel essential components of the Pgm-associated complex. Using a combination of RNAi-mediated knockdowns (KDs), immunofluorescence microscopy and whole genome sequencing, we show that each PgmL group is essential for Pgm nuclear localization during the sexual cycle and efficient genome-wide IES excision. In some KDs, residual Pgm complexes lacking one PgmL partner are still detected in the developing MAC and retain partial activity. However, they tend to incorrectly target IES excision boundaries, resulting in excision errors. Our data, as a whole, indicate that six groups of domesticated PB transposases, including one catalytically active subunit (Pgm) and five additional partners (PgmL), act together to carry out IES excision. We discuss a model, in which PgmLs associate with Pgm, favor and stabilize its nuclear localization and ensure the precise positioning of DNA cleavage.

Results

Novel domesticated PB transposase genes in the P. tetraurelia genome

Two structural domains can be predicted in Pgm using the Pfam protein family database (Finn et al., 2016) (http://pfam.xfam.org/, Figure 1A). The first domain (PF13843 or DDE_Tnp1_7) encompasses the RNase H fold-related catalytic domain found in DD(D/E) transposases. The second domain (DDE_Tnp_1-like zinc-ribbon) corresponds to a cysteine-rich domain (CRD), essential for Pgm activity (Dubois et al., 2017). Using a Hidden Markov Model (HMM) search, we discovered that nine putative Pgm-related proteins, hereafter designated as PiggyMac-like (PgmL) proteins, are encoded by the P. tetraurelia somatic genome (Supplementary file 2). A Pfam domain search predicted that the DDE_Tnp1_7 transposase domain is conserved in all PgmLs (Figure 1A). The DDE_Tnp_1-like zinc-ribbon domain was not systematically found using this approach, but alignment of protein sequences confirmed that all PgmLs carry a CRD (Figure 1A, Figure 1—figure supplement 1 and Supplementary file 3).

Figure 1. Novel domesticated PiggyBac transposases in Paramecium.

(A) Domain organization of the PiggyBac transposase (PB) from T. ni and of Paramecium PiggyBac-related proteins (Pgm and PgmLs). The Pfam domain DDE_Tnp_1_7 is shown as a bipartite orange domain, with the RNase H fold corresponding to its right part (conserved catalytic D residues are indicated by vertical bars). The DDE_Tnp_1-like zinc ribbon is in grey. Id: % of amino acid identity; sim: % of similarity. (B) Protein sequence alignment of the residues surrounding the three catalytic aspartic acids (DDD). Following secondary structure prediction, sequence alignments were adjusted manually, using the expected position of the three catalytic D residues in the first and fourth β strands and immediately downstream of the fourth α helix of the RNase H fold domain, respectively (Hickman et al., 2010). ‘?' indicates that the expected α4 helix could not be predicted using the PSIPRED secondary structure prediction software.

Figure 1.

Figure 1—figure supplement 1. MUSCLE alignment of the cysteine-rich domains of ciliate domesticated PB transposases and other PB transposases.

Figure 1—figure supplement 1.

The analysis involved 62 amino acid sequences of PB transposases and domesticated transposases from ciliates and other species. Amino acid sequences encompassing the cysteine-rich domain of each protein were aligned using MUSCLE (http://www.ebi.ac.uk/Tools/msa/muscle/). All sequences used for the alignment are displayed in Supplementary file 3. Complete accession numbers can be found in Figure 1—figure supplement 2.
Figure 1—figure supplement 2. Maximum Likelihood tree of ciliate domesticated PB transposases and other PB transposases.

Figure 1—figure supplement 2.

The tree includes 69 amino acid sequences of PB transposases and domesticated PB transposases from ciliates and other species. To construct the tree, the alignment of all transposase core domains (Supplementary file 1) was edited to remove specific insertions restricted to one particular PgmL family. All accession numbers are indicated, except for the PiggyBat transposase from Myotis lucifugus (Mitra et al., 2013). PB and domesticated PB proteins from Tetrahymena thermophila are in blue. The evolutionary history was inferred by using the Maximum Likelihood method based on the JTT matrix-based model (Jones et al., 1992). The tree with the highest log likelihood (−35952.17) is shown. The percentage of trees in which the associated taxa clustered together is shown next to the branches (bootstrap = 100). Initial tree(s) for the heuristic search were obtained automatically by applying Neighbor-Join and BioNJ algorithms to a matrix of pairwise distances estimated using a JTT model, and then selecting the topology with superior log likelihood value. A discrete Gamma distribution was used to model evolutionary rate differences among sites (five categories (+G, parameter = 2.8098)). There were a total of 541 positions in the final dataset. The tree is drawn to scale, with branch lengths measured as the number of substitutions per site. Evolutionary analyses were conducted in MEGA7 (Kumar et al., 2016).

PgmL-encoding genes form five groups of paralogs. PGML1 and PGML2 each are single genes, whereas PGML3 is composed of three genes: PGML3a and b are duplicates from the most recent whole genome duplication (WGD) that took place during evolution of the P. aurelia group of species (Aury et al., 2006), PGML3c arose from an earlier ‘intermediate’ WGD. Similarly, PGML4a and b on the one hand and PGML5a and b on the other are paralogs from the most recent WGD. Genes from distinct PGML groups do not share nucleotide sequence homology with each other and their encoded proteins are very divergent in sequence and domain organization (Figure 1A and Figure 1—figure supplement 2). Within each group, however, WGD paralogs encode highly similar proteins. Analysis of published genome assemblies confirmed the conservation of at least one representative of each PGML group in other P. aurelia species (McGrath et al., 2014b) (Supplementary file 2, Figure 1—figure supplement 2). Evidence was also obtained that all PGML groups are present in a more distant species, P. caudatum (McGrath et al., 2014a) (Supplementary file 2).

The predicted PgmL proteins have different lengths, ranging from 578 to 1085 residues (Figure 1A). The domain organization of PgmL1 and PgmL2 is close to that of canonical PB transposases, whereas other PgmLs carry additional domains: PgmL3, PgmL4 and PgmL5 have a carboxy-terminal extension predicted to be rich in coiled-coil; an amino-terminal extension with no homology to any known structure is found in PgmL4 and PgmL5. While the presence and position of a ‘DDD’ motif of three aspartic acids in the conserved RNase H domain is essential for the catalytic activity of PB transposases (Mitra et al., 2008), we did not find a complete DDD triad in any PgmL using a combination of sequence alignment and secondary structure prediction (Figure 1B, Supplementary file 1 and 4). PgmL1, PgmL2, PgmL3 and PgmL5 do not carry any conserved D residue, while only two out of three are found in PgmL4 at the expected first and third positions of the triad (D619/617 and D785/783). Given that a single mutation in the catalytic triad is sufficient to completely abolish in vitro activity of the PB transposase from Trichoplusia ni (Mitra et al., 2008), it is unlikely that PgmLs are still catalytically active.

PGMLs are expressed during autogamy and localize in the new developing MAC

We analyzed previously published high-throughput sequencing data obtained from polyadenylated RNAs extracted during three standard autogamy time-courses of P. tetraurelia (Arnaiz et al., 2017), and found that PGMLs are all specifically induced and co-expressed with PGM during new MAC development, when programmed genome rearrangements take place (Figure 2A). We observe maximal expression levels of all genes around 5 to 11 hr (T5 to T11) following the T0 time-point that corresponds to the stage when 50% of cells in the population have fragmented their old MAC.

Figure 2. Expression and nuclear localization of PgmLs during autogamy.

(A) Normalized RNA-seq data were extracted from (Arnaiz et al., 2017) and used to calculate mean expression levels for each time-point. V: vegetative cells (V1.2); S: starved cells with meiotic micronuclei (S1.1 and S1.2); T0: T0.1 and T0.2; T5: T5.1 and T5.2; T11: T11.1 and T11.2; T20: T20.1 and T20.2. All time-points are in hours. (B) Immunofluorescence staining of PgmL proteins in autogamous cells. White arrowheads point to developing new MACs. Scale bar: 10 µm.

Figure 2.

Figure 2—figure supplement 1. Validation of the specificity of antibodies directed against Pgm, PgmL1, PgmL5a, and the Flag peptide by immunofluorescence labelling of fixed cells.

Figure 2—figure supplement 1.

(A) Immunostaining of Pgm in early autogamous cells subjected to control (ND7) or PGM RNAi. (B) Immunostaining of PgmL1 in early autogamous cells subjected to control (L4440) or PGML1 RNAi. (C) Immunostaining of PgmL5a and b in early autogamous cells subjected to control (L4440) or PGML5a and b RNAi. (D) Immunodetection of PgmL3a-FLAG in early autogamous cells injected with a PGML3a-FLAG fusion transgene (top panel). No FLAG signal was detected using the α-FLAG in non-injected cells (bottom panel). Developing MACs are indicated by white arrowheads. Scale bar is 10 μm.
Figure 2—figure supplement 2. Localization of GFP and RFP fusions in developing new MACs.

Figure 2—figure supplement 2.

Plasmids expressing GFP-PgmL1, GFP-PgmL2 and GFP-PgmL5b are derivatives of pUC19 and carry the EGFP coding sequence (Singh et al., 2014) fused to the 5’ end of each PGML coding sequence. They were linearized with ScaI (PGML1 and 2) or NdeI (PGML5b) prior to microinjection. Plasmid expressing PgmL4a-RFP is a derivative of pBL49g (Baudry et al., 2009), in which the PGM gene and its 5’ and 3’ UTRs were first replaced by the PGML4a gene and its upstream and downstream regulatory sequences: in this construct, a codon-optimized sequence encoding a TSSGGGSG linker followed by the mRFPmars protein was inserted between the last codon of PGML4a and the TGA stop codon. It was linearized with NgoMIV prior to microinjection. All plasmids carry the 5’ and 3’ transcription signals of the appropriate PGML gene (sequences available upon request). Following microinjection into the MAC of vegetative cells, transgenes are capped by addition of telomeric repeats at their ends (concatemers may form prior to telomere addition): they may then integrate into the somatic genome (through a mechanism that remains to be studied) or be maintained throughout vegetative growth as autonomously replicating mini-chromosomes (Gilley et al., 1988; Bourgain and Katinka, 1991; Katinka and Bourgain, 1992). Replication origins have not been characterized in Paramecium, but any injected DNA (including bacterial plasmids) may be replicated. Transgenes persist in old MAC fragments throughout autogamy and continue to be expressed if controlled by proper transcription signals, but are lost in the next sexual generation, after complete destruction of the old MACtransformed cells were grown for ~20 vegetative divisions and starved to induce autogamy. (A) Autogamous cells expressing GFP-PgmL1 or GFP-PgmL2. Cells were fixed for 10 min in 2% paraformaldehyde 1X PHEM, and stained with DAPI for 10 min before mounting them in Citifluor AF2 antifading solution (Biovalley). Epifluorescence microscopy imaging was performed using a Nikon Eclipse microscope, with a 63x oil objective. VIS: whole cells visualized under brightfield illumination. (B) Autogamous cell expressing PgmL4a-RFP. Cells were fixed using a previously published procedure (Marmignon et al., 2014) and epifluorescence microscopy imaging was performed as described (Dubois et al., 2017). DIC: whole cells visualized using differential interference contrast microscopy (Normarski). (C) Autogamous cell expressing GFP-PgmL5b. Cells were fixed and stained as mentioned in (Ignarski et al., 2014) for the immunostaining of histone modifications. Confocal imaging was carried out using an Olympus Fluoview Fv1000 confocal laser scanning microscope, with a 63X zoom four objective. Image analysis was performed using ImageJ 1.5 r. In all panels, the white bar represents 10 µm and arrowheads indicate the position of developing new MACs.

The development-specific expression of PGMLs suggests that their encoded proteins may be implicated in DNA rearrangements during MAC development. To confirm protein production and follow the cellular localization of PgmLs during autogamy, we raised specific antibodies against PgmL1 and PgmL5a carboxy-terminal peptides (Figure 2—figure supplement 1). For PgmL2, PgmL3a and PgmL4a, transgenes expressing carboxy-terminal 3X Flag-tagged fusions under the control of their respective endogenous transcription signals were microinjected into the MAC of vegetative cells. Non-injected cells and transformants were grown then starved to induce autogamy. Immunofluorescence microscopy allowed the detection of a specific signal in the developing MAC for all PgmLs 5 to 10 hr after the start of autogamy (Figure 2B). This stage corresponds to the time when Pgm appears in the new MACs (Dubois et al., 2017) and DNA cleavage takes place at IES ends (Gratias and Bétermier, 2003; Gratias et al., 2008; Baudry et al., 2009). Specific localization in the developing new MAC was confirmed using N-terminal GFP fusions for PgmL1, PgmL2 and PgmL5b, and a C-terminal RFP fusion for PgmL4a (Figure 2—figure supplement 2).

Each PGML group is required for successful completion of autogamy

Functional analysis of PGML genes was performed by knocking down their expression using feeding-induced RNA interference (Galvani and Sperling, 2002). PGML1- or PGML2-knocked down cells were unable to produce viable post-autogamous progeny with a functional new MAC (Figure 3A, Supplementary file 6). For PGML3 genes, specific silencing of PGML3a yielded only 30% viable sexual progeny, whereas no significant phenotype was observed following individual PGML3b or PGML3c silencing. In contrast, no sexual progeny were recovered in a double PGML3a and b KD, suggesting that the two paralogs have a redundant function. The contribution of PGML3c - the least expressed gene in the group - is unclear since knocking down this gene alone or together with PGML3a or PGML3b did not give a post-autogamous phenotype. Thus, even though PGML3c has been conserved in all P. aurelia species (Supplementary file 2), we cannot confirm that it carries out any important function. Similar results were obtained for PGML4 and PGML5 groups: knocking down both paralogs gave a stronger phenotype than individual silencing of each gene. We conclude that each PGML group as a whole is essential for the completion of autogamy and paralogs from the most recent WGD play redundant roles in the process.

Figure 3. PgmLs are essential during autogamy and interact with Pgm in cell extracts.

(A) Effect of PGML KDs on the recovery of post-autogamous progeny with functional new MACs. For PGML1, PGML2 and PGML3c, only the results obtained using IF1 RNAi constructs (Figure 3—figure supplement 1) are shown. For groups of duplicated paralogs, individual gene KDs were performed using gene-specific IF2 constructs (Figure 3—figure supplement 1), while double KDs were performed using either IF2 or cross-hybridizing IF1 (*) constructs. Error bars represent standard deviations (n = 2 to 14, see Supplementary file 6) (B) Pull down of HA-PgmL fusions with MBP-Pgm using recombinant proteins expressed in insect cells. In each panel, the HA-tagged protein that was co-expressed with MBP or MBP-Pgm is indicated on the left and the band revealed on western blots (WB) using anti-HA antibodies is indicated on the right. The full-size blot with molecular weight marker is shown in Figure 3—figure supplement 2.

Figure 3.

Figure 3—figure supplement 1. Map and coordinates of PGML feeding inserts.

Figure 3—figure supplement 1.

Two feeding inserts (IF1 and IF2) were designed for each gene. Within multigenic PGML groups, gene-specific inserts are in red and inserts that are able to cross-silence other genes of the same family (* constructs in the main text) are in blue (the cross indicates that blue inserts can target the two paralogs simultaneously). The coordinates of each fragment refer to their 5’ and 3’ nucleotide positions, with +1 corresponding to the first base of the ATG start codon.
Figure 3—figure supplement 2. Co-precipitation of MBP-Pgm with HA-PgmL fusions.

Figure 3—figure supplement 2.

(A) Control DRaCALA DNA binding assay (Differential Radial Capillary Action of Ligand Assay, see [Donaldson et al., 2012]). MBP-Pgm was purified from insect cells. Purified MBP-Pgm (400 nM final concentration) was mixed with a 32P-radiolabeled 80 bp double-strand DNA fragment carrying IES 51A1835 from the surface antigen A gene (25 nM final concentration) in 25 mM HEPES pH7.5, 0.1 mg/ml BSA, 0.5 mM DTT and 50 mM NaCl-containing buffer. NaCl was then added to reach 50, 100, 250 or 500 mM final concentration and complexes were loaded in duplicate onto a Nitrocellulose Hybond ECL membrane. DNA binding is detected at 50 and 100 mM NaCl, while complexes are destabilized at higher salt concentrations (250 mM NaCl and above). (B) Pull down of each HA-PgmL fusion with MBP-Pgm. The full-size western blot is identical to the one used for Figure 3. Recombinant MBP-Pgm was co-expressed in insect cells with each indicated HA-tagged protein, cell extracts were prepared as indicated in Materials and methods and MBP-Pgm was pulled down using amylose beads. HA-tagged proteins were revealed using monoclonal α-HA antibodies (HA-7 from Sigma Aldrich). Expected sizes: 128 kDa (Pgm-HA), 69 kDa (HA-PgmL1), 76 kDa (HA-PgmL2), 88 kDa (HA-PgmL3a), 127 kDa (HA-PgmL4a) and 116 k Da (HA-PgmL5a). ‘Precision Plus Protein Standards’ from Bio-Rad were used as molecular weight markers. (C) Co-immunoprecipitation of MBP-Pgm with each HA-PgmL fusion. For each lane, 1.5 μg of monoclonal α-HA antibodies were incubated overnight at 4°C on a rotating wheel with 10 μL of protein A sepharose beads (GE Healthcare). The coated beads were incubated with the same cell extracts as in (B) for 2 hr at 4°C, then washed 3 times with 1 mL of lysis buffer A and re-suspended in Laemmli buffer (Laemmli, 1970) before electrophoresis in SDS-polyacrylamide gels. MBP-Pgm was detected using HRP-coupled α-MBP antibodies according to the manufacturer’s instructions (New England Biolabs).

PgmLs can form complexes with Pgm

The T. ni PB transposase forms a dimer in solution and probably works as a higher-order oligomer during assembly of the transposition complex (Jin et al., 2017). Previous work in Paramecium established that Pgm multimerizes in cell extracts and several Pgm subunits are required to complete IES excision in vivo (Dubois et al., 2017). Like Pgm, PgmLs are essential for MAC development, even though they lack a complete DDD catalytic triad. We therefore considered the possibility that PgmLs interact with Pgm.

N-terminal HA-tagged versions of PgmL1, PgmL2, PgmL3a, PgmL4a and PgmL5a were expressed in insect cells using synthetic genes cloned into baculovirus vectors (Supplementary file 5). Soluble protein extracts were prepared from cells co-expressing each individual HA-fused PgmL with MBP-Pgm or MBP alone, and the ability of each PgmL to interact with Pgm was tested in MBP pull-down assays. Because recombinant MBP-Pgm binds DNA at low salt concentration (Figure 3—figure supplement 2), all assays were performed under high-salt conditions (500 mM NaCl) to avoid potential DNA-mediated interactions between proteins. We found that each HA-tagged PgmL co-precipitates with MPB-Pgm, whereas little or no co-precipitation is observed with MBP alone (Figure 3B). We confirmed the interaction between Pgm and each PgmL by showing that Pgm co-immunoprecipitates with HA-PgmLs using α-HA antibodies (Figure 3—figure supplement 2). These experiments demonstrate that PgmLs can form complexes with Pgm.

PGML KDs compromise the correct nuclear localization of Pgm

The ability of each PgmL to interact with Pgm prompted us to check the fate of Pgm in PgmL-depleted cells. We knocked down each PGML group as a whole and the efficiency of each RNAi was attested by the absence of progeny with a functional new MAC (Supplementary file 7). For each KD, autogamous cells were collected and fixed between T5 and T10, which corresponds to the time-window when the total cellular amount of Pgm is maximal in control cells (Dubois et al., 2017). Endogenous Pgm was monitored using immunofluorescence (Figure 4A) and immunoblotting (Figure 4B). We confirmed on western blots that Pgm is undetectable in Pgm-depleted cells (Figure 4B). No change in total cellular Pgm amounts was observed in PgmL-depleted cells relative to controls, indicating that neither Pgm expression nor stability are affected in PGML KDs.

Figure 4. Expression and localization of Pgm in PGML KDs.

(A) Immunostaining of Pgm in early autogamous cells subjected to control (L4440) or PGML RNAi. Developing MACs are indicated by white arrowheads. Scale bar is 10 μm. (B) Western blot analysis of Pgm expression in early autogamous cells subjected to control (L4440: two independent controls A and B are shown), PGML or PGM RNAi. (C) Boxplot representation of the distribution of Pgm fluorescence intensities quantified in 30–55 μm2 developing MACs subjected to the different RNAi shown in (A). This size window corresponds to the maximal Pgm signal in the control (Figure 4—figure supplement 1) and was chosen to quantify nuclear Pgm immunofluorescence for all KDs, since no significant size difference was noticed for developing MACs relative to the control. For each condition, 19 to 35 developing MACs were analyzed. (D) Independent set of experiments showing the quantification of Pgm fluorescence intensity in 30–55 μm2 developing MACs following control (ND7) or PGM RNAi. 11 and 12 MACs were analyzed, respectively. In (C) and (D): *** for p<0.001 in a Mann-Whitney-Wilcoxon statistical test (see Materials and methods for details).

Figure 4.

Figure 4—figure supplement 1. Plot of Pgm mean immunofluorescence intensity vs developing MAC size in cells subjected to control or PGML RNAi.

Figure 4—figure supplement 1.

For each RNAi condition, 44 to 48 developing MACs were analyzed on slides carrying whole cells immunostained with α-Pgm antibodies (Figure 4). The Pgm mean fluorescence intensity was plotted against the size of the developing MACs. As previously reported, the intensity of the Pgm signal varies during MAC development (Dubois et al., 2017). In control experiments (here using L4440), maximal level of Pgm expression was observed for MAC sizes ranging between 30 and 55 μm2: this size window is highlighted in grey.
Figure 4—figure supplement 2. Immunolocalization of Pgm without Triton extraction in PGML knockdowns.

Figure 4—figure supplement 2.

(A) Immunostaining of Pgm in cells subjected to control (L4440), PGML or PGM RNAi. Cells were fixed for 10 min in PHEM +2% formaldehyde, permeabilized for 15 min in PHEM +1% Triton before TBST +3% BSA washes. The following steps of the immunostaining were performed as described in the supplementary Materials and methods. Developing MACs are indicated by white arrowheads. The last panel (Vegetative control) shows the background immunostaining of a vegetative cell in the control RNAi culture. Scale bar is 10 μm. (B) Plot of Pgm mean fluorescence intensity vs size in developing MACs. Since background immunostaining is relatively high under these experimental conditions, the mean Pgm fluorescence intensity was calculated by measuring the mean fluorescence intensity in developing MACs minus the mean fluorescence intensity for vegetative cells on the slide. Under these experimental conditions, the size of the developing MACs is larger than in standard immunostaining conditions including the pre-extraction step (Figure 4—figure supplement 1), and the maximum level of Pgm in control cells is observed for MAC sizes falling within the 45–90 μm2 range (highlighted in grey). 12 to 33 developing MACs were quantified for each condition. (C) Boxplot representation of Pgm mean fluorescence intensity (in arbitrary units: A.U.) for developing MACs ranging between 45–90 μm2 in size. 10 to 25 MACs were quantified for each RNAi. (D) Mean Pgm fluorescence intensity in PGML KDs after immunostaining with or without the pre-extraction step. Data plotted in (C) were compared to those plotted in Figure 4C, after normalization by the mean control value obtained in each experimental condition (with or without pre-extraction). Error bars represent the standard deviation for each dataset. *** for p<0.001 (Mann-Whitney-Wilcoxon test).

Immunofluorescence staining, however, revealed that the endogenous nuclear Pgm signal is systematically lower in PGML KDs relative to control (Figure 4A). Quantification of the Pgm signal in new MACs revealed a 35% decrease in a PGML1 KD and a ~ 75% decrease in every other PGML KD (Figure 4C), almost reaching the 85% decrease observed in a PGM KD (Figure 4D). Of note, the immunofluorescence protocol used here was set up for optimal detection of nuclear Pgm (Dubois et al., 2017) and includes a Triton-mediated permeabilization step prior to cell fixation (see Materials and methods). Because this pre-extraction procedure may affect the apparent localization of proteins that are not tightly held in the nucleus (Martini et al., 1998), we also performed Pgm immunostaining in PGML KDs omitting this step. Under these conditions, the quality of the control Pgm immunostaining was reduced and a higher background was observed, but we could still quantify the Pgm nuclear signal in the different KDs (Figure 4—figure supplement 2). We still observed a ≈35% decrease in PGML1 KD relative to the control, and a 50% to 60% decrease in every other PGML KD. These data therefore indicate that bona fide Pgm nuclear localization is significantly affected by the depletion of PgmL2, PgmL3, PgmL4 or PgmL5 and, to a lesser extent, by the depletion of PgmL1. The significant exacerbation of the localization defect observed in PgmL2, PgmL3, PgmL4 or PgmL-5-depleted cells subjected to a pre-extraction procedure further indicates that depletion of these particular PgmL reduces the strength of Pgm association with the nucleus (Figure 4—figure supplement 2).

PGML KDs have a genome-wide impact on IES elimination

To gain genome-wide insight into the effect of PGML KDs on IES excision, large-scale cultures were subjected to RNAi against PGML1, PGML2, PGML3a and b, PGML4a and b or PGML5a and b, and genomic DNA was extracted from isolated nuclei at late autogamy stages for high-throughput sequencing (Supplementary file 8). IES retention scores (IRS) were computed for each sample, using IES+ sequencing reads matching IES boundaries and IES- reads matching precise IES excision junctions (Denby Wilkes et al., 2016, see Materials and methods). The efficiency of PGML KDs was checked using northern blot hybridization of total RNA from autogamous cells (Figure 5—figure supplement 1) and confirmed by the absence of viable post-autogamous progeny (Supplementary file 7).

The distributions of IRS show that every PGML KD strongly inhibits IES excision genome-wide (Figure 5A). Differences, however, are observed between the five PGML groups. PGML2, PGML4a and b or PGML5a and b KDs result in significant retention of all IESs in the new MAC. PGML1 and PGML3a and b KDs still allow efficient excision of a fraction of IESs, referred to as non-significantly retained (i.e. excised) following statistical analysis of IRS (Denby Wilkes et al., 2016): this represents 7479 and 3511 IESs, respectively. Strikingly, 89% of excised IESs in a PGML3 KD are also excised in a PGML1 KD (Figure 5B). Of note, excised IESs in PGML1 or PGML3a and b KDs tend to be significantly longer (median size = 62 or 67 bp, respectively) than an equivalent number of strongly retained IESs in the same KDs (median size = 48 or 55 bp, respectively) (Figure 5C). Analysis of the size distributions reveals that the size bias can mainly be attributed to over-representation of IESs from the 75 to 77 bp peak (and larger sizes) (Figure 5D). Under-representation of 55 to 57 bp IESs can also be noted.

Figure 5. Analysis of IES retention in PGML KDs.

(A) Distribution of IES retention scores (IRS) in PGML KDs. Grey bars represent the distribution of all IESs over IRS ranging from 0 to 1 (by bins of 0.025). The distribution obtained in a previously published PGM KD (Arnaiz et al., 2012) is shown as a control. Absolute values of IRS should not be compared from one KD to the other, due to variable contamination by old MAC fragments. For PGML1 and PGML3a and b KDs, the distribution of statistically non-significantly retained IESs (i.e. excised IESs) is superimposed in magenta. (B) Venn diagram representing the overlap between the sets of excised IESs in PGML1 or PGML3a and b KDs (C) Violin plots of IES length distributions for the population of non-significantly retained IESs (magenta; n = 7479 in PGML1 KD and n = 3511 in PGML3a and b KD), the same number of IESs with the highest retention scores (blue) and all IESs (grey). The black dash shows the median of each distribution. Plots were drawn using the ggplot2 R-package (Wickham, 2009). Size distributions were compared using a Mann-Whitney-Wilcoxon statistical test and p values are indicated for each comparison (***: p<2.2 10−16; **: 2.2 10−16<p<10−10; *: 10−10<p<5 10−2; NS: p>5.5 10−2) (D) Comparative analysis of the relative fraction of IESs in each size peak among the populations of excised (magenta) and strongly retained (blue) IESs in PGML1 and PGML3a and b KDs, and the whole IES population (grey). Only IESs between 40 and 130 bp are represented.

Figure 5.

Figure 5—figure supplement 1. Northern blot analysis of PGML mRNA during autogamy in PGML knockdowns.

Figure 5—figure supplement 1.

Top panels: Detection of PGML mRNAs by northern blot hybridization. RNAi against the ND7 gene was used as a control. PGML1 and PGML2 genes were knocked down individually using their respective IF1 RNAi-inducing plasmids (Figure 3—figure supplement 1). Each of PGML4 or PGML5 groups of paralogs was knocked down as a whole by feeding Paramecium cells with a 1:1 mixture of induced bacteria harboring IF1 plasmids targeting RNAi against both genes a and b. For PGML3 genes, the most highly expressed a and b genes were knocked down through the same double feeding procedure using cross-reacting IF1 plasmids. PGML hybridization probes (indicated on top) were the gene-specific IF2 inserts (Figure 3—figure supplement 1). Time-points are in hours after T0, the time at which 50% of cells had a fragmented MAC. Bottom panel: Quantification of mRNA levels in PGML KDs, as the percentage of steady state mRNA amounts observed in the control RNAi. Acquisition of PGML hybridization signals was performed using a Typhoon phosphorimager and quantified using the ImageQuant TL software (GE Healthcare Life Sciences). For each lane, 17S rRNA was used as a loading control for normalization.
Figure 5—figure supplement 2. Analysis of IES retention scores in partial PGML2 KDs.

Figure 5—figure supplement 2.

(A) Distribution of IES retention scores in complete and partial PGML2 KDs. Dilutions of PGML2 RNAi-inducing bacteria are indicated in each panel. For each condition, the distribution of statistically non-significantly retained IESs (i.e. efficiently excised IESs) is plotted in magenta.(B) Spearman’s rank correlation coefficients between IES retention scores in all PGML KDs, including PGML2 partial KDs. The graph was drawn using the ‘levelplot’ function from the ‘lattice’ package (Sarkar, 2008) (see Figure 5—figure supplement 3 for details). (C) Violin plots of the distributions of IES length for the population of 10297 non-significantly retained IESs in the partial PGML2 KD (1:4 dilution) (magenta), the same number of IESs with the highest retention scores (blue) and all IESs (grey). The black dash shows the median of each distribution. Violin plots were drawn using the ggplot2 R-package (Wickham, 2009). Size distributions were compared using a Mann-Whitney-Wilcoxon statistical test and p values are indicated for each comparison (***: p<2.2 10−16; **: 2.2 10−16<p<10−10; *: 10−10<p<5 10−2; NS: p>5.5 10−2). (D) Comparative analysis of the relative fraction of IESs in each size peak among the populations of efficiently excised (magenta) and strongly retained (blue) IESs in the partial PGML2 KD (1:4), and the whole IES population (grey). Only IESs between 40 and 130 bp are represented.
Figure 5—figure supplement 3. Correlation between IES retention scores in PGM, PGML and partial PGML2 knockdowns.

Figure 5—figure supplement 3.

The distributions of IES boundary scores are displayed in the diagonal for each knockdown. For each pair of KDs, level plots of IES retention scores were drawn using the ‘ggplot2’ package (Wickham, 2009) and are shown below the diagonal. Spearman’s rank correlation coefficients are displayed at the symmetrical position above the diagonal.

To investigate whether the size biases described above are a specificity of PGML1 and PGML3a and b KDs or simply attributable to incomplete RNAi, we partially released PGML2 KD by diluting PGML2 RNAi-inducing medium in RNAi medium targeting a non-essential gene (see Dubois et al., 2017). Consistent with higher survival in the progeny (Supplementary file 7), partial PGML2 KD (1:4 dilution) shifts the distribution of IRS toward zero and allows efficient excision of a fraction of 10,297 IESs (Figure 5—figure supplement 2). The distributions of IRS in partial PGML2 KD correlate with those obtained in PGML1 or PGML3 KDs (Spearman’s rank correlation coefficients = 0.7 and 0.77, respectively; Figure 5—figure supplements 2 and 3), indicating that a similar gradient is overall established from excised to strongly retained IESs in these KDs. However, comparison of the size distributions of excised versus strongly retained IESs in the three conditions confirms that over-representation of large IESs (75 to 77 bp in length and larger) is specific to excised IESs in PGML1 and PGML3 KDs (Figure 5D, Figure 5—figure supplement 2).

PgmL1- or PgmL3a and b-depleted cells are prone to IES excision errors

Previous whole-genome DNA sequencing of wild-type reference strains revealed that IES excision generates sequence heterogeneity in the somatic genome (Duret et al., 2008; Swart et al., 2014), most likely due to erroneous excision events taking place between two TA dinucleotides, one of which at least is localized at an alternative position relative to reference IES boundaries. To evaluate the background of IES excision errors in the absence of any KD, we first sequenced total DNA extracted from parental vegetative cells before they were subjected to PGML RNAi. We analyzed sequencing reads mapping to the MAC + IES reference and counted the number of erroneous excision reads present in each sample. Consistent with published data, we found a low number of erroneous junctions (2.4% to 3% of all excision reads) in vegetative MACs (V) formed under no-KD conditions (Supplementary file 9).

We then sequenced genomic DNA of nuclei-enriched preparations from autogamous cells originating from the parental cultures described above and subjected to PGML RNAi. As in wildtype vegetative MACs, a fraction of excision reads (2.9% to 3.7%), representing the contribution of both the new MAC and old MAC fragments, correspond to erroneous junctions. For each PGML KD, we calculated a normalized number of de novo excision errors that are specific to the new MAC (Materials and methods). Fewer de novo errors (26 to 38 per million mapped reads) are observed in PGML2, PGML4 or PGML5 KDs than in no-KD controls (Figure 6A), consistent with our observation that no significant excision activity is detected in these PGML KDs. In contrast, PGML1 and PGML3 KDs yield similar de novo excision counts (122 and 98, respectively) relative to control (Figure 6A), in spite of strongly reduced IES excision activity (Figure 5A), suggesting that residual excision in these KDs tends to be error-prone. The observation that the fraction of IESs, for which at least one error is detected in PGML1 or PGML3a and b KDs, is higher among excised IESs provides support to this idea: specifically for these two KDs, the fraction of fully excised IES (IRS < 0.025) with errors reaches ~45%, versus ~10% under other conditions (Figure 6—figure supplement 1).

Figure 6. IES excision errors in PGML KDs.

(A) Number of IES excision errors in the MAC of vegetative cells before autogamy (No KD) and in the new MAC of autogamous cells upon each PGML KD (de novo errors). For the No KD sample, the error bar represents the standard deviation for five replicates (V samples in Supplementary file 9). (B) Major classes of IES excision errors found in the different samples. In external or internal errors, the two alternative TAs are misplaced (one on each flanking side or both inside of a reference IES, respectively). Overlapping errors use one TA inside and the other outside of reference IESs. (C) Distribution of the different classes of de novo excision errors in PGML KDs. As a control, the distribution of pre-existing errors found in the old MAC is shown for a vegetative culture (see Supplementary file 9). (D) Position of alternative excision boundaries used in partial internal excision errors, relative to the canonical boundary of the reference IES. WT: vegetative MAC; for PGML KDs, only de novo errors were considered. (E) Size distribution of IESs exhibiting partial internal errors in a PGML1 KD. Upper panel: size distribution of all IESs with partial internal errors. Lower panel: the black curve shows the fraction of IESs of each size relative to the total number of IESs in the genome; the red curve shows the fraction of IESs of each size among the population of IESs showing at least one partial internal error in a PGML1 KD. In both panels, only IESs with an alternative boundary at >2 bp from the canonical one were counted. In the bottom panel, IESs shorter than 35 bp were not considered (see Figure 6—figure supplement 4).

Figure 6.

Figure 6—figure supplement 1. The number of excision errors increases for IESs with the lowest retention scores in PGML1 or PGML3a and b knockdowns.

Figure 6—figure supplement 1.

(A) Fraction of IESs with errors as a function of boundary scores. For each dataset (see Supplementary file 9), the distribution of IES boundary scores is shown in grey (interval width: 0.025). For each interval, the number of IESs showing at least one excision error is superimposed in yellow and blue dots represent the ratio of IESs with at least one error (for intervals containing at least 50 IESs). All classes of errors were considered in this analysis. To allow comparison between PGML1, PGML3 and partial PGML2 KDs, raw error counts were used for all plots. (B) Plot of the fraction of IESs showing at least one excision error among fully excised IESs (boundary score <0.025). Vegetative controls are five independent cultures before autogamy in each RNAi-inducing medium (numbers refer to PGML groups).
Figure 6—figure supplement 2. Raw counts of IES excision errors in PGML1, PGML3a and b and partial PGML2 knockdowns.

Figure 6—figure supplement 2.

(A) Raw counts of IES excision errors in PGML1, PGML3a and b and partial PGML2 KDs. Here, and in contrast to Figure 6C, the distribution of different classes of IES excision errors for each condition was plotted without subtracting the contribution of the old MAC from raw error counts. Over-representation of partial internal excision errors was still detected in PGML1 and PGML3a and b KDs. See Supplementary file 9 for details. (B) Position of alternative excision boundaries used in partial internal excision errors in PGM and PGML KDs, relative to the canonical boundary of the reference IES. As in panel A, all partial internal errors were considered for each sample without subtracting the contribution of the old MAC, leading to higher error counts for each position than shown in Figure 6D.
Figure 6—figure supplement 3. Alternative excision boundaries used in partial internal IES excision errors.

Figure 6—figure supplement 3.

(A) Fraction of IES ends with a TA dinucleotide localized at each indicated distance from the reference TA boundary. Position 0 corresponds to the T of the reference TA at each IES end. (B) SeqLogos of canonical (left) and alternative internal (11–12 bp distant) TAs erroneously used in PGML KDs were determined using the ‘weblogo’ software, version 3.3 (Schneider and Stephens, 1990; Crooks et al., 2004). (C) Same analysis as in panel A, restricted to IESs ranging from 44 to 47 bp in length.
Figure 6—figure supplement 4. Size distribution of IESs with partial internal excision errors in PGML1 or PGML3a and b KDs.

Figure 6—figure supplement 4.

(A) Size distribution of IESs exhibiting partial internal errors in a PGML1 KD. (B) Same as panel A for a PGML3a and b KD. (C) For each IES size, the black curve shows the fraction of IESs relative to total number of IESs in the genome, the red (PGML1 KD) and blue (PGML3a and b KD) curves show the fraction of IESs of each size among the population of IESs showing at least one partial internal error. In all panels, only IESs with an alternative boundary located >2 bp from the canonical one were counted.

Erroneous junctions can be grouped in different classes according to the location of the TAs used as alternative excision boundaries (Figure 6B). We found that PGML1 or PGML3 KDs modify the relative proportions of de novo error classes compared to a no-KD control or to PGML2, PGML4 or PGML5 KDs (Figure 6C and Supplementary file 9). The most conspicuous change is a specific increase in the proportion of partial internal excision errors, reaching more than 50% of all de novo errors in PgmL1- or PgmL3-depleted cells. Over-representation of this particular class of errors is not observed in partial PGML2 KDs (Figure 6—figure supplement 2), indicating that erroneous targeting of alternative internal boundaries is not a general consequence of limiting availability of the excision machinery, but is a specificity of PGML1 and PGML3 KDs. In the latter two KDs, most erroneous internal boundaries are shifted by 10 to 11 bp from the canonical TA, resulting in excision of DNA fragments shortened by one helical turn (Figure 6D). The position of alternative boundaries cannot be explained simply by a biased distribution of TA dinucleotides in IESs, since internal TAs can be found 5 to 6 bp away from reference IES boundaries (Figure 6—figure supplement 3), but they are not used in partial internal excision errors. Except for the TA, erroneous alternative boundaries do not match the general consensus sequence for IES ends (5’-TAYAGYNR-3’), nor do they define a novel sequence motif (Figure 6—figure supplement 3). Moreover, ‘unused’ canonical ends exhibit no particular sequence motif - different from the general consensus - that would suggest a preference of PgmL1 or PgmL3 for a specific sequence. Finally, we noticed that error-prone IESs, for which an internal TA (mostly shifted by 10 to 11 bp) is used in erroneous excision events, follow a different size distribution from the global IES population (Figure 6E and Figure 6—figure supplement 4): IESs of 66 bp and above are over-represented, whereas 47 bp and shorter IESs are largely under-represented. This size bias is similar to the distribution of excised IESs in PGML1 or PGML3 KDs, indicating again that errors are linked to excision activity.

Discussion

IES excision in Paramecium is carried out by multiple partners including Pgm and five novel domesticated PB transposases

The present discovery of novel essential Pgm partners, encoded by five groups of paralogous genes and collectively referred to as PgmL1 to PgmL5, brings new insight into the mechanism of IES excision and deeper understanding of the molecular machinery involved. PgmLs are domesticated PB transposases distantly related to Pgm. Consistent with their essential function, Pgm and PgmLs are conserved within the P. aurelia group as well as in the more distant P. caudatum, indicative of an ancient origin of the IES excision machinery in Paramecium. More work is clearly needed to unravel the nature and organization of the IES excision machinery, but our data are in favor of a unique protein complex embedding at least two Pgm subunits and one PgmL subunit from each group (Figure 7). Indeed, depletion of each single PgmL group is sufficient to strongly compromise the nuclear localization of Pgm. As a consequence, each PGML KD inhibits excision of a large majority of IESs (83% to 100%) genome-wide. The remaining ~17% IESs that are still excised in PGML1 or PGML3 KDs are prone to excision errors, suggesting that PgmLs are stricty required for efficient and accurate excision.

Figure 7. Model for IES excision mediated by a multicomponent Pgm/PgmLs complex.

Figure 7.

This figure summarizes the observed effects of PGML KDs on Pgm-mediated IES excision. In line with previously published data (Dubois et al., 2017) and known properties of the T. ni PiggyBac transposase (Jin et al., 2017), the catalytically active form of Pgm is assumed to be a dimer. In the absence of information about the stoichiometry of the complex, one Pgm homodimer (active catalytic site drawn as a star) is represented at each IES boundary, with a large bridging structure formed by all PgmLs. In a fully assembled complex, PgmL subunits are proposed to drive the correct positioning of the Pgm catalytic site onto both TA cleavage sites (indicated by arrows). Following PGML KDs, we propose two distinct situations. In PGML1 or PGML3 KDs, the depleted complexes can exist but are misassembled. As a consequence, Pgm nuclear stability is reduced (the phenotype is more pronounced in a PGML3 KD than in a PGML1 KD) and Pgm activity is altered, because incorrect positioning of catalytic subunits generates specific partial internal excision errors at low frequency (a dotted arrow points to the erroneously targeted alternative TA). In the other three KDs, IES excision complexes depleted for PgmL2, PgmL4 or PgmL5 are totally inactive. This might result either from strong misassembly of the complex or its dissociation (or non-assembly).

We show here that PgmLs can each interact directly with Pgm and we previously reported that Pgm also has homo-oligomerization properties (Dubois et al., 2017). Future studies will address whether these interactions are mutually exclusive, whether PgmLs interact with each other and, more importantly, which among these interactions participate in the assembly of the full excision complex. The Paramecium system shares interesting features with the higher-order complexes (transpososomes) that interact with transposon ends during bona fide transposition. Biochemical and structural studies of DD(D/E) transposases and retroviral integrases bound to their cognate DNA substrates have revealed that assembly of a productive transpososome often involves more transposase subunits than those actually engaged in catalysis, the supernumerary subunits playing an architectural role within the transposition complex (Montaño et al., 2012; Hickman et al., 2014); reviewed in Hickman and Dyda, 2015). In particular, recent studies have shown that the T. ni PB transposase dimerizes in solution (Jin et al., 2017) and higher-order complexes were proposed to assemble during piggyBac transposition (Morellet et al., 2018). Within a transpososome, all transposase subunits are generally identical because they are encoded by a single gene carried by the mobile element itself. In Paramecium, multiple domesticated transposase genes encode the different subunits that together carry out IES elimination. Selection pressure to maintain a fully conserved catalytic triad has been exerted on only one of these genes (PGM) and previous work established that at least two Pgm subunits are present in the complex (Dubois et al., 2017), consistent with a model in which their catalytic sites are positioned on each TA boundary (Figure 7). The catalytic domain of PgmL proteins has evolved more rapidly, suggesting that PgmLs rather have an architectural function within the complex. Moreover, PgmL proteins differ in their domain organization and PgmL depletions have different effects on IES excision efficiency and accuracy, which suggests that all PgmLs may not play exactly the same role. Of note, the conservation of two Ds in PgmL4, which is the most closely related to Pgm, indicates that its RNase H domain, although probably inactive, has evolved under selection pressure, suggesting that it may play a particular role in the catalysis of DNA cleavage. RNAi-mediated depletion of PgmL2, PgmL4 or PgmL5 completely abolishes Pgm cleavage activity and its stable nuclear localization, perhaps by impairing correct assembly of the excision complex (Figure 7) and/or loosening its interaction with chromatin, preventing us at this stage from proposing a specific function for these three proteins. In contrast, PgmL1- or PgmL3-depleted complexes exhibit residual activity associated with an increased bias for partial internal excision errors involving the use of alternative 10 bp shifted TA boundaries. Because such a bias is not observed in partial PGML2 KDs, our data cannot simply be attributed to partial depletion and rather point to a specific function of PgmL1 and PgmL3. We propose that incomplete machineries devoid of PgmL1 or PgmL3 can still interact with IESs (Figure 7), even though much less efficiently and with lower accuracy than fully assembled complexes. The size biases observed in the population of excised IESs in PGML1 or PGML3 KDs might reveal that one function of these two PgmL subunits is to provide the architectural versatility required to adjust to the variable features of eliminated sequences.

Size biases of partial internal excision errors recapitulate evolution-driven IES size distribution

Paramecium IESs exhibit a characteristic size distribution, with a minimum size of 26 bp and a ~ 10 bp periodicity proposed to result from mechanistic constraints on IES excision (Arnaiz et al., 2012). The present study of partially active PgmL1- or PgmL3-depleted complexes is consistent with this hypothesis.

The overlapping subsets of IESs that still excise efficiently in PGML1 and PGML3a and b KDs tend to be larger in size than an equivalent pool of strongly retained IESs, with an under-representation of 46 to 47 bp IESs and over-representation of 75 to 77 bp (and larger) IESs. No specific sequence other than the usual consensus was found at the ends of excised or strongly retained IESs in these KDs, indicating that no particular motif defines the subsets of PgmL1- (or PgmL3)-dependent or independent IESs. We propose, instead, that PgmL1 and PgmL3 contribute to the positioning of Pgm-dependent DNA cleavages at correct TA boundaries, thus fine-tuning the size of the excised sequences (Figure 7). Indeed, residual activity of PgmL1- or PgmL3-depleted machineries reveals an over-representation of partial internal excision errors, leading to excision of IES-derived fragments one turn of a DNA helix shorter than reference IESs, a distance that corresponds to the periodicity of the IES size distribution. Remarkably, erroneous excision of 46 to 47 bp IESs that would have led to excision of 36 to 37 bp fragments is not observed (Figure 6E). Furthermore, 36 to 37 bp IESs are prone to partial internal errors resulting in excision of 26 to 28 bp fragments. These observations indicate that sequences of 36–37 bp may be mechanistically difficult to excise, as proposed previously based on the existence of a ‘forbidden’ peak corresponding to this size range in the distribution of genomic reference IESs (Arnaiz et al., 2012). Likewise, IESs from the 26 to 28 bp peak do not yield erroneously excised 10 bp shorter TA-indels, supporting the notion that 26 bp is the minimum size for excision by the Pgm-associated machinery. Taken together, our data provide strong experimental support to the hypothesis that mechanistic limitations have imposed strong constraints on Paramecium IES size during evolution.

Domesticated PB transposases in ciliates and other species

The distantly related ciliate Tetrahymena thermophila, which separated from Paramecium at least ~500 M years ago (Parfrey et al., 2011), harbors a clear Pgm ortholog, Tpb2p (Figure 1—figure supplement 2) that carries out imprecise excision of intergenic IESs, which constitute the vast majority of IESs in this ciliate (Cheng et al., 2010; Vogt and Mochizuki, 2013; Hamilton et al., 2016). Additional Tetrahymena domesticated PB transposases (Tpb7p and Lia5p), also lacking a conserved DDD triad, may somehow be related to Paramecium PgmLs, although the evolutionary relationships between Tetrahymena and Paramecium proteins are difficult to assess (Figure 1—figure supplement 2). While the role of Tpb7p is unknown (Cheng et al., 2016), Lia5p is essential for Tpb2p-dependent DNA elimination, localizes on IESs before excision and is involved in the delimitation of their excision boundaries (Shieh and Chalker, 2013; Suhren et al., 2017). Lia5p and Tpb7p may represent functional homologs of PgmLs for IES excision, but whether they interact with Tpb2p and how they contribute to DNA elimination at the molecular level remain to be established. In addition to the Tpb2p/Lia5p system, Tetrahymena encodes two other domesticated PB transposases, Tpb1p and Tpb6p, that precisely excise a distinct subset of 12 intragenic piggyBac-related IESs, in a Tpb2p-independent manner (Cheng et al., 2016; Feng et al., 2017). Thus, in contrast to Paramecium, Tetrahymena possesses two distinct IES excision machineries, each responsible for elimination of a particular subset of IESs. In spite of their differences, the two ciliate species provide remarkable examples of the participation of multiple-component protein complexes composed of catalytic and non-catalytic subunits, all domesticated from the same transposase family, in a transposition-related reaction. In humans, the Pgbd1 to Pgbd5 PB domesticated transposases do not all carry a fully conserved catalytic triad. Based on our Paramecium work, future investigations should take into consideration the possibility that Pgbd proteins may be involved together in the same cellular function(s).

Materials and methods

In silico protein sequence analysis

Paramecium genes and protein sequences were uploaded from the ParameciumDB database (Arnaiz and Sperling, 2011) (https://paramecium.i2bc.paris-saclay.fr/) and accession numbers are displayed in Supplementary file 2. HMMer version 3.1b (http://eddylab.org/software/hmmer3/3.1b2/Userguide.pdf, default parameters) was used to search for Pgm-related proteins using the DDE_Tnp1_7 domain (PF13843) as the query and the predicted Paramecium proteins from ParameciumDB (v1.77) as the database. Multiple protein sequence alignments were performed using MUSCLE (http://www.ebi.ac.uk/Tools/msa/muscle/) (Edgar, 2004). Secondary structures were predicted using PSIPRED (V3.3) at the UCL website (http://bioinf.cs.ucl.ac.uk/psipred/) (Jones, 1999).

Paramecium strains and standard culture conditions

P. tetraurelia wild-type 51 new (Gratias and Bétermier, 2003) or its mutant derivative 51 nd7-1 (Dubois et al., 2017) were grown in a standard medium made of a wheat grass infusion inoculated with Klebsiella pneumoniae and supplemented with β-sitosterol (0.8 µg/mL) (Beisson et al., 2010). Autogamy was carried out as described (Dubois et al., 2017).

Gene knockdown experiments

RNAi during autogamy was achieved using the feeding procedure, as described (Dubois et al., 2017). Briefly, Paramecium cells grown for 10 to 15 vegetative fissions in plasmid-free Escherichia coli HT115 bacteria (Timmons et al., 2001) were transferred to medium containing non-induced HT115 harboring each RNAi plasmid and grown for ~4 divisions. Cells were then diluted into plasmid-containing HT115 induced for dsRNA production and allowed to grow for ~8 additional vegetative divisions before the start of autogamy. Final volumes were 3 to 4 mL for small-scale experiments, 50 to 75 mL for middle-scale experiments, 4 L for large-scale experiments. The presence of a functional new MAC in the progeny was tested after four days of starvation, as described (Dubois et al., 2017).

Control experiments were performed using the L4440 vector (Kamath et al., 2001) or plasmid p0ND7c, which targets RNAi against the non-essential ND7 gene (Garnier et al., 2004). For PGML KDs, PCR fragments from each PGML gene (Figure 3—figure supplement 1) were inserted into the multiple cloning site of L4440. Two different RNAi-inducing constructs targeting distinct non-overlapping regions (IF1 and IF2) were used for each gene: within multi-gene PGML families, IF1 shared strong nucleotide sequence homology between paralogs, while IF2 regions were gene-specific.

RNA and DNA extraction

During autogamy, total RNA was Trizol-extracted from ~2 to 5 × 105 cells for each time-point and quantified using a NanoDrop spectrophotometer. Gel electrophoresis and northern blot hybridization with 32P-radiolabelled DNA probes were performed as described (Baudry et al., 2009). For PCR analysis, total genomic DNA was extracted from ~1 to 3 × 103 cells for each time-point using the NucleoSpin Tissue extraction kit (Macherey Nagel). For high throughput DNA sequencing, genomic DNA was extracted from whole cells (Gratias and Bétermier, 2003; Arnaiz et al., 2012) or isolated nuclei (Gratias and Bétermier, 2003; Arnaiz et al., 2012).

Microinjection of transgenes expressing tagged PgmL proteins

Plasmids expressing PgmL2-3X Flag, PgmL3a-3X Flag and PgmL4a-3X Flag are pUC18 derivatives, in which the P. tetraurelia PGML2, PGML3a and PGML4a genes, respectively, were fused at their 3’ end to a synthetic DNA sequence (Integrated DNA Technologies) encoding the 3X Flag tag (YKDHDGDYKDHDIDYKDDDDKT). All sequences are available upon request. Each transgene-bearing plasmid was linearized with BsaI and microinjected with an ND7-complementing plasmid into the MAC of vegetative 51 nd7-1 cells, as described (Dubois et al., 2017).

Immunofluorescence and western blot analysis

Polyclonal α-Pgm 2659-GP guinea pig antibodies were described in (Dubois et al., 2017). Peptides DKGKSVQYAKQVEIE and FSQVRKQAYKKQTQP from the C-terminus of PgmL1 and PgmL5a, respectively, were used for rabbit immunization to yield α-PgmL1 and α-PgmL5a antibodies (Eurogentec). Polyclonal antibodies were purified by antigen affinity purification. A commercial α-Flag monoclonal antibody (monoclonal anti-Flag M2 antibody, Sigma Aldrich) was used for the detection of 3X Flag fusion proteins. The specificity of α-PgmL1 and α-PgmL5a antibodies was validated by immunofluorescence using PGML1 and PGML5a and b-silenced cells respectively, whereas the specificity of the α-Flag antibody was validated using non-injected control cells (Figure 3—figure supplement 1).

Immunofluorescence and western blot analysis were performed as described (Dubois et al., 2017), with these modifications. Autogamous cells from middle-scale cultures were washed with Dryl's buffer (2 mM sodium citrate, 1 mM NaH2PO4, 1 mM Na2HPO4, 1 mM CaCl2), extracted with ice-cold PHEM (60 mM PIPES, 25 mM Hepes, 10 mM EGTA, 2 mM MgCl2 pH 6.9)+1% Triton during 4 min, fixed for 15 min in PHEM +2% formaldehyde. Cells were further washed three times in TBST (10 mM Tris pH 7.4, 0.15 M NaCl, 0.1% Tween20)+3% BSA. The Triton-mediated pre-extraction step was found to increase the detection signal and lower the background level for all the antibodies we used. Antibody incubation was done in TBST +3% BSA for 2 hr at room temperature using either α-Pgm 2659-GP (1:500), α-PgmL1 (1:800), α-PgmL5a (1:800) or α-Flag (1:500) antibodies. Primary antibodies were detected with (Alexafluor 488)-conjugated goat anti-guinea pig, anti-rabbit or anti-mouse IgG (1:500, ThermoFisher Scientific) and DNA was counterstained with 0.5 µg/ml DAPI (Sigma). Epifluorescence microscopy was performed as described (Dubois et al., 2017). The size of developing MACs (in µm2) was measured at the maximal area section and quantification of the Pgm signal was performed using the ImageJ software (https://imagej.nih.gov/). The mean Pgm fluorescence intensity corresponds to the mean fluorescence intensity (per surface unit) in a developing MAC minus the mean extracellular background fluorescence intensity on the slide for each RNAi condition. For each experiment, normalization was performed using the mean value obtained for the corresponding control dataset. Boxplots were drawn using BoxPlotR (http://boxplot.bio.ed.ac.uk/). Mann-Whitney-Wilcoxon statistical tests (Marx et al., 2016) (https://ccb-compute2.cs.uni-saarland.de/wtest/) were performed to compare the datasets obtained under different conditions.

For the protein extraction from Paramecium cells and western blot analysis, 3 to 6 × 105 autogamous cells were collected by centrifugation at T5-T10 and washed with Dryl's buffer before transfer to liquid nitrogen. Frozen concentrated cells were directly lysed following addition of an equal volume of boiling 10% SDS containing 1x Protease Inhibitor Cocktail Set 1 (Merck Chemicals) and incubated at 100°C for 3 min. SDS-PAGE and western blotting with α-Pgm 2659-GP (1:500) and α-alpha Tubulin TEU435 (1:1000) were performed as described (Dubois et al., 2017). The signal was visualized with the ChemiDoc Touch Imaging System (Bio-rad) and densitometric analyses were performed using Image Lab software (Bio-rad).

Protein expression in insect cells and co-precipitation assays

For MBP-Pgm or MBP expression, plasmids pVL1392-MBP-PGM and pVL1392-MBP (Marmignon et al., 2014) were transfected individually into High Five cells together with the BD BaculoGold Linearized Baculovirus DNA (BD Biosciences) to produce recombinant baculoviruses (Dubois et al., 2017).

Synthetic PGML genes adapted to the universal genetic code (Eurofins Genomics, Supplementary file 5) were cloned into the pFastBAC vector (ThermoFisher Scientific) and fused at their 5’ end to a nucleotide sequence encoding the HA tag. Production of recombinant baculoviruses and expression of HA-PgmL fusions were performed using the BAC-to-BAC baculovirus expression system (ThermoFisher Scientific).

To co-express each HA-PgmL fusion with MBP-Pgm (or the MBP control), High Five cells were co-infected with the appropriate recombinant baculoviruses. Cell lysis, preparation of soluble protein extracts, co-precipitation on amylose beads and detection of HA-tagged PgmLs on western blots using HA-7 monoclonal α-HA antibodies (Sigma Aldrich) were performed as described (Dubois et al., 2017). Independent experiments showed that the HA epitope does not interact non-specifically with MBP-PGM.

High throughput sequencing and analysis of IES retention

For each PGML KD, total genomic DNA was extracted from vegetative parental cells or nuclear preparations enriched in developing MACs from the same cultures at late autogamy stages (following 4 days of starvation). DNA was sequenced at a 76 to 160X coverage by a paired-end strategy using Illumina HiSeq (paired-end read length:~2×100 nt) or NextSeq (paired-end read length:~2×150 nt) sequencers. Sequencing reads were mapped against the MAC or MAC + IES reference genomes of P. tetraurelia 51 (Arnaiz et al., 2012). IRS correspond to the mean of the two boundary scores of a given IES calculated using the MIRET module of the ParTIES package (Denby Wilkes et al., 2016). Because variable amounts of DNA from old MAC fragments are present in the samples, the retention scores calculated in each experiment cannot be considered as absolute measurements of IES retention in the new MAC. For each IES, boundary scores were individually compared to those obtained in a control autogamy experiment performed in standard K. pneumoniae medium, as described in (Gruchota et al., 2017) and a statistical test for the significance of each boundary score was performed using the ParTIES package. This allowed us to define two groups of IESs: a set of significantly retained IESs and a set of ‘non-retained’ IESs (i.e. excised) that did not pass the statistical test.

Excision errors were analyzed using the MILORD module of ParTIES, with the mapping performed on the MAC + IES reference. Each error was counted as 1, independently of the number of corresponding reads. An estimate number of de novo excision errors in the new MAC was calculated by removing the errors already found in the MAC of parental vegetative cells from those that were detected in total genomic DNA from autogamous cells (Supplementary file 9). De novo error counts were normalized relative to the total number of sequence reads for each sample.

Data availability

All DNA-seq datasets (Supplementary file 8) generated in this study were deposited in the European Nucleotide Archive under the Project Accession PRJEB24171. Reference genomes and IESs are available through ParameciumDB (https://paramecium.i2bc.paris-saclay.fr).

Acknowledgments

This work has benefited from the facilities and expertise of the high throughput sequencing core facility of I2BC (Centre de Recherche de Gif - http://www.i2bc-saclay.fr/). We thank Sylvain Bouvard, Franck Mayeux and Victor Plet for performing preliminary experiments during their master internship, Cindy Mathon and Pascaline Tirand for excellent technical assistance and Arthur Abello, Marc Guérineau, Sandra Duharcourt, Laurent Duret, Eric Meyer, Nelly Morellet, Anne-Marie Tassin and all members of the Duharcourt, Meyer and Tassin laboratories for stimulating discussions. The project was carried out in the framework of the CNRS GDRI ‘Paramecium Genome Dynamics and Evolution’.

Funding Statement

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Contributor Information

Julien Bischerour, Email: julien.bischerour@i2bc.paris-saclay.fr.

Mireille Bétermier, Email: mireille.betermier@i2bc.paris-saclay.fr.

Funding Information

This paper was supported by the following grants:

  • Agence Nationale de la Recherche ANR-10-BLAN-1603 to Mireille Bétermier.

  • Fondation ARC pour la Recherche sur le Cancer #PJA20151203521 to Julien Bischerour.

  • European Research Council 260358 to Mariusz Nowacki.

  • Schweizerischer Nationalfonds zur Förderung der Wissenschaftlichen Forschung 31003A_146257 to Mariusz Nowacki.

  • Centre National de la Recherche Scientifique Intramural CNRS funding to Mireille Bétermier.

  • Agence Nationale de la Recherche ANR-12-BSV6-0017 to Linda Sperling.

  • Schweizerischer Nationalfonds zur Förderung der Wissenschaftlichen Forschung 31003A_166407 to Mariusz Nowacki.

Additional information

Competing interests

No competing interests declared.

Author contributions

Conceptualization, Funding acquisition, Investigation, Visualization, Writing—original draft, Writing—review and editing, Designed, Performed and analyzed wet-lab experiments.

Investigation, Writing—review and editing, Designed, Performed and analyzed wet-lab experiments.

Software, Formal analysis, Visualization, Analyzed all DNA sequencing data.

Investigation, Visualization, Methodology, Writing—review and editing, Designed, Performed and analyzed wet-lab experiments.

Investigation.

Validation, Writing—review and editing.

Validation.

Investigation, Writing—review and editing, Identified PGML genes in Paramecium species.

Data curation, Identified PGML genes in Paramecium species.

Data curation, Funding acquisition, Writing—review and editing, Identified PGML genes in Paramecium species.

Supervision, Funding acquisition, Directed the experiments performed in Bern.

Conceptualization, Funding acquisition, Investigation, Visualization, Writing—original draft, Project administration, Writing—review and editing.

Additional files

Supplementary file 1. MUSCLE alignment of the transposase core domains of ciliate domesticated PB transposases and other PB transposases.
elife-37927-supp1.pdf (149.1KB, pdf)
DOI: 10.7554/eLife.37927.024
Supplementary file 2. Table of Pgm and PgmL proteins encoded by published Paramecium genomes and their ParameciumDB accession numbers.
elife-37927-supp2.xlsx (14.5KB, xlsx)
DOI: 10.7554/eLife.37927.025
Supplementary file 3. Sequences of the cysteine-rich domains used for the alignment shown in Figure 1—figure supplement 1.
elife-37927-supp3.rtf (56.4KB, rtf)
DOI: 10.7554/eLife.37927.026
Supplementary file 4. Sequences of the transposase core domains used for the alignement shown in Supplementary file 1.
elife-37927-supp4.rtf (78KB, rtf)
DOI: 10.7554/eLife.37927.027
Supplementary file 5. Sequence of the synthetic PGML genes used for protein production in insect cells.
elife-37927-supp5.rtf (60.8KB, rtf)
DOI: 10.7554/eLife.37927.028
Supplementary file 6. Analysis of post-autogamous progeny in small-scale PGML knockdowns,
elife-37927-supp6.xlsx (25.6KB, xlsx)
DOI: 10.7554/eLife.37927.029
Supplementary file 7. Analysis of post-autogamous progeny in middle- and large-scale PGML knockdowns.
elife-37927-supp7.xlsx (13.8KB, xlsx)
DOI: 10.7554/eLife.37927.030
Supplementary file 8. DNA-seq datasets from ENA project PRJEB24171 (this study).
elife-37927-supp8.xlsx (13.5KB, xlsx)
DOI: 10.7554/eLife.37927.031
Supplementary file 9. Analysis of IES excision reads in PGM and PGML knockdowns.
elife-37927-supp9.xlsx (17.3KB, xlsx)
DOI: 10.7554/eLife.37927.032
Transparent reporting form
DOI: 10.7554/eLife.37927.033

Data availability

All DNA-seq datasets generated in this study were deposited in the European Nucleotide Archive under the Project Accession PRJEB24171. Reference genomes and IESs are available through ParameciumDB (http://paramecium.i2bc.paris-saclay.fr).

The following dataset was generated:

Bischerour J, author; Bhullar S, author; Denby Wilkes C, author; Régnier V, author; Mathy N, author; Dubois E, author; Singh A, author; Swart E, author; Arnaiz O, author; Sperling L, author; Nowacki M, author; Bétermier M, author. DNA-seq of PGMLs knocked down cells. 2018 http://www.ebi.ac.uk/ena/data/view/PRJEB24171 Publicly available at the European Nucleotide Archive (accession no: PRJEB24171)

The following previously published datasets were used:

Arnaiz O, author; Mathy N, author; Baudry C, author; Malinsky S, author; Aury JM, author; Denby Wilkes C, author; Garnier O, author; Labadie K, author; Lauderdale BE, author; Le Mouël A, author; Marmignon A, author; Nowacki M, author; Poulain J, author; Prajer M, author; Wincker P, author; Meyer E, author; Duharcourt S, author; Duret L, author; Bétermier M, author; Sperling L, author. DNA-seq of PGM knocked down cells. 2012 http://www.ebi.ac.uk/ena/data/view/ERA137444 Publicly available at the European Nucleotide Archive (accession no: ERA137444)

Arnaiz O, author; Mathy N, author; Baudry C, author; Malinsky S, author; Aury JM, author; Denby Wilkes C, author; Garnier O, author; Labadie K, author; Lauderdale BE, author; Le Mouël A, author; Marmignon A, author; Nowacki M, author; Poulain J, author; Prajer M, author; Wincker P, author; Meyer E, author; Duharcourt S, author; Duret L, author; Bétermier M, author; Sperling L, author. DNA-seq strain 51MAC. 2012 http://www.ebi.ac.uk/ena/data/view/ERA137420 Publicly available at the European Nucleotide Archive (accession no: ERA137420)

References

  1. Allen SE, Hug I, Pabian S, Rzeszutek I, Hoehener C, Nowacki M. Circular concatemers of Ultra-Short DNA segments produce regulatory RNAs. Cell. 2017;168:990–999. doi: 10.1016/j.cell.2017.02.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Arnaiz O, Mathy N, Baudry C, Malinsky S, Aury JM, Denby Wilkes C, Garnier O, Labadie K, Lauderdale BE, Le Mouël A, Marmignon A, Nowacki M, Poulain J, Prajer M, Wincker P, Meyer E, Duharcourt S, Duret L, Bétermier M, Sperling L. The Paramecium germline genome provides a niche for intragenic parasitic DNA: evolutionary dynamics of internal eliminated sequences. PLoS Genetics. 2012;8:e1002984. doi: 10.1371/journal.pgen.1002984. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Arnaiz O, Sperling L. ParameciumDB in 2011: new tools and new data for functional and comparative genomics of the model ciliate Paramecium tetraurelia. Nucleic Acids Research. 2011;39:D632–D636. doi: 10.1093/nar/gkq918. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Arnaiz O, Van Dijk E, Bétermier M, Lhuillier-Akakpo M, de Vanssay A, Duharcourt S, Sallet E, Gouzy J, Sperling L. Improved methods and resources for paramecium genomics: transcription units, gene annotation and gene expression. BMC Genomics. 2017;18:483. doi: 10.1186/s12864-017-3887-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Aury JM, Jaillon O, Duret L, Noel B, Jubin C, Porcel BM, Ségurens B, Daubin V, Anthouard V, Aiach N, Arnaiz O, Billaut A, Beisson J, Blanc I, Bouhouche K, Câmara F, Duharcourt S, Guigo R, Gogendeau D, Katinka M, Keller AM, Kissmehl R, Klotz C, Koll F, Le Mouël A, Lepère G, Malinsky S, Nowacki M, Nowak JK, Plattner H, Poulain J, Ruiz F, Serrano V, Zagulski M, Dessen P, Bétermier M, Weissenbach J, Scarpelli C, Schächter V, Sperling L, Meyer E, Cohen J, Wincker P. Global trends of whole-genome duplications revealed by the ciliate Paramecium tetraurelia. Nature. 2006;444:171–178. doi: 10.1038/nature05230. [DOI] [PubMed] [Google Scholar]
  6. Barsoum E, Martinez P, Aström SU. Alpha3, a transposable element that promotes host sexual reproduction. Genes & Development. 2010;24:33–44. doi: 10.1101/gad.557310. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Baudry C, Malinsky S, Restituito M, Kapusta A, Rosa S, Meyer E, Betermier M. PiggyMac, a domesticated piggyBac transposase involved in programmed genome rearrangements in the ciliate Paramecium tetraurelia. Genes & Development. 2009;23:2478–2483. doi: 10.1101/gad.547309. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Beisson J, Bétermier M, Bré MH, Cohen J, Duharcourt S, Duret L, Kung C, Malinsky S, Meyer E, Preer JR, Sperling L. Paramecium tetraurelia: the renaissance of an early unicellular model. Cold Spring Harbor Protocols. 2010;2010:pdb.emo140. doi: 10.1101/pdb.emo140. [DOI] [PubMed] [Google Scholar]
  9. Betermier M, Duharcourt S. Programmed Rearrangement in Ciliates: Paramecium. Microbiology spectrum. 2014;2 doi: 10.1128/microbiolspec.MDNA3-0035-2014. [DOI] [PubMed] [Google Scholar]
  10. Bouallègue M, Rouault JD, Hua-Van A, Makni M, Capy P. Molecular evolution of piggyBac superfamily: from selfishness to domestication. Genome Biology and Evolution. 2017;9:292–339. doi: 10.1093/gbe/evw292. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Bourgain FM, Katinka MD. Telomeres inhibit end to end fusion and enhance maintenance of linear DNA molecules injected into the Paramecium primaurelia macronucleus. Nucleic Acids Research. 1991;19:1541–1547. doi: 10.1093/nar/19.7.1541. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Cheng CY, Vogt A, Mochizuki K, Yao MC. A domesticated piggyBac transposase plays key roles in heterochromatin dynamics and DNA cleavage during programmed DNA deletion in Tetrahymena thermophila. Molecular Biology of the Cell. 2010;21:1753–1762. doi: 10.1091/mbc.e09-12-1079. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Cheng CY, Young JM, Lin CG, Chao JL, Malik HS, Yao MC. The piggyBac transposon-derived genes TPB1 and TPB6 mediate essential transposon-like excision during the developmental rearrangement of key genes in Tetrahymena thermophila. Genes & Development. 2016;30:2724–2736. doi: 10.1101/gad.290460.116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Crooks GE, Hon G, Chandonia JM, Brenner SE. WebLogo: a sequence logo generator. Genome Research. 2004;14:1188–1190. doi: 10.1101/gr.849004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Curcio MJ, Derbyshire KM. The outs and ins of transposition: from mu to kangaroo. Nature Reviews Molecular Cell Biology. 2003;4:865–877. doi: 10.1038/nrm1241. [DOI] [PubMed] [Google Scholar]
  16. Denby Wilkes C, Arnaiz O, Sperling L. ParTIES: a toolbox for Paramecium interspersed DNA elimination studies. Bioinformatics. 2016;32:599–601. doi: 10.1093/bioinformatics/btv691. [DOI] [PubMed] [Google Scholar]
  17. Donaldson GP, Roelofs KG, Luo Y, Sintim HO, Lee VT. A rapid assay for affinity and kinetics of molecular interactions with nucleic acids. Nucleic Acids Research. 2012;40:e48. doi: 10.1093/nar/gkr1299. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Dubois E, Mathy N, Régnier V, Bischerour J, Baudry C, Trouslard R, Bétermier M. Multimerization properties of PiggyMac, a domesticated piggyBac transposase involved in programmed genome rearrangements. Nucleic Acids Research. 2017;45:3204–3216. doi: 10.1093/nar/gkw1359. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Duret L, Cohen J, Jubin C, Dessen P, Goût JF, Mousset S, Aury JM, Jaillon O, Noël B, Arnaiz O, Bétermier M, Wincker P, Meyer E, Sperling L. Analysis of sequence variability in the macronuclear DNA of Paramecium tetraurelia: a somatic view of the germline. Genome Research. 2008;18:585–596. doi: 10.1101/gr.074534.107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Edgar RC. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Research. 2004;32:1792–1797. doi: 10.1093/nar/gkh340. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Feng L, Wang G, Hamilton EP, Xiong J, Yan G, Chen K, Chen X, Dui W, Plemens A, Khadr L, Dhanekula A, Juma M, Dang HQ, Kapler GM, Orias E, Miao W, Liu Y. A germline-limited piggyBac transposase gene is required for precise excision in Tetrahymena genome rearrangement. Nucleic Acids Research. 2017;45:9481–9502. doi: 10.1093/nar/gkx652. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Finn RD, Coggill P, Eberhardt RY, Eddy SR, Mistry J, Mitchell AL, Potter SC, Punta M, Qureshi M, Sangrador-Vegas A, Salazar GA, Tate J, Bateman A. The Pfam protein families database: towards a more sustainable future. Nucleic Acids Research. 2016;44:D279–D285. doi: 10.1093/nar/gkv1344. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Galvani A, Sperling L. RNA interference by feeding in Paramecium. Trends in Genetics. 2002;18:11–12. doi: 10.1016/S0168-9525(01)02548-3. [DOI] [PubMed] [Google Scholar]
  24. Garnier O, Serrano V, Duharcourt S, Meyer E. RNA-mediated programming of developmental genome rearrangements in Paramecium tetraurelia. Molecular and Cellular Biology. 2004;24:7370–7379. doi: 10.1128/MCB.24.17.7370-7379.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Gilley D, Preer JR, Aufderheide KJ, Polisky B. Autonomous replication and addition of telomerelike sequences to DNA microinjected into Paramecium tetraurelia macronuclei. Molecular and Cellular Biology. 1988;8:4765–4772. doi: 10.1128/MCB.8.11.4765. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Gratias A, Bétermier M. Processing of double-strand breaks is involved in the precise excision of paramecium internal eliminated sequences. Molecular and Cellular Biology. 2003;23:7152–7162. doi: 10.1128/MCB.23.20.7152-7162.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Gratias A, Lepère G, Garnier O, Rosa S, Duharcourt S, Malinsky S, Meyer E, Bétermier M. Developmentally programmed DNA splicing in Paramecium reveals short-distance crosstalk between DNA cleavage sites. Nucleic Acids Research. 2008;36:3244–3251. doi: 10.1093/nar/gkn154. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Gray LT, Fong KK, Pavelitz T, Weiner AM. Tethering of the conserved piggyBac transposase fusion protein CSB-PGBD3 to chromosomal AP-1 proteins regulates expression of nearby genes in humans. PLoS Genetics. 2012;8:e1002972. doi: 10.1371/journal.pgen.1002972. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Gruchota J, Denby Wilkes C, Arnaiz O, Sperling L, Nowak JK. A meiosis-specific Spt5 homolog involved in non-coding transcription. Nucleic Acids Research. 2017;45:4722–4732. doi: 10.1093/nar/gkw1318. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Hamilton EP, Kapusta A, Huvos PE, Bidwell SL, Zafar N, Tang H, Hadjithomas M, Krishnakumar V, Badger JH, Caler EV, Russ C, Zeng Q, Fan L, Levin JZ, Shea T, Young SK, Hegarty R, Daza R, Gujja S, Wortman JR, Birren BW, Nusbaum C, Thomas J, Carey CM, Pritham EJ, Feschotte C, Noto T, Mochizuki K, Papazyan R, Taverna SD, Dear PH, Cassidy-Hanley DM, Xiong J, Miao W, Orias E, Coyne RS. Structure of the germline genome of Tetrahymena thermophila and relationship to the massively rearranged somatic genome. eLife. 2016;5:e19090. doi: 10.7554/eLife.19090. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Henssen AG, Henaff E, Jiang E, Eisenberg AR, Carson JR, Villasante CM, Ray M, Still E, Burns M, Gandara J, Feschotte C, Mason CE, Kentsis A. Genomic DNA transposition induced by human PGBD5. eLife. 2015;4:e10565. doi: 10.7554/eLife.10565. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Henssen AG, Koche R, Zhuang J, Jiang E, Reed C, Eisenberg A, Still E, MacArthur IC, Rodríguez-Fos E, Gonzalez S, Puiggròs M, Blackford AN, Mason CE, de Stanchina E, Gönen M, Emde AK, Shah M, Arora K, Reeves C, Socci ND, Perlman E, Antonescu CR, Roberts CWM, Steen H, Mullen E, Jackson SP, Torrents D, Weng Z, Armstrong SA, Kentsis A. PGBD5 promotes site-specific oncogenic mutations in human tumors. Nature Genetics. 2017;49:1005–1014. doi: 10.1038/ng.3866. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Hickman AB, Chandler M, Dyda F. Integrating prokaryotes and eukaryotes: DNA transposases in light of structure. Critical Reviews in Biochemistry and Molecular Biology. 2010;45:50–69. doi: 10.3109/10409230903505596. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Hickman AB, Ewis HE, Li X, Knapp JA, Laver T, Doss AL, Tolun G, Steven AC, Grishaev A, Bax A, Atkinson PW, Craig NL, Dyda F. Structural basis of hAT transposon end recognition by Hermes, an octameric DNA transposase from Musca domestica. Cell. 2014;158:353–367. doi: 10.1016/j.cell.2014.05.037. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Hickman AB, Dyda F. Mechanisms of DNA Transposition. Microbiology Spectrum. 2015;3:MDNA3-0034-2014. doi: 10.1128/microbiolspec.MDNA3-0034-2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Huang S, Tao X, Yuan S, Zhang Y, Li P, Beilinson HA, Zhang Y, Yu W, Pontarotti P, Escriva H, Le Petillon Y, Liu X, Chen S, Schatz DG, Xu A. Discovery of an Active RAG Transposon Illuminates the Origins of V(D)J Recombination. Cell. 2016;166:102–114. doi: 10.1016/j.cell.2016.05.032. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Ignarski M, Singh A, Swart EC, Arambasic M, Sandoval PY, Nowacki M. Paramecium tetraurelia chromatin assembly factor-1-like protein PtCAF-1 is involved in RNA-mediated control of DNA elimination. Nucleic Acids Research. 2014;42:11952–11964. doi: 10.1093/nar/gku874. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Jangam D, Feschotte C, Betrán E. Transposable Element Domestication As an Adaptation to Evolutionary Conflicts. Trends in Genetics. 2017;33:817–831. doi: 10.1016/j.tig.2017.07.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Jin Y, Chen Y, Zhao S, Guan KL, Zhuang Y, Zhou W, Wu X, Xu T. DNA-PK facilitates piggyBac transposition by promoting paired-end complex formation. PNAS. 2017;114:7408–7413. doi: 10.1073/pnas.1612980114. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Jones DT, Taylor WR, Thornton JM. The rapid generation of mutation data matrices from protein sequences. Bioinformatics. 1992;8:275–282. doi: 10.1093/bioinformatics/8.3.275. [DOI] [PubMed] [Google Scholar]
  41. Jones DT. Protein secondary structure prediction based on position-specific scoring matrices. Journal of Molecular Biology. 1999;292:195–202. doi: 10.1006/jmbi.1999.3091. [DOI] [PubMed] [Google Scholar]
  42. Kamath RS, Martinez-Campos M, Zipperlen P, Fraser AG, Ahringer J. Effectiveness of specific RNA-mediated interference through ingested double-stranded RNA in Caenorhabditis elegans. Genome Biology. 2001;2:research0002.1. doi: 10.1186/gb-2000-2-1-research0002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Kapitonov VV, Jurka J. RAG1 core and V(D)J recombination signal sequences were derived from Transib transposons. PLoS Biology. 2005;3:e181. doi: 10.1371/journal.pbio.0030181. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Kapusta A, Matsuda A, Marmignon A, Ku M, Silve A, Meyer E, Forney JD, Malinsky S, Bétermier M. Highly precise and developmentally programmed genome assembly in Paramecium requires ligase IV-dependent end joining. PLoS Genetics. 2011;7:e1002049. doi: 10.1371/journal.pgen.1002049. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Katinka MD, Bourgain FM. Interstitial telomeres are hotspots for illegitimate recombination with DNA molecules injected into the macronucleus of Paramecium primaurelia. The EMBO Journal. 1992;11:725–732. doi: 10.1002/j.1460-2075.1992.tb05105.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Kim HS, Chen Q, Kim SK, Nickoloff JA, Hromas R, Georgiadis MM, Lee SH. The DDN catalytic motif is required for Metnase functions in non-homologous end joining (NHEJ) repair and replication restart. Journal of Biological Chemistry. 2014;289:10930–10938. doi: 10.1074/jbc.M113.533216. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Kumar S, Stecher G, Tamura K. MEGA7: Molecular Evolutionary Genetics Analysis Version 7.0 for Bigger Datasets. Molecular Biology and Evolution. 2016;33:1870–1874. doi: 10.1093/molbev/msw054. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Laemmli UK. Cleavage of structural proteins during the assembly of the head of bacteriophage T4. Nature. 1970;227:680–685. doi: 10.1038/227680a0. [DOI] [PubMed] [Google Scholar]
  49. Liu D, Bischerour J, Siddique A, Buisine N, Bigot Y, Chalmers R. The human SETMAR protein preserves most of the activities of the ancestral Hsmar1 transposase. Molecular and Cellular Biology. 2007;27:1125–1132. doi: 10.1128/MCB.01899-06. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Marmignon A, Bischerour J, Silve A, Fojcik C, Dubois E, Arnaiz O, Kapusta A, Malinsky S, Bétermier M. Ku-mediated coupling of DNA cleavage and repair during programmed genome rearrangements in the ciliate Paramecium tetraurelia. PLoS Genetics. 2014;10:e1004552. doi: 10.1371/journal.pgen.1004552. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Martini E, Roche DM, Marheineke K, Verreault A, Almouzni G. Recruitment of phosphorylated chromatin assembly factor 1 to chromatin after UV irradiation of human cells. The Journal of Cell Biology. 1998;143:563–575. doi: 10.1083/jcb.143.3.563. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Marx A, Backes C, Meese E, Lenhof HP, Keller A. EDISON-WMW: Exact Dynamic Programing Solution of the Wilcoxon-Mann-Whitney Test. Genomics, Proteomics & Bioinformatics. 2016;14:55–61. doi: 10.1016/j.gpb.2015.11.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Mateo L, González J. Pogo-like transposases have been repeatedly domesticated into CENP-B-related proteins. Genome Biology and Evolution. 2014;6:2008–2016. doi: 10.1093/gbe/evu153. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. McGrath CL, Gout JF, Doak TG, Yanagi A, Lynch M. Insights into three whole-genome duplications gleaned from the Paramecium caudatum genome sequence. Genetics. 2014a;197:1417–1428. doi: 10.1534/genetics.114.163287. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. McGrath CL, Gout JF, Johri P, Doak TG, Lynch M. Differential retention and divergent resolution of duplicate genes following whole-genome duplication. Genome Research. 2014b;24:1665–1675. doi: 10.1101/gr.173740.114. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Mitra R, Fain-Thornton J, Craig NL. piggyBac can bypass DNA synthesis during cut and paste transposition. The EMBO Journal. 2008;27:1097–1109. doi: 10.1038/emboj.2008.41. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Mitra R, Li X, Kapusta A, Mayhew D, Mitra RD, Feschotte C, Craig NL. Functional characterization of piggyBat from the bat Myotis lucifugus unveils an active mammalian DNA transposon. PNAS. 2013;110:234–239. doi: 10.1073/pnas.1217548110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Montaño SP, Pigli YZ, Rice PA. The μ transpososome structure sheds light on DDE recombinase evolution. Nature. 2012;491:413–417. doi: 10.1038/nature11602. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Morellet N, Li X, Wieninger SA, Taylor JL, Bischerour J, Moriau S, Lescop E, Bardiaux B, Mathy N, Assrir N, Bétermier M, Nilges M, Hickman AB, Dyda F, Craig NL, Guittet E. Sequence-specific DNA binding activity of the cross-brace zinc finger motif of the piggyBac transposase. Nucleic Acids Research. 2018;46:2660–2677. doi: 10.1093/nar/gky044. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Parfrey LW, Lahr DJ, Knoll AH, Katz LA. Estimating the timing of early eukaryotic diversification with multigene molecular clocks. PNAS. 2011;108:13624–13629. doi: 10.1073/pnas.1110633108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Pavelitz T, Gray LT, Padilla SL, Bailey AD, Weiner AM. PGBD5: a neural-specific intron-containing piggyBac transposase domesticated over 500 million years ago and conserved from cephalochordates to humans. Mobile DNA. 2013;4:23. doi: 10.1186/1759-8753-4-23. [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Prescott DM. The DNA of ciliated protozoa. Microbiological Reviews. 1994;58:233–267. doi: 10.1128/mr.58.2.233-267.1994. [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Rajaei N, Chiruvella KK, Lin F, Aström SU. Domesticated transposase Kat1 and its fossil imprints induce sexual differentiation in yeast. PNAS. 2014;111:15491–15496. doi: 10.1073/pnas.1406027111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Sarkar A, Sim C, Hong YS, Hogan JR, Fraser MJ, Robertson HM, Collins FH. Molecular evolutionary analysis of the widespread piggyBac transposon family and related "domesticated" sequences. Molecular Genetics and Genomics. 2003;270:173–180. doi: 10.1007/s00438-003-0909-0. [DOI] [PubMed] [Google Scholar]
  65. Sarkar D. Lattice: Multivariate Data Visualization with R. New York: Springer; 2008. [DOI] [Google Scholar]
  66. Saveliev SV, Cox MM. Developmentally programmed DNA deletion in Tetrahymena thermophila by a transposition-like reaction pathway. The EMBO Journal. 1996;15:2858–2869. doi: 10.1002/j.1460-2075.1996.tb00647.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  67. Schneider TD, Stephens RM. Sequence logos: a new way to display consensus sequences. Nucleic Acids Research. 1990;18:6097–6100. doi: 10.1093/nar/18.20.6097. [DOI] [PMC free article] [PubMed] [Google Scholar]
  68. Shieh AW, Chalker DL. LIA5 is required for nuclear reorganization and programmed DNA rearrangements occurring during tetrahymena macronuclear differentiation. PLoS One. 2013;8:e75337. doi: 10.1371/journal.pone.0075337. [DOI] [PMC free article] [PubMed] [Google Scholar]
  69. Singh DP, Saudemont B, Guglielmi G, Arnaiz O, Goût JF, Prajer M, Potekhin A, Przybòs E, Aubusson-Fleury A, Bhullar S, Bouhouche K, Lhuillier-Akakpo M, Tanty V, Blugeon C, Alberti A, Labadie K, Aury JM, Sperling L, Duharcourt S, Meyer E. Genome-defence small RNAs exapted for epigenetic mating-type inheritance. Nature. 2014;509:447–452. doi: 10.1038/nature13318. [DOI] [PubMed] [Google Scholar]
  70. Suhren JH, Noto T, Kataoka K, Gao S, Liu Y, Mochizuki K. Negative Regulators of an RNAi-Heterochromatin Positive Feedback Loop Safeguard Somatic Genome Integrity in Tetrahymena. Cell Reports. 2017;18:2494–2507. doi: 10.1016/j.celrep.2017.02.024. [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. Swart EC, Wilkes CD, Sandoval PY, Arambasic M, Sperling L, Nowacki M. Genome-wide analysis of genetic and epigenetic control of programmed DNA deletion. Nucleic Acids Research. 2014;42:8970–8983. doi: 10.1093/nar/gku619. [DOI] [PMC free article] [PubMed] [Google Scholar]
  72. Timmons L, Court DL, Fire A. Ingestion of bacterially expressed dsRNAs can produce specific and potent genetic interference in Caenorhabditis elegans. Gene. 2001;263:103–112. doi: 10.1016/S0378-1119(00)00579-5. [DOI] [PubMed] [Google Scholar]
  73. Vogt A, Mochizuki K. A domesticated PiggyBac transposase interacts with heterochromatin and catalyzes reproducible DNA elimination in Tetrahymena. PLoS Genetics. 2013;9:e1004032. doi: 10.1371/journal.pgen.1004032. [DOI] [PMC free article] [PubMed] [Google Scholar]
  74. Volff JN. Turning junk into gold: domestication of transposable elements and the creation of new genes in eukaryotes. BioEssays. 2006;28:913–922. doi: 10.1002/bies.20452. [DOI] [PubMed] [Google Scholar]
  75. Weiner AM, Gray LT. What role (if any) does the highly conserved CSB-PGBD3 fusion protein play in Cockayne syndrome? Mechanisms of Ageing and Development. 2013;134:225–233. doi: 10.1016/j.mad.2013.01.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  76. Wicker T, Sabot F, Hua-Van A, Bennetzen JL, Capy P, Chalhoub B, Flavell A, Leroy P, Morgante M, Panaud O, Paux E, SanMiguel P, Schulman AH. A unified classification system for eukaryotic transposable elements. Nature Reviews Genetics. 2007;8:973–982. doi: 10.1038/nrg2165. [DOI] [PubMed] [Google Scholar]
  77. Wickham H. Ggplot2: Elegant Graphics for Data Analysis. New York: Springer-Verlag; 2009. [DOI] [Google Scholar]

Decision letter


In the interests of transparency, eLife includes the editorial decision letter and accompanying author responses. A lightly edited version of the letter sent to the authors after peer review is shown, indicating the most substantive concerns; minor comments are not usually included.

Thank you for submitting your article "Six domesticated PiggyBac transposases together carry out programmed DNA elimination in Paramecium" for consideration by eLife. Your article has been reviewed by two peer reviewers, including Bernard de Massy as the Reviewing Editor and Reviewer #1, and the evaluation has been overseen by Jessica Tyler as the Senior Editor.

The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.

Summary:

The authors' group previously reported that the domesticated PiggyBac transpose Pgm is required for the programmed genome rearrangement (IES elimination) in Paramecium. In this manuscript, the authors identified and characterised additional five groups of domesticated PiggyBac transposes, PgmL1, 2, 3, 4 and 5, in Paramecium. All these PgmLs most likely are catalytically inactive because they lack the endonuclease "DDD" residues. The authors showed that all PgmLs can be co-precipitated with Pgm when they were co-expressed in insect cells. All the PgmL proteins localised to the new macronuclei, where programmed genome rearrangement occurs. Individual RNAi knockdown of each PgmL (or PgmL group) inhibited: 1) the production of functional new macronucleus; 2) partially (in PgmL1 KD) or severely (in PgmL2, 3, 4 or 5 KD) the accumulation of Pgm in the new macronucleus; and 3) severely (but not completely) the eliminations of IESs. The residual IES elimination events in the absence of PgmL1 or PgmL3 were accompanied with high frequency of errors in which excision boundaries tend to shift by ~10-11 bp internal to IESs. The authors concluded that PgmLs act together with Pgm to carry out precise IES excisions and that PgmL1 and 3 are important for proper positioning of the complex.

Essential revisions:

Overall this is an interesting study providing new insight into molecular components involved in IES excision in Paramecium. The understanding of the mechanistic contribution of the PgmL proteins is still limited but these results are definitely suitable for publication providing several features of the phenotypes are clarified.

1) The conclusion on chromatin association on fixed cells is not convincing as impact of Triton can vary greatly between proteins. In addition, the authors conclude an exacerbation with Triton (from 50-60% to 85% decrease): what is the statistical validation for this difference?

Well established protocols exist to purify soluble chromatin and chromatin bound fractions (with inclusion of control proteins) and this should be used to establish the phenotype in PgmL depleted cells.

2) 3B: The interaction analysis requires additional validations:

Indicate that WB was done with anti-HA; Add MW markers.

Other issues with this experiment: The use of 500mM salt does not always avoid interaction with DNA as some proteins can bind DNA at high salt, so this is not a convincing argument; The level of purification should be presented, because any contaminant co-purifying with MBP-Pgm could be responsible for interaction with PgmL. How do the authors exclude this possibility? Also control should be included with use of another HA tagged protein, as HA could interact with Pgm.

3) Figure 5. A PgmKD should be shown (data is in Figure 5—figure supplement 3) to compare with PgmLKD and to evaluate what is the background signal.

The authors suggest that excision events observed in PgmL1 and 3KD are due to partial depletion, and support this conclusion based on the partial KD of PgmL2 which is convincing. The authors also conclude that PgmL1 and 3 have a specific role based on the difference of excision events observed in these KD and the partial PgmL2KD. This conclusion is based on the comparison of size distribution and is not convincing. Although the distributions do appear in part different, it is not possible to tell what is reproducible and statistically significant:

The authors should confirm the specificity of PgmL1 and 3 by analyzing the type of errors observed in partial KD of PgmL2: it is expected that the use of 10bp shifted TA is not observed. This point is essential to validate the model presented in Figure 7.

4) The model figure (Figure 7) is misleading because i) presence of the all PgmLs and Pgm in a same complex has not been demonstrated; ii) it was previously shown that Pgm can form homo-oligomer; iii) all of the PgmLs can directly bind to Pgm; iv) interactions among PgmLs have not been analysed.

The middle section (PgmL1 and 3KD) is also confusing: the authors aim to summarize the consequences of PgmL1 and 3 KD, but since they have distinct properties, they present only the complex in the absence of PgmL1 but arrows refer to 1 and 3.

5) Because PgmL1 and 3KD have lower IES excision frequency, it is not clear what is the actual error rate (after normalization for excision rate)?

6) Insert the information about the general consensus for IES ends (subsection “PgmL1- or PgmL3a&b-depleted cells are prone to IES excision errors”, last paragraph).

7) Figure 6E: it would be interesting to have a plot of the fraction of IE with errors among excised IES.

8) Discussion: To conclude of a defect in nuclear import is an overstatement and far to be demonstrate by the data; Discussion about specific roles of PgmL1 and 3 (subsection “IES excision in Paramecium is carried out by multiple partners including Pgm and five novel domesticated PB transposases”, last paragraph) not so convincing because KD are partial.

Other comments

9) Figure 1A: The drawing of the first line (Pfam domain) is not clear: Where is DDE_Tnp_1_7?; The RnaseH fold is in orange? The DDE_Tnp_1-like zinc ribbon is in grey?

What do black vertical lines represent, presumably the catalytic D? Why only two lines in PB? Why none in Pgm?

10) Figure 1B: What does "?" mean?

11) Please explain in the main text what are T0, T5, T11. Impossible to understand for non-aficionados.

12) Transgene expression the GFP-fusion PGML upon injection into MAC of vegetative cells. Could the authors briefly explain what happens to these transgenes, if they are integrated? In which genome?

13) Figure 3A: Why silencing of ac has less effect than silencing a alone?

Statistical test for% of progeny.

14) 5A: Legend should indicate that grey bars represent IES that are not significantly excised (if I understood properly). Can you clarify the statement:

"Because variable amounts of DNA from old MAC fragments are present in the samples, the retention scores calculated in each experiment cannot be considered as absolute measurements of IES retention in the new MAC.": What is the range of variation, has it been quantified, if so could this be used to normalize retention score.

This is confusing and no explanation is provided as to why distribution of scores differs between Pgml2 and 4 for instance.

Also, an explanation is needed to understand that at the same retention score some as statistically significantly not retained, some are not (grey vs. magenta). Is this due to number of reads for different IES? It seems that the different IES are plotted by bins of retention score windows? Please clarify?

Add KD next to genes.

15) Figure 5C: significantly longer excised IES: which statistical test?

16) Figure 5D: Absence of 46-47 in PgmL: not very clear (in particular for Pgml3 KD): zoom out on the graph. Why is it important? What is the implication?

17) Figure 6E is a plot of partial internal errors not only of the 10 to 11bp shift as written in the text (subsection “PgmL1- or PgmL3a&b-depleted cells are prone to IES excision errors”, last paragraph).

18) Update Morellet 2018 reference.

19) Figure 3—figure supplement S1: What does the cross mean?

20) Subsection “PGML KDs have a genome-wide impact on IES elimination”, last paragraph: Cannot find progeny survival data in Figure 1—figure supplement 3?

21) Why are the correlation coefficients of PgmL2KD with other KD and with the partial KD so low?

22) Some of the supplementary figures are not mentioned in the main text, please revise or remove.

eLife. 2018 Sep 18;7:e37927. doi: 10.7554/eLife.37927.044

Author response


Essential revisions:

[…] 1) The conclusion on chromatin association on fixed cells is not convincing as impact of Triton can vary greatly between proteins. In addition, the authors conclude an exacerbation with Triton (from 50-60% to 85% decrease): what is the statistical validation for this difference?

We first would like to emphasize that all experiments address the nuclear localization of the same protein (Pgm) under different RNAi conditions. We have now included statistical analyses (Mann-Whitney-Wilcoxon test) of the Pgm immunofluorescence nuclear signals displayed in Figure 4C and 4D. We also present in Figure 4—figure supplement 2D a statistical comparison of the mean normalized values obtained with and without Triton pre-extraction. This analysis confirms that a statistically significant decrease of Pgm nuclear signal is observed following Triton pre-extraction in PGML2/3/4/5 RNAi relative to conditions omitting the pre-extraction step, while Triton treatment has no significant effect on Pgm nuclear localization in PGML1 RNAi.

Well established protocols exist to purify soluble chromatin and chromatin bound fractions (with inclusion of control proteins) and this should be used to establish the phenotype in PgmL depleted cells.

We agree with the reviewers that definite proof of a defect in Pgm association with chromatin in PGML RNAi requires additional biochemical evidence.

However, we must stress here that there are no well-established protocols in Paramecium to extract purified soluble chromatin and chromatin-bound fractions. Neither are there, to our knowledge, available antibodies that would recognize a control Paramecium chromatin-bound protein. Ciliate and mammalian proteins are generally poorly conserved (Author response image 1), which renders the use of commercial antibodies only rarely feasible.

Author response image 1. Paramecium proteins are highly divergent relative to their mammalian orthologs.

Author response image 1.

Percent identity between orthologs was calculated using the InParanoid tool from Cildb (Arnaiz et al., 2009, Database 2000 bap022; Arnaiz et al., 2014, Cilia 3:9; http://cildb.cgm.cnrs-gif.fr/). Dotted lines indicate the median of each distribution. Overall, Paramecium proteins share only 35% identity (median) with mammalian proteins, while mouse and human proteins share 95% identity with each other.

We nonetheless prepared nuclear extracts from control and PgmL-depleted autogamous cells and tried two procedures to extract endogenous Pgm. We performed a differential salt fractionation (Hermann et al., 2017, Bio Protoc 7 pii: e2175), as well as an extraction using a modified Wuarin-Schibler buffer (MWS; Gagnon et al., 2014, Nat Protoc 9: 2045-2060). As shown for control cells in Author response image 2, Pgm remained in the insoluble fraction under all extraction conditions tested, whereas histone H3 started to be solubilized at 300mM NaCl in the salt fractionation protocol (lane 3). This suggests that Pgm is insoluble under these conditions, consistent with our independent observation that proper Pgm detection in total extracts or in purified nuclei requires three-minute boiling in 5% SDS, prior to denaturation in Laemmli buffer. We observed that Pgm behaved similarly in PgmL-depleted cells. For this reason, we cannot – for the moment – address biochemically the chromatin-associated status of Pgm.

Author response image 2. Attempts to extract endogenous Pgm from chromatin in control cells.

Author response image 2.

Paramecium nuclei were purified by low-speed centrifugation of autogamous cell lysates, as described (Arnaiz et al., 2012). Immunoblotting was performed using α-Pgm 2659-GP antibodies for Pgm detection (Dubois et al., 2017) and commercial antibodies for the detection of histone H3 (Merck Millipore # 07-690).

Accordingly, we have removed our initial statement that Pgm is more loosely bound to chromatin in the absence of PgmL proteins from the Results section and only refer to its nuclear localisation.

2) 3B: The interaction analysis requires additional validations:

Indicate that WB was done with anti-HA; Add MW markers.

The use of anti-HA antibodies (α-HA) has been mentioned in the figure legend and indicated in Figure 3B itself. The full-size blot with all lanes and molecular weight markers has been included in Figure 3—figure supplement 3B.

Other issues with this experiment: The use of 500mM salt does not always avoid interaction with DNA as some proteins can bind DNA at high salt, so this is not a convincing argument;

We can rule out that Pgm binds DNA at 500 mM NaCl. Indeed, we performed a control DRaCALA DNA binding assay (Differential Radial Capillary Action of Ligand Assay, see Donaldson et al., 2012), which we have included in Figure 3—figure supplement 3A. While Pgm clearly binds DNA at low NaCl concentration (50 or 100 mM), no DNA binding was detected at 250 mM NaCl and above. We are, therefore, confident that co-purification of Pgm and PgmLs cannot be explained by simultaneous binding of both protein to the same DNA molecule. We have included these clarifications in the Results section.

The level of purification should be presented, because any contaminant co-purifying with MBP-Pgm could be responsible for interaction with PgmL. How do the authors exclude this possibility?

We are aware that a 100% purity level has not been reached in the co-precipitates and agree with the reviewers that the presence of a contaminating partner cannot formally be excluded. We would like to point out, however, that all assays were performed in heterologous insect cell extracts containing no other Paramecium protein than Pgm and PgmLs. We consider it unlikely that an insect contaminating protein strongly bridges Pgm and PgmL.

Also control should be included with use of another HA tagged protein, as HA could interact with Pgm.

The requested negative control has been provided for review, but has not been included here since the experiment was designed for another study.

3) Figure 5. A PgmKD should be shown (data is in Figure 5—figure supplement 3) to compare with PgmLKD and to evaluate what is the background signal.

The distribution obtained in a previously published PGM KD (Arnaiz et al., 2012) has been included in Figure 5A. This indeed corresponds to the data shown in Figure 5—figure supplement 3.

The authors suggest that excision events observed in PgmL1 and 3KD are due to partial depletion, and support this conclusion based on the partial KD of PgmL2 which is convincing.

There might have been some misunderstanding here. Residual excision activity can indeed be observed in a partial PGML2 KD. We believe, however, that partial depletion of PgmL1 or PgmL3 cannot be the sole explanation for the error-prone IES excision events observed in PGML1 or PGML3 KDs. Indeed, over-representation of one particular type of errors (i.e. partial internal) is clearly specific for these two KDs and is not observed in partial PGML2 KDs. This analysis is displayed in Figure 6—figure supplement 2.

The authors also conclude that PgmL1 and 3 have a specific role based on the difference of excision events observed in these KD and the partial PgmL2KD. This conclusion is based on the comparison of size distribution and is not convincing. Although the distributions do appear in part different, it is not possible to tell what is reproducible and statistically significant:

The authors should confirm the specificity of PgmL1 and 3 by analyzing the type of errors observed in partial KD of PgmL2: it is expected that the use of 10bp shifted TA is not observed. This point is essential to validate the model presented in Figure 7.

The conclusion that PgmL1 and 3 have a specific role in IES excision is actually based on two observations:

i) In PGML1 or PGML3 KDs, the population of efficiently excised IESs indeed shows a significantly different size distribution relative to strongly retained IESs, with an enrichment in >77-bp IESs. A statistical comparison of the size distributions of the two IES populations is presented in Figure 5C (Mann-Whitney-Wilcoxon test).

ii) Figure 6 and Figure 6—figure supplement 2 show that partial internal excision errors are specifically over-represented among all excision errors in PGML1 or PGML3 KDs. To address the reviewers’ concern, we have now included an additional panel (panel B) in Figure 6—figure supplement 2 to show that no specific 10-bp shift is observed in partial PGML2 KDs.

4) The model figure (Figure 7) is misleading because i) presence of the all PgmLs and Pgm in a same complex has not been demonstrated; ii) it was previously shown that Pgm can form homo-oligomer; iii) all of the PgmLs can directly bind to Pgm; iv) interactions among PgmLs have not been analysed.

The middle section (PgmL1 and 3KD) is also confusing: the authors aim to summarize the consequences of PgmL1 and 3 KD, but since they have distinct properties, they present only the complex in the absence of PgmL1 but arrows refer to 1 and 3.

The reviewers are right. We are aware that more work is needed to provide deeper insight into the organization and stoichiometry of PgmL subunits in the Pgm-associated complex. We have prepared a simplified version of Figure 7, in which no assumption is made with regard to the exact number of PgmLs present in the complex. In line with previously published data (Dubois et al., 2017) and with known properties of the canonical PiggyBac transposase from Trichoplusia ni (Jin et al., 2017), we have drawn Pgm as a dimer positioned at each TA cleavage site. The revised model has been focused on the control of DNA cleavage at IES ends. As stated above, we agree with other reviewers’ comments (see reply to point 1) that speculating on the role of PgmL in stabilizing Pgm association with chromatin is premature at this stage. The first paragraph of the Discussion has been modified accordingly.

5) Because PgmL1 and 3KD have lower IES excision frequency, it is not clear what is the actual error rate (after normalization for excision rate)?

Due to experimental limitations, it is not possible to calculate IES excision rates precisely. Indeed, the DNA used for high throughput sequencing is prepared from nuclear preparations enriched in developing MACs, but fragments of the old MAC always contaminate these preparations. Thus, in our sequencing datasets, IES- reads do not exclusively represent de novo excision events that took place in the developing MACs. Furthermore, because the contamination is variable from one sample to the other, no simple correction can be applied to our data to accurately remove the contribution of the old MAC.

6) Insert the information about the general consensus for IES ends (subsection “PgmL1- or PgmL3a&b-depleted cells are prone to IES excision errors”, last paragraph).

Done.

7) Figure 6E: it would be interesting to have a plot of the fraction of IE with errors among excised IES.

This information was available in our initial Figure 6—figure supplement 1. To make it more conspicuous, we have prepared a new version of this figure, which now includes the requested plot as an additional panel (panel B). We explicitly refer to the data in the main text (subsection “PgmL1- or PgmL3a&b-depleted cells are prone to IES excision errors”).

8) Discussion: To conclude of a defect in nuclear import is an overstatement and far to be demonstrate by the data; Discussion about specific roles of PgmL1 and 3 (subsection “IES excision in Paramecium is carried out by multiple partners including Pgm and five novel domesticated PB transposases”, last paragraph) not so convincing because KD are partial.

We agree that not enough evidence can be provided for a defect in Pgm nuclear import. We have removed this suggestion from the Results and Discussion sections. We have also removed all considerations about the protein import/export balance from the model of Figure 7, which now puts more emphasis on the biochemical properties of the Pgm-associated complex.

As stated above (reply to Point 3), our work clearly shows that PGML1 or PGML3 KDs induce specific phenotypes that are not observed in partial PGML2 KDs and, therefore, cannot simply be explained by partial depletion. We have put more emphasis on this point in the Discussion.

Other comments

9) Figure 1A: The drawing of the first line (Pfam domain) is not clear: Where is DDE_Tnp_1_7?; The RnaseH fold is in orange? The DDE_Tnp_1-like zinc ribbon is in grey?

What do black vertical lines represent, presumably the catalytic D? Why only two lines in PB? Why none in Pgm?

The Pfam domain DDE_Tnp_1_7 is shown as a bipartite orange domain, with the RNaseH fold corresponding to its right part. The DDE_Tnp_1-like zinc ribbon is in grey. These explanations are now provided in the figure legend. We completed the figure by adding vertical bars to represent all conserved D residues.

10) Figure 1B: What does "?" mean?

The question mark indicates that the expected α4 helix of the RNaseH fold domain could not be predicted using the PSIPRED secondary structure program. This is now specified in the legend.

11) Please explain in the main text what are T0, T5, T11. Impossible to understand for non-aficionados.

The explanation was mentioned in the Materials and methods section. We moved it to the main text and completed the legend of Figure 2 by specifying that all time-points are in hours. We also explained in the Introduction that the MAC “is fragmented and destroyed at each sexual cycle” before a new MAC develops from a copy of the zygotic nucleus.

12) Transgene expression the GFP-fusion PGML upon injection into MAC of vegetative cells. Could the authors briefly explain what happens to these transgenes, if they are integrated? In which genome?

We have added the following explanation to the legend of Figure 2—figure supplement 2:

“Following microinjection into the MAC of vegetative cells, transgenes are capped by addition of telomeric repeats at their ends (concatemers may form prior to telomere addition): they may then integrate into the somatic genome (through a mechanism that remains to be studied) or be maintained throughout vegetative growth as autonomously replicating mini-chromosomes (Gilley et al. 1988; Bourgain and Katinka 1991; Katinka and Bourgain 1992). […] Transgenes persist in old MAC fragments throughout autogamy and continue to be expressed if controlled by proper transcription signals, but are lost in the next sexual generation, after complete destruction of the old MAC”.

13) Figure 3A: Why silencing of ac has less effect than silencing a alone?

Statistical test for% of progeny.

Within the PGML3 group, PGML3a shows by far the highest expression level (Figure 2A), which is the reason which silencing a leads to a strong – although partial – phenotype. PGML3c has a low expression level and its specific contribution to total PgmL3 amounts is probably very small. In addition, the double feeding procedure that we used to silence a and c consists in feeding Paramecium cells with a 1:1 mix of induced bacteria producing dsRNA homologous to each gene. In practice, this leads to a 2-fold dilution of PGML3a-inducing bacteria, which might explain why silencing 3ac appears to have less effect than silencing 3a alone (p=0.053 in a two-sample t-test). The purpose of Figure 3A is to show that silencing of the two most expressed genes within each PGML group is required to completely abolish progeny survival. For the sake of clarity, we have not systematically mentioned the results of the statistical analysis of% progeny with functional new MAC.

14) 5A: Legend should indicate that grey bars represent IES that are not significantly excised (if I understood properly).

We indicated in the legend that grey bars represent the distribution of all IESs over all possible retention scores ranging from 0 to 1 (by bins of 0.025). When relevant, the distribution of non-significantly retained (=efficiently excised) IESs is superimposed in magenta.

Can you clarify the statement:

"Because variable amounts of DNA from old MAC fragments are present in the samples, the retention scores calculated in each experiment cannot be considered as absolute measurements of IES retention in the new MAC.": What is the range of variation, has it been quantified, if so could this be used to normalize retention score.

This is confusing and no explanation is provided as to why distribution of scores differs between Pgml2 and 4 for instance.

As stated above (reply to point 5) the distributions of IES retention scores may shift along the X-axis from one experiment to the other, depending on the variable amounts of contaminating old MAC DNA present in our samples. The variability has different causes that are very difficult to control, among which the physiological state of cells (strong starvation conditions accelerate the loss of old MAC fragments) and the fragment size at the time of cell fractionation (low speed centrifugation will selectively eliminate smaller fragments). Guérin et al. (BMC Genomics 2017, 18:327) used an additional flow cytometry sorting step to enrich preparations of developing MACs from Pgm-depleted cells and obtained higher IES retention scores in their 98% pure MAC preparation than in the “non-sorted” sample shown in Figure 5A. This procedure will certainly improve future analyses, without completely eliminating the problem of quantifying the level of contamination by old MAC DNA in different preparations. This makes normalization still problematic at this stage.

Also, an explanation is needed to understand that at the same retention score some as statistically significantly not retained, some are not (grey vs. magenta). Is this due to number of reads for different IES?

As explained in the Materials and methods, our statistical analysis was performed by comparing each PGML KD dataset to a control obtained following autogamy in standard culture medium. For each IES, the number of reads corresponding to IES+ and IES- molecules were compared between the two samples and a statistical test for the significance of each boundary score was performed using the published ParTIES package (Denby Wilkes et al., 2015). Given that the number of reads covering an IES differs between samples and from one IES to the other, the significance of IES retention may differ for IESs showing similar retention scores.

It seems that the different IES are plotted by bins of retention score windows? Please clarify?

This is right. The IRS bins are equal to 0.025. This is now clearly stated in the legend to Figure 5.

Add KD next to genes.

Done.

15) Figure 5C: significantly longer excised IES: which statistical test?

We performed Mann-Whitney-Wilcoxon statistical tests to compare IES size distributions. This is now stated in the figure legend.

16) Figure 5D: Absence of 46-47 in PgmL: not very clear (in particular for Pgml3 KD): zoom out on the graph. Why is it important? What is the implication?

We agree that this is a minor point. Since we have no straightforward interpretation for this observation, we have removed this statement from the Results section.

17) Figure 6E is a plot of partial internal errors not only of the 10 to 11bp shift as written in the text (subsection “PgmL1- or PgmL3a&b-depleted cells are prone to IES excision errors”, last paragraph).

This is true sensu stricto. However, as shown in Figure 6D, partial internal errors consist mostly in excision events using a 10 to 11-bp shifted TA. We changed the sentence to “Finally, we noticed that error-prone IESs, for which an internal TA (mostly shifted by 10 to 11 bp) is used in erroneous excision events, follow a different size distribution…”

18) Update Morellet 2018 reference.

Done.

19) Figure 3—figure supplement S1: What does the cross mean?

The cross indicates that blue inserts can target the two paralogs simultaneously. This has been included in the legend.

20) Subsection “PGML KDs have a genome-wide impact on IES elimination”, last paragraph: Cannot find progeny survival data in Figure 1—figure supplement 3?

Thanks for the notice: we were referring to Figure 4—figure supplement 3. Sorry for the confusion.

21) Why are the correlation coefficients of PgmL2KD with other KD and with the partial KD so low?

PGML2 KD induces a homogeneous phenotype at the genome-wide level, with similar retention scores for all IESs. The absence of correlation indicates that variations around the mean retention score are limited and probably attributable to statistical noise.

22) Some of the supplementary figures are not mentioned in the main text, please revise or remove.

Thanks for drawing our attention to this point. Some supplementary figures were only mentioned in the Supplementary Information file. They are now all referred to in the main text.

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    Supplementary file 1. MUSCLE alignment of the transposase core domains of ciliate domesticated PB transposases and other PB transposases.
    elife-37927-supp1.pdf (149.1KB, pdf)
    DOI: 10.7554/eLife.37927.024
    Supplementary file 2. Table of Pgm and PgmL proteins encoded by published Paramecium genomes and their ParameciumDB accession numbers.
    elife-37927-supp2.xlsx (14.5KB, xlsx)
    DOI: 10.7554/eLife.37927.025
    Supplementary file 3. Sequences of the cysteine-rich domains used for the alignment shown in Figure 1—figure supplement 1.
    elife-37927-supp3.rtf (56.4KB, rtf)
    DOI: 10.7554/eLife.37927.026
    Supplementary file 4. Sequences of the transposase core domains used for the alignement shown in Supplementary file 1.
    elife-37927-supp4.rtf (78KB, rtf)
    DOI: 10.7554/eLife.37927.027
    Supplementary file 5. Sequence of the synthetic PGML genes used for protein production in insect cells.
    elife-37927-supp5.rtf (60.8KB, rtf)
    DOI: 10.7554/eLife.37927.028
    Supplementary file 6. Analysis of post-autogamous progeny in small-scale PGML knockdowns,
    elife-37927-supp6.xlsx (25.6KB, xlsx)
    DOI: 10.7554/eLife.37927.029
    Supplementary file 7. Analysis of post-autogamous progeny in middle- and large-scale PGML knockdowns.
    elife-37927-supp7.xlsx (13.8KB, xlsx)
    DOI: 10.7554/eLife.37927.030
    Supplementary file 8. DNA-seq datasets from ENA project PRJEB24171 (this study).
    elife-37927-supp8.xlsx (13.5KB, xlsx)
    DOI: 10.7554/eLife.37927.031
    Supplementary file 9. Analysis of IES excision reads in PGM and PGML knockdowns.
    elife-37927-supp9.xlsx (17.3KB, xlsx)
    DOI: 10.7554/eLife.37927.032
    Transparent reporting form
    DOI: 10.7554/eLife.37927.033

    Data Availability Statement

    All DNA-seq datasets generated in this study were deposited in the European Nucleotide Archive under the Project Accession PRJEB24171. Reference genomes and IESs are available through ParameciumDB (http://paramecium.i2bc.paris-saclay.fr).

    The following dataset was generated:

    Bischerour J, author; Bhullar S, author; Denby Wilkes C, author; Régnier V, author; Mathy N, author; Dubois E, author; Singh A, author; Swart E, author; Arnaiz O, author; Sperling L, author; Nowacki M, author; Bétermier M, author. DNA-seq of PGMLs knocked down cells. 2018 http://www.ebi.ac.uk/ena/data/view/PRJEB24171 Publicly available at the European Nucleotide Archive (accession no: PRJEB24171)

    The following previously published datasets were used:

    Arnaiz O, author; Mathy N, author; Baudry C, author; Malinsky S, author; Aury JM, author; Denby Wilkes C, author; Garnier O, author; Labadie K, author; Lauderdale BE, author; Le Mouël A, author; Marmignon A, author; Nowacki M, author; Poulain J, author; Prajer M, author; Wincker P, author; Meyer E, author; Duharcourt S, author; Duret L, author; Bétermier M, author; Sperling L, author. DNA-seq of PGM knocked down cells. 2012 http://www.ebi.ac.uk/ena/data/view/ERA137444 Publicly available at the European Nucleotide Archive (accession no: ERA137444)

    Arnaiz O, author; Mathy N, author; Baudry C, author; Malinsky S, author; Aury JM, author; Denby Wilkes C, author; Garnier O, author; Labadie K, author; Lauderdale BE, author; Le Mouël A, author; Marmignon A, author; Nowacki M, author; Poulain J, author; Prajer M, author; Wincker P, author; Meyer E, author; Duharcourt S, author; Duret L, author; Bétermier M, author; Sperling L, author. DNA-seq strain 51MAC. 2012 http://www.ebi.ac.uk/ena/data/view/ERA137420 Publicly available at the European Nucleotide Archive (accession no: ERA137420)


    Articles from eLife are provided here courtesy of eLife Sciences Publications, Ltd

    RESOURCES