Skip to main content
eLife logoLink to eLife
. 2026 Mar 10;14:RP109950. doi: 10.7554/eLife.109950

Defining the chromatin-associated protein landscapes on Trypanosoma brucei repetitive elements using synthetic TALE proteins

Roberta Carloni 1,2,, Tadhg Devlin 1,2,†,, Pin Tong 1, Christos Spanos 1, Tanya Auchynnikava 1,§, Juri Rappsilber 1,3, Keith R Matthews 2,, Robin C Allshire 1,
Editors: Yamini Dalal4, Yamini Dalal5
PMCID: PMC12975129  PMID: 41805585

Abstract

Kinetoplastids, such as Trypanosoma brucei, are eukaryotes that likely separated from the main lineage at an exceptionally early point in evolution. Consequently, many aspects of kinetoplastid biology differ significantly from other eukaryotic model systems, including yeasts, plants, worms, flies, and mammals. As in many eukaryotes, the T. brucei genome contains repetitive elements at various chromosomal locations, including centromere- and telomere-associated repeats and interspersed retrotransposon elements. T. brucei also contains intermediate-sized and mini-chromosomes that harbour abundant 177 bp repeat arrays and 70 bp repeat elements implicated in Variable Surface Glycoprotein (VSG) gene switching. In many eukaryotes, repetitive elements are assembled in specialised chromatin such as heterochromatin; however, apart from centromere- and telomere-associated repeats, little is known about chromatin-associated proteins that decorate these and other repetitive elements in kinetoplastids. Here, we utilise affinity selection of synthetic TALE DNA binding proteins designed to target specific repeat elements to identify enriched proteins by proteomics. Validating the approach, a telomere repeat binding TelR-TALE identifies many proteins previously implicated in telomere function. Furthermore, the 70R-TALE designed to bind 70 bp repeats indicates that proteins involved in DNA repair are enriched on these elements that reside adjacent to VSG genes. Interestingly, the 177 bp repeat binding 177R-TALE enriches for many kinetochore proteins, suggesting that intermediate-sized and mini-chromosomes assemble kinetochores related in composition to those located on the main megabase chromosomes. This provides a first insight into the chromatin landscape of repetitive regions of the trypanosome genome with relevance for their mechanisms of chromosome integrity, immune evasion, and cell replication.

Research organism: Other

Introduction

Repetitive sequences are scattered across the genomes of many eukaryotes, where they define various functional chromosomal elements (Bringaud et al., 2009; Feschotte, 2008; Kazazian, 2004; Slotkin and Martienssen, 2007). For example, telomeres are generally composed of TG-rich repeats, added by the reverse transcriptase activity of telomerase, which uses its associated RNA as a template (Li, 2021; Pfeiffer and Lingner, 2013), whereas centromere regions often contain extensive tandem arrays of non-conserved repetitive sequences (Allshire and Karpen, 2008; Miga and Alexandrov, 2021; Sullivan and Sullivan, 2020; Thakur et al., 2021). In many eukaryotes, such arrays frequently provide a substrate for constitutive heterochromatin formation through di/tri-methylation of lysine 9 on histone H3 on resident nucleosomes. In addition, repetitive centromeric repeat arrays are associated with the assembly of specialised nucleosomes containing the centromere-specific histone H3 variant, generally known as CENP-A or cenH3 (Allshire and Karpen, 2008; Talbert and Henikoff, 2020). CENP-A nucleosomes form the foundation for kinetochore assembly, which mediates accurate chromosome segregation (Allshire and Karpen, 2008). Other repetitive sequences, such as transposable elements or their remnants, can alter – or have been co-opted to regulate – the expression of nearby genes (Bourque et al., 2018; Fueyo et al., 2022). In many eukaryotes, heterochromatin forms clusters in the nucleus that are generally located at the nuclear periphery or adjacent to nucleoli (Bizhanova and Kaufman, 2021; van Steensel and Belmont, 2017).

Kinetoplastids represent a distinct branch of protozoan eukaryotes within the Euglenozoa that diverged from the main eukaryotic lineage early during their evolution (Cavalier-Smith, 2010). As a result, kinetoplastids are distinct from most other eukaryotes in which cellular mechanisms are intensively studied, including yeasts, fungi, plants, nematodes, insects, and mammals. Many kinetoplastids are parasites that cause diseases in humans and economically important livestock. Trypanosoma brucei, for example, is prevalent in sub-Saharan Africa, where it is transmitted by tsetse flies and causes human African trypanosomiasis and Nagana in cattle (Morrison et al., 2023). Other kinetoplastid parasites that cause human diseases in the tropics include Trypanosoma cruzi (Chagas disease) and Leishmania spp. (leishmaniasis) (Stuart et al., 2008). Despite their divergence from most eukaryotes, the genomes of kinetoplastids contain a variety of repetitive sequences. The diploid genome of the commonly used laboratory T. brucei Lister 427 strain has recently been re-characterised with advanced genome assembly methods. The genome contains two homologues for each of the 11 large chromosomes, ranging in size from 900 to 4600 kb, 5–6 intermediate chromosomes, and ~100 mini-chromosomes (Cosentino et al., 2021; Müller et al., 2018; Rabuffo et al., 2024).

All T. brucei chromosomes are linear, and each end terminates with arrays of telomeric (TTAGGG)n repeats that are added by telomerase (Sandhu and Li, 2017). T. brucei exhibits the generally well-defined process of Variable Surface Glycoprotein (VSG) gene switching, which allows a proportion of parasites to evade the host immune system at any given time (Barcons-Simon et al., 2023). Most of the 2634 detected VSG genes are not expressed and reside in arrays in sub-telomeric regions, with others residing on mini-chromosomes. Only one VSG gene is expressed at any time, and only from one of the estimated 15 telomere adjacent bloodstream expression sites (BES) (Cosentino et al., 2021). The non-expressed VSG genes provide a library of potential alternative VSGs, so that the parasite has almost limitless potential to vary its protective coat. Further variation in the expressed VSG protein repertoire can be generated by recombination events between VSG genes and VSG pseudogenes, which comprise approximately 80% of the overall gene repertoire (Mugnier et al., 2015; Cosentino et al., 2021). Non-expressed VSG genes are exchanged with VSG genes residing in expression sites using recombination-directed processes that act on or near 70 bp repeats residing upstream of the resident VSG gene at each BES (Boothroyd et al., 2009; Thivolle et al., 2021). Apart from telomeric (TTAGGG)n repeats at their ends and one or two VSG genes, mini-chromosomes are comprised of tandem arrays of 177 bp repeats, which are also present on the poorly characterised intermediate-sized chromosomes (Ersfeld, 2011; Sloof et al., 1983; Figure 1A). The function of these 177 bp repeats is unknown, but mini-chromosomes have been shown to be maintained with high stability through mitotic cell divisions, suggesting that a mechanism is in place to ensure their segregation with fidelity to daughter cells (Ersfeld and Gull, 1997; Wickstead et al., 2003). The main 11 megabase-sized chromosomes have been shown to assemble evolutionarily unconventional kinetochores composed of 25 kinetoplastid kinetochore proteins (KKT1-25) that mediate their accurate mitotic segregation and are distinctly different from those of other eukaryotes (Akiyoshi and Gull, 2014; D’Archivio and Wickstead, 2017; Nerusheva et al., 2019; Akiyoshi and Gull, 2014). ChIP-seq has shown that, on the main megabase-sized chromosomes, kinetochores assemble on different DNA sequences; on some chromosomes, kinetochores coincide with tandem arrays of CIR147 repeats or related repeat elements (Akiyoshi and Gull, 2014; Echeverry et al., 2012; Obado et al., 2007). CIR147 repeats produce non-coding transcripts that are processed by Dicer into siRNAs and loaded into Argonaute/TbAGO1 (Tschudi et al., 2012). In addition, SLAC and ingi-related retrotransposons are dispersed across the T. brucei genome (Bringaud et al., 2009) and are also transcribed and processed into Ago1-associated siRNA (Tschudi et al., 2012).

Figure 1. T. brucei repetitive elements, TALE design, and target site number.

(A) Distinct repetitive elements are present at various locations on T. brucei chromosomes. (B) Construct designed to express the indicated TALE proteins that bind 15 bp target sequences fused to 3xTy1 and YFP tags when integrated at the β-tubulin locus. The Aldolase 5’UTR and PAD1 3’UTR regulate expression levels. A Bleomycin resistance marker gene provides Phleomycin selection (not shown). (C) Predicted number of target sequences for each TALE in the Lister 427 genome.

Figure 1.

Figure 1—figure supplement 1. TALE-YFP construction and sequencing reveals rearranged TALE domain in TelR-TALE-YFP following integration in the T. brucei genome.

Figure 1—figure supplement 1.

(A) Tetramer and trimer modules used to build plasmids designed to express each of six TALE-YFP when integrated in the T. brucei genome. (B) Each of the complete TALE plasmid constructs is expected to contain four modules comprising a complete TALE domain of the same size in all. (C) Sequencing of five assembled TALE-YFP constructs revealed that they had the expected layout following integration at the β-tubulin locus in T. brucei. (D) PCR of genomic DNA extracted from 427 control cells and 427 cells with the TelR-TALE, 177R-TALE, 70R-TALE, ingiR-TALE, or NonR-TALE constructs integrated at the β-tubulin locus confirms correct size (2 kb) (Figure 1—figure supplement 1—source data 1). PCR product for TelR-TALE is shorter than expected (1.5 kb). Position of left (LP) and right (RP) primer pair, common to all TALE-YFP constructs, is indicated. (E) Following integration at the β-tubulin locus, sequencing revealed that the TALE DNA binding domain of TelR-TALE had rearranged, explaining the shorter TelR-TALE protein made by T. brucei 427 TelR-TALE expressing cells.
Figure 1—figure supplement 1—source data 1. Original DNA-stained agarose gel for Figure 1—figure supplement 1D indicating the relevant PCR bands from T. brucei cells containing the indicated synthetic TALE-YFP protein expression constructs.

Figure 1—figure supplement 2. Synthetic TALE proteins are expressed in Lister 427 T. brucei bloodstream-form cells, but the TelR-TALE protein is shorter than expected.

Figure 1—figure supplement 2.

Protein extracted from 427 control cells and 427 cells with constructs designed to express 177R-TALE, 70R-TALE, 147R-TALE, ingiR-TALE, NonR-TALE, and TelR-TALE fused to Ty and YFP tags integrated at the β-tubulin locus was subject to western analysis using either: (A) monoclonal mouse anti-GFP (anti-YFP, Figure 1—figure supplement 2—source data 1 and 2) or (B) anti-BB2 (anti-Ty, Figure 1—figure supplement 2—source data 3). The TelR-TALE protein is smaller than the 177R-TALE, 70R-TALE, 147R-TALE, ingiR-TALE, and NonR-TALE proteins.
Figure 1—figure supplement 2—source data 1. Original anti-YFP western for Figure 1—figure supplement 2A (left), indicating the relevant bands in T. brucei cells expressing the indicated synthetic TALE-YFP proteins.
Figure 1—figure supplement 2—source data 2. Original anti-YFP western for Figure 1—figure supplement 2A (left), indicating the relevant bands in T. brucei cells expressing the indicated synthetic TALE-YFP proteins.
Figure 1—figure supplement 2—source data 3. Original anti-Ty western for Figure 1—figure supplement 2B, indicating the relevant bands in T. brucei cells expressing the indicated synthetic TALE-YFP proteins.

Figure 1—figure supplement 3. Growth assays of cells expressing TelR-TALE-GFP, 177R-TALE-GFP, or ingi-TALE-GFP, and their cellular localisation.

Figure 1—figure supplement 3.

(A) Indicated cell cultures were seeded, cell number monitored and diluted every 2 days. (B) DAPI and anti-GFP staining of fixed cells expressing indicated synthetic TALE-GFP fusion proteins. Bar = 10 μm.

The key hallmarks of eukaryotic heterochromatin, di/tri-methylation of histone H3 on lysine 9 (H3K9) or lysine 27 (H3K27), cannot be detected in T. brucei or other kinetoplastids because their histones, including H3, are particularly divergent rendering useless most existing antibody reagents used for histone post-translational modification (PTM) analyses in other eukaryotes (Deák et al., 2023; Figueiredo et al., 2009). Thus, it is not known which, if any, other modified or unmodified residues on T. brucei histones might nucleate repressive chromatin that could be regarded as heterochromatin. However, mass spectrometry has identified a plethora of residues in T. brucei and T. cruzi histones that exhibit various PTMs (de Lima et al., 2020; Kraus et al., 2020; Maree et al., 2022; Picchi et al., 2017). Some of these PTMs may be involved in forming distinct chromatin structures on repetitive elements through the recruitment of specific proteins analogous to chromodomain protein recruitment via H3K9 or H3K27 methylation in other eukaryotes (Allshire and Madhani, 2018).

To characterise the chromatin context and the possible function of T. brucei repetitive elements, we applied an unbiased proteomics-based approach. We exploited synthetic DNA binding TALE (transcription activator-like effectors) fusion protein expression in T. brucei to bind to particular repetitive sequences and, following affinity selection, identify specific factors enriched on these chromosomal regions. Thus, synthetic TALE proteins were designed that were expected to bind the terminal telomeric (TTAGGG)n repeat arrays (TelR-TALE), the most frequent canonical CIR147 centromeric repeat (147R-TALE), core 177 bp repeats (177R-TALE), 70 bp BES-associated repeats (70R-TALE), ingi-related retrotransposon repeats (ingiR-TALE), and a Non-Recognised control (NonR-TALE) (Figure 1B). These synthetic TALE proteins were expressed as YFP fusion proteins with a nuclear localisation signal in T. brucei Lister 427 bloodstream-form cells with ChIP-seq confirming that they target the repeat elements that they were designed to bind. Validating the approach, affinity purification of TelR-TALE followed by proteomics analyses identified many proteins that were also enriched by affinity purification of the endogenous YFP-tagged T. brucei TRF telomere repeat binding protein. Further, several proteins involved in DNA-repair recombination were enriched with affinity-purified 70R-TALE, suggesting candidates that may be involved in mediating VSG gene switching events via these repeats. Surprisingly, many kinetochore proteins were detected as being enriched on 177 bp repeats. Thus, intermediate-sized and mini-chromosomes may assemble kinetochores and utilise machinery related to that operating on the main 11 chromosomes for their accurate mitotic segregation.

Results

Synthetic TALE-YFP fusion proteins that target T. brucei repetitive sequences

Five synthetic transcription activator-like effector TALE proteins were designed that were predicted to specifically bind 15 bp target sequences residing in different repetitive elements using pre-assembled tetramer and trimer modules (Moore et al., 2014; Figure 1B and C; Figure 1—figure supplement 1). BLAST searches confirmed that each selected 15 bp target sequence was unique to the specific target repetitive element with no exact match elsewhere in the T. brucei 427 reference genome (Cosentino et al., 2021; Rabuffo et al., 2024). The five TALEs assembled were thus predicted to bind: (i) telomeric (TTAGGG)n repeats residing at all chromosome ends (TelR-TALE) (Blackburn and Challoner, 1984; Van der Ploeg et al., 1984), (ii) the 70 bp repeat arrays that reside upstream of bloodstream VSG gene expression sites, and in shorter tracts adjacent to silent subtelomeric VSG genes and contribute to VSG gene switching events (70R-TALE) (Boothroyd et al., 2009; Glover et al., 2013; Hovel-Miner et al., 2016; Kim and Cross, 2010; Thivolle et al., 2021), (iii) the satellite-like centromere-associated 147 bp Chromosome Internal Repeats (147R-TALE) (Akiyoshi and Gull, 2014; Obado et al., 2005; Tschudi et al., 2012), (iv) the 177 bp satellite repeats that are concentrated on mini- and intermediate-sized chromosomes (177R-TALE) (Wickstead et al., 2004), and (v) a sequence common to the ingi clade of non-LTR retrotransposon interspersed repeat elements (ingiR-TALE) (Bringaud et al., 2008). A control NonR-TALE protein was also designed, which was predicted to have no target sequence in the T. brucei genome. Each synthetic TALE DNA binding domain open reading frame (ORF) was fused at its N-terminus to DNA encoding the T. brucei La protein nuclear localisation signal (NLS) and at its C-terminus with DNA encoding a 3xTy-YFP tag (Dean et al., 2015; Marchetti et al., 2000). T. brucei genes are polycistronic with their expression regulated by RNA processing and turnover. Consequently, the attenuated D1-354 PAD1 3’UTR from the PAD1 gene was placed downstream of each NLS-TALE-3xTy-YFP ORF. Use of this 3’UTR, which drives high-level expression in the stumpy transmission stage of parasites but only low-level expression in proliferative bloodstream forms (MacGregor and Matthews, 2012) restricted TALE protein expression levels. All constructs carried the Aldolase (ALD) 5’ UTR (ALD) to enable 5’ end RNA processing. Each of the final ALD5’UTR-NLS-TALE-3xTy-YFP*PAD1-3’UTR plasmids was integrated by homologous recombination at the β-tubulin gene locus in monomorphic Lister 427 bloodstream-form T. brucei cells (for brevity hereon the constructs and proteins produced are referred to as ---R-TALEs; Figures 1B–C, Figure 1—figure supplement 1A–C, Figure 1—figure supplement 2, Figure 1—figure supplement 3).

Proteins extracted from resultant TelR-TALE, 70R-TALE, 147R-TALE, 177R-TALE, ingiR-TALE, and NonR-TALE T. brucei transformants were analysed by anti-GFP and anti-Ty westerns (Figure 1—figure supplement 2). Cell lines expressing representative TALE-YFP proteins displayed no fitness deficit (Figure 1—figure supplement 3A). Five of the six synthetic ORFs produced proteins of the expected size of ~110 kDa. However, the expression level of NonR-TALE-YFP was lower than other TALE-YFP proteins; this may relate to the lack of DNA binding sites for NonR-TALE-YFP in the nucleus. Moreover, the TelR-TALE protein was smaller than expected; further investigation revealed that the repetitive nature of the telomeric target sequence AGGGTTAGGGTTAGG gave rise to a 612 bp direct repeat within the TALE encoding modules which, following transformation of T. brucei, resulted in a deletion event that reduced the predicted recognised target sequence to 8 rather than 15 bases of telomeric repeat (Figure 1—figure supplement 1D and E). Nevertheless, about 19,000 copies of the (TTAGGG)n sequence reside at T. brucei telomeres and contain the predicted, albeit truncated, TelR-TALE target sequence AGGGTTAG. Indeed, further analysis confirmed that the TelR-TALE-YFP protein binds telomeres in vivo (see below).

Synthetic repeat targeting TALE proteins localise to nuclei and are enriched on their cognate sequences

To determine the localisation of the six TALE proteins, anti-GFP immunolocalisation was performed on T. brucei cells expressing each individual TALE-YFP fusion protein or, as controls, the YFP-TRF telomere (TTAGGG)n binding protein or YFP-KKT2 centromere-associated kinetochore protein (Figure 2, Figure 2—figure supplement 1, Figure 1—figure supplement 3B). All synthetic TALE-YFP proteins and the endogenously tagged YFP-TRF and YFP-KKT2 proteins localised within nuclei, with YFP-TRF and YFP-KKT2 exhibiting distinct nuclear foci as expected for telomeres and centromeres (Li, 2023; Akiyoshi and Gull, 2014). The TelR-TALE-YFP and 147R-TALE-YFP localisation patterns were also punctate and comparable to that of YFP-TRF and YFP-KTT2, respectively. Furthermore, the localisation pattern for 177R-TALE-YFP was consistent with the known location of mini-chromosome 177 bp repeats around the nuclear periphery (Ersfeld and Gull, 1997). Both the 70R-TALE-YFP and ingiR-TALE-YFP proteins exhibited a diffuse nuclear signal with no specific sub-nuclear pattern. NonR-TALE-YFP displayed a diffuse nuclear and cytoplasmic signal; unexpectedly, the cytoplasmic signal appeared to be in the vicinity of the kDNA of the kinetoplast (mitochondria). We note that artefactual localisation of some proteins fused to an eGFP tag has previously been observed in T. brucei (Pyrih et al., 2023).

Figure 2. Localisation and specific target sequence association of five synthetic TALE-YFP fusion proteins expressed in T. brucei compared to YFP-TRF and YFP-KKT.

(A) Bloodstream-form Lister 427 T. brucei cells expressing the indicated TALE-YFP fusion proteins fixed and TALE-YFP protein localisation detected with anti-GFP primary antibody and Alexa Fluor 568-labelled secondary antibody (red). Nuclear and kinetoplast (mitochondrial) DNA were stained with DAPI (green). Control cells expressing telomeric YFP-TRF, centromeric YFP-KKT2 kinetochore protein, or wild-type Lister 427 cells expressing no YFP are also shown. Scale bar, 10 μm. (B) Anti-GFP ChIP-seq analysis for 147R-TALE, 177R-TALE, 70R-TALE, TelR-TALE, and ingiR-TALE, demonstrating that each protein is enriched on the repeat elements they were designed to recognise: CIR147 repeats, 177 bp repeats, 70 bp repeats, telomeric (TTAGGG)n repeats and ingi retrotransposons. Enrichments obtained for the YFP-KKT2 kinetochore protein, the TRF telomere repeat binding protein, and with a No-Tag control are shown for comparison. Data are from two biological replicates.

Figure 2.

Figure 2—figure supplement 1. Fields of T. brucei cells showing the cellular localisation of six expressed synthetic TALE-YFP fusion proteins compared to YFP-TRF and YFP-KKT.

Figure 2—figure supplement 1.

Bloodstream-form Lister 427 T. brucei cells expressing the indicated TALE-YFP fusion proteins fixed, and TALE-YFP protein localisation detected with anti-GFP primary antibody and Alexa Fluor 568-labelled secondary antibody (red). Nuclear and kinetoplastid (mitochondrial) DNA were stained with DAPI (green). Control cells expressing telomeric YFP-TRF, centromeric YFP-KKT2 kinetochore protein, or wild-type Lister 427 cells expressing no YFP are also shown. Scale bar, 10 μm except the NonR-TALE-TFP field was captured at a quarter the size of the others.

To determine if the TALE proteins were enriched on the repetitive elements that they were designed to bind, anti-GFP ChIP-seq was performed. The resulting ChIP-seq reads were aligned to the most recent T. brucei 427 genome assembly (Cosentino et al., 2021; Rabuffo et al., 2024) and the relative specificity compared (Figure 2B). The truncated TelR-TALE protein predicted to bind AGGGTTAG within telomeric (TTAGGG)n arrays (Figure 3A) was found to be enriched at the end of all megabase-sized, intermediate-sized, and mini-chromosomes coincident with telomere repeat binding protein YFP-TRF enrichment (e.g. Figure 3B). The 70R-TALE bound to 70 bp repeats (Figure 3A) that reside upstream of many VSG gene BES loci, regardless of their expression status, 2–8 kb from terminal (TTAGGG)n telomere repeat arrays (Hertz-Fowler et al., 2008) (binding at active BES1 and inactive BES5 is shown in Figure 3B). The ingiR-TALE protein was enriched over the 470 matching ingi element target sites dispersed across the T. brucei genome and, as expected, these included the region of similarity in RIME, SIDER, and DIRE retrotransposons (Bringaud et al., 2008; Figure 3—figure supplement 1).

Figure 3. TelR-TALE-YFP and 70R-TALE-YFP are enriched at or near telomeric T. brucei bloodstream expression sites.

(A) Telomeric repeat (TTAGGG)n sequence (top) and 70 bp repeat consensus sequence (bottom). Sequences that TelR-TALE and 70R-TALE were designed to bind are indicated. Deletion of TelR-TALE recognition modules following integration in T. brucei results in recognition of AGGGTTAG rather than the full 15 bp target sequence. (B) Anti-GFP ChIP-seq for cells expressing TelR-TALE-YFP, YFP-TRF, or 70R-TALE–YFP proteins, or 427 cells expressing no YFP-tagged protein. Anti-GFP ChIP-seq enrichment profiles are shown for telomeric bloodstream expression sites (BES) BES1 (top) and BES5 (bottom). Diagrams show the position of telomeric (TTAGGG)n repeats (black chevrons), VSG genes (blue), and upstream 70 bp repeats (green bars). Data are from two biological replicates. Y axis: log2 values, X axis: base pairs.

Figure 3.

Figure 3—figure supplement 1. IngiR-TALE is enriched at matching binding sites located in retrotransposons.

Figure 3—figure supplement 1.

(A) ‘Ingi’ repeat consensus sequence conserved in Ingi, RIME, SIDER, and DIRE elements. The sequence that ingiR-TALE was designed to bind is indicated. (B) Cross-hatched rectangle indicates the conserved region in Ingi, RIME, SIDER, and DIRE retrotransposons, which are predicted to provide approximately 295, 187, 101, and 21 binding sites for the ingiR-TALE, respectively. (C) Analysis of ChIP-seq data for cells expressing ingi-TALE shows enrichment of DNA residing at or near predicted ingiR-TALE binding sites.

The centromere region of T. brucei chromosomes 4, 5, and 8 contains extensive arrays of canonical CIR147 repeats. Divergent but related repeats are associated with the centromeres of the other main chromosomes, but no CIR147-related centromere repeats reside on the intermediate-sized or mini-chromosomes. Hence, 147R-TALE, which was designed to bind the TTGACGTGAAAATAC sequence within the consensus CIR147 repeat (Figure 4A), and for which homologous siRNAs are produced (Patrick et al., 2009; Tschudi et al., 2012), showed enrichment on the cognate repeat arrays at centromeres 4 and 5, and to some extent centromere 8, which are also occupied by the YFP-KKT2 kinetochore protein (Figure 4B and C). In contrast, the 147R-TALE did not decorate the CIR147-related repeats residing at centromeres 9, 10, and 11 or the more divergent classes of repeats bound by YFP-KKT2 at centromeres 1, 2, 3, 6, and 7 (Figure 4C). ChIP-seq analysis for the 177R-TALE showed that this synthetic protein was enriched on target intermediate-sized and mini-chromosome 177 bp repeat arrays (Figure 5).

Figure 4. The 147R-TALE-YFP protein is enriched at a subset of centromeres containing canonical CIR147 repeats.

Figure 4.

(A) CIR147 repeat consensus sequence. Sequence that 147R-TALE-YFP was designed to bind is indicated. (B) Comparison of sequences enriched in YFP-KKT2 (purple) and 147R-TALE-YFP (blue) anti-GFP ChIP-seq for chromosomes 1, 3, 4, 5, and 8. DNA from all centromeres is enriched in YFP-KKT2 anti-GFP ChIP-seq, whereas only CIR147 repeats at centromeres on chromosomes 4, 5, and 8 are enriched in 147R-TALE-YFP anti-GFP ChIP-seq. (C) Split-Violin plot demonstrating the relative enrichment of YFP-KKT2 (purple) and 147R-TALE-YFP (blue) over the 11 main chromosome centromere regions. Data are from two biological replicates. Y axis: log2 values.

Figure 5. The 177R-TALE-YFP is enriched over 177 bp repeats located on intermediate-sized and mini-chromosomes.

Figure 5.

(A) 177 repeat consensus sequence. Sequence that 177R-TALE-YFP was designed to bind is indicated. (B) Distribution of 177R-TALE-YFP, TelR-TALE-YFP, YFP-TRF, and 70R-TALE-YFP, at two intermediate/mini-chromosome telomeres determined by anti-GFP ChIP-seq. Anti-GFP ChIP-seq of 427 cells expressing no tagged protein is included as control. Diagrams below ChIP-seq profiles indicate the positions of 177 bp repeats (red chevrons), 70 bp repeats (green bars), VSG encoding genes (blue), and telomere (TTAGGG)n repeats (black chevrons) within Tb427VSG-671_unitig_Tb427v12:17,836–31,606 (31kb) and Tb427VSG-647_unitig_Tb427v12 (10 kb). Data are from two biological replicates. Y axis: log2 values, X axis: base pairs.

TelR-TALE affinity purification verifies the use of TALEs to identify repetitive element-associated proteins

All five synthetic TALE-YFP proteins were found to target the repetitive elements to which they were designed to bind when expressed in T. brucei. At least five proteins have previously been shown to be specifically enriched with the T. brucei TRF telomere binding protein in affinity purifications: TIF2, TelAP1, TelAP2, TelAP3, and PolQ/PolIE (Leal et al., 2020; Reis et al., 2018; Weisert et al., 2024). Therefore, to test if repeat-targeted TALE-YFP proteins could be used to identify proteins associated with repetitive elements, we affinity-purified solubilised TelR-TALE-bound chromatin and compared the associated proteins with those we detected as being enriched with YFP-TRF by mass spectrometry (AP-LC-MS/MS; Figure 6A and B; Figure 6—figure supplement 1; Supplementary file 1a and b). As expected, known telomere-associated proteins TRF (Tb927.10.12850), TIF2 (Tb927.3.1560), TelAP1 (Tb927.11.9870), TelAP2 (Tb927.6.4330), TelAP3 (Tb927.9.4000), RAP1 (Tb927.11.370), and PolQ/PolIE (Tb927.11.5550) were enriched with affinity-purified YFP-TRF (Figure 6A, Figure 6—figure supplement 1, Supplementary file 1a). In addition, replication/repair proteins RPA2 (Replication Factor A; Tb927.11.9130) and PPL2 (PrimPol-Like protein 2; Tb927.10.2520) were also enriched with YFP-TRF, along with the RNA binding proteins ZC3H39 (Tb927.10.14930) and ZC3H40 (Tb927.10.14950), HDAC3 (Tb927.2.2190), and histones (Figure 6A). Using the same affinity selection procedure, an overlapping set of 108 proteins was found to be enriched with TelR-TALE-YFP-bound chromatin (Figure 6B, Figure 6—figure supplement 1, Supplementary file 1b); these included TRF, TIF2, TelAP1, TELAP2, TELAP3, PolQ/PolIE, PPL2, HDAC3, and histones; however, RAP1 was only weakly enriched. In addition, all three Replication Factor A subunits (RPA1, 2, 3; Tb927.11.9130, Tb927.5.1700, Tb927.9.11940) were enriched with TelR-TALE, but only RPA2 with YFP-TRF. Notably, two RNA-associated proteins PABP2 (Tb927.9.10770) and MRB1590 (Tb927.3.1590), which were previously identified as potential telomere-associated proteins (Reis et al., 2018; Weisert et al., 2024), were detected in both our YFP-TRF and TelR-TALE affinity purifications. Moreover, the ZC3H39 (Tb927.10.14930) and ZC3H40 (Tb927.10.14950) RNA binding proteins, which heterodimerise to regulate respiratome transcript levels (Trenaman et al., 2019), were enriched in affinity selections of both proteins (Figure 6—figure supplement 1).

Figure 6. Affinity selection of TelR-TALE-YFP enriches for telomere-associated proteins and 177R-TALE-YFP protein enriches for kinetochore proteins.

Affinity selection was performed on control cells expressing YFP-TRF (A), YFP-RPA2 (C), YFP-KKT2 (E), or No-YFP-tagged protein, and cells expressing synthetic TelR-TALE-YFP (B), 70R-TALE-YFP (D), 177R-TALE-YFP (F). Enriched proteins were identified and quantified by LC-MS/MS analysis relative to the No-YFP tag control. The data for each plot is derived from three biological replicates. Cut-offs used for significance: p<0.05 (Student’s t-test). Enrichment scores for proteins identified in each affinity selection are presented in Supplementary file 1.

Figure 6.

Figure 6—figure supplement 1. Overlap of proteins enriched in affinity purifications of both synthetic protein telomere binding protein TelR-TALE-YFP and YFP-TRF.

Figure 6—figure supplement 1.

(A) The Telomere Repeat binding Factor (TRF) binds (TTAGGG)n repeats at the ends of T. brucei chromosomes. A Venn diagram is shown, comparing number of proteins enriched in YFP-TRF versus TelR-TALE-YFP affinity purifications. (B) List of known telomere-associated proteins and other proteins detected in YFP-TRF and/or TelR-TALE-YFP affinity purifications. + detected, – not detected, (–) weakly detected. Lists of all proteins detected in YFP-TRF and TelR-TALE-YFP affinity purifications are available in Supplementary file 1a and b, respectively. *See Reis et al., 2018; Leal et al., 2020; Weisert et al., 2024.
Figure 6—figure supplement 2. A control TALE that binds no specific T. brucei sequence validates proteins enriched in TelR-TALE, 70R-TALE, and 177R-TALE affinity purifications.

Figure 6—figure supplement 2.

A control NonR-TALE was designed to bind the sequence GGAAGTATACCTGGC that is not present in the T. brucei 427 genome. Affinity selection was performed on cells expressing the synthetic NonR-TALE-YFP, TelR-TALE, 70R-TALE-YFP, 177R-TALE-YFP, 147R-TALE-YFP, or ingiR-TALE proteins and control cells expressing no-YFP-tagged protein. Proteins enriched with the five repeat sequence targeted TALE-YFP proteins were identified and quantified by LC-MS/MS analysis relative to the NonR-TALE-YFP control rather than the No-YFP tag control. The data for each plot is derived from three biological replicates. Cut-offs used for significance: log2(tagged/untagged) p<0.05 (Student’s t-test). Enrichment scores for proteins identified in each affinity selection are presented in Supplementary file 1.
Figure 6—figure supplement 3. Affinity selection of TelR-TALE-YFP, 70R-TALE-YFP 177R-TALE-YFP relative to ingiR-TALE-YFP validates specificity.

Figure 6—figure supplement 3.

Affinity selection was performed on cells expressing synthetic TelR-TALE-YFP (A), 70R-TALE-YFP (B), and 177R-TALE-YFP (C). Enriched proteins were identified and quantified by LC-MS/MS analysis relative to affinity-selected ingiR-TALE-YFP as a negative control. The data for each plot is derived from three biological replicates. Cut-offs used for significance: p<0.05 (Student’s t-test). Enrichment scores for proteins identified in each affinity selection are presented in Supplementary file 1.
Figure 6—figure supplement 4. No proteins of interest are detected following affinity selection of 147R-TALE or ingiR-TALE.

Figure 6—figure supplement 4.

Affinity selection was performed on cells expressing synthetic (A) 147R-TALE-YFP or (B) ingiR-TALE-YFP proteins and control cells expressing no-YFP-tagged protein. Enriched proteins were identified and quantified by LC-MS/MS analysis relative to the No-YFP tag control. The data for each plot is derived from three biological replicates. Cut-offs used for significance: log2(tagged/untagged) p<0.05 (Student’s t-test). Enrichment scores for proteins identified in each affinity selection are available in Supplementary file 1.
Figure 6—figure supplement 5. Overlap of proteins enriched in affinity purifications of both kinetochore protein YFP-KKT2 and synthetic protein 177R-TALE.

Figure 6—figure supplement 5.

Kinetoplastid KineTochore (KKT) proteins are known to be enriched at all centromeres on T. brucei main chromosomes (Akiyoshi and Gull, 2014). 177 bp repeats are confined to intermediate-sized or mini-chromosomes. (A) A Venn diagram comparing proteins enriched in YFP-KKT versus 177R-TALE versus affinity purifications. A high proportion of KKT proteins, in addition to cohesin (SCC1, SCC3, SMC1, and SMC3) and condensin (SMC2) subunits, are enriched on 177 bp repeats. (B) List of kinetochore, cohesin, and condensin proteins detected in YFP-KKT and/or 177R-TALE affinity purifications. Lists of all proteins detected in 177R-TALE-YFP and YFP-KKT2 affinity purifications are available in Supplementary file 1l and n, respectively.

Overall, these data indicate that a core set of known telomere/TRF-associated proteins were also enriched with the synthetic TelR-TALE telomere binding protein. Thus, we conclude that our other synthetic TALE-YFP proteins, designed to bind distinct repetitive elements, could allow the identification of proteins specifically residing on those other sequences in vivo. Moreover, a similar set of enriched proteins was identified in TelR-TALE-YFP affinity purifications when compared with cells expressing no YFP fusion protein (No-YFP), the NonR-TALE-YFP, or the ingiR-TALE-YFP as controls (Figure 6—figure supplement 2B, Figure 6—figure supplement 3A; Supplementary file 1c, d, and o). Thus, the most enriched proteins are specific to TelR-TALE-YFP-associated chromatin rather than to the TALE-YFP synthetic protein module or other chromatin.

Target sequence copy number may determine effectiveness of TALE-YFP proteins in identifying repeat-associated proteins

We estimated that the most recent T. brucei 427 genome assembly contains 19,164 copies of the telomeric AGGGTTAG target sequence which the truncated Tel-TALE-YFP is predicted to bind within (TTAGGG)n repeat arrays (Cosentino et al., 2021; Rabuffo et al., 2024). In contrast, there are only 440 and 470 targets matching the predicted binding sites TTGACGTGAAAATAC and GCCGGCACCTCAAC for the 147R-TALE and ingiR-TALE synthetic proteins, respectively (Figure 1C). NonR-TALE is predicted to have no matching binding sites in the T. brucei TREU 427 genome. To determine if proteins associated with such low copy number TALE-YFP target sequences could be identified, we applied the same AP-LC-MS/MS proteomics procedure to T. brucei cells expressing 147R-TALE, ingiR-TALE, or NonR-TALE. Comparison of either 147R-TALE or ingiR-TALE affinity purifications results with the No-YFP or NonR-TALE-YFP control affinity purifications showed no specific enrichment of any proteins of obvious potential functional interest with either 147R-TALE or ingiR-TALE (Figure 6—figure supplement 2E and F, Figure 6—figure supplement 4, Supplementary file 1e–h). Thus, the nuclear ingiR-TALE-YFP provides an additional chromatin-associated negative control for affinity purifications with the TelR-TALE-YFP, 70R-TALE-YFP, and 177R-TALE-YFP proteins (Figure 6—figure supplement 3, Supplementary file 1o-q). Moreover, although kinetochore proteins are enriched on CIR147 repeats (Figure 4B and C; Akiyoshi and Gull, 2014), no kinetochore proteins were detected in 147R-TALE affinity purifications. Thus, although ChIP-seq showed that both 147R-TALE and ingiR-TALE were enriched on their cognate target sequences, it appears that there are insufficient copies of these repeats for our AP-LC-MS/MS procedure to reveal associated proteins above background. We therefore focused our attention on the 70R-TALE and 177R-TALE synthetic proteins for which there are 3850 and 1828 predicted binding sites in the genome, respectively (Figure 1C).

The RPA complex is enriched with synthetic 70 bp repeat binding protein

70R-TALE affinity purifications showed enrichment of all three subunits of the Replication Protein A complex (RPA1, RPA2, and RPA3) comparable to the enrichment detected in affinity purification of YFP-RPA2 itself (Figure 6C and D; Supplementary file 1i and j). Proteins identified as being enriched with 70R-TALE-YFP (Figure 6D) were similar in comparisons with either the No-YFP, NonR-TALE-YFP, or ingiR-TALE-YFP as negative controls (Figure 6—figure supplements 1 and 2C, Figure 6—figure supplement 3B; Supplementary file 1k and p). Along with the RPA complex, FACT subunits (SPT16 and POB3), histones, and Tb927.11.6830 were also enriched with both YFP-RPA2 and 70R-TALE-YFP affinity purification. This collection of proteins was also enriched in affinity purifications of TelR-TALE, which binds terminal telomeric (TTAGGG)n repeats. In contrast, the 70R-TALE targets 70 bp repeats residing several kilobase pairs internal from telomeres (ChIP-seq, Figure 3B). Given the known role for the RPA complex in DNA repair and replication, it may have distinct roles in mediating specific DNA transactions via 70 bp repeats and in telomere repeat dynamics (Boothroyd et al., 2009; Li, 2023).

Kinetochore proteins are enriched on 177 bp repeats bound by 177R-TALE

In contrast to 70R-TALE and TelR-TALE, affinity selection of the 177R-TALE resulted in enrichment of a distinct set of proteins which unexpectedly included 14 of the 25 known kinetoplastid core kinetochore proteins: KKT1, KKT2, KTT3, KKT4, KKT6, KKT7, KKT8, KKT9, KKT10, KKT11, KKT12, KKT16, KKT17, KKT24 (Akiyoshi and Gull, 2014; D’Archivio and Wickstead, 2017; Nerusheva et al., 2019; Figure 6F; Supplementary file 1l). The same kinetochore proteins were enriched regardless of whether the 177R-TALE proteomics data was compared with No-YFP, NonR-TALE, or ingiR-TALE controls (Figure 6—figure supplement 2D, Figure 6—figure supplement 3C, Supplementary file 1m and q). For comparison, YFP-KKT2 was affinity-selected from T. brucei cells expressing endogenous N-terminal YFP-tagged KKT2 (Figure 6E; Supplementary file 1n). A clearly overlapping set of proteins was detected in both 177R-TALE-YFP and YFP-KKT2 affinity purifications (Figure 6—figure supplement 5). Moreover, the outer kinetochore-associated proteins KKIP3 and KKIP5, which transiently associate with T. brucei kinetochores through Aurora B kinase regulation, were also enriched with 177R-TALE, along with Aurora B kinase itself (Supplementary file 1l; D’Archivio and Wickstead, 2017; Nerusheva et al., 2019; Zhou et al., 2019). Cohesin complex subunits were also present in both 177R-TALE-YFP and YFP-KKT2 affinity purifications, underscoring their expected role in mediating sister-kinetochore cohesion during mitosis. Thus, many KKT kinetochore proteins were found to be enriched with affinity-purified 177R-TALE, which ChIP-seq showed associates with 177 bp repeats but not with KKT2-bound centromere regions on the 11 megabase-sized main chromosomes (i.e. 177R-TALE is not enriched on CIR147 or other centromeric repeats; Figure 2B). This finding suggests that kinetochores with a related composition assemble on all T. brucei chromosomes regardless of their size classification and that 177 bp repeats attract kinetochore proteins to the intermediate-sized and mini-chromosomes. To explore this possibility further, our YFP-KKT2 and 177R-TALE ChIP-seq data was compared over two regions from intermediate-sized chromosomes spanning 177 bp repeats to the telomere (Figure 7A). The resulting analysis revealed that both YFP-KKT2 and 177R-TALE proteins are enriched on 177 bp repeat arrays but not adjacent non-repetitive sequences. In contrast, YFP-KKT2 and 147R-TALE, but not 177R-TALE, were enriched over main chromosome centromeric 147 bp repeat arrays (Figure 7B). Taking into account the relative number of CIR147 and 177 bp repeats in the current T. brucei genome (Cosentino et al., 2021; Rabuffo et al., 2024), comparative analyses demonstrated that YFP-KKT2 is enriched on both CIR147 and 177 bp repeats (Figure 7C). We conclude that kinetochore proteins assemble on at least a proportion of the individual units within the 177 bp repeat arrays on intermediate-sized and/or mini-chromosomes.

Figure 7. Synthetic 177R-TALE-YFP and YFP-KKT2 kinetochore proteins co-localise over 177 bp repeats located on intermediate-sized and mini-chromosomes but not over centromeric CIR147 repeats where 147R-TALE-YFP binds.

Figure 7.

(A) Distribution of 177R-TALE, YFP-KKT2, and 147R-TALE over two intermediate/mini-chromosome telomeres determined by anti-GFP ChIP-seq. Anti-GFP ChIP-seq of T. brucei 427 cells expressing no tagged protein is included as a control. The diagram below ChIP-seq profiles indicates the positions of 177 bp repeats (red chevrons), 70 bp repeats (green bars), VSG encoding genes (blue), and telomere (TTAGGG)n repeats (black chevrons) within Tb427VSG-671_unitig_Tb427v12:17,836–31,606 (31 kb) and Tb427VSG-647_untig_Tb427v12 (10 kb). (B) Comparison of distribution of 177R-TALE, 147R-TALE, and YFP-KKT2 over the chromosome 4 CIR147 centromere repeat array and adjacent unique sequences. Chr4:880,000–895,000 (15 kb) and Tb427VSG-671_unitig_Tb427v12:12,000–27,000 (31 kb). The diagram below ChIP-seq indicates the position of CIR147 repeats. (C) Comparison of YFP-KKT2 kinetochore protein enrichment on 177 bp and 147 bp repeats. Data are from two biological replicates. Y axis: log2 values, X axis: repeat types.

Discussion

Repetitive elements are a feature of most eukaryotic genomes with major roles in defining centromeres and telomeres, and influencing the expression of nearby genes through the formation of specific types of chromatin (Allshire and Karpen, 2008; Allshire and Madhani, 2018). Kinetoplastids represent a very distinct early-branching eukaryotic lineage, which has highly divergent histones (Alsford and Horn, 2004; Deák et al., 2023; Saha et al., 2021). Consequently, little is known about the repertoire of proteins that associate with chromatin formed on repetitive elements in these organisms despite their importance in chromosome segregation, telomere maintenance, and immune evasion. Here, we have developed an approach which utilises a collection of synthetic DNA binding proteins designed to bind telomeric (TTAGGG)n (TelR-TALE), centromeric CIR147 (147R-TALE), ingi-related (ingiR-TALE) dispersed, VSG gene-associated 70 bp (70R-TALE), and mini/intermediate chromosome-specific (177R-TALE) repeats when expressed in bloodstream-form T. brucei cells. ChIP-seq demonstrated that all five TALE-based synthetic proteins target the repetitive elements that they were designed to bind. Affinity selection of the TelR-TALE, 70R-TALE, and 177R-TALE proteins identified specific sets of enriched proteins. However, the 147R-TALE and ingiR-TALE failed to identify any enrichment of specific proteins following their affinity selection. Encouragingly, the proteins identified as enriched with TelR-TALE showed significant overlap with those identified following affinity selection of the (TTAGGG)n telomere binding protein TRF (Figure 6—figure supplement 1). All subunits of the RPA complex were highly enriched with the 70R-TALE and many kinetochore proteins were identified in 177R-TALE affinity purifications.

It was initially surprising that no specific proteins were detected as being enriched following either 147R-TALE or ingiR-TALE pulldown. Given that both of these synthetic proteins target their designated target sequence in vivo (Figure 2B), it seems likely that this failure is related to the fact that there are fewer target sites for these synthetic proteins to bind to than the TelR-TALE, 70R-TALE, and 177R-TALE proteins, which clearly identified proteins bound in their immediate vicinity in T. brucei cells. Thus, although proteins that bind CIR147 or ingi-related repeats in vivo may be present, their level of enrichment may not be sufficient to allow detection above background by proteomic analyses following 147R-TALE or ingiR-TALE pulldown. It is also possible that the binding of the 147R-TALE and ingiR-TALE proteins dislodges a significant proportion of the proteins that normally bind these repeats, thus reducing their enrichment. Regardless, the 147R-TALE and ingiR-TALE proteins were well expressed in T. brucei cells, but their affinity selection did not significantly enrich for any relevant proteins. Thus, 147R-TALE and ingiR-TALE provide reassurance for the overall specificity for proteins enriched in TelR-TALE, 70R-TALE, and 177R-TALE affinity purifications.

The TelR-TALE binds telomeric (TTAGGG)n repeats in vivo and copurifies with a collection of proteins known to function at trypanosome telomeres, therefore demonstrating that a synthetic protein designed to bind a repetitive element can be used to identify other proteins enriched over those repeats (Figure 6A and B, Figure 6—figure supplement 1). Apart from known telomere binding proteins, the two zinc finger proteins ZC3H39 (Tb927.10.14930) and ZC3H40 (Tb927.10.14950) were enriched in both TelR-TALE-YFP and YFP-TRF affinity selections. The fact that both ZC3H39 and ZC3H40 were enriched by affinity selection with these independent baits − one endogenous (YFP-TRF) and the other a synthetic telomere repeat binding protein (TelR-TALE-YFG) − suggests that at least a proportion of these proteins are present near, and may have some function at, telomeres. Although both ZC3H39 and ZC3H40 have also been shown to be involved in the post-transcriptional regulation of transcripts encoding respiratory chain proteins and located primarily in the cytoplasm, both were identified in a genome-wide RNAi screen for telomeric gene derepression that also selected the VSG expression site regulator VEX1 (Trenaman et al., 2019). Hence, in addition to their role in respiratory complex gene regulation, ZC3H39 and ZC3H40 might play some additional role in the regulation of gene expression near telomeres. It is possible that they act through telomerase recruitment at telomeres, the regulation of telomere repeat-containing RNA (TERRA) transcripts that are produced at VSG active telomeres (Saha et al., 2021), or engagement of some other regulatory complex associated with telomeres.

Analysis of 70 bp repeat-associated proteins via specific 70R-TALE affinity selection identified RPA1, 2, and 3. This heterotrimeric complex is enriched at single-stranded DNA and associated with DNA damage and double-stranded DNA breaks. The accumulation of these proteins at 70 bp repeats is consistent with the function of these sequences in the initiation of recombination events involved in surface antigen switching, with trypanosomes being unusual in not activating a DNA damage cell cycle checkpoint thereby allowing continued proliferation whilst promoting antigenic diversity (Glover et al., 2019). Furthermore, the enrichment of FACT complex components with repeat bound 70R-TALE again highlights 70 bp repeats as an expected focus of recombination events and expression site activity. FACT depletion is known to alleviate repression at these silent VSG expression sites by generating a more open chromatin conformation and reciprocally decreases expression from the active VSG expression site (Denninger and Rudenko, 2014).

Kinetoplastid kinetochores are unusual in that they are composed of at least 26 proteins that bear little resemblance to the 40–100 constitutive and transient kinetochore-associated proteins that assemble at conventional eukaryotic centromeres (Akiyoshi and Gull, 2014; Ballmer et al., 2024; D’Archivio and Wickstead, 2017; Yatskevich et al., 2023). In T. brucei, kinetochores assemble at a single location on both copies of the 11 main diploid chromosomes. The centromeres of chromosomes 4, 5, and 8 contain CIR147 repeat arrays over which kinetochore proteins are enriched while the centromeres of other megabase chromosomes form on less well-characterised repetitive elements (Akiyoshi and Gull, 2014; Echeverry et al., 2012; Obado et al., 2007). In addition, the characterisation of the many T. brucei mini- and intermediate chromosomes remains incomplete due to the presence of long tandem 177 bp repeat arrays. Our ChIP-seq analyses of synthetic 177R-TALE-YFP location showed that it associated with 177 bp repeats in vivo but not the adjacent VSG gene regions on mini-chromosomes or any region of the 11 megabase chromosomes (Figure 7). The detection of a plethora of kinetochore proteins on 177R-TALE-YFP-bound chromatin indicates that kinetochores or a sub-kinetochore complex also assembles on the 177 bp repeats of mini- and intermediate T. brucei chromosomes. Consistent with this finding, some enrichment of YFP-tagged KKT2 and KKT3 was previously detected using a model mini-chromosome assemblage, and depletion of kinetochore proteins was shown to cause aberrant segregation of mini/intermediate chromosomes (Akiyoshi and Gull, 2014). If, as previously suggested (Akiyoshi and Gull, 2014), mini- and intermediate 177 bp repeat bearing chromosomes segregate by somehow ‘hitching a ride’ via kietochores that are actually assembled on the main chromosomes, then it might be expected that 177R-TALE-YFP ChIP-seq would register some signal over centromere regions of the main chromosomes; however, no such signal was observed (Figures 2B, 4 and 7). The 17 kinetochore proteins (KKT1, 2, 3, 4, 6, 7, 9, 10, 11, 12, 16, 17, 18, 24, KKIP3, KKIP4) detected in 177R-TALE-YFP affinity purifications represent most of the components considered to comprise the core kinetochore but represent only a subset of the 26 known T. brucei main structural kinetochore proteins. It is possible that a more rudimentary kinetochore is assembled on mini- and intermediate chromosome 177 repeat arrays and that these are sufficient to mediate their accurate segregation.

Interestingly, although targeting TALE proteins to different repetitive sequences selected components specific to each repeat type, some overlap in the proteins detected was observed. For example, enrichment of telomere-associated proteins was detected in some affinity-selected samples using 177R-TALE-YFP, presumably resulting from the juxtaposition of telomeric repeats and 177 bp repeats on mini-chromosomes. Supporting this, KKT3 was reciprocally detected in samples affinity-selected using YFP-TRF. Similarly, enrichment of both FACT subunits with 177R-TALE may simply reflect the proximity of silent telomeric chromatin on mini-chromosomes, or it may also indicate that FACT contributes to a particular chromatin environment at 177 bp repeats.

Although proteins associated with TALE-YFP fusions targeting telomeric, 70 bp and 177 bp repeats were successfully identified, our analyses suggest that target sequences need to be present in many copies (>1000) in a T. brucei genome of ~35 Mb to successfully identify associated proteins. Thus, the 147R-TALE-YFP which targets 440 copies of the canonical centromeric CIR147 repeat resulted in no enrichment of associated proteins, although other parameters may also influence the ability of a sequence-bound TALE-YFP protein to enrich for nearby chromatin-associated proteins. Such parameters may include the affinity that the target chromatin-bound proteins have for the repeat sequence of interest, the relatively low affinity that these chromatin proteins may have for any other chromosomal region and their overall relative abundance (for detailed discussion, see Gauchier et al., 2020). Methods such as CUT&RUN (Skene and Henikoff, 2017), which should selectively release only TALE-bound chromatin, followed by affinity selection (similar to CUT&RUN.ChIP Brahma and Henikoff, 2019), might improve protein enrichment relative to background and allow identification of proteins associated with less abundant sequences. An alternative to synthetic TALE proteins is to utilise tagged catalytically dead Cas9 targeted to specific sequences via a CRISPR-embedded guide RNA. Fusion of TALE or dCAS9 probes to APEX or BirA* enzymes could also be incorporated to perhaps improve the identification of proteins that reside close to the synthetically targeted DNA binding protein (Gao et al., 2018; Myers et al., 2018). Cas9/CRISPR systems allow precise genome editing in T. brucei (Rico et al., 2018; Vasquez et al., 2018) such that the development of dCas9-based CRISPR tools may improve the performance of future sequence-targeted proteomics. However, an advantage of TALE protein use is that only a single entity needs to be expressed that directly targets the sequence of interest.

In conclusion, we have successfully deployed TALE-based affinity selection of proteins associated with repetitive sequences in the trypanosome genome. This has provided new information concerning telomere biology, chromosomal segregation mechanisms, and immune evasion strategies employed by these evolutionarily divergent pathogens. As well as providing an orthogonal corroboration of existing knowledge of protein interactions with discrete genomic features, this has provided new entry points to dissect these parasites’ chromatin architecture. We anticipate that extension to other kinetoplastid parasites could assist exploration of Leishmania genome instability as a response to environmental adaptation where, for example, the highly abundant SIDER family (70-fold more numerous than in T. brucei; Bringaud et al., 2007) might overcome the copy number limitations of analysing retransposon sequences analysed in our study. Likewise, the 195 nt satellite DNA in T. cruzi represents 5–10% of the parasite genome and is sufficiently abundant to allow analysis of associated proteins (Elias et al., 2003).

Materials and methods

TALEs target sequence design

All synthetic TALE proteins were designed to bind 15 bp target sequences following a T/thymine base as required for the TALEN kit (Ding et al., 2013). The design of the mini-chromosomal 177 bp repeat binding TALE was informed by available sequences (Wickstead et al., 2004). The ingi repeat TALE was designed to bind a target within the conserved 5’ region 79 bp of related transposable elements (Bringaud et al., 2008). For the design of the CIR147 binding, TALE published sequences were used as reference (Obado et al., 2007; Patrick et al., 2009); however, a CIR147 bp target sequence with only one exact match was picked. TALEs were designed, which were predicted to bind the known 70 bp repeats (Boothroyd et al., 2009) and terminal (TTAGGG)n repeats. A control NonR-TALE predicted to have no recognised target in the T. brucei genome was designed as follows: BLAST searches were used to identify exact matches in the TREU927 reference genome. Candidate sequences with one or more matches were discarded. Each TALE was assembled using the Musunuru/Cowan TALEN kit protocol (Ding et al., 2013) and subsequently placed in a vector that allowed expression in T. brucei bloodstream cells as described in the main text.

Trypanosome cell culture

T. brucei brucei Lister 427 bloodstream-form monomorphic cells were used for all experiments. Cell lines were grown at 37°C and 5% CO2 in HMI-9 medium supplemented with 10% Fetal Calf Serum (Gibco), 1% Penicillin-Streptomycin (Gibco), and selective drug(s) when required (Hirumi and Hirumi, 1989). Cell cultures were maintained below 3×106 cells/ml. Phleomycin 2.5 μg/ml was used to select transformants containing the TALE construct BleoR gene.

Trypanosome transfections

5×107 cells were harvested per transfection by centrifugation at 1000×g, 10 min. Cells were washed once with 5 ml TbBSF transfection buffer (Schumann Burkard et al., 2011) and pelleted again by centrifugation at 1000×g, 10 min before resuspending in 100 μl ice-cold TbBSF transfection buffer, and transferred to an electroporation cuvette (Ingenio). 10–20 μl of DNA for transfection containing 1–5 μg DNA was added to the cuvette. Cells were electroporated in the Amaxa Nucleofector II (Lonza) using the X-001 programme for bloodstream cells. A ‘no DNA’ mock transfection was always performed in parallel as a negative control. Electroporated bloodstream cells were added to 30 ml HMI-9 medium and two 10-fold serial dilutions were performed in order to isolate clonal Phleomycin-resistant populations from the transfection. 1 ml of transfected cells was plated per well on 24-well plates (1 plate per serial dilution) and incubated at 37°C and 5% CO2 for a minimum of 6 hr before adding 1 ml media containing 2× concentration Phleomycin (5 μg/ml) per well. A positive control was also performed by adding media containing no selective drug to 12 wells of the control transfection plate.

Western analyses

Cells were harvested by centrifugation at 1000×g, 10 min, washed with 1× PBS and resuspended in 1× PBS + 4× NuPAGE LDS Sample Buffer (Thermo Fisher Scientific) to give a final concentration of 5×106 cells per 10 μl. Samples were then boiled at 95°C for 5 min to ensure cells were dead before removal from the CAT3 facility. Samples were then subjected to sonication using a Diagenode Bioruptor for 10 cycles, 30 s ON/30 s OFF at 4°C on high setting to shear the DNA and reduce the viscosity to aid loading on gels. Samples were run on NuPAGE Bis-Tris Mini Gels (Thermo Fisher Scientific) in a Mini Gel Tank (Thermo Fisher Scientific) in 1× NuPAGE MES Running Buffer at 200 V. Following PAGE, proteins were transferred onto nitrocellulose membranes in a Mini Blot Module (Thermo Fisher Scientific) at 20 V for 1 hr. Membranes were stained with Ponceau S (Sigma-Aldrich) to assess efficiency of protein transfer. After blocking with 5% milk/PBS-T (PBS + 0.05% Tween), membranes were incubated with mouse anti-GFP (Roche) (1:1000 in 5% milk in PBS-T) or anti-BB2 antibody (Hybrydome mouse monoclonal, clone BB2) (1:5 in 5% milk in PBS-T) at 4°C overnight on a lab rocker, then washed with PBS-T and incubated with HRP-conjugated anti-mouse secondary antibody (1:2500 in 5% milk in PBS-T) at room temperature for 1 hr. Membranes were washed with PBS-T and incubated with Amersham ECL Prime Western Blotting Detection Reagent (GE Healthcare) following the manufacturer’s instructions. Proteins were visualised using Amersham Hyperfilm ECL (GE Healthcare).

Fluorescent immunolocalisation

Cells were fixed with 4% paraformaldehyde for 10 min on ice. Fixation was stopped with 0.1 M glycine. Cells were added to polylysine-coated slides and permeabilised with 0.1% Triton X-100. The slides were blocked with 2% BSA. Rabbit anti-GFP primary antibody (Thermo Fisher Scientific A-11122) was used at 1:500 dilution, and secondary Alexa Fluor-568 or -488 goat antirabbit antibody (Thermo Fisher Scientific) was used at 1:1000 dilution. Images were taken with a Zeiss Axio Imager microscope.

Chromatin immunoprecipitation and sequencing

As previously described (Staneva et al., 2021), 3.5×108 parasites were fixed with 0.8% formaldehyde for 20 min at room temperature. Cells were lysed and sonicated in the presence of 0.2% SDS for 30 cycles (30 s ON, 30 s OFF) using the high setting on a Bioruptor sonicator (Diagenode). Cell debris was pelleted by centrifugation, and SDS in the lysate supernatants was diluted to 0.07%. Input samples were taken before incubating the rest of the cell lysates overnight with 10 μg rabbit anti-GFP antibody (Thermo Fisher Scientific A-11122) and Protein G Dynabeads. The beads were washed, and the DNA eluted from them was treated with RNase and Proteinase K. DNA was then purified using a QIAquick PCR purification kit (QIAGEN), and libraries were prepared using NEXTflex barcoded adapters (Bio Scientific). The libraries were sequenced on Illumina NextSeq (Western General Hospital, Edinburgh). In all cases, 75 bp paired-end sequencing was performed. Our subsequent analyses were based on two replicates for all TALEs.

ChIP-seq data analysis

Sequencing data were mapped to the Tb427V12 genome build (Rabuffo et al., 2024) using Bowtie2 (version 2.4.2), with duplicate reads removed using SAMtools (Danecek et al., 2021). The default mode of Bowtie 2 was used, which searches for multiple alignments and reports the best one or, if several alignments are deemed equally good, reports one of those randomly. The peaks were identified using MACS2 (version 2.2.7.1) broad peak call. The ChIP samples were normalised to their respective inputs (ratio of ChIP to input reads) and the genome overview was generated using deepTools (Ramírez et al., 2016) with 5 bp sliding window.

Background enrichment calculation

The genome was divided into 50 bp sliding windows, and each window was annotated based on overlapping genomic features, including CIR147, 177 bp repeats, 70 bp repeats, and telomeric (TTAGGG)n repeats. Windows that did not overlap with any of these annotated repeat elements were defined as ‘background’ regions and used to establish the baseline ChIP-seq signal. Enrichment for each window was calculated using bamCompare, as log2(IP/Input). To adjust for background signal amongst all samples, enrichment values for each sample were further normalised against the corresponding No-YFP ChIP-seq dataset.

Affinity purification and LC-MS/MS proteomic analysis

As previously described (Staneva et al., 2021), cells, 3.5×108, were lysed per IP in the presence of 0.2% NP-40 and 150 mM KCl. Lysates were sonicated briefly (three cycles, 12 s ON, 12 s OFF) at a high setting in a Bioruptor (Diagenode) sonicator. The soluble and insoluble fractions were separated by centrifugation, and the soluble fraction was incubated for 1 hr at 4°C with beads cross-linked to mouse anti-GFP antibody (Roche 11814460001). The resulting immunoprecipitates were washed three times with lysis buffer, and protein was eluted with RapiGest SF Surfactant (Waters) for 15 min at 55°C. Next, filter-aided sample preparation (FASP) (Wiśniewski et al., 2009) was used to digest the protein samples for mass spectrometric analysis. Briefly, proteins were reduced with DTT and then denatured with 8 M urea in Vivacon spin (filter) column 30 K cartridges. Samples were alkylated with 0.05 M IAA and digested with 0.5 μg MS-grade Pierce trypsin protease (Thermo Fisher Scientific) overnight, desalted using stage tips (Rappsilber et al., 2007), and resuspended in 0.1% TFA for LC-MS/MS. Peptides were separated using RSLC Ultimate 3000 system (Thermo Fisher Scientific) fitted with an EasySpray column (50 cm; Thermo Fisher Scientific) using 2%–40%–95% nonlinear gradients with solvent A (0.1% formic acid) and solvent B (80% acetonitrile in 0.1% formic acid). The EasySpray column was directly coupled to an Orbitrap Fusion Lumos (Thermo Fisher Scientific) operated in DDA mode. ‘TopSpeed’ mode was used with 3 s cycles with standard settings to maximise identification rates: MS1 scan range 350–1500 mz, RF lens 30%, AGC target 4.0e5 with intensity threshold 5.0e3, filling time 50 ms and resolution 120,000, monoisotopic precursor selection, and filter for charge states 2–5.

HCD (27%) was selected as fragmentation mode. MS2 scans were performed using an ion trap mass analyser operated in rapid mode with AGC set to 2.0e4 and filling time to 50 ms. The dynamic exclusion was set at 60 s.

The MaxQuant software platform (Cox and Mann, 2008) version 1.6.1.0 was used to process the raw files, and search was conducted against T. brucei brucei complete/reference proteome (Uniprot – released in April 2019), using the Andromeda search engine (Cox et al., 2011). For the first search, peptide tolerance was set to 20 ppm, while for the main search, it was set to 4.5 ppm. The isotope mass tolerance was set to 2 ppm, with a maximum charge of 7. Digestion mode was set to ‘specific’ with trypsin, allowing a maximum of two missed cleavages. Carbamidomethylation of cysteine was set as a fixed modification. Oxidation of methionine was set as a variable modification. Label-free quantitation analysis was performed by employing the MaxLFQ algorithm as described by Cox et al., 2014. Absolute protein quantification was performed as described in Schwanhäusser et al., 2011. Peptide and protein identifications were filtered to 1% FDR. Statistical analysis and visualisation were performed using Perseus version 1.6.2.1 (Tyanova et al., 2016).

Acknowledgements

The authors thank Alison Pidoux for comments on the manuscript and assistance with images, Bungo Akiyoshi for comments and discussion, and Shaun Webb of the Wellcome Centre for Cell Biology and Discovery Research Platform for Hidden Cell Biology Bioinformatics Core for maintaining servers and pipelines for processing sequencing data. The authors also thank Julie Young for laboratory management support for trypanosome culture during this project. This work was funded by a UKRI/BBSRC EastBio PhD studentship supporting Tadhg Devlin (BB/M010996/1), an MRC Research Grant awarded to RCA and KRM and supporting RC (MR/T04702X/1), a Wellcome Investigator Award to KRM (221717), a Wellcome Principal Research Fellowship to RCA supporting RC and TA (200885; 224358), a Wellcome Instrument grant to JR (108504), and core funding for the Wellcome Centre for Cell Biology (203149) and subsequently the Wellcome funded Discovery Research Platform for Hidden Cell Biology DRP-HCB supporting CS (226791). For the purpose of Open Access, the authors have applied a CC BY public copyright licence to any Author Accepted Manuscript version arising from this submission.

Appendix 1

Appendix 1—key resources table.

Reagent type (species) or resource Designation Source or reference Identifiers Additional information
Gene (T. brucei) TRF gene TriTrypDB Tb927.11.1000 Telomere Repeat binding Factor
Gene (T. brucei) KKT2 gene TriTrypDB Tb927.10.12850 Kinetoplastid KineTochore protein 2
Strain background (T. brucei Lister 427) T. brucei Lister 427 Standard RRID:CVCL_K226 Bloodstream form, monomorphic
Cell line (T. brucei) T. brucei Lister 427 bloodstream form Standard laboratory strain RRID:CVCL_K226 Monomorphic strain, used as parental line
Cell line (T. brucei) YFP-TRF Akiyoshi and Gull, 2014.
Staneva et al., 2021
N/A T. brucei 427 expressing YFP-tagged TRF (Tb927.10.12850)
Cell line (T. brucei) YFP-KKT2 Akiyoshi and Gull, 2014.
Staneva et al., 2021
N/A T. brucei 427 expressing YFP-tagged KKT2 (Tb927.10.12850)
Cell line (T. brucei) TelR-TALE-YFP This paper N/A T. brucei 427 expressing TelR-TALE-YFP
Cell line (T. brucei) 70R-TALE-YFP This paper N/A T. brucei 427 expressing 70R-TALE-YFP
Cell line (T. brucei) 147R-TALE-YFP This paper N/A T. brucei 427 expressing 147R-TALE-YFP
Cell line (T. brucei) 177R-TALE-YFP This paper N/A T. brucei 427 expressing 177R-TALE-YFP
Cell line (T. brucei) ingiR-TALE-YFP This paper N/A T. brucei 427 expressing ingiR-TALE-YFP
Cell line (T. brucei) NonR-TALE-YFP This paper N/A T. brucei 427 expressing NonR-TALE-YFP
Transfected construct (E. coli) pTALE-TelR This paper De novo construct Plasmid to generate TelR-TALE-YFP cell line
Transfected construct (E. coli) pTALE-70R This paper De novo construct Plasmid to generate 70R-TALE-YFP cell line
Transfected construct (E. coli) pTALE-147R This paper De novo construct Plasmid to generate 147R-TALE-YFP cell line
Transfected construct (E. coli) pTALE-177R This paper De novo construct Plasmid to generate 177R-TALE-YFP cell line
Transfected construct (E. coli) pTALE-ingiR This paper De novo construct Plasmid to generate ingiR-TALE-YFP cell line
Transfected construct (E. coli) pTALE-NonR This paper De novo construct Plasmid to generate NonR-TALE-YFP cell line
Recombinant DNA reagent pPOTv4 TRF fusion PCR As in Staneva et al., 2021
N/A For endogenous YFP-tagging of TRF
Recombinant DNA reagent pPOTv4 KKT2 fusion PCR As in Staneva et al., 2021
N/A For endogenous YFP-tagging of KKT2
Recombinant DNA reagent pPOTv4 RPA2 fusion PCR As in Staneva et al., 2021 N/A For endogenous YFP-tagging of RPA2
Antibody Mouse anti-Ty1 (BB2) (Monoclonal) Thermo Fisher Scientific Cat# MA5-23513; RRID:AB_2610643 (1:5) for western blots
Antibody Mouse anti-GFP (Monoclonal) Roche Cat# 11814460001; RRID:AB_390913 Used for Affinity Purification (beads cross-linked)
Antibody Mouse anti-GFP (Monoclonal) Roche Cat# 11814460001;RRID:AB_390913 (1:1000) for western blots
Antibody Goat anti-mouse IgG (H+L) Alexa Fluor 568 Thermo Fisher Scientific Cat# A-11004; RRID:AB_2534072 (1:1000) for IF
Antibody Goat anti-mouse IgG (H+L) HRP-conjugated Thermo Fisher Scientific Cat# 31430; RRID:AB_228307 (1:5000) for western blot
Sequence-based reagent TelR-TALE target sequence This paper N/A Designed to bind AGGGTTAGGGTTAGG.
TALE in vivo truncation recognises AGGGTTAG
Sequence-based reagent 70R-TALE target sequence This paper N/A AGGAGAGTGTTGTGA
Sequence-based reagent 147R-TALE target sequence This paper N/A GCAGCGTTGTGCATG
Sequence-based reagent ingiR-TALE target sequence This paper N/A GCCGGCCACCTCAAC
Sequence-based reagent NonR-TALE target sequence This paper N/A GGAAGTATACCTGGC (no genomic match)
Sequence-based reagent TALE-PCR-LP (Primer) This paper N/A Forward primer for integration check (Figure 1—figure supplement 1)
Sequence-based reagent TALE-PCR-RP (Primer) This paper N/A Reverse primer for integration check (see Figure 1—figure supplement 1)
Peptide, recombinant protein Protein G Dynabeads Thermo Fisher Scientific Cat# 10004D Used for ChIP
Peptide, recombinant protein Trypsin Protease, MS Grade Thermo Fisher Scientific (Pierce) Cat# 90057 Used for protein digestion
Commercial assay or kit TALEN module kit Ding et al., 2013. N/A Used for TALE assembly
Commercial assay or kit NEXTflex barcoded adapters Bio Scientific N/A Used for library preparation
Commercial assay or kit NuPAGE Bis-Tris Mini Gels Thermo Fisher Scientific Cat# NP0321BOX For protein separation
Commercial assay or kit NuPAGE LDS Sample Buffer (4×) Thermo Fisher Scientific Cat# NP0007 For western blot sample preparation
Commercial assay or kit Amersham ECL Prime GE Healthcare Cat# RPN2232 Western blot detection
Commercial assay or kit Amersham Hyperfilm ECL GE Healthcare Cat# 28906839 Film for western blot visualisation
Commercial assay or kit Vivacon 500 (30 K MWCO) Sartorius Cat# VN01H22 Spin filters used for FASP protocol
Chemical compound, drug RapiGest SF Surfactant Waters Cat# 186001861 Used for protein elution in AP-MS
Chemical compound, drug HMI-9 medium Standard N/A For T. brucei culture
Chemical compound, drug Fetal Calf Serum (FCS) Gibco (Thermo Fisher Scientific) Cat# 10500064 10% supplement for HMI-9
Chemical compound, drug Ponceau S Sigma-Aldrich Cat# P3504 Membrane staining
Chemical compound, drug Dithiothreitol (DTT) Sigma-Aldrich (or similar) Cat# D0632 Reducing agent
Chemical compound, drug Iodoacetamide (IAA) Sigma-Aldrich (or similar) Cat# I1149 Alkylating agent
Chemical compound, drug Urea Sigma-Aldrich (or similar) Cat# U5378 Denaturing agent (8 M)
Chemical compound, drug Bovine Serum Albumin (BSA) Sigma-Aldrich (or similar) N/A Blocking agent (2%) for IF
Chemical compound, drug Triton X-100 Sigma-Aldrich (or similar) N/A Permeabilisation (0.1%) for IF
Chemical compound, drug Paraformaldehyde Sigma-Aldrich (or similar) N/A Fixation (4%) for IF
Chemical compound, drug Blasticidin S InvivoGen or similar Cat# ant-bl-1 (10 µg/ml)
Used for TALE-YFP selection
Software, algorithm MaxQuant Cox and Mann, 2008 RRID:SCR_014485 v.2.0.3.0 used for proteomic analysis
Software, algorithm Perseus Tyanova et al., 2016 RRID:SCR_015753 v.1.6.15.0 used for proteomic statistical analysis
Software, algorithm Bowtie2 Langmead and Salzberg, 2012
RRID:SCR_016368 v.2.4.2 used for ChIP-seq alignment
Software, algorithm MACS2 Zhang et al., 2008 v.2.2.7.1 used for peak calling
Software, Algorithm SAMtools Danecek et al., 2021 RRID:SCR_002105 Used for removing duplicate reads
Software, algorithm deepTools Ramírez et al., 2016 RRID:SCR_016366 Used for genome overview
Other Orbitrap Fusion Lumos Thermo Fisher Scientific N/A Mass spectrometer used
Other Zeiss Axio Imager Zeiss N/A Microscope used for imaging
Other Illumina NextSeq 500/550 Illumina N/A Used for ChIP-seq library sequencing
Other
(software, algorithm)
TriTrypDB https://tritrypdb.org RRID:SCR_007043 Database used for T. brucei gene identifiers

Funding Statement

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication. For the purpose of Open Access, the authors have applied a CC BY public copyright license to any Author Accepted Manuscript version arising from this submission.

Contributor Information

Keith R Matthews, Email: keith.matthews@ed.ac.uk.

Robin C Allshire, Email: robin.allshire@ed.ac.uk.

Yamini Dalal, National Cancer Institute, Bethesda, United States.

Yamini Dalal, National Cancer Institute, Bethesda, United States.

Funding Information

This paper was supported by the following grants:

  • Biotechnology and Biological Sciences Research Council BB/M010996/1 to Tadhg Devlin.

  • Medical Research Council MR/T04702X/1 to Keith R Matthews, Robin C Allshire.

  • Wellcome 10.35802/221717 to Keith R Matthews.

  • Wellcome 10.35802/200885 to Robin C Allshire.

  • Wellcome 10.35802/224358 to Robin C Allshire.

  • Wellcome 10.35802/108504 to Juri Rappsilber.

  • Wellcome 10.35802/203149 to Robin C Allshire.

  • Wellcome 10.35802/226791 to Robin C Allshire.

Additional information

Competing interests

No competing interests declared.

Author contributions

Conceptualization, Formal analysis, Investigation, Visualization, Methodology, Writing – original draft, Writing – review and editing.

Conceptualization, Formal analysis, Investigation, Methodology, Writing – original draft.

Data curation, Formal analysis, Investigation, Visualization, Methodology, Writing – original draft, Writing – review and editing.

Data curation, Formal analysis, Investigation, Visualization, Methodology.

Formal analysis, Methodology.

Funding acquisition.

Conceptualization, Supervision, Funding acquisition, Visualization, Writing – original draft, Project administration, Writing – review and editing.

Conceptualization, Supervision, Funding acquisition, Visualization, Writing – original draft, Project administration, Writing – review and editing.

Additional files

Supplementary file 1. Proteomics analyses comparing protein enrichments in the indicated affinity selections a-to-q.

(a) Affinity selection data for wild-type cells expressing No YFP versus cells expressing YFP-TRF (WT NoYFP vs. YFP-TRF). Proteins enriched in YFP-TRF affinity selections were identified and quantified by LC-MS/MS analysis relative to affinity selections from wild-type cells lacking any YFP as a negative control. (b) Affinity selection data for wild-type cells expressing No YFP versus cells expressing TelR-TALE (WT NoYFP vs. TelR-TALE). Proteins enriched in TelR-TALE affinity selections were identified and quantified by LC-MS/MS analysis relative to affinity selections from wild-type cells lacking any YFP as a negative control. (c) Affinity selection data for wild-type cells expressing No YFP versus cells expressing NonR-TALE (WT NoYFP vs. NonR-TALE). Proteins enriched in NonR-TALE affinity selections were identified and quantified by LC-MS/MS analysis relative to affinity selections from wild-type cells lacking any YFP as a negative control. (d) Affinity selection data for cells expressing NonR-TALE versus cells expressing TelR-TALE (NonR-TALE vs. TelR-TALE). Proteins enriched in TelR-TALE affinity selections were identified and quantified by LC-MS/MS analysis relative to affinity selections of NonR-TALE. (e) Affinity selection data for wild-type cells expressing No YFP versus cells expressing 147R-TALE (WT NoYFP vs. 147R-TALE). Proteins enriched in 147R-TALE affinity selections were identified and quantified by LC-MS/MS analysis relative to affinity selections from wild-type cells lacking any YFP as a negative control. (f) Affinity selection data for cells expressing NonR-TALE versus cells expressing 147R-TALE (NonR-TALE vs. TelR-TALE). Proteins enriched in 147R-TALE affinity selections were identified and quantified by LC-MS/MS analysis relative to affinity selections of NonR-TALE. (g) Affinity selection data for wild-type cells expressing No YFP versus cells expressing ingiR-TALE (WT NoYFP vs. ingiR-TALE). Proteins enriched in ingiR-TALE affinity selections were identified and quantified by LC-MS/MS analysis relative to affinity selections from wild-type cells lacking any YFP as a negative control. (h) Affinity selection data for cells expressing NonR-TALE versus cells expressing ingiR-TALE (NonR-TALE vs. ingiR-TALE). Proteins enriched in ingiR-TALE affinity selections were identified and quantified by LC-MS/MS analysis relative to affinity selections of NonR-TALE. (i) Affinity selection data for wild-type cells expressing No YFP versus cells expressing 70R-TALE (WT NoYFP vs. 70R-TALE). Proteins enriched in 70R-TALE affinity selections were identified and quantified by LC-MS/MS analysis relative to affinity selections from wild-type cells lacking any YFP as a negative control. (j) Affinity selection data for wild-type cells expressing No YFP versus cells expressing YFP-RPA2 (WT NoYFP vs. YFP-RPA2). Proteins enriched in YFP-RPA2 affinity selections were identified and quantified by LC-MS/MS analysis relative to affinity selections from wild-type cells lacking any YFP as a negative control. (k) Affinity selection data for cells expressing NonR-TALE versus cells expressing 70R-TALE (NonR-TALE vs. 70R-TALE). Proteins enriched in 70R-TALE affinity selections were identified and quantified by LC-MS/MS analysis relative to affinity selections of NonR-TALE. (l) Affinity selection data for wild-type cells expressing No YFP versus cells expressing 177R-TALE (WT NoYFP vs. 177R-TALE). Proteins enriched in 177R-TALE affinity selections were identified and quantified by LC-MS/MS analysis relative to affinity selections from wild-type cells lacking any YFP as a negative control. (m) Affinity selection data for cells expressing NonR-TALE versus cells expressing 177R-TALE (NonR-TALE vs. 177R-TALE). Proteins enriched in 177R-TALE affinity selections were identified and quantified by LC-MS/MS analysis relative to affinity selections of NonR-TALE. (n) Affinity selection data for wild-type cells expressing No YFP versus cells expressing YFP-RPA2 (WT NoYFP vs. YFP-KKT2). Proteins enriched in YFP-KKT2 affinity selections were identified and quantified by LC-MS/MS analysis relative to affinity selections from wild-type cells lacking any YFP as a negative control. (o) Affinity selection data for cells expressing ingiR-TALE versus cells expressing TelR-TALE (ingiR-TALE vs. TelR-TALE). Proteins enriched in TelR-TALE affinity selections were identified and quantified by LC-MS/MS analysis relative to affinity selections of ingiR-TALE. (p) Affinity selection data for cells expressing ingiR-TALE versus cells expressing 70R-TALE (ingiR-TALE vs. 70R-TALE) Proteins enriched in 70R-TALE affinity selections were identified and quantified by LC-MS/MS analysis relative to affinity selections of ingiR-TALE. SuppFile1p_ingiR-TALEv70R-TALE (q) Affinity selection data for cells expressing ingiR-TALE versus cells expressing 177R-TALE (ingiR-TALE vs. 70R-TALE) Proteins enriched in 177R-TALE affinity selections were identified and quantified by LC-MS/MS analysis relative to affinity selections of ingiR-TALE. SuppFile1q_ingiR-TALEv177R-TALE.

MDAR checklist

Data availability

Sequence Data: All NGS ChIP-seq data generated have been submitted to and will be available under an accession number at the NCBI Gene Expression Omnibus (GEO; https://www.ncbi.nlm.nih.gov/geo/). The GEO accession number for ChIP-seq data is: GSE295698. Proteomics Data: All LC-MS/MS proteomics data generated are available on the Proteomics Identification Database (PRIDE; https://www.ebi.ac.uk/pride/archive/projects/PXD063130) with accession number PXD063130.

The following datasets were generated:

Carloni R, Devlin T, Tong P, Auchynnikava T, Spanos CR, Rappsilber J, Matthews KR, Allshire RC. 2025. Defining the chromatin-associated protein landscapes on Trypanosoma brucei repetitive elements using synthetic TALE proteins. NCBI Gene Expression Omnibus. GSE295698

Carloni R, Devlin T, Tong P, Auchynnikava T, Spanos CR, Rappsilber J, Matthews KR, Allshire RC. 2025. Defining the chromatin-associated protein landscapes on Trypanosoma brucei repetitive elements using synthetic TALE proteins. PRIDE. PXD063130

References

  1. Akiyoshi B, Gull K. Discovery of unconventional kinetochores in kinetoplastids. Cell. 2014;156:1247–1258. doi: 10.1016/j.cell.2014.01.049. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Allshire RC, Karpen GH. Epigenetic regulation of centromeric chromatin: old dogs, new tricks? Nature Reviews. Genetics. 2008;9:923–937. doi: 10.1038/nrg2466. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Allshire RC, Madhani HD. Ten principles of heterochromatin formation and function. Nature Reviews. Molecular Cell Biology. 2018;19:229–244. doi: 10.1038/nrm.2017.119. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Alsford S, Horn D. Trypanosomatid histones. Molecular Microbiology. 2004;53:365–372. doi: 10.1111/j.1365-2958.2004.04151.x. [DOI] [PubMed] [Google Scholar]
  5. Ballmer D, Carter W, van Hooff JJE, Tromer EC, Ishii M, Ludzia P, Akiyoshi B. Kinetoplastid kinetochore proteins KKT14-KKT15 are divergent Bub1/BubR1-Bub3 proteins. Open Biology. 2024;14:240025. doi: 10.1098/rsob.240025. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Barcons-Simon A, Carrington M, Siegel TN. Decoding the impact of nuclear organization on antigenic variation in parasites. Nature Microbiology. 2023;8:1408–1418. doi: 10.1038/s41564-023-01424-9. [DOI] [PubMed] [Google Scholar]
  7. Bizhanova A, Kaufman PD. Close to the edge: Heterochromatin at the nucleolar and nuclear peripheries. Biochimica et Biophysica Acta. Gene Regulatory Mechanisms. 2021;1864:194666. doi: 10.1016/j.bbagrm.2020.194666. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Blackburn EH, Challoner PB. Identification of a telomeric DNA sequence in Trypanosoma brucei. Cell. 1984;36:447–457. doi: 10.1016/0092-8674(84)90238-1. [DOI] [PubMed] [Google Scholar]
  9. Boothroyd CE, Dreesen O, Leonova T, Ly KI, Figueiredo LM, Cross GAM, Papavasiliou FN. A yeast-endonuclease-generated DNA break induces antigenic switching in Trypanosoma brucei. Nature. 2009;459:278–281. doi: 10.1038/nature07982. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Bourque G, Burns KH, Gehring M, Gorbunova V, Seluanov A, Hammell M, Imbeault M, Izsvák Z, Levin HL, Macfarlan TS, Mager DL, Feschotte C. Ten things you should know about transposable elements. Genome Biology. 2018;19:199. doi: 10.1186/s13059-018-1577-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Brahma S, Henikoff S. RSC-Associated Subnucleosomes define MNase-Sensitive promoters in yeast. Molecular Cell. 2019;73:238–249. doi: 10.1016/j.molcel.2018.10.046. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Bringaud F, Müller M, Cerqueira GC, Smith M, Rochette A, El-Sayed NMA, Papadopoulou B, Ghedin E. Members of a large retroposon family are determinants of post-transcriptional gene expression in Leishmania. PLOS Pathogens. 2007;3:0136. doi: 10.1371/journal.ppat.0030136. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Bringaud F, Ghedin E, El-Sayed NMA, Papadopoulou B. Role of transposable elements in trypanosomatids. Microbes and Infection. 2008;10:575–581. doi: 10.1016/j.micinf.2008.02.009. [DOI] [PubMed] [Google Scholar]
  14. Bringaud F, Berriman M, Hertz-Fowler C. Trypanosomatid genomes contain several subfamilies of ingi-related retroposons. Eukaryotic Cell. 2009;8:1532–1542. doi: 10.1128/EC.00183-09. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Cavalier-Smith T. Kingdoms Protozoa and Chromista and the eozoan root of the eukaryotic tree. Biology Letters. 2010;6:342–345. doi: 10.1098/rsbl.2009.0948. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Cosentino RO, Brink BG, Siegel TN. Allele-specific assembly of a eukaryotic genome corrects apparent frameshifts and reveals a lack of nonsense-mediated mRNA decay. NAR Genomics and Bioinformatics. 2021;3:lqab082. doi: 10.1093/nargab/lqab082. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Cox J, Mann M. MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification. Nature Biotechnology. 2008;26:1367–1372. doi: 10.1038/nbt.1511. [DOI] [PubMed] [Google Scholar]
  18. Cox J, Neuhauser N, Michalski A, Scheltema RA, Olsen JV, Mann M. Andromeda: a peptide search engine integrated into the MaxQuant environment. Journal of Proteome Research. 2011;10:1794–1805. doi: 10.1021/pr101065j. [DOI] [PubMed] [Google Scholar]
  19. Cox J, Hein MY, Luber CA, Paron I, Nagaraj N, Mann M. Accurate proteome-wide label-free quantification by delayed normalization and maximal peptide ratio extraction, termed MaxLFQ. Molecular & Cellular Proteomics. 2014;13:2513–2526. doi: 10.1074/mcp.M113.031591. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Danecek P, Bonfield JK, Liddle J, Marshall J, Ohan V, Pollard MO, Whitwham A, Keane T, McCarthy SA, Davies RM, Li H. Twelve years of SAMtools and BCFtools. GigaScience. 2021;10:giab008. doi: 10.1093/gigascience/giab008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. D’Archivio S, Wickstead B. Trypanosome outer kinetochore proteins suggest conservation of chromosome segregation machinery across eukaryotes. The Journal of Cell Biology. 2017;216:379–391. doi: 10.1083/jcb.201608043. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Deák G, Wapenaar H, Sandoval G, Chen R, Taylor MRD, Burdett H, Watson JA, Tuijtel MW, Webb S, Wilson MD. Histone divergence in trypanosomes results in unique alterations to nucleosome structure. Nucleic Acids Research. 2023;51:7882–7899. doi: 10.1093/nar/gkad577. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Dean S, Sunter J, Wheeler RJ, Hodkinson I, Gluenz E, Gull K. A toolkit enabling efficient, scalable and reproducible gene tagging in trypanosomatids. Open Biology. 2015;5:140197. doi: 10.1098/rsob.140197. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. de Lima LP, Poubel SB, Yuan Z-F, Rosón JN, de L Vitorino FN, Holetz FB, Garcia BA, da Cunha JPC. Improvements on the quantitative analysis of Trypanosoma cruzi histone post translational modifications: Study of changes in epigenetic marks through the parasite’s metacyclogenesis and life cycle. Journal of Proteomics. 2020;225:103847. doi: 10.1016/j.jprot.2020.103847. [DOI] [PubMed] [Google Scholar]
  25. Denninger V, Rudenko G. FACT plays a major role in histone dynamics affecting VSG expression site control in Trypanosoma brucei. Molecular Microbiology. 2014;94:945–962. doi: 10.1111/mmi.12812. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Ding Q, Lee YK, Schaefer EAK, Peters DT, Veres A, Kim K, Kuperwasser N, Motola DL, Meissner TB, Hendriks WT, Trevisan M, Gupta RM, Moisan A, Banks E, Friesen M, Schinzel RT, Xia F, Tang A, Xia Y, Figueroa E, Wann A, Ahfeldt T, Daheron L, Zhang F, Rubin LL, Peng LF, Chung RT, Musunuru K, Cowan CA. A TALEN genome-editing system for generating human stem cell-based disease models. Cell Stem Cell. 2013;12:238–251. doi: 10.1016/j.stem.2012.11.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Echeverry MC, Bot C, Obado SO, Taylor MC, Kelly JM. Centromere-associated repeat arrays on Trypanosoma brucei chromosomes are much more extensive than predicted. BMC Genomics. 2012;13:29. doi: 10.1186/1471-2164-13-29. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Elias M, Vargas NS, Zingales B, Schenkman S. Organization of satellite DNA in the genome of Trypanosoma cruzi. Molecular and Biochemical Parasitology. 2003;129:1–9. doi: 10.1016/s0166-6851(03)00054-9. [DOI] [PubMed] [Google Scholar]
  29. Ersfeld K, Gull K. Partitioning of large and minichromosomes in Trypanosoma brucei. Science. 1997;276:611–614. doi: 10.1126/science.276.5312.611. [DOI] [PubMed] [Google Scholar]
  30. Ersfeld K. Nuclear architecture, genome and chromatin organisation in Trypanosoma brucei. Research in Microbiology. 2011;162:626–636. doi: 10.1016/j.resmic.2011.01.014. [DOI] [PubMed] [Google Scholar]
  31. Feschotte C. Transposable elements and the evolution of regulatory networks. Nature Reviews Genetics. 2008;9:397–405. doi: 10.1038/nrg2337. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Figueiredo LM, Cross GAM, Janzen CJ. Epigenetic regulation in African trypanosomes: a new kid on the block. Nature Reviews. Microbiology. 2009;7:504–513. doi: 10.1038/nrmicro2149. [DOI] [PubMed] [Google Scholar]
  33. Fueyo R, Judd J, Feschotte C, Wysocka J. Roles of transposable elements in the regulation of mammalian transcription. Nature Reviews. Molecular Cell Biology. 2022;23:481–497. doi: 10.1038/s41580-022-00457-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Gao XD, Tu LC, Mir A, Rodriguez T, Ding Y, Leszyk J, Dekker J, Shaffer SA, Zhu LJ, Wolfe SA, Sontheimer EJ. C-BERST: defining subnuclear proteomic landscapes at genomic elements with dCas9-APEX2. Nature Methods. 2018;15:433–436. doi: 10.1038/s41592-018-0006-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Gauchier M, van Mierlo G, Vermeulen M, Déjardin J. Purification and enrichment of specific chromatin loci. Nature Methods. 2020;17:380–389. doi: 10.1038/s41592-020-0765-4. [DOI] [PubMed] [Google Scholar]
  36. Glover L, Alsford S, Horn D. DNA break site at fragile subtelomeres determines probability and mechanism of antigenic variation in African trypanosomes. PLOS Pathogens. 2013;9:e1003260. doi: 10.1371/journal.ppat.1003260. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Glover L, Marques CA, Suska O, Horn D. Persistent DNA damage foci and DNA replication with a broken chromosome in the African Trypanosome. mBio. 2019;10:e01252-19. doi: 10.1128/mBio.01252-19. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Hertz-Fowler C, Figueiredo LM, Quail MA, Becker M, Jackson A, Bason N, Brooks K, Churcher C, Fahkro S, Goodhead I, Heath P, Kartvelishvili M, Mungall K, Harris D, Hauser H, Sanders M, Saunders D, Seeger K, Sharp S, Taylor JE, Walker D, White B, Young R, Cross GAM, Rudenko G, Barry JD, Louis EJ, Berriman M. Telomeric expression sites are highly conserved in Trypanosoma brucei. PLOS ONE. 2008;3:e3527. doi: 10.1371/journal.pone.0003527. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Hirumi H, Hirumi K. Continuous cultivation of Trypanosoma brucei blood stream forms in a medium containing a low concentration of serum protein without feeder cell layers. The Journal of Parasitology. 1989;75:985–989. doi: 10.2307/3282883. [DOI] [PubMed] [Google Scholar]
  40. Hovel-Miner G, Mugnier MR, Goldwater B, Cross GAM, Papavasiliou FN. A conserved DNA repeat promotes selection of a diverse repertoire of Trypanosoma brucei surface antigens from the genomic archive. PLOS Genetics. 2016;12:e1005994. doi: 10.1371/journal.pgen.1005994. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Kazazian HH. Mobile elements: drivers of genome evolution. Science. 2004;303:1626–1632. doi: 10.1126/science.1089670. [DOI] [PubMed] [Google Scholar]
  42. Kim HS, Cross GAM. TOPO3alpha influences antigenic variation by monitoring expression-site-associated VSG switching in Trypanosoma brucei. PLOS Pathogens. 2010;6:e1000992. doi: 10.1371/journal.ppat.1000992. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Kraus AJ, Vanselow JT, Lamer S, Brink BG, Schlosser A, Siegel TN. Distinct roles for H4 and H2A.Z acetylation in RNA transcription in African trypanosomes. Nature Communications. 2020;11:1498. doi: 10.1038/s41467-020-15274-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nature Methods. 2012;9:357–359. doi: 10.1038/nmeth.1923. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Leal AZ, Schwebs M, Briggs E, Weisert N, Reis H, Lemgruber L, Luko K, Wilkes J, Butter F, McCulloch R, Janzen CJ. Genome maintenance functions of a putative Trypanosoma brucei translesion DNA polymerase include telomere association and a role in antigenic variation. Nucleic Acids Research. 2020;48:9660–9680. doi: 10.1093/nar/gkaa686. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Li B. Keeping balance between genetic stability and plasticity at the telomere and subtelomere of Trypanosoma brucei. Frontiers in Cell and Developmental Biology. 2021;9:699639. doi: 10.3389/fcell.2021.699639. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Li B. Telomere maintenance in African trypanosomes. Frontiers in Molecular Biosciences. 2023;10:1302557. doi: 10.3389/fmolb.2023.1302557. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. MacGregor P, Matthews KR. Identification of the regulatory elements controlling the transmission stage-specific gene expression of PAD1 in Trypanosoma brucei. Nucleic Acids Research. 2012;40:7705–7717. doi: 10.1093/nar/gks533. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Marchetti MA, Tschudi C, Kwon H, Wolin SL, Ullu E. Import of proteins into the trypanosome nucleus and their distribution at karyokinesis. Journal of Cell Science. 2000;113:899–906. doi: 10.1242/jcs.113.5.899. [DOI] [PubMed] [Google Scholar]
  50. Maree JP, Tvardovskiy A, Ravnsborg T, Jensen ON, Rudenko G, Patterton HG. Trypanosoma brucei histones are heavily modified with combinatorial post-translational modifications and mark Pol II transcription start regions with hyperacetylated H2A. Nucleic Acids Research. 2022;50:9705–9723. doi: 10.1093/nar/gkac759. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Miga KH, Alexandrov IA. Variation and evolution of human centromeres: a field guide and perspective. Annual Review of Genetics. 2021;55:583–602. doi: 10.1146/annurev-genet-071719-020519. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Moore R, Chandrahas A, Bleris L. Transcription activator-like effectors: a toolkit for synthetic biology. ACS Synthetic Biology. 2014;3:708–716. doi: 10.1021/sb400137b. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Morrison LJ, Steketee PC, Tettey MD, Matthews KR. Pathogenicity and virulence of African trypanosomes: From laboratory models to clinically relevant hosts. Virulence. 2023;14:2150445. doi: 10.1080/21505594.2022.2150445. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Mugnier MR, Cross GAM, Papavasiliou FN. The in vivo dynamics of antigenic variation in Trypanosoma brucei. Science. 2015;347:1470–1473. doi: 10.1126/science.aaa4502. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Müller LSM, Cosentino RO, Förstner KU, Guizetti J, Wedel C, Kaplan N, Janzen CJ, Arampatzi P, Vogel J, Steinbiss S, Otto TD, Saliba AE, Sebra RP, Siegel TN. Genome organization and DNA accessibility control antigenic variation in trypanosomes. Nature. 2018;563:121–125. doi: 10.1038/s41586-018-0619-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Myers SA, Wright J, Peckner R, Kalish BT, Zhang F, Carr SA. Discovery of proteins associated with a predefined genomic locus via dCas9-APEX-mediated proximity labeling. Nature Methods. 2018;15:437–439. doi: 10.1038/s41592-018-0007-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Nerusheva OO, Ludzia P, Akiyoshi B. Identification of four unconventional kinetoplastid kinetochore proteins KKT22-25 in Trypanosoma brucei. Open Biology. 2019;9:190236. doi: 10.1098/rsob.190236. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Obado SO, Taylor MC, Wilkinson SR, Bromley EV, Kelly JM. Functional mapping of a trypanosome centromere by chromosome fragmentation identifies a 16-kb GC-rich transcriptional “strand-switch” domain as a major feature. Genome Research. 2005;15:36–43. doi: 10.1101/gr.2895105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Obado SO, Bot C, Nilsson D, Andersson B, Kelly JM. Repetitive DNA is associated with centromeric domains in Trypanosoma brucei but not Trypanosoma cruzi. Genome Biology. 2007;8:R37. doi: 10.1186/gb-2007-8-3-r37. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Patrick KL, Shi H, Kolev NG, Ersfeld K, Tschudi C, Ullu E. Distinct and overlapping roles for two Dicer-like proteins in the RNA interference pathways of the ancient eukaryote Trypanosoma brucei. PNAS. 2009;106:17933–17938. doi: 10.1073/pnas.0907766106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Pfeiffer V, Lingner J. Replication of telomeres and the regulation of telomerase. Cold Spring Harbor Perspectives in Biology. 2013;5:a010405. doi: 10.1101/cshperspect.a010405. [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Picchi GFA, Zulkievicz V, Krieger MA, Zanchin NT, Goldenberg S, de Godoy LMF. Post-translational modifications of Trypanosoma cruzi canonical and variant histones. Journal of Proteome Research. 2017;16:1167–1179. doi: 10.1021/acs.jproteome.6b00655. [DOI] [PubMed] [Google Scholar]
  63. Pyrih J, Hammond M, Alves A, Dean S, Sunter JD, Wheeler RJ, Gull K, Lukeš J. Comprehensive sub-mitochondrial protein map of the parasitic protist Trypanosoma brucei defines critical features of organellar biology. Cell Reports. 2023;42:113083. doi: 10.1016/j.celrep.2023.113083. [DOI] [PubMed] [Google Scholar]
  64. Rabuffo C, Schmidt MR, Yadav P, Tong P, Carloni R, Barcons-Simon A, Cosentino RO, Krebs S, Matthews KR, Allshire RC, Siegel TN. Inter-chromosomal transcription hubs shape the 3D genome architecture of African trypanosomes. Nature Communications. 2024;15:10716. doi: 10.1038/s41467-024-55285-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. Ramírez F, Ryan DP, Grüning B, Bhardwaj V, Kilpert F, Richter AS, Heyne S, Dündar F, Manke T. deepTools2: a next generation web server for deep-sequencing data analysis. Nucleic Acids Research. 2016;44:W160–W165. doi: 10.1093/nar/gkw257. [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. Rappsilber J, Mann M, Ishihama Y. Protocol for micro-purification, enrichment, pre-fractionation and storage of peptides for proteomics using StageTips. Nature Protocols. 2007;2:1896–1906. doi: 10.1038/nprot.2007.261. [DOI] [PubMed] [Google Scholar]
  67. Reis H, Schwebs M, Dietz S, Janzen CJ, Butter F. TelAP1 links telomere complexes with developmental expression site silencing in African trypanosomes. Nucleic Acids Research. 2018;46:2820–2833. doi: 10.1093/nar/gky028. [DOI] [PMC free article] [PubMed] [Google Scholar]
  68. Rico E, Jeacock L, Kovářová J, Horn D. Inducible high-efficiency CRISPR-Cas9-targeted gene editing and precision base editing in African trypanosomes. Scientific Reports. 2018;8:7960. doi: 10.1038/s41598-018-26303-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  69. Saha A, Gaurav AK, Pandya UM, Afrin M, Sandhu R, Nanavaty V, Schnur B, Li B. TbTRF suppresses the TERRA level and regulates the cell cycle-dependent TERRA foci number with a TERRA binding activity in its C-terminal Myb domain. Nucleic Acids Research. 2021;49:5637–5653. doi: 10.1093/nar/gkab401. [DOI] [PMC free article] [PubMed] [Google Scholar]
  70. Sandhu R, Li B. Telomerase activity is required for the telomere G-overhang structure in Trypanosoma brucei. Scientific Reports. 2017;7:15983. doi: 10.1038/s41598-017-16182-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. Schumann Burkard G, Jutzi P, Roditi I. Genome-wide RNAi screens in bloodstream form trypanosomes identify drug transporters. Molecular and Biochemical Parasitology. 2011;175:91–94. doi: 10.1016/j.molbiopara.2010.09.002. [DOI] [PubMed] [Google Scholar]
  72. Schwanhäusser B, Busse D, Li N, Dittmar G, Schuchhardt J, Wolf J, Chen W, Selbach M. Global quantification of mammalian gene expression control. Nature. 2011;473:337–342. doi: 10.1038/nature10098. [DOI] [PubMed] [Google Scholar]
  73. Skene PJ, Henikoff S. An efficient targeted nuclease strategy for high-resolution mapping of DNA binding sites. eLife. 2017;6:e21856. doi: 10.7554/eLife.21856. [DOI] [PMC free article] [PubMed] [Google Scholar]
  74. Sloof P, Menke HH, Caspers MPM, Borst P. Size fractionation of Trypanosoma brucei DNA: localization of the 177-bp repeat satellite DNA and a variant surface glycoprotein gene in a mini-chromosomal DNA fraction. Nucleic Acids Research. 1983;11:3889–3901. doi: 10.1093/nar/11.12.3889. [DOI] [PMC free article] [PubMed] [Google Scholar]
  75. Slotkin RK, Martienssen R. Transposable elements and the epigenetic regulation of the genome. Nature Reviews. Genetics. 2007;8:272–285. doi: 10.1038/nrg2072. [DOI] [PubMed] [Google Scholar]
  76. Staneva DP, Carloni R, Auchynnikava T, Tong P, Rappsilber J, Jeyaprakash AA, Matthews KR, Allshire RC. A systematic analysis of Trypanosoma brucei chromatin factors identifies novel protein interaction networks associated with sites of transcription initiation and termination. Genome Research. 2021;31:2138–2154. doi: 10.1101/gr.275368.121. [DOI] [PMC free article] [PubMed] [Google Scholar]
  77. Stuart K, Brun R, Croft S, Fairlamb A, Gürtler RE, McKerrow J, Reed S, Tarleton R. Kinetoplastids: related protozoan pathogens, different diseases. The Journal of Clinical Investigation. 2008;118:1301–1310. doi: 10.1172/JCI33945. [DOI] [PMC free article] [PubMed] [Google Scholar]
  78. Sullivan LL, Sullivan BA. Genomic and functional variation of human centromeres. Experimental Cell Research. 2020;389:111896. doi: 10.1016/j.yexcr.2020.111896. [DOI] [PMC free article] [PubMed] [Google Scholar]
  79. Talbert PB, Henikoff S. What makes a centromere? Experimental Cell Research. 2020;389:111895. doi: 10.1016/j.yexcr.2020.111895. [DOI] [PubMed] [Google Scholar]
  80. Thakur J, Packiaraj J, Henikoff S. Sequence, chromatin and evolution of satellite DNA. International Journal of Molecular Sciences. 2021;22:4309. doi: 10.3390/ijms22094309. [DOI] [PMC free article] [PubMed] [Google Scholar]
  81. Thivolle A, Mehnert AK, Tihon E, McLaughlin E, Dujeancourt-Henry A, Glover L. DNA double strand break position leads to distinct gene expression changes and regulates VSG switching pathway choice. PLOS Pathogens. 2021;17:e1010038. doi: 10.1371/journal.ppat.1010038. [DOI] [PMC free article] [PubMed] [Google Scholar]
  82. Trenaman A, Glover L, Hutchinson S, Horn D. A post-transcriptional respiratome regulon in trypanosomes. Nucleic Acids Research. 2019;47:7063–7077. doi: 10.1093/nar/gkz455. [DOI] [PMC free article] [PubMed] [Google Scholar]
  83. Tschudi C, Shi H, Franklin JB, Ullu E. Small interfering RNA-producing loci in the ancient parasitic eukaryote Trypanosoma brucei. BMC Genomics. 2012;13:427. doi: 10.1186/1471-2164-13-427. [DOI] [PMC free article] [PubMed] [Google Scholar]
  84. Tyanova S, Temu T, Cox J. The MaxQuant computational platform for mass spectrometry-based shotgun proteomics. Nature Protocols. 2016;11:2301–2319. doi: 10.1038/nprot.2016.136. [DOI] [PubMed] [Google Scholar]
  85. Van der Ploeg LHT, Liu AYC, Borst P. Structure of the growing telomeres of trypanosomes. Cell. 1984;36:459–468. doi: 10.1016/0092-8674(84)90239-3. [DOI] [PubMed] [Google Scholar]
  86. van Steensel B, Belmont AS. Lamina-associated domains: links with chromosome architecture, heterochromatin, and gene repression. Cell. 2017;169:780–791. doi: 10.1016/j.cell.2017.04.022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  87. Vasquez JJ, Wedel C, Cosentino RO, Siegel TN. Exploiting CRISPR-Cas9 technology to investigate individual histone modifications. Nucleic Acids Research. 2018;46:E106. doi: 10.1093/nar/gky517. [DOI] [PMC free article] [PubMed] [Google Scholar]
  88. Weisert N, Majewski V, Hartleb L, Luko K, Lototska L, Krapoth NC, Ulrich HD, Janzen CJ, Butter F. TelAP2 links TelAP1 to the telomere complex in Trypanosoma brucei. Scientific Reports. 2024;14:30493. doi: 10.1038/s41598-024-81972-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  89. Wickstead B, Ersfeld K, Gull K. The mitotic stability of the minichromosomes of Trypanosoma brucei. Molecular and Biochemical Parasitology. 2003;132:97–100. doi: 10.1016/j.molbiopara.2003.08.007. [DOI] [PubMed] [Google Scholar]
  90. Wickstead B, Ersfeld K, Gull K. The small chromosomes of Trypanosoma brucei involved in antigenic variation are constructed around repetitive palindromes. Genome Research. 2004;14:1014–1024. doi: 10.1101/gr.2227704. [DOI] [PMC free article] [PubMed] [Google Scholar]
  91. Wiśniewski JR, Zougman A, Nagaraj N, Mann M. Universal sample preparation method for proteome analysis. Nature Methods. 2009;6:359–362. doi: 10.1038/nmeth.1322. [DOI] [PubMed] [Google Scholar]
  92. Yatskevich S, Barford D, Muir KW. Conserved and divergent mechanisms of inner kinetochore assembly onto centromeric chromatin. Current Opinion in Structural Biology. 2023;81:102638. doi: 10.1016/j.sbi.2023.102638. [DOI] [PubMed] [Google Scholar]
  93. Zhang Y, Liu T, Meyer CA, Eeckhoute J, Johnson DS, Bernstein BE, Nusbaum C, Myers RM, Brown M, Li W, Liu XS. Model-based analysis of ChIP-Seq (MACS) Genome Biology. 2008;9:R137. doi: 10.1186/gb-2008-9-9-r137. [DOI] [PMC free article] [PubMed] [Google Scholar]
  94. Zhou Q, Pham KTM, Hu H, Kurasawa Y, Li Z. A kinetochore-based ATM/ATR-independent DNA damage checkpoint maintains genomic integrity in trypanosomes. Nucleic Acids Research. 2019;47:7973–7988. doi: 10.1093/nar/gkz476. [DOI] [PMC free article] [PubMed] [Google Scholar]

eLife Assessment

Yamini Dalal 1

This work significantly advances our understanding of chromatin organization within regions of repetitive sequences in the parasitic protozoan Trypanosoma brucei. Using cutting edge interdisciplinary tools, the authors provide compelling evidence for two discrete types of repetitive DNA element-associated proteins- one set involved in essential centromere function; and, the other involved in glycoprotein antigenic variation via homologous recombination. Thus, these fundamental findings have implications for this parasite's biology, and for therapeutic targeting in kinetoplastid diseases. This work will be exciting to those in the centromere/mitosis and parasite immunity fields.

[Editors' note: this paper was reviewed by Review Commons.]

Reviewer #1 (Public review):

Anonymous

Summary:

Carloni et al. comprehensively analyze which proteins bind repetitive genomic elements in Trypanosoma brucei. For this, they perform mass spectrometry on custom-designed, tagged programmable DNA-binding proteins. After extensively verifying their programmable DNA-binding proteins (using bioinformatic analysis to infer target sites, microscopy to measure localization, ChIP-seq to identify binding sites), they present, among others, two major findings: (1) 14 of the 25 known T. brucei kinetochore proteins are enriched at 177bp repeats. As T. brucei's 177bp repeat-containing intermediate-sized and mini-chromosomes lack centromere repeats but are stable over mitosis, Carloni et al. use their data to hypothesize that a 'rudimentary' kinetochore assembles at the 177bp repeats of these chromosomes to segregate them. (2) 70bp repeats are enriched with the Replication Protein A complex, which, notably, is required for homologous recombination. Homologous recombination is the pathway used for recombination-based antigenic variation of the 70bp-repeat-adjacent variant surface glycoproteins.

Strengths and Weaknesses:

The manuscript was previously reviewed through Review Commons. As noted there, the experiments are well controlled, the claims are well supported, and the methods are clearly described. The conclusions are convincing. All concerns I raised have been addressed except one (minor point #8):

"The way the authors mapped the ChIP-seq data is potentially problematic when analyzing the same repeat type in different genomic regions. Reads with multiple equally good mapping positions were assigned randomly. This is fine when analyzing repeats by type, independent of genomic position, which is what the authors do to reach their main conclusions. However, several figures (Fig. 3B, Fig. 4B, Fig. 5B, Fig. 7) show the same repeat type at specific genomic locations." Due to the random assignment, all of these regions merely show the average signal for the given repeat. I find it misleading that this average is plotted out at "specific" genomic regions.

Initially, I suggested a workaround, but the authors clarified why the workaround was not feasible, and their explanation is reasonable to me. That said, the figures still show a signal at positions where they can't be sure it actually exists. If this cannot be corrected analytically, it should at least be noted in the figure legends, Results, or Discussion.

Importantly, the authors' conclusions do not hinge on this point; they are appropriately cautious, and their interpretations remain valid regardless.

Significance:

This work is of high significance for chromosome/centromere biology, parasitology, and the study of antigenic variation. For chromosome/centromere biology, the conceptual advancement of different types of kinetochores for different chromosomes is a novelty, as far as I know. It would certainly be interesting to apply this study as a technical blueprint for other organisms with mini-chromosomes or chromosomes without known centromeric repeats. I can imagine a broad range of labs studying other organisms with comparable chromosomes to take note of and build on this study. For parasitology and the study of antigenic variation, it is crucial to know how intermediate- and mini-chromosomes are stable through cell division, as these chromosomes harbor a large portion of the antigenic repertoire. Moreover, this study also found a novel link between the homologous repair pathway and variant surface glycoproteins, via the 70bp repeats. How and at which stages during the process, 70bp repeats are involved in antigenic variation is an unresolved, and very actively studied, question in the field. Of course, apart from the basic biological research audience, insights into antigenic variation always have the potential for clinical implications, as T. brucei causes sleeping sickness in humans and nagana in cattle. Due to antigenic variation, T. brucei infections can be chronic.

Comments on revised version:

All my recommendations have been addressed.

Reviewer #2 (Public review):

Anonymous

The Trypanosoma brucei genome, like that of other eukaryotes, contains diverse repetitive elements. Yet, the chromatin-associated proteome of these regions remains largely unexplored. This study represents a very important conceptual and technical advancement by employing synthetic TALE DNA-binding proteins fused to YFP to selectively capture proteins associated with specific repetitive sequences in T. brucei chromatin. The data presented here are convincing, supported by appropriate controls and a well-validated methodology, aligned with current state-of-the-art approaches.

The authors used synthetic TALE DNA binding proteins, tagged with YFP, which were designed to target five specific repeat elements in T. brucei genome, including centromere and telomeres-associated repeats and those of a transposon element. This is in order to identify specific proteins that bind to these repetitive sequences in T. brucei chromatin. Validation of the approach was done using a TALE protein designed to target the telomere repeat (TelR-TALE) that detected many of the proteins that were previously implicated with telomeric functions. A TALE protein designed to target the 70 bp repeats that reside adjacent to the VSG genes (70R-TALE) detected proteins that function in DNA repair and a protein designed to target the 177 bp repeat arrays (177R-TALE) identified kinetochore proteins associated T. brucei mega base chromosomes, as well as in intermediate and mini-chromosomes, which imply that kinetochore assembly and segregation mechanisms are similar in all T. brucei chromosomes.

This study represents a significant conceptual and technical advancement. To the best of our knowledge, it is the first report of employing TALE-YFP for affinity-based detection of protein complexes bound to repetitive genomic sequences in T. brucei. This approach enhances our understanding the organization in these important regions of the trypanosomal chromatin and provides the foundation for investigating the functional roles of associated proteins in parasite biology. These findings will be of particular interest to researchers studying the molecular biology of kinetoplastid parasites and other unicellular organisms, as well as to scientists investigating the roles of repetitive genomic elements in chromatin structure and their functional role in higher eukaryotes.

Importantly, any essential or unique interacting partners identified using the approach employed here, could serve as a potential target for therapeutic intervention in severe tropical diseases cause by kinetoplastids.

eLife. 2026 Mar 10;14:RP109950. doi: 10.7554/eLife.109950.2.sa3

Author response

Roberta Carloni 1, Tadhg Devlin 2, Pin Tong 3, Christos Spanos 4, Tatsiana Auchynnikava 5, Juri Rappsilber 6, Keith R Matthews 7, Robin C Allshire 8

Point-by-point description of the revisions:

Reviewer #1 (Evidence, reproducibility and clarity):

Summary

In this article, the authors used the synthetic TALE DNA binding proteins, tagged with YFP, which were designed to target five specific repeat elements in Trypanosoma brucei genome, including centromere and telomeres-associated repeats and those of a transposon element. This is in order to detect and identified, using YFP-pulldown, specific proteins that bind to these repetitive sequences in T. brucei chromatin. Validation of the approach was done using a TALE protein designed to target the telomere repeat (TelR-TALE) that detected many of the proteins that were previously implicated with telomeric functions. A TALE protein designed to target the 70 bp repeats that reside adjacent to the VSG genes (70R-TALE) detected proteins that function in DNA repair and the protein designed to target the 177 bp repeat arrays (177R-TALE) identified kinetochore proteins associated T. brucei mega base chromosomes, as well as in intermediate and mini-chromosomes, which imply that kinetochore assembly and segregation mechanisms are similar in all T. brucei chromosome.

Major comments:

Are the key conclusions convincing?

The authors reported that they have successfully used TALE-based affinity selection of proteinassociated with repetitive sequences in the T. brucei genome. They claimed that this study has provided new information regarding the relevance of the repetitive region in the genome to chromosome integrity, telomere biology, chromosomal segregation and immune evasion strategies. These conclusions are based on high-quality research, and it is, basically, merits publication, provided that some major concerns, raised below, will be addressed before acceptance for publication.

(1) The authors used TALE-YFP approach to examine the proteome associated with five different repetitive regions of the T. brucei genome and confirmed the binding of TALE-YFP with Chip-seq analyses. Ultimately, they got the list of proteins that bound to synthetic proteins, by affinity purification and LS-MS analysis and concluded that these proteins bind to different repetitive regions of the genome. There are two control proteins, one is TRF-YFP and the other KKT2-YFP, used to confirm the interactions. However, there are no experiment that confirms that the analysis gives some insight into the role of any putative or new protein in telomere biology, VSG gene regulation or chromosomal segregation. The proteins, which have already been reported by other studies, are mentioned. Although the author discovered many proteins in these repetitive regions, their role is yet unknown. It is recommended to take one or more of the new putative proteins from the repetitive elements and show whether or not they (1) bind directly to the specific repetitive sequence (e.g., by EMSA); (2) it is recommended that the authors will knockdown of one or a small sample of the new discovered proteins, which may shed light on their function at the repetitive region, as a proof of concept.

The main request from Referee 1 is for individual evaluation of protein-DNA interaction for a few candidates identified in our TALE-YFP affinity purifications, particularly using EMSA to identify binding to the DNA repeats used for the TALE selection. In our opinion, such an approach would not actually provide the validation anticipated by the reviewer. The power of TALE-YFP affinity selection is that it enriches for protein complexes that associate with the chromatin that coats the target DNA repetitive elements rather than only identifying individual proteins or components of a complex that directly bind to DNA assembled in chromatin.

The referee suggests we express recombinant proteins and perform EMSA for selected candidates, but many of the identified proteins are unlikely to directly bind to DNA – they are more likely to associate with a combination of features present in DNA and/or chromatin (e.g. specific histone variants or histone post-translational modifications). Of course, a positive result would provide some validation but only IF the tested protein can bind DNA in isolation – thus, a negative result would be uninformative.

In fact, our finding that KKT proteins are enriched using the 177R-TALE (minichromosome repeat sequence) identifies components of the trypanosome kinetochore known (KKT2) or predicted (KKT3) to directly bind DNA (Marciano et al., 2021; PMID: 34081090), and likewise the TelR-TALE identifies the TRF component that is known to directly associate with telomeric (TTAGGG)n repeats (Reis et al 2018; PMID: 29385523). This provides reassurance on the specificity of the selection, as does the lack of cross selectivity between different TALEs used (see later point 3 below). The enrichment of the respective DNA repeats quantitated in Figure 2B (originally Figure S1) also provides strong evidence for TALE selectivity.

It is very likely that most of the components enriched on the repetitive elements targeted by our TALE-YFP proteins do not bind repetitive DNA directly. The TRF telomere binding protein is an exception – but it is the only obvious DNA binding protein amongst the many proteins identified as being enriched in our TelR-TALE-YFP and TRF-YFP affinity selections.

The referee also suggests that follow up experiments using knockdown of the identified proteins found to be enriched on repetitive DNA elements would be informative. In our opinion, this manuscript presents the development of a new methodology previously not applied to trypanosomes, and referee 2 highlights the value of this methodological development which will be relevant for a large community of kinetoplastid researchers. In-depth follow-up analyses would be beyond the scope of this current study but of course will be pursued in future. To be meaningful such knockdown analyses would need to be comprehensive in terms of their phenotypic characterisation (e.g. quantitative effects on chromosome biology and cell cycle progression, rates and mechanism of recombination underlying antigenic variation, etc) – simple RNAi knockdowns would provide information on fitness but little more. This information is already publicly available from genome-wide RNAi screens (www.tritrypDB.org), with further information on protein location available from the genome-wide protein localisation resource (Tryptag.org). Hence basic information is available on all targets selected by the TALEs after RNAi knock down but in-depth follow-up functional analysis of several proteins would require specific targeted assays beyond the scope of this study.

(2) NonR-TALE-YFP does not have a binding site in the genome, but YFP protein should still be expressed by T. brucei clones with NLS. The authors have to explain why there is no signal detected in the nucleus, while a prominent signal was detected near kDNA (see Fig.2). Why is the expression of YFP in NonR-TALE almost not shown compared to other TALE clones?

The NonR-TALE-YFP immunolocalisation signal indeed is apparently located close to the kDNA and away from the nucleus. We are not sure why this is so, but the construct is sequence validated and correct. However, we note that artefactual localisation of proteins fused to a globular eGFP tag, compared to a short linear epitope V5 tag, near to the kinetoplast has been previously reported (Pyrih et al, 2023; PMID: 37669165).

The expression of NonR-TALE-YFP is shown in Supplementary Fig. S2 in comparison to other TALE proteins. Although it is evident that NonR-TALE-YFP is expressed at lower levels than other TALEs (the different TALEs have different expression levels), it is likely that in each case the TALE proteins would be in relative excess.

It is possible that the absence of a target sequence for the NonR-TALE-YFP in the nucleus affects its stability and cellular location. Understanding these differences is tangential to the aim of this study.

However, importantly, NonR-TALE-YFP is not the only control for used for specificity in our affinity purifications. Instead, the lack of cross-selection of the same proteins by different TALEs (e.g. TelR-TALE-YFP, 177R-TALE-YFP) and the lack of enrichment of any proteins of interest by the well expressed ingiR-TALE-YFP or 147R-TALE-YFP proteins each provide strong evidence for the specificity of the selection using TALEs, as does the enrichment of similar protein sets following affinity purification of the TelR-TALE-YFP and TRF-YFP proteins which both bind telomeric (TTAGGG)n repeats. Moreover, control affinity purifications to assess background were performed using cells that completely lack an expressed YFP protein which further support specificity (Figure 6).

We have added text to highlight these important points in the revised manuscript:

Page 8:

“However, the expression level of NonR-TALE-YFP was lower than other TALE-YFP proteins; this may relate to the lack of DNA binding sites for NonR-TALE-YFP in the nucleus.”

Page 8:

“NonR-TALE-YFP displayed a diffuse nuclear and cytoplasmic signal; unexpectedly the cytoplasmic signal appeared to be in the vicinity the kDNA of the kinetoplast (mitochrondria). We note that artefactual localisation of some proteins fused to an eGFP tag has previously been observed in T. brucei (Pyrih et al, 2023).”

Page 10:

Moreover, a similar set of enriched proteins was identified in TelR-TALE-YFP affinity purifications whether compared with cells expressing no YFP fusion protein (No-YFP), the NonR-TALE-YFP or the ingiR-TALE-YFP as controls (Fig. S7B, S8A; Tables S3, S4). Thus, the most enriched proteins are specific to TelR-TALE-YFP-associated chromatin rather than to the TALE-YFP synthetic protein module or other chromatin.

(3) As a proof of concept, the author showed that the TALE method determined the same interacting partners enrichment in TelR-TALE as compared to TRF-YFP. And they show the same interacting partners for other TALE proteins, whether compared with WT cells or with the NonR-TALE parasites. It may be because NonR-TALE parasites have almost no (or very little) YFP expression (see Fig. S3) as compared to other TALE clones and the TRF-YFP clone. To address this concern, there should be a control included, with proper YFP expression.

See response to point 2, but we reiterate that the ingi-TALE -YFP and 147R-TALE-YFP proteins are well expressed (western original Fig. S3 now Fig. S2) but few proteins are detected as being enriched or correspond to those enriched in TelR-TALE-YFP or TRF-YFP affinity purifications (see Fig. S9). Therefore, the ingi-TALE -YFP and 147R-TALE-YFP proteins provide good additional negative controls for specificity as requested. To further reassure the referee we have also included additional volcano plots which compare TelR-TALE-YFP, 70R-TALE-YFP or 177R-TALE-YFP to the ingiR-TALE-YFP affinity selection (new Figure S8). As with No-YFP or NonR-TALE-YFP controls, the use of ingiR-TALE-YFP as a negative control demonstrates that known telomere associated proteins are enriched in TelR-TALE-YFP affinity purification, RPA subunits enriched with 70R-TALE-YFP and Kinetochore KKT poroteins enriched with 177RTALE-YFP. These analyses demonstrate specificity in the proteins enriched following affinity purification of our different TALE-YFPs and provide support to strengthen our original findings.

We now refer to use of No-YFP, NonR-TALE-YFP, and ingiR-TALE -YFP as controls for comparison to TelR-TALE-YFP, 70R-TALE-YFP or 177R-TALE-YFP in several places:

Page10:

“Moreover, a similar set of enriched proteins was identified in TelR-TALE-YFP affinity purifications whether compared with cells expressing no YFP fusion protein (No-YFP), the NonR-TALE-YFP or the ingiR-TALE-YFP as controls (Fig. S7B, S8A; Tables S3, S4).”

Page 11:

“Thus, the nuclear ingiR-TALE-YFP provides an additional chromatin-associated negative control for affinity purifications with the TelR-TALE-YFP, 70R-TALE-YFP and 177R-TALE-YFP proteins (Fig. S8).”

“Proteins identified as being enriched with 70R-TALE-YFP (Figure 6D) were similar in comparisons with either the No-YFP, NonR-TALE-YFP or ingiR-TALE-YFP as negative controls.”

Top Page 12:

“The same kinetochore proteins were enriched regardless of whether the 177R-TALE proteomics data was compared with No-YFP, NonR-TALE or ingiR-TALE-YFP controls.”

Discussion Page 13:

“Regardless, the 147R-TALE and ingiR-TALE proteins were well expressed in T. brucei cells, but their affinity selection did not significantly enrich for any relevant proteins. Thus, 147R-TALE and ingiR-TALE provide reassurance for the overall specificity for proteins enriched TelR-TALE, 70R-TALE and 177R-TALE affinity purifications.”

(4) After the artificial expression of repetitive sequence binding five-TALE proteins, the question is if there is any competition for the TALE proteins with the corresponding endogenous proteins? Is there any effect on parasite survival or health, compared to the control after the expression of these five TALEs YFP protein? It is recommended to add parasite growth curves, for all the TALE proteins expressing cultures.

Growth curves for cells expressing TelR-TALE-YFP, 177R-TALE-YFP and ingiR-TALE-YFP are now included (New Fig S3A). No deficit in growth was evident while passaging 70R-TALE-YFP, 147R-TALE-YFP, NonR-TALE-YFP cell lines (indeed they grew slightly better than controls).

The following text has been added page 8:

“Cell lines expressing representative TALE-YFP proteins displayed no fitness deficit (Fig. S3A).”

(5) Since the experiments were performed using whole-cell extracts without prior nuclear fractionation, the authors should consider the possibility that some identified proteins may have originated from compartments other than the nucleus. Specifically, the detection of certain binding proteins might reflect sequence homology (or partial homology) between mitochondrial DNA (maxicircles and minicircles) and repetitive regions in the nuclear genome. Additionally, the lack of subcellular separation raises the concern that cytoplasmic proteins could have been co-purified due to whole cell lysis, making it challenging to discern whether the observed proteome truly represents the nuclear interactome.

In our experimental design, we confirmed bioinformatically that the repeat sequences targeted were not represented elsewhere in the nuclear or mitochondrial genome (kDNA). The absence of subcellular fractionation could result in some cytoplasmic protein selection, but this is unlikely since each TALE targets a specific DNA sequence but is otherwise identical such that cross-selection of the same contaminating protein set would be anticipated if there was significant non-specific binding. We have previously successfully affinity selected 15 chromatin modifiers and identified associated proteins without major issues concerning cytoplasmic protein contamination (Staneva et al 2021 and 2022; PMID: 34407985 and 36169304). Of course, the possibility that some proteins are contaminants will need to be borne in mind in any future follow-up analysis of proteins of interest that we identified as being enriched on specific types of repetitive element in T. brucei. Proteins that are also detected in negative control, or negative affinity selections such as No-YFP, NoR-YFP, IngiR-TALE or 147R-TALE must be disregarded.

(6) Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether?

As mentioned earlier, the author claimed that this study has provided new information concerning telomere biology, chromosomal segregation mechanisms, and immune evasion strategies. But there are no experiments that provides a role for any unknown or known protein in these processes. Thus, it is suggested to select one or two proteins of choice from the list and validate their direct binding to repetitive region(s), and their role in that region of interaction.

As highlighted in response to point 1 the suggested validation and follow up experiments may well not be informative and are beyond the scope of the methodological development presented in this manuscript. Referee 2 describes the study in its current form as “a significant conceptual and technical advancement” and “This approach enhances our understanding of chromatin organization in these regions and provides a foundation for investigating the functional roles of associated proteins in parasite biology.”

The Referee’s phrase ‘validate their direct binding to repetitive region(s)’ here may also mean to test if any of the additional proteins that we identified as being enriched with a specific TALE protein actually display enrichment over the repeat regions when examined by an orthogonal method. A key unexpected finding was that kinetochore proteins including KKT2 are enriched in our affinity purifications of the 177R-TALE-YFP that targets 177bp repeats (Figure 6F). By conducting ChIP-seq for the kinetochore specific protein KKT2 using YFP-KKT2 we confirmed that KKT2 is indeed enriched on 177bp repeat DNA but not flanking DNA (Figure 7). Moreover, several known telomere-associated proteins are detected in our affinity selections of TelRTALE-YFP (Figure 6B, FigS6; see also Reis et al, 2018 Nuc. Acids Res. PMID: 29385523; Weisert et al, 2024 Sci. Reports PMID: 39681615).

Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation.

The answer for this question depends on what the authors want to present as the achievements of the present study. If the achievement of the paper was is the creation of a new tool for discovering new proteins, associated with the repeat regions, I recommend that they add a proof for direct interactions between a sample the newly discovered proteins and the relevant repeats, as a proof of concept discussed above, However, if the authors like to claim that the study achieved new functional insights for these interactions they will have to expand the study, as mentioned above, to support the proof of concept.

See our response to point 1 and the point we labelled ‘6’ above.

Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments.

I think that they are realistic. If the authors decided to check the capacity of a small sample of proteins (which was unknown before as a repetitive region binding proteins) to interacts directly with the repeated sequence, it will substantially add of the study (e.g., by EMSA; estimated time: 1 months). If the authors will decide to check the also the function of one of at least one such a newly detected proteins (e.g., by KD), I estimate the will take 3-6 months.

As highlighted previously the proposed EMSA experiment may well be uninformative for protein complex components identified in our study or for isolated proteins that directly bind DNA in the context of a complex and chromatin. RNAi knockdown data and cell location data (as well as developmental expression and orthology data) is already available through tritrypDB.org and trtyptag.org

Are the data and the methods presented in such a way that they can be reproduced? Yes

Are the experiments adequately replicated, and statistical analysis adequate?

The authors did not mention replicates. There is no statistical analysis mentioned.

The figure legends indicate that all volcano plots of TALE affinity selections were derived from three biological replicates. Cutoffs used for significance: P < 0.05 (Student's t-test).

For ChiP-seq two biological replicates were analysed for each cell line expressing the specific YFP tagged protein of interest (TALE or KKT2). This is now stated in the relevant figure legends – apologies for this oversight. The resulting data are available for scrutiny at GEO: GSE295698.

Minor comments:

Specific experimental issues that are easily addressable.

The following suggestions can be incorporated:

(1) Page 18, in the material method section author mentioned four drugs: Blasticidine, Phleomycin and G418, and hygromycin. It is recommended to mention the purpose of using these selective drugs for the parasite. If clonal selection has been done, then it should also be mentioned.

We erroneously added information on several drugs used for selection in our labaoratory. In fact all TALE-YFP construct carry the Bleomycin resistance genes which we select for using Phleomycin. Also, clones were derived by limiting dilution immediately after transfection. We have amended the text accordingly:

Page 17/18:

“Cell cultures were maintained below 3 x 106 cells/ml. Pleomycin 2.5 µg/ml was used to select transformants containing the TALE construct BleoR gene.”

“Electroporated bloodstream cells were added to 30 ml HMI-9 medium and two 10-fold serial dilutions were performed in order to isolate clonal Pleomycin resistant populations from the transfection. 1 ml of transfected cells were plated per well on 24-well plates (1 plate per serial dilution) and incubated at 37°C and 5% CO2 for a minimum of 6 h before adding 1 ml media containing 2X concentration Pleomycin (5 µg/ml) per well.”

(2) In the method section the authors mentioned that there is only one site for binding of NonR-TALE in the parasite genome. But in Fig. 1C, the authors showed zero binding site. So, there is one binding site for NonR-TALE-YFP in the genome or zero?

We thank the reviewer for pointing out this discrepancy. We have checked the latest Tb427v12 genome assembly for predicted NonR-TALE binding sites and there are no exact matches. We have corrected the text accordingly.

Page 7:

“A control NonR-TALE protein was also designed which was predicted to have no target sequence in the T. brucei genome.”

Page 17:

“A control NonR-TALE predicted to have no recognised target in the T. brucei geneome was designed as follows: BLAST searches were used to identify exact matches in the TREU927 reference genome. Candidate sequences with one or more match were discarded.”

(3) The authors used two different anti-GFP antibodies, one from Roche and the other from Thermo Fisher. Why were two different antibodies used for the same protein?

We have found that only some anti-GFP antibodies are effective for affinity selection of associated proteins, whereas others are better suited for immunolocalisation. The respective suppliers’ antibodies were optimised for each application.

(4) Page 6: in the introduction, the authors give the number of total VSG genes as 2,634. Is it known how many of them are pseudogenes?

This value corresponds to the number reported by Consentino et al. 2021 (PMID: 34541528) for subtelomeric VSGs, which is similar to the value reported by Muller et al 2018 (PMID: 30333624) (2486), both in the same strain of trypanosomes as used by us. Based on the earlier analysis by Cross et al (PMID: 24992042), 80% of the identified VSGs in their study (2584) are pseudogenes. This approximates to the estimation by Consentino of 346/2634 (13%) being fully functional VSG genes at subtelomeres, or 17% when considering VSGs at all genomic locations (433/2872).

(5) I found several typos throughout the manuscript.

Thank you for raising this, we have read through the manuscipt several times and hopefully corrected all outstanding typos.

(6) Fig. 1C: Table: below TOTAL 2nd line: the number should be 1838 (rather than 1828)

Corrected- thank you.

- Are prior studies referenced appropriately? Yes

- Are the text and figures clear and accurate? Yes

- Do you have suggestions that would help the authors improve the presentation of their data and conclusions? Suggested above

Reviewer #1 (Significance):

Describe the nature and significance of the advance (e.g., conceptual, technical, clinical) for the field:

This study represents a significant conceptual and technical advancement by employing a synthetic TALE DNA-binding protein tagged with YFP to selectively identify proteins associated with five distinct repetitive regions of T. brucei chromatin. To the best of my knowledge, it is the first report to utilize TALE-YFP for affinity-based isolation of protein complexes bound to repetitive genomic sequences in T. brucei. This approach enhances our understanding of chromatin organization in these regions and provides a foundation for investigating the functional roles of associated proteins in parasite biology. Importantly, any essential or unique interacting partners identified could serve as potential targets for therapeutic intervention.

- Place the work in the context of the existing literature (provide references, where appropriate). I agree with the information that has already described in the submitted manuscript, regarding its potential addition of the data resulted and the technology established to the study of VSGs expression, kinetochore mechanism and telomere biology.

- State what audience might be interested in and influenced by the reported findings. These findings will be of particular interest to researchers studying the molecular biology of kinetoplastid parasites and other unicellular organisms, as well as scientists investigating chromatin structure and the functional roles of repetitive genomic elements in higher eukaryotes.

- (1) Define your field of expertise with a few keywords to help the authors contextualize your point of view. Protein-DNA interactions/ chromatin/ DNA replication/ Trypanosomes

- (2) Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate. None

Reviewer #2 (Evidence, reproducibility and clarity):

Summary

Carloni et al. comprehensively analyze which proteins bind repetitive genomic elements in Trypanosoma brucei. For this, they perform mass spectrometry on custom-designed, tagged programmable DNA-binding proteins. After extensively verifying their programmable DNA-binding proteins (using bioinformatic analysis to infer target sites, microscopy to measure localization, ChIP-seq to identify binding sites), they present, among others, two major findings: (1) 14 of the 25 known T. brucei kinetochore proteins are enriched at 177bp repeats. As T. brucei's 177bp repeatcontaining intermediate-sized and mini-chromosomes lack centromere repeats but are stable over mitosis, Carloni et al. use their data to hypothesize that a 'rudimentary' kinetochore assembles at the 177bp repeats of these chromosomes to segregate them. (2) 70bp repeats are enriched with the Replication Protein A complex, which, notably, is required for homologous recombination. Homologous recombination is the pathway used for recombination-based antigenic variation of the 70bp-repeat-adjacent variant surface glycoproteins.

Major Comments

None. The experiments are well-controlled, claims well-supported, and methods clearly described. Conclusions are convincing.

Thank you for these positive comments.

Minor Comments

(1) Fig. 2 - I couldn't find an uncropped version showing multiple cells. If it exists, it should be linked in the legend or main text; Otherwise, this should be added to the supplement.

The images presented represent reproducible analyses, and independently verified by two of the authors. Although wider field of view images do not provide the resolution to be informative on cell location, as requested we have provided uncropped images in new Fig. S4 for all the cell lines shown in Figure 2A.

In addition, we have included as supplementary images (Fig. S3B) additional images of TelRTALE-YFP, 177R-TALE-YFP and ingiR-TALE YFP localisation to provide additional support their observed locations presented in Figure 1. The set of cells and images presented in Figure 2A and in Fig S3B were prepared and obtained by a different authors, independently and reproducibly validating the location of the tagged protein.

(2) I think Suppl. Fig. 1 is very valuable, as it is a quantification and summary of the ChIP-seq data. I think the authors could consider making this a panel of a main figure. For the main figure, I think the plot could be trimmed down to only show the background and the relevant repeat for each TALE protein, leaving out the non-target repeats. (This relates to minor comment 6.) Also, I believe, it was not explained how background enrichment was calculated.

We are grateful for the reviewer’s positive view of original Fig. S1 and appreciate the suggestion. We have now moved these analysis to part B of main Figure 2 in the revised manuscript – now Figure 2B. We have also provided additional details in the Methods section on the approaches used to assess background enrichment.

Page 19:

“Background enrichment calculation

The genome was divided into 50 bp sliding windows, and each window was annotated based on overlapping genomic features, including CIR147, 177 bp repeats, 70 bp repeats, and telomeric (TTAGGG)n repeats. Windows that did not overlap with any of these annotated repeat elements were defined as "background" regions and used to establish the baseline ChIP-seq signal. Enrichment for each window was calculated using bamCompare, as log₂(IP/Input). To adjust for background signal amongst all samples, enrichment values for each sample were further normalized against the corresponding No-YFP ChIP-seq dataset.”

Note: While revising the manuscript we also noticed that the script had a nomalization error. We have therefore included a corrected version of these analyses as Figure 2B (old Fig. S1)

(3) Generally, I would plot enrichment on a log2 axis. This concerns several figures with ChIP-seq data.

Our ChIP-seq enrichment is calculated by bamCompare. The resulting enrichment values are indeed log2 (IP/Input). We have made this clear in the updated figures/legends.

(4) Fig. 4C - The violin plots are very hard to interpret, as the plots are very narrow compared to the line thickness, making it hard to judge the actual volume. For example, in Centromere 5, YFP-KKT2 is less enriched than 147R-TALE over most of the centromere with some peaks of much higher enrichment (as visible in panel B), however, in panel C, it is very hard to see this same information. I'm sure there is some way to present this better, either using a different type of plot or by improving the spacing of the existing plot.

We thank the reviewer for this suggestion; we have elected to provide a Split-Violin plot instead. This improves the presentation of the data for each centromere. The original violin plot in Figure 4C has been replaced with this Split-Violin plot (still Figure 4C).

(5) Fig. 6 - The panels are missing an x-axis label (although it is obvious from the plot what is displayed).

Maybe the "WT NO-YFP vs" part that is repeated in all the plot titles could be removed from the title and only be part of the x-axis label?

In fact, to save space the X axis was labelled inside each volcano plot but we neglected to indicate that values are a log2 scale indicating enrichment. This has been rectified – see Figure 6, and Fig. S7, S8 and S9.

(6) Fig. 7 - I would like to have a quantification for the examples shown here. In fact, such a quantification already exists in Suppl. Figure 1. I think the relevant plots of that quantification (YFPKKT2 over 177bp-repeats and centromere-repeats) with some control could be included in Fig. 7 as panel C. This opportunity could be used to show enrichment separated out for intermediate-sized, mini-, and megabase-chromosomes. (relates to minor comment 2 & 8)

The CIR147 sequence is found exclusively on megabase-sized chromosomes, while the 177 bp repeats are located on intermediate- and mini-sized chromosomes. Due to limitations in the current genome assembly, it is not possible to reliably classify all chromosomes into intermediate- or mini- sized categories based on their length. Therefore, original Supplementary Fig. S1 presented the YFP-KKT2 enrichment over CIR147 and 177 bp repeats as a representative comparison between megabase chromosomes and the remaining chromosomes (corrected version now presented as main Figure 2B). Additionally, to allow direct comparison of YFP-KKT2 enrichment on CIR147 and 177 bp repeats we have included a new plot in Figure 7C which shows the relative enrichment of YFP-KKT2 on these two repeat types.

We have added the following text , page 12:

“Taking into account the relative to the number of CIR147 and 177 bp repeats in the current T. brucei genome (Cosentino et al., 2021; Rabuffo et al., 2024), comparative analyses demonstrated that YFP-KKT2 is enriched on both CIR147 and 177 bp repeats (Figure 7C).”

(7) Suppl. Fig. 8 A - I believe there is a mistake here: KKT5 occurs twice in the plot, the one in the overlap region should be KKT1-4 instead, correct?

Thanks for spotting this. It has been corrected

(8) The way that the authors mapped ChIP-seq data is potentially problematic when analyzing the same repeat type in different regions of the genome. The authors assigned reads that had multiple equally good mapping positions to one of these mapping positions, randomly.

This is perfectly fine when analysing repeats by their type, independent of their position on the genome, which is what the authors did for the main conclusions of the work.

However, several figures show the same type of repeat at different positions in the genome. Here, the authors risk that enrichment in one region of the genome 'spills' over to all other regions with the same sequence. Particularly, where they show YFP-KKT2 enrichment over intermediate- and mini-chromosomes (Fig. 7) due to the spillover, one cannot be sure to have found KKT2 in both regions.

Instead, the authors could analyze only uniquely mapping reads / read-pairs where at least one mate is uniquely mapping. I realize that with this strict filtering, data will be much more sparse. Hence, I would suggest keeping the original plots and adding one more quantification where the enrichment over the whole region (e.g., all 177bp repeats on intermediate-/mini-chromosomes) is plotted using the unique reads (this could even be supplementary). This also applies to Fig. 4 B & C.

We thank the reviewer for their thoughtful comments. Repetitive sequences are indeed challenging to analyze accurately, particularly in the context of short read ChIP-seq data. In our study, we aimed to address YFP-KKT2 enrichment not only over CIR147 repeats but also on 177 bp repeats, using both ChIP-seq and proteomics using synthetic TALE proteins targeted to the different repeat types. We appreciate the referees suggestion to consider uniquely mapped reads, however, in the updated genome assembly, the 177 bp repeats are frequently immediately followed by long stretches of 70 bp repeats which can span several kilobases. The size and repetitive nature of these regions exceeds the resolution limits of ChIP-seq. It is therefore difficult to precisely quantify enrichment across all chromosomes.

Additionally, the repeat sequences are highly similar, and relying solely on uniquely mapped reads would result in the exclusion of most reads originating from these regions, significantly underestimating the relative signals. To address this, we used Bowtie2 with settings that allow multi-mapping, assigning reads randomly among equivalent mapping positions, but ensuring each read is counted only once. This approach is designed to evenly distribute signal across all repetitive regions and preserve a meaningful average.

Single molecule methods such as DiMeLo (Altemose et al. 2022; PMID: 35396487) will need to be developed for T. brucei to allow more accurate and chromosome specific mapping of kinetochore or telomere protein occupancy at repeat-unique sequence boundaries on individual chromosomes.

Reviewer #2 (Significance):

This work is of high significance for chromosome/centromere biology, parasitology, and the study of antigenic variation. For chromosome/centromere biology, the conceptual advancement of different types of kinetochores for different chromosomes is a novelty, as far as I know. It would certainly be interesting to apply this study as a technical blueprint for other organisms with minichromosomes or chromosomes without known centromeric repeats. I can imagine a broad range of labs studying other organisms with comparable chromosomes to take note of and build on this study. For parasitology and the study of antigenic variation, it is crucial to know how intermediate- and mini-chromosomes are stable through cell division, as these chromosomes harbor a large portion of the antigenic repertoire. Moreover, this study also found a novel link between the homologous repair pathway and variant surface glycoproteins, via the 70bp repeats. How and at which stages during the process, 70bp repeats are involved in antigenic variation is an unresolved, and very actively studied, question in the field. Of course, apart from the basic biological research audience, insights into antigenic variation always have the potential for clinical implications, as T. brucei causes sleeping sickness in humans and nagana in cattle. Due to antigenic variation, T. brucei infections can be chronic.

Thank you for supporting the novelty and broad interest of our manuscript

My field of expertise / Point of view:

I'm a computer scientist by training and am now a postdoctoral bioinformatician in a molecular parasitology laboratory. The laboratory is working on antigenic variation in T. brucei. The focus of my work is on analyzing sequencing data (such as ChIP-seq data) and algorithmically improving bioinformatic tools.

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Data Citations

    1. Carloni R, Devlin T, Tong P, Auchynnikava T, Spanos CR, Rappsilber J, Matthews KR, Allshire RC. 2025. Defining the chromatin-associated protein landscapes on Trypanosoma brucei repetitive elements using synthetic TALE proteins. NCBI Gene Expression Omnibus. GSE295698 [DOI] [PMC free article] [PubMed]
    2. Carloni R, Devlin T, Tong P, Auchynnikava T, Spanos CR, Rappsilber J, Matthews KR, Allshire RC. 2025. Defining the chromatin-associated protein landscapes on Trypanosoma brucei repetitive elements using synthetic TALE proteins. PRIDE. PXD063130 [DOI] [PMC free article] [PubMed]

    Supplementary Materials

    Figure 1—figure supplement 1—source data 1. Original DNA-stained agarose gel for Figure 1—figure supplement 1D indicating the relevant PCR bands from T. brucei cells containing the indicated synthetic TALE-YFP protein expression constructs.
    Figure 1—figure supplement 2—source data 1. Original anti-YFP western for Figure 1—figure supplement 2A (left), indicating the relevant bands in T. brucei cells expressing the indicated synthetic TALE-YFP proteins.
    Figure 1—figure supplement 2—source data 2. Original anti-YFP western for Figure 1—figure supplement 2A (left), indicating the relevant bands in T. brucei cells expressing the indicated synthetic TALE-YFP proteins.
    Figure 1—figure supplement 2—source data 3. Original anti-Ty western for Figure 1—figure supplement 2B, indicating the relevant bands in T. brucei cells expressing the indicated synthetic TALE-YFP proteins.
    Supplementary file 1. Proteomics analyses comparing protein enrichments in the indicated affinity selections a-to-q.

    (a) Affinity selection data for wild-type cells expressing No YFP versus cells expressing YFP-TRF (WT NoYFP vs. YFP-TRF). Proteins enriched in YFP-TRF affinity selections were identified and quantified by LC-MS/MS analysis relative to affinity selections from wild-type cells lacking any YFP as a negative control. (b) Affinity selection data for wild-type cells expressing No YFP versus cells expressing TelR-TALE (WT NoYFP vs. TelR-TALE). Proteins enriched in TelR-TALE affinity selections were identified and quantified by LC-MS/MS analysis relative to affinity selections from wild-type cells lacking any YFP as a negative control. (c) Affinity selection data for wild-type cells expressing No YFP versus cells expressing NonR-TALE (WT NoYFP vs. NonR-TALE). Proteins enriched in NonR-TALE affinity selections were identified and quantified by LC-MS/MS analysis relative to affinity selections from wild-type cells lacking any YFP as a negative control. (d) Affinity selection data for cells expressing NonR-TALE versus cells expressing TelR-TALE (NonR-TALE vs. TelR-TALE). Proteins enriched in TelR-TALE affinity selections were identified and quantified by LC-MS/MS analysis relative to affinity selections of NonR-TALE. (e) Affinity selection data for wild-type cells expressing No YFP versus cells expressing 147R-TALE (WT NoYFP vs. 147R-TALE). Proteins enriched in 147R-TALE affinity selections were identified and quantified by LC-MS/MS analysis relative to affinity selections from wild-type cells lacking any YFP as a negative control. (f) Affinity selection data for cells expressing NonR-TALE versus cells expressing 147R-TALE (NonR-TALE vs. TelR-TALE). Proteins enriched in 147R-TALE affinity selections were identified and quantified by LC-MS/MS analysis relative to affinity selections of NonR-TALE. (g) Affinity selection data for wild-type cells expressing No YFP versus cells expressing ingiR-TALE (WT NoYFP vs. ingiR-TALE). Proteins enriched in ingiR-TALE affinity selections were identified and quantified by LC-MS/MS analysis relative to affinity selections from wild-type cells lacking any YFP as a negative control. (h) Affinity selection data for cells expressing NonR-TALE versus cells expressing ingiR-TALE (NonR-TALE vs. ingiR-TALE). Proteins enriched in ingiR-TALE affinity selections were identified and quantified by LC-MS/MS analysis relative to affinity selections of NonR-TALE. (i) Affinity selection data for wild-type cells expressing No YFP versus cells expressing 70R-TALE (WT NoYFP vs. 70R-TALE). Proteins enriched in 70R-TALE affinity selections were identified and quantified by LC-MS/MS analysis relative to affinity selections from wild-type cells lacking any YFP as a negative control. (j) Affinity selection data for wild-type cells expressing No YFP versus cells expressing YFP-RPA2 (WT NoYFP vs. YFP-RPA2). Proteins enriched in YFP-RPA2 affinity selections were identified and quantified by LC-MS/MS analysis relative to affinity selections from wild-type cells lacking any YFP as a negative control. (k) Affinity selection data for cells expressing NonR-TALE versus cells expressing 70R-TALE (NonR-TALE vs. 70R-TALE). Proteins enriched in 70R-TALE affinity selections were identified and quantified by LC-MS/MS analysis relative to affinity selections of NonR-TALE. (l) Affinity selection data for wild-type cells expressing No YFP versus cells expressing 177R-TALE (WT NoYFP vs. 177R-TALE). Proteins enriched in 177R-TALE affinity selections were identified and quantified by LC-MS/MS analysis relative to affinity selections from wild-type cells lacking any YFP as a negative control. (m) Affinity selection data for cells expressing NonR-TALE versus cells expressing 177R-TALE (NonR-TALE vs. 177R-TALE). Proteins enriched in 177R-TALE affinity selections were identified and quantified by LC-MS/MS analysis relative to affinity selections of NonR-TALE. (n) Affinity selection data for wild-type cells expressing No YFP versus cells expressing YFP-RPA2 (WT NoYFP vs. YFP-KKT2). Proteins enriched in YFP-KKT2 affinity selections were identified and quantified by LC-MS/MS analysis relative to affinity selections from wild-type cells lacking any YFP as a negative control. (o) Affinity selection data for cells expressing ingiR-TALE versus cells expressing TelR-TALE (ingiR-TALE vs. TelR-TALE). Proteins enriched in TelR-TALE affinity selections were identified and quantified by LC-MS/MS analysis relative to affinity selections of ingiR-TALE. (p) Affinity selection data for cells expressing ingiR-TALE versus cells expressing 70R-TALE (ingiR-TALE vs. 70R-TALE) Proteins enriched in 70R-TALE affinity selections were identified and quantified by LC-MS/MS analysis relative to affinity selections of ingiR-TALE. SuppFile1p_ingiR-TALEv70R-TALE (q) Affinity selection data for cells expressing ingiR-TALE versus cells expressing 177R-TALE (ingiR-TALE vs. 70R-TALE) Proteins enriched in 177R-TALE affinity selections were identified and quantified by LC-MS/MS analysis relative to affinity selections of ingiR-TALE. SuppFile1q_ingiR-TALEv177R-TALE.

    MDAR checklist

    Data Availability Statement

    Sequence Data: All NGS ChIP-seq data generated have been submitted to and will be available under an accession number at the NCBI Gene Expression Omnibus (GEO; https://www.ncbi.nlm.nih.gov/geo/). The GEO accession number for ChIP-seq data is: GSE295698. Proteomics Data: All LC-MS/MS proteomics data generated are available on the Proteomics Identification Database (PRIDE; https://www.ebi.ac.uk/pride/archive/projects/PXD063130) with accession number PXD063130.

    The following datasets were generated:

    Carloni R, Devlin T, Tong P, Auchynnikava T, Spanos CR, Rappsilber J, Matthews KR, Allshire RC. 2025. Defining the chromatin-associated protein landscapes on Trypanosoma brucei repetitive elements using synthetic TALE proteins. NCBI Gene Expression Omnibus. GSE295698

    Carloni R, Devlin T, Tong P, Auchynnikava T, Spanos CR, Rappsilber J, Matthews KR, Allshire RC. 2025. Defining the chromatin-associated protein landscapes on Trypanosoma brucei repetitive elements using synthetic TALE proteins. PRIDE. PXD063130


    Articles from eLife are provided here courtesy of eLife Sciences Publications, Ltd

    RESOURCES