Abstract
Trypanosomatid parasites are causative agents of important human and animal diseases such as sleeping sickness and leishmaniasis. Most trypanosomatids are transmitted to their mammalian hosts by insects, often belonging to Diptera (or true flies). These are called dixenous trypanosomatids since they infect two different hosts, in contrast to those that infect just insects (monoxenous). However, it is still unclear whether dixenous and monoxenous trypanosomatids interact similarly with their insect host, as fly-monoxenous trypanosomatid interaction systems are rarely reported and under-studied–despite being common in nature. Here we present the genome of monoxenous trypanosomatid Herpetomonas muscarum and discuss its transcriptome during in vitro culture and during infection of its natural insect host Drosophila melanogaster. The H. muscarum genome is broadly syntenic with that of human parasite Leishmania major. We also found strong similarities between the H. muscarum transcriptome during fruit fly infection, and those of Leishmania during sand fly infections. Overall this suggests Drosophila-Herpetomonas is a suitable model for less accessible insect-trypanosomatid host-parasite systems such as sand fly-Leishmania.
Author summary
Trypanosomes and Leishmania are parasites that cause serious Neglected Tropical Diseases (NTDs) in the world’s poorest people. Both of these are dixenous trypanosomatids, transmitted to humans and other mammals by biting flies. They are called dixenous as they can establish infections in two different types of hosts– insect vectors and mammals. In contrast, monoxenous trypanosomatids usually only infect insects. Despite establishment in the insect’s midgut being key to transmission of NTDs, events during early establishment inside the insect are still unclear in both dixenous and monoxenous parasites. Here, we study the interaction between a model insect–the fruit fly Drosophila melanogaster–and its natural monoxenous trypanosomatid parasite Herpetomonas muscarum. We show that both the genome of this parasite, and gene regulation at early stages of infection have strong parallels with Leishmania. This work has begun to identify evolutionarily conserved aspects of the process by which trypanosomatids establish in insects, thus potentially highlighting key checkpoints necessary for transmission of dixenous parasites. In turn, this might inform new strategies to control trypanosomatid NTDs.
Introduction
The family Trypanosomatidae belong to the order Kinetoplastida, a group characterized by the presence a mitochondrial organelle rich in DNA (kDNA) called the kinetoplast. This family includes parasitic flagellates that undergo cyclical development in both vertebrate and invertebrate hosts (and are therefore dixenous). These parasites are best known as agents of important diseases in humans, domestic animals and plants. However, several genera of this order such as Crithidia, Herpetomonas, Blastocrithia and Leptomonas are restricted to a single host (monoxenous), usually an insect from the orders Diptera, Hemiptera or Siphonaptera [1]. Although such monoxenous or “lower” trypanosomatids seem to have their lifecycle essentially confined to insect hosts [2], they have also been reported in plants [3] and immunocompromised humans [1].
There is an increasing interest in monoxenous trypanosomatids as a model for understanding the evolution and ecology of trypanosomatids [4], as well as how they may modify their insect host [4]. It is now clear that monoxenous trypanosomatids are ubiquitous parasites of a wide range of insect groups and have numerous effects on the physiology of the insect host. These effects include alterations in fertility and reproduction, modified food intake, delayed development and reduction in lifespan [5]. In projections of total animal biodiversity, insects represent more than 60% of all animals [6]. Therefore, knowledge of insect physiology and what can influence it, is essential for maintaining a species-rich environment especially when longitudinal population data show a sharp decline in flying insect biomass [7]. In this context, studies of trypanosomatid-insect interactions will provide vital insights into the ecology of crucial insect species (e.g. pollinators).
To this end, a number of monoxenous trypanosomatid genomes and transcriptomes are being investigated [8,9]; including bee parasites from the genus, Lotmaria passim (the honey bee parasite) and Leptomonas pyrrhocoris a globally disseminated parasite isolated from fire bugs [10,11]. These studies, and earlier work on the molecular biology of trypanosomatids, have revealed that monoxenous parasites share many distinctive genome features with their better-studied dixenous relatives [12].
The genomic DNA is arranged into ‘polycistronic’ (multi-gene) transcriptional units of functionally unrelated genes, the majority of which lack introns. Given this gene arrangement, the cells do not control an individual gene’s expression by varying its transcription level, instead expression is controlled by RNA-binding proteins [13] and other post-transcriptional processes such as RNA editing [14]. RNA editing processes include trans-splicing where 39 nucleotides, called a splice leader sequence, are added to the 5’ end of mRNAs [15]. The splice leaders (also called mini exons) are encoded in tandem repeats in a different genomic locus to the gene.
Trypanosomatid kDNA is arranged in interlocking ‘maxi-circles’ [16–18]. The kDNA maxicircle is homologous to mitochondrial genomes in other systems but the sequence encoding many of typical mitochondrial proteins is scrambled, relying on post-transcriptional mRNA editing to reconstitute the correct coding sequence [19]. The kinetoplast also contains thousands of associated ‘mini-circles’ which encode guide RNAs involved in this editing process [17].
In addition to ecological insights, studies of monoxenous trypanosomatids may help us gain new perspective on interactions of more medically important parasites and their insect vectors, which mediate neglected tropical diseases such as Leishmaniasis (vectored by phlebotomine sand flies) and sleeping sickness (tsetse flies). To inform, and accelerate, research in these experimentally challenging dipteran-parasite relationships, we have developed the study of the model dipteran Drosophila melanogaster and its natural trypanosomatid Herpetomonas muscarum [20]. We have established that a network of signalling in the intestine of the host was important for clearance as well as for maintaining fecundity. This network involved NF-κB and STAT-mediated transcription, which regulate intestinal stem cell proliferation that the parasite attempts to suppress. Here, we turn our attention to the parasite. We report the genome of H. muscarum isolated from a wild population of Drosophila melanogaster in Oxfordshire, UK. We also report the transcriptomes of this H. muscarum isolate from in vitro culture and during the course of infection in D. melanogaster. The similarities with Leishmania major both at the genome level as well as transcriptome regulation were striking. This was especially the case in the early phases of host infection when the parasite needs to overcome the barrier of the insect midgut and establish infection. Given the resistance mechanisms to parasite establishment (and therefore onward transmission) reside in the dipteran midgut [21], the Drosophila-Herpetomonas model may allow researchers to take advantage of the extensive toolkit of genetic approaches available for Drosophila to uncover mechanistic details of evolutionary conserved aspects of the relationship between trypanosomatids and dipteran vectors, where the tool-box for functional studies is not yet fully developed.
Results/Discussion
The Herpetomonas muscarum genome
Assembly
PacBio and Illumina sequence reads were generated from an axenic culture of H. muscarum promastigotes as described in Materials and Methods. The reads were assembled into a genome of 41.7 Mbp in 264 scaffolds with the largest 1,793,442 bp in length (N50 = 707,495 bp). We observed a median read coverage of 114x with populations of scaffolds coverage at approximately 50x and 160x which may represent monosomic and trisomic scaffolds (Fig 1, predicting 37–39 chromosomes). Kmer analysis of the sequencing reads estimated the haploid genome length to be approximately 35.2 Mbp with a read error rate of less than 1% (S1 Fig, Vurture et al., 2017). While the GenomeScope [22] model does not fit the aneuploid nature of trypanosomatid genomes (see below), we believe this suggests our assembly is approximately the correct size.
Annotation
Gene model annotation was generated with Companion [23] using evidence from RNA-seq data (described below) and the proteomes of L. major, L. braziliensis and T. brucei as described in Materials and Methods. The final H. muscarum v1 annotation contains 12,687 genes, of which 12,162 are inferred to be protein-coding (Table 1).
Table 1. Herpetomonas muscarum genome annotation summary.
Feature | H. muscarum v1.0 |
---|---|
Genes | 12687 |
mRNAs | 12162 |
CDSs | 12175 |
Polypeptides | 12934 |
Pseudogenes | 772 |
rRNAs | 168 |
snRNAs | 3 |
snoRNAs | 181 |
tRNAs | 173 |
All unique open reading frames produced by the gene models were kept, even in cases where the gene prediction was not strongly supported by RNA-sequencing evidence, in an attempt to not ‘miss’ genes. It is therefore likely that this annotation contains a higher number of genes than the ‘true annotation’. However, the number of reported genes is close to that reported for other trypanosomatid species e.g. T. brucei TREU927 strain contains 11,567 genes [24]. We also note that the few T. brucei genes reported to contain intronic sequences, e.g. poly(A)-polymerase (Tb927.3.3160) and the mini-exon gene (see below), also appear to contain intronic regions in H. muscarum.
Conserved features of trypanosomatid genomes
Genome structure and large scale synteny
As seen in other trypanosomatid genomes, open reading frames were found on both strands on many scaffolds. Genes are (mostly) arranged in large groups of genes present on the same strand and in the same direction, which is indicative of the polycistronic transcripts typical in trypanosomatid genomes. The regions between polycistrons, commonly referred to as strand switch regions (SSRs), are thought to contain the transcriptional start sites for transcription of each group of genes. We used the SSRs to define and estimate the number of polycistrons. Here we defined SSRs to begin and end at genes where the downstream open reading frame is on the opposing strand of the same scaffold. This highlighted 386 genes from 112 different scaffolds. These putative strand switches were manually inspected and could be grouped into different three situations. There were 128 bona fide strand switches which were either divergent (72 cases) or convergent (56 cases) (S1 Table). There were 166 cases where a single gene (or small group of < 5 genes) had become inverted within a polycistron. Small genes (< 350bp) encoding hypothetical proteins and tRNAs were commonly found in these cases, though other larger genes were also found in these groups e.g. HMUS00935500.1 an putative trans-sialidase. Finally, there were 92 cases where a strand switch does occur, but the precise locus was unclear. These cases tended to be at where a single gene at the end of a scaffold was on the opposing strand to all other genes on the scaffold–as such it was unclear if this represented a bona fide strand switch or a single gene inversion. Overall, this indicated there are at least 128 polycistrons in the H. muscarum genome, though this is likely to be an underestimate given the ambiguity of some strand switch regions. Comparisons with other trypanosomatids genomes also suggest this figure is an underestimate, e.g. L. major is predicted to have 184 polycistrons [25] and T. brucei is predicted to have 150 [26], both of which have smaller genomes and fewer predicted chromosomes than H. muscarum.
Despite diverging before the existence of mammals [27], trypanosomatids show high gene order conservation across the genome. As expected, the H. muscarum scaffold showed synteny with other trypanosomatid genomes (Fig 2A–2E). Herpetomonas was most highly syntenic with L. major despite being considered phylogenetically closer to Phytomonas and Leptomonas. To quantify this, we took non-overlapping windows of adjacent H. muscarum genes with single copy orthologs in three comparator genomes: L. major, T. brucei and Leptomonas seymouri. For each window size, we count for how many windows have all orthologs on the same scaffold in the comparator (syntenic windows), and for how many of those all the genes are in the same relative order as their H. muscarum orthologs (colinear windows). Almost 96% of 3-gene windows of single-copy orthologs between H. muscarum and L. major (1845/1926) are syntenic, and 53% of these are colinear (985/1845). This conserved genome structure is shared, to a slightly lesser extent across the trypanosomatids (91.7% or 1386/1511 syntenic with T. brucei brucei, 55% or 766/1386 colinear, 80.9% or 1643/2030 syntenic with L. seymouri, 46% or 761/1643 colinear). This relationship holds across window sizes (Fig 2F). The values for synteny with Leptomonas seymouri are likely to be biased downwards by the fragmentary assembly available for that species, and this analysis does not capture rearrangements, expansions or contractions of multi-gene families, for which one-to-one orthology is unlikely to be clear.
Splice leader sequence
In trypanosomatids, each mRNA is capped, via trans-splicing [reviewed in 15], with a conserved 39bp sequence called the splice leader (SL). The SL is encoded by the mini-exon genes which are found throughout the genome in tandem arrays. Each mini-exon has two components; the highly conserved 39bp sequence trans-spliced on to mRNAs (the exon) and a less well conserved intronic sequence. Between each mini-exon gene there is a variable spacer region which is not transcribed. To find the splice leader sequence for our H. muscarum isolate, we searched for the conserved 39bp SL sequence from Phytomonas serpens (L42381.1) in the H. muscarum scaffolds. This gave 259 hits over 24 scaffolds, which we used to identify 19 clusters of mini-exon gene repeats (over 15 scaffolds) containing 3–43 copies of the mini exon gene (see S2 Table). The first 111bp of the gene are common to all copies of the mini-exon gene and contain a 40bp splice leader sequence and what we predict to be the intron.
The splice leader sequence (1-40bp) and the putative intronic region (41-111bp) were then aligned with mini-exon sequences of several other trypanosomatids in the Leishmaniinae clade—including 9 other Herpetomonas isolated from heteropterans in the neotropics [28]. Whilst the splice leader sequence is well-conserved across the clade (Table 2), we observe variability in the A/T-rich region between bases 11-19bp which appears genus specific, with the exception of the Herpetomonas sequences. H. rotimani and H. nabiculae have identical sequence across the 11-19bp region. However, the H. muscarum and H. nabiculae differ from each other, and the other Herpetomonas sequences over this variable region. Additionally, compared to other trypanosomatids, the Herpetomonas sequences have an ‘additional’ adenosine between bases 10 and 11. The intronic region from H. muscarum shows high similarity to that of previously reported Herpetomonas sequences. The first 15bp of the intronic sequence appear to be conserved in other species from the Leishmaniiae clade, however the sequence becomes more variable thereafter in both in terms of base content and length.
Table 2. Alignment of highly conserved splice leader sequences (bases 1–40 of mini-exon gene) of H. muscarum and other species from the Leishmaniiae clade.
Species | Accession # | Splice leader sequence (bases 1–40) |
---|---|---|
Herpetomonas muscarum | AACTAACGCTAAAAATTGTTACAGTTTCTGTACATTATTG | |
Herpetomonas muscarum | EU095982.1*, EU095980.1*, EU095979.1*, EU095983.1, EU095984.1, EU095981.1* | AACTAACGCTAAAAATTGTTACAGTTTCTGTACTATATTG |
Herpetomonas sp. TCC263 | EU095976.1 | AACTAAAGCATTATATAGATACAGTTTCTGTACTATATTG |
Herpetomonas sp. TCC263 | EU095977.1 | AACTAAAGCATTATATAGATACAGTTTCTGTACTATATTG |
Herpetomonas roitmani | EU095978.1 | AACTAAAGCATTATATAGATACAGTTTCTGTACTTTATTG |
Herpetomonas nabiculae | KF054153.1 | AACTAACGCTAT-TATTGTTACAGTTTCTGTACTTTATTG |
Phytomonas EM1 | X87138.1 | AACTAACGCT-ATTCTAGATACAGTTTCTGTACTTTATTG |
Phytomonas serpens | L42381.1, L42378.1, L42377.1, L42382.1, L42376.1 | AACTAACGCT-ATTCTAGATACAGTTTCTGTACTTTATTG |
Phytomonas sp. Mar8 | AF250993.1 | AACTAACGCT-ATTCTAGATACAGTTTCTGTACTTTATTG |
Phytomonas sp. Alp1 | AF250967.1 | AACTAACGCT-ATTCTAGATACAGTTTCTGTACTTTATTG |
Leishmania braziliensis | MG010484.1 | AACTAACGCT-ATATAAGTATCAGTTTCTGTACTTTATTG |
Leishmania tarentolae | AY100201.1 | AACTAACGCT-ATATAAGTATCAGTTTCTGTACTTTATTG |
Leishmania hoogstraali | AY100197.1, AY100200.1 | AACTAACGCT-ATATAAGTATCAGTTTCTGTACTTTATTG |
Leishmania gymnodactyli | AY100195.1, AY100196.1 | AACTAACGCT-ATATAAGTATCAGTTTCTGTACTTTATTG |
Leishmania adleri | AY100199.1, AY100194.1 | AACTAACGCT-ATATAAGTATCAGTTTCTGTACTTTATTG |
Leishmania major | XR_002460055.1 | AACTAACGCT-ATATAAGTATCAGTTTCTGTACTTTATTG |
Leishmania mexicana | Agami and Shapira 1992 | AACTAACGCT-ATATAAGTATCAGTTTCTGTACTTTATTG |
Leishmania donovani | CP022617.1 | AACTAACGCT-ATATAAGTATCAGTTTCTGTACTTTATTG |
Leishmania infantum | AF097653.1 | AACTAACGCT-ATATAAGTATCAGTTTCTGTACTTTATTG |
Blastocrithidia culicis | DQ860204.1 | AACTAACGCT-ATATTTGTTACAGTTTCTGTACTATATTG |
Blastocrithidia culicis | DQ860203.1 | AACTAACGCT-ATATTTGTTACAGTTTCTGTACTTTATTG |
Tubulin loci
The architecture of the tubulin arrays has been described in a number of trypanosomatids [29], with two mutually exclusive formats being defined–monotypic and alternating. Monotypic tubulin arrays consist of either alpha-tubulin or beta-tubulin. Alternating arrays contain both alpha-tubulin and beta-tubulin genes which alternate along the array. The H. muscarum orthologues of Trypanosoma brucei alpha and beta tubulin genes were found using Orthofinder and used to locate the tubulin arrays.
We identified three genomic loci containing H. muscarum tubulin genes (Fig 3). Two of these loci consist of beta-alpha alternating arrays and the third locus consists of four copies of a beta tubulin genes. The alternating beta-alpha arrays are consistent with previous findings (reported as Herpetomonas megaseliae) [29] and suggested that, like T. brucei, H. muscarum genome has the alternating tubulin array configuration. However, the presence of a monotypic beta tubulin array in addition to the alternating arrays contrasts the established model in which each species has either alternating or monotypic arrays, but not both.
The genes surrounding the monotypic beta tubulin locus shared some synteny with regions of chromosome 4 of T. brucei and chromosome 8 of L. major (gene numbers Tb927.5.970 –Tb927.927.5.3090 and Lmj.08.1090-Lmj.08.11140). Interestingly this region of L. major chromosome 8 is one of two singleton beta-tubulin loci in the species. As such, the tubulin configuration of H. muscarum was an intermediate between the tubulin array configurations of T. brucei and L. major.
The predicted Herpetomonas muscarum proteome
Orthofinder [30] was used to identify orthologous proteins from other trypanosomatids in the predicted proteome of H. muscarum. For the analysis, protein coding genes from the following species were used: 9 Trypanosoma species/subspecies (Trypanosoma brucei brucei, Trypanosoma brucei gambiense, Trypanosoma congolense, Trypanosoma cruzi, Trypanosoma evansi, Trypanosoma grayi, Trypanosoma rangeli, Trypanosoma theileri and Trypanosoma vivax), 4 Leishmania species (Leishmania braziliensis, Leishmania donovani, Leishmania infantum and Leishmania major); 6 additional monoxenous trypanosomatids along with our Herpetomonas muscarum predictions (Angomonas deanei, Leptomonas pyrrhocoris, Leptomonas seymori, Crithidia bombi, Crithidia expoeki, Crithidia fasciculata). Finally, we included a free-living, non-trypanosomatid kinetoplastid, Bodo saltans, as an outgroup. From these 21 species 87.5% of genes were assigned to 12,701 orthogroups (for summary see Table 3, full orthogroups table S3 Table). We found 7,265 of these orthogroups contained H. muscarum genes. There were 45 orthogroups containing only H. muscarum genes, these groups contain 215 genes. Overall, 90.7% of H. muscarum predicted proteins were assigned to an orthogroup.
Table 3. Summary of Orthofinder analysis of 13 trypanosomatid genomes.
Total number of genes | 212,664 |
Number of genes in orthogroups | 186,070 |
Number of unassigned genes | 26,594 |
Percentage of genes in orthogroups | 87.50% |
Number of unassigned genes | 12.50% |
Number of orthogroups | 12,701 |
Number of species-specific orthogroups | 313 |
Number of genes in species-specific orthogroups | 4,212 |
Percentage of genes in species-specific orthogroups | 2.0% |
Mean orthogroup size | 14.7 |
Median orthogroup size | 14 |
Number of orthogroups with all species present | 9 |
Number of single copy orthogroups | 0 |
Orthofinder also produced a phylogenetic tree based on protein sequences from proteins in orthogroups which contained a single gene from every species used in the analysis (Fig 4A). This tree is consistent with others published for the trypanosomatids (Maslov et al., 2013). Unsurprisingly H. muscarum shares more orthogroups with L. major (6,607) than T. brucei (5,893)–which is more distantly related (Fig 4B). However, H. muscarum had slightly more orthogroups in common (6754) with the two Leptomonas sp. used in the analysis (Fig 4C). Finally, within the Leishmaniinae clade H. muscarum and two species of ‘old world’ Leishmania, L. major and L. donovani, shared 81.2% of their orthogroups (Fig 4D). A global examination of the patterns of gene family sharing between H. muscarum, and other trypanosomatid groups confirmed these patterns (Fig 5A). Most gene families, including most genes, are present in all of the groups, and another significant set of families is shared by all the trypanosomatid groups but missing from the outgroup, the free-living kinetoplastid Bodo saltans. These trypanosomatidae-specific gene families tend to be quite large, while many smaller gene families are specific to genera Crithidia and Trypanosoma, perhaps because of the more extensive taxon sampling of these lineages. There are exceptions, including some strikingly large gene families unique to trypanosomes, Leishmania and a number of other taxonomic groups (Fig 5B). Monoxenous trypanosomatids share many more genes families with Leishmania than Trypanosoma, and there are strikingly few families specific to the Leishmania lineage or any of the monoxenous parasites except Crithidia, explaining the strikingly similar predicted proteomes of Leishmania and H. muscarum.
We could not look in detail at all of the homology relationships between genes in this extensive comparison. We used a more focused OrthoFinder analysis to investigate specific groups of orthologues between H. muscarum and T. brucei genes of interest e.g. metabolic pathway genes, as T. brucei is the best-studied kinetoplastid at the molecular and cellular level. We summarise our findings in Table 4 (for full data see S4–S16 Tables) and discuss some of the orthologues of interest, including surprisingly ‘missing’ orthologues, below.
Table 4. Summary of H. musccarum proteins orthologous to important T. brucei proteins.
H.muscarum orthologues/T. brucei proteins | T.brucei/L.major without orthologues in H. muscarum | |
---|---|---|
METABOLISM | ||
Glycolysis | 44/45 | Tb927.10.4520 |
Gluconeogenesis | 2/2 | n/a |
Pentose phosphate pathways | 12/13 | Tb927.2.5800 |
NADPH metabolism | 4/4 | n/a |
Acetate metabolism | 14/17 | Tb927.11.2230, Tb927.8.2790, Tb927.6.2790 |
TCA cycle | 17/17 | n/a |
Mitochondrial carriers | 24/25 | Tb927.9.12140 |
Respiratory chain | 79/82 | Tb927.7.6350, Tb927.10.7090, Tb927.10.3120 |
Amino acid transporters | 31/31 | n/a |
Lipid metabolism | 9/11 | Tb927.10.11930, Tb927.4.2700 |
Leu-Isoleu-Val degradation | 22/23 | Tb927.4.2700 |
Fatty Acid Biosynthesis | 14/14 | n/a |
Sphingolipid biosynthesis | 7/11 | Tb927.9.9410, Tb927.9.9400, Tb927.9.9390, Tb927.9.9380 |
Glycerophspholipid biosynthesis | 16/16 | n/a |
GPI-N-glycosylation biosynthesis | 47/49 | Tb927.4.4200, Tb927.1.4830 |
DIFFERENTIATION AND DNA | ||
Quorum sensing | 32/35 | Tb927.4.3650, Tb927.11.2250, Tb927.11.11480 |
Bloodstream to procyclic form differentiation | 10/12 | Tb927.10.10260, Tb927.10.11220 |
Epimastigote meiosis | 4/5 | Tb927.9.15510 |
RNA regulators of the life cycle | 18/18 | n/a |
Proteins with RNA-binding annotation | 54/57 | Tb927.10.14950, Tb927.6.2550, Tb927.9.6870 |
RNAi machinery | 5/5 | n/a |
PROTEIN KINASES | 147/169 | Tb11.v5.0564, Tb11.v5.0644, Tb927.1.3130, Tb927.10.12480, Tb927.10.15880, Tb927.10.4940, Tb927.10.9980, Tb927.11.5150, Tb927.11.5860, Tb927.3.1850, Tb927.3.3920, Tb927.3.5650, Tb927.3.840, Tb927.4.4330, Tb927.5.4430, Tb927.7.4090, Tb927.9.12400, Tb927.9.12880, Tb927.9.1500, Tb927.9.1570, Tb927.9.16260, Tb927.9.2350 |
PHOSPHATASES | 86/93 | Tb927.07.v5.1, Tb07.30D13.60, Tb927.10.4930, Tb927.11.11740, Tb927.11.4990, Tb927.11.5740, Tb927.8.8040 |
NUCLEAR PROTEOME | ||
Nuclear Pores | 27/27 | n/a |
Exosome | 12/12 | n/a |
Spliceosome | 56/59 | Tb927.10.7390, Tb927.9.6870, Tb927.3.1090 |
Kinetochore | 30/34 | Tb927.10.6330, Tb927.11.1030, Tb927.5.4520, Tb927.9.13970 |
OTHER PROTEINS OF INTEREST | ||
GP63 | 14/15 | Tb927.11.7610 |
Mucins | 8/11 | Tb927.8.7190, TcMUCII, Tb927.11.18610, Tb927.11.3400 |
LPG biosynthesis | 20/29 | LmjF.14.1400, LmjF.02.0160, LmjF.02.0170, LmjF.02.0190, LmjF.02.0200, LmjF.02.0210, LmjF.02.0230, LmjF.35.0010, LmjF.25.2460, LmjF.31.3190, LmjF.36.0010, LmjF.02.0010, LmjF.21.0010, LmjF.07.1170, LmjF.34.0510, LmjF.02.0180, LmjF.02.0220, LmjF.05.1230, LmjF.19.650, LmjF.32.3900 |
Trypanothione synthesis | 2/2 | LmjF.05.0350, LmjF.27.1870 |
Metabolism
H. muscarum is missing sphingolipid (SL) biosynthesis genes SLS1-4, including the inositol phosphorylceramide synthase and two choline phosphorylceramide synthases. These genes are part of the same orthogroup from our analysis. Most of the Trypanosoma have 4 genes assigned to this orthogroup (with the exception of T. cruzi (2) and T. vivax (0)). However, other species used in this analysis had only 1 gene assigned to this orthogroup. Given that SLs are thought to be essential to eukaryotic membranes [31], this seemed surprising. However, L. major promastigotes do not require de novo SL synthesis and a mutant devoid of SLs was viable and replicated as log-phase promastigotes [32]. However, the SL-free mutant was unable to differentiate into a metacyclic stage in vitro and showed severe defects in vesicular trafficking. As such, like L. major, H. muscarum and the other species without a complete SLS pathway may rely on scavenging sphingolipids from the environment.
H. muscarum did not have orthologues for the carnitine O-acetyltransferase (CAT) (Tb927.11.2230) and L-threonine 3-dehydrogenase (Tb927.6.2790) genes of the acetate metabolism pathway. We were also unable to find an orthologue to these genes in other species from the Leishmaniinae clade used in the analysis. As such these genes may have been lost sometime after the group diverged from Trypanosoma.
Additionally, three T. brucei respiratory chain genes did not appear to have orthologues in H. muscarum, including mitochondrial NADH-ubiquinone oxidoreductase flavoprotein 2 (Tb927.7.6350), which had orthologues in all species used in the analysis apart from H. muscarum. Similarly, the only genomes in the analysis without an orthologue for the cytochrome c oxidase assembly protein (Tb927.10.3120) were H. muscarum and Phytomonas EM1. Given the importance of these genes, this likely indicates an important gap in the H. muscarum annotation. Finally, no orthologue was identified for the T. brucei alternative oxidase (AOX) (Tb927.10.7090) which is found in Trypanosoma and is upregulated in bloodstream forms. This oxidase is thought to enhance organisms ability to cope with stress associated with temperature change, infections and oxidative stress [33].
We also note that for several T. brucei genes there were multiple H. muscarum orthologues. Two of the most extreme examples of this being the high-affinity arginine transporter AAT13 [34, 35] and the endo-/lysosome-associated membrane-bound phosphatase 2 (MBAP2) which have 38 and 18 orthologous genes in H. muscarum respectively. The increased copy number of these genes, hints at their importance, though the reason for their high-copy number in H. muscarum is as yet unclear. AAT13 and MBAP2 have been shown to be highly upregulated in Leishmania after their ingestion by sand flies and in conditions of nutrient starvation [36, 37]. Speculatively, the increased copy number of these genes may reflect the nutrient availability in Herpetomonas’ environment/host(s).
Differentiation
RNA-binding proteins (RBPs) have emerged as key modulators of gene expression in trypanosomatids—particularly in the context of trypanosome development and differentiation [38]. Orthologues were found for 72/75 T. brucei RNA-binding proteins. RNA-binding proteins with no orthologues found in H. muscarum were: chromatin-remodelling-associated RRM2 (Tb927.6.2550) [39], the pre‐RNA processing protein RBSR1 (Tb927.9.6870) [40] and a hypothetical RBP (Tb927.10.14950).
We have not observed differentiation in H. muscarum using ‘classical’ temperature/pH manipulations in vitro or during D. melanogaster infections. As such the ‘completeness’ of the H. muscarum RBP repertoire, relative to T. brucei which has multiple discrete forms, is of interest. Several of these proteins had multiple orthologues in H. muscarum including RBP10 (4 orthologues, Tb927.8.2780). RBP10 is known to be highly expressed in bloodstream forms of T. brucei and its overexpression in procyclics led to an increase of many bloodstream-form specific mRNAs, as well as transcripts associated with sugar transport, the flagellum and cytoskeleton [41]. The role for this protein in H. muscarum is unclear, as it does not appear to have a bona fide vertebrate host, however given this proteins links to sugar transport, it may play a more general role in metabolism in H. muscarum. Comparisons of H. muscarum RBP expression levels/timings with other trypanosomatids may shed more light on their role in the cell and potentially why we do not observe differentiated forms for this species.
In addition to the RBPs, we were unable to find any orthologues for the hydrophilic acylated surface proteins (HASPs) or small hydrophilic endoplasmic reticulum-associated proteins (SHERPs) which are associated with metacyclogenesis in Leishmania. We also note that the repressor of differentiation kinase 1 (RDK1, Tb927.11.14070) has 6 orthologues in H. muscarum. In T. brucei, RDK1 acts with the PTP1/PIP39 phosphatase cascade to prevent uncontrolled differentiation from bloodstream to procyclic form [42]. Given that H. muscarum is thought to be confined to insects, the presence of multiple copies of this gene which assists in maintaining a ‘vertebrate’ cell form in T. brucei is intriguing. It may be that this protein has an alternative role in H. muscarum.
Surface proteins
No orthologues were found for the EP procyclins which are known to be expressed highly T. brucei procyclic whilst in the tsetse vectors and are thought to provide protection from the digestive enzymes in the insect midgut [43, 44]. As such H. muscarum likely relies on other surface proteins for protection in the insect midgut (see the transcriptomic data below).
The lipophosphoglycan (LPG) is an abundant component of the Leishmania cell surface and its importance during multiple stages of the Leishmania life cycle, including interactions with the insect gut epithelium, is well known [45, 46]. As such the prescence of LPG synthesis ezymes in H. muscarum is of great interest (see Table 4 and S9 Table). Single copy orthologues were found for the LPG biosynthesis-associated proteins GPI12/14 and LPG2-5. The β-galactofuranosyl transferases LPG-1, -1R and -1L were grouped together in a single orthogroup (orthogroup 32) which contained 12 H. muscarum orthologues. However, no orthologues could be found for the β-galactofuranosyl transferases LPG1G1-3 in H. muscarum, these genes were only found in Leishmania species and L. pyrrocoris in our analyses (orthogroup 7861). Orthogroup 32 contained genes from all species used in this analysis with the exception of the two T. brucei sub species. Speculatively, orthogroup 32 may represent a more ancient group of these enzymes, whilst orthogroup 7861 may be a more recent development within the Leishmania/Leptomonas species.
The three L. major side chain arabinosyltransferases SCA1, 2 and L were grouped into a single orthogroup (orthogroup 886). This orthogroup consisted of only Leishmania, Leptomonas and T. grayi proteins. Similarly, the L. major side chain galactosyltransferases (SCG1-7) and related proteins (SCGR1-6) were grouped into a single orthogroup (orthogroup 60) which contained protein sequences from only Leishmania and Leptomonas suggesting these proteins may be Leishmaniiae specific.
Orthofinder was unable to find an orthologue to the major surface proteins of salivary gland forms of T. brucei—BARPs (bloodstream alanine-rich proteins). These GPI-anchored proteins required for tsetse salivary gland colonisation [47, 48]. Additionally, we do not find orthologues for the T. brucei metacyclic invariant surface proteins (MISPs) which are found extending above the VSG coat in salivary gland metacyclic forms [49]. Given the proteins are crucial for salivary gland colonisation, the lack of copies in the H. muscarum genome may partially explain the inability of H. muscarum to colonise the salivary glands of D. melanogaster, instead infections are confined to the insect crop and gut [20].
Finally, the 13 T. brucei GP63 genes were grouped with 28 H. muscarum genes. GP63 is a major surface protease in L. major promastigotes. The comparatively high copy number of GP63 in H. muscarum may highlight its importance. Furthermore, GP63 has been implicated in Leishmania virulence [50], and as such these will be of interest in future studies.
Nuclear proteome
Kinetochore interacting protein 3 (KKIP3, Tb927.10.6700) and SR protein (Tb927.9.6870) had no orthologues in H. muscarum or other species from the Leishmaniiae clade used in the analysis and as such they appear to be Trypanosoma specific. RNAi of KKIP3 in T. brucei resulted in defects in DNA segregation and reduced population growth [51].
Additionally, T. brucei’s kinetochore interacting protein 1 (KKIP1), PHF5-like protein (Tb927.10.7390) and U1 small nuclear ribonucleoprotein 24 kDa (Tb927.3.1090) had orthologues in all species used in the analysis apart from H. muscarum. Similar to KKIP3, RNAi knock down of KKIP1 caused defects in DNA replication, though in the case of KKIP these defects were more severe–resulting in the loss of entire chromosomes [51]. It is unclear if these genes have been lost in H. muscarum or this indicates a gap in the current annotation. Based on the importance of KKIP1 and the fact these genes have orthologues in all other species analysed, it is likely to be the latter.
Finally, H. muscarum appears to have a ‘full set’ of the T. brucei RNA interference pathway genes including an orthologue for TbARGO1 (Tb927.10.10850). Genes from this well-conserved (in metazoans) pathway have been lost in several trypanosomatids including: L. major, L. donovani and T. cruzi [52, 53, 54]. The loss of this pathway in these organisms has been linked to Leishmania RNA virus perturbation [54, 55]—though this has not been explicitly demonstrated. Further investigations to look for evidence of viruses akin to the LRVs in H. muscarum could test the link between RNAi and virus infection in trypanosomatids. The presence of a functional RNAi pathway has also been linked to transposon activity in Leishmania–with RNA-negative species lacking active transposable elements (TEs), and RNAi competent L. braziliensis harbouring several classes of active TEs [55, 56]. Given this, it is possible that the loss/lack of active TEs in L. major and L. donovani have lifted the requirement of the RNAi pathway to protect against TE-associated genomic perturbations. We did observe transcripts corresponding to the telomere associated transposable elements (TATEs) in all H. muscarum transcriptomes (see below). As such, there may also be an important link between RNAi and transposon activity in trypanosomatids.
The H. muscarum transcriptome during in vitro culture
We first analysed the transcriptome of H. muscarum during in vitro axenic culture, specifically to compare log-phase and stationary phase cultures. Knowledge of the log-phase transcriptome was especially important as this was the ‘pre-infection’ transcriptome in our Drosophila infection model. By comparing the log-phase H. muscarum transcriptome with that of H. muscarum in flies we sought to identify genes important in the establishment of infection (see section below). The principal component analysis (PCA) plot (S2 Fig) shows that the first principal component is mostly capturing variation between distinct clusters of samples from log and stationary phase and explains 68% of the variance in these data. As expected, we found extensive differential expression between log-phase and stationary phase, with 4044 genes significantly differentially regulated (p-adjusted <0.05) (S16 Table). This is approximately a third of the genome but most changes in expression were modest, with only 264 genes upregulated ≥ 2-fold in stationary phase cells and 811 downregulated ≥ 2-fold which we will discuss further below. GO enrichment analysis, using Ontologizer [57], did not identify any significantly enriched GO terms associated with differentially regulated genes. However only 62% of H. muscarum genes have associated GO terms. As such, we looked for enrichment in Pfam domains. There were 26 Pfam domains significantly enriched in the genes upregulated in stationary phase and 73 Pfam domains significantly enriched among downregulated transcripts (S17 Table), which we discuss further below.
Cell cycle associated proteins
The Pfam domain associated with cyclins was significantly enriched in genes upregulated in stationary phase cells. From this, we investigated the expression profiles of the cyclins, and their associated kinases. Eleven were found to be differentially regulated between the two cell populations (Table 5).
Table 5. Significantly differentially regulated cyclins and cyclin-related kinases between stationary, and log phase H. muscarum.
Gene Name | H. muscarum orthologue ID | log2FoldChange | adjusted p-value |
---|---|---|---|
CRK4 | HMUS00195900.1 | 1.3 | 8.89E-10 |
cyclin 11 | HMUS01322900.1 | 1.2 | 4.32E-05 |
cyclin 2 | HMUS00751100.1 | 1.2 | 2.72E-19 |
cyclin 4 | HMUS00787500.1 | 1.1 | 1.53E-17 |
cyclin 7 | HMUS00475100.1 | 0.8 | 2.41E-14 |
CRK10 | HMUS01143000.1 | 0.7 | 6.49E-09 |
cyclin 5 | HMUS00580100.1 | 0.7 | 2.02E-12 |
cyclin 10 | HMUS01323000.1 | 0.5 | 0.001 |
CRK12 | HMUS00986000.1 | 0.3 | 0.015 |
DNA-directed RNA polymerase III subunit, putative | HMUS00638800.1 | -0.3 | 0.032 |
mitochondrial DNA polymerase I protein C | HMUS00828800.1 | -0.5 | 0.006 |
mitochondrial DNA polymerase I protein D | HMUS00617400.1 | -0.5 | 0.018 |
mitochondrial DNA polymerase I protein B, | HMUS01100200.1 | -0.6 | 0.007 |
DNA polymerase alpha/epsilon subunit B | HMUS00740000.1 | -0.7 | 0.004 |
DNA polymerase delta catalytic subunit | HMUS00566500.1 | -0.7 | 0.006 |
CRK3 | HMUS00914500.1 | -1.0 | 1.06E-40 |
cyclin 8 | HMUS00524500.1 | -1.0 | 1.40E-39 |
There was significant downregulation of the mitosis-associated cyclin 8, CRK3 and several mitochondrial DNA polymerase subunits in stationary phase cells. Knockdown of CRK3 in T. brucei is associated with a reduction in cell growth [58]. Furthermore, there was upregulation of the G1-associated cyclins 7, 4 and 11. These observations reflect the observed reductions in cell replication at higher cell densities. Consistent with this, and with a reduction in cell growth, there were also significant reductions in transcripts for α- and β-tubulins, DNA polymerases and several protein synthesis-related genes including: 40S ribosomal subunits, 28S rRNAs and five putative elongation factor 2 genes. However, there was also upregulation of mitosis-associated cyclin 2 in the stationary phase cells. Cyclin 2 has two roles in T. brucei procyclics: cell cycle progression through G1 and the maintenance of correct cell morphology at the posterior end of the cell [59]. The CRKs 10 and 12, which were also upregulated in stationary phase cells, have been shown to interact with cyclin 2 and their knock-down results in growth defects [60]. CRK12 is also essential to survival of T. brucei in mice and its depletion by RNAi lead to defects in endocytosis, an enlarged flagellar pocket and abnormal kinetoplast localisation [61]. Given the relative abundance of many transcripts associated with reduced replication in stationary phase cells, the upregulation of cyclin 2 and its associated CRKs (10 and 12) may be more relevant to the maintenance of correct cell morphology than mitosis.
Stress and metabolism
Stationary phase (of growth) is associated with build-up of toxic waste products and fewer nutrients available per cell. It was therefore unsurprising that we observed transcriptional changes indicating metabolic change and nutrient starvation. Genes containing the Pfam domain associated with major autophagy marker ATG8 were significantly enriched in stationary phase transcripts (33 in total). Autophagy is a vital process for survival in nutrient poor environments and involves the segregation of the cell components to be recycled into double membrane-bound vesicles called autophagosomes. The requirement for increased amounts of membrane in autophagy, may partially explain the upregulation of fatty-acid synthesis related genes in stationary phase, as fatty acids are crucial components of cell membranes. Three lipases, two putative lipase precursor-like proteins, fatty-acyl-CoA Synthase 1 and putative fatty acid elongase (ELO) protein were upregulated upon entry into stationary phase. This is consistent with observations of Trypanosoma cruzi cultures [62].
Whilst the upregulation of autophagy-related genes is an indicator of cell stress, we also observed the downregulation of several genes with domains associated with responding to oxidative stress including: thioredoxin, glutathione S-transferase and alkyl hydroperoxide reductase (AhpC) and thiol specific antioxidant (TSA). As such, cells do not appear to be under significant oxidative stress. Other forms of stress, such as reduced nutrient availability or pH changes, may be driving the predicted increases in autophagy. Additionally, transcripts bearing the heat shock protein 60 HSP60 domain (PF00118) were also significantly enriched in the downregulated transcripts, which is another indicator of cell stress.
Cell surface proteins
Proteins sharing a domain (cl28643) with the variant surface protein (VSP) proteins of the Giardia lamblia, a flagellated intestinal pathogen, were highly represented among genes upregulated in stationary phase H. muscarum. In G. lamblia, these VSPs are integral membrane proteins rich in cysteine residues, often in CxxC repeats. They have a highly conserved C-terminal membrane spanning region which has a hydrophilic cytoplasmic tail with a conserved five amino acid CRGKA signature sequence, and an extended polyadenylation signal [63, 64]. One VSP, of hundreds in the Giardia genome, is expressed per Giardia cell and they are thought to protect the cells from proteolysis [65]. A similar strategy of surface protein expression is utilised by blood stage T. brucei cells [66]. This method of antigen switching plays a major role in immune system avoidance and survival in vertebrate hosts. In H. muscarum the VSP domain-containing genes are predicted, by Phobius [67], to encode proteins with 8–9% cysteine residues, and a single predicted transmembrane domain predicted at the C-terminus. Notably there were also ten VSP domain containing proteins downregulated upon entry into stationary phase.
In addition to the VSP domain containing genes, several other putative surface proteins were differentially regulated upon entry to stationary phase; two putative amastin genes were highly upregulated, and eight transcripts which encode for proteins with the cytomegalovirus UL20A protein domain (PF05984), were downregulated in stationary phase H. muscarum cells. The functions of proteins with UL20a domains, including the domains namesake, are largely unknown. Deletion of UL20a from the human cytomegalovirus genome resulted in reduced viral production in infected fibroblasts [68]. Further study will be required to elucidate the role of these proteins in trypanosomatids.
Transcription
The bias towards downregulated transcripts in the stationary phase cells as compared to log phase suggests a reduction of transcription and translation during stationary phase. Furthermore, five tRNA-synthase Pfam domains (PF00133.22, PF00749.21, PF00152.20, PF00587.25, PF01411.19) were significantly enriched in downregulated transcripts (chi-squared, p< 0.05) and RNA polymerase III subunits were also downregulated. Overall, transcriptomic changes associated with cell surface remodelling, autophagy and reductions in transcription were observed in cells entering stationary phase. Cyclin expression patterns appear to suggest a bias in cells at G1 phase, as reported for in vitro culture of T. brucei procyclics [69].
Transcriptome of H. muscarum inside D. melanogaster compared to in vitro culture
To identify potentially important H. muscarum genes during the infection of D. melanogaster we sought to analyse the transcriptome of the trypanosomatid over the course of infection by RNA-sequencing analysis. RNA was purified from infected flies at 6, 12, 18, and 54 hours post-ingestion of H. muscarum. The resulting RNAs were sequenced and mapped to the concatenated genomes of D. melanogaster and H. muscarum. Reads were later resolved to the corresponding species. Here we will discuss the resulting transcriptome of H. muscarum: the transcriptome of D. melanogaster after ingestion of H. muscarum in the same experiment was discussed elsewhere [20].
The number of reads which mapped to the H. muscarum genome ranged from 6949 to approx. 16.2 million reads per sample. At 6 hours post ingestion 40% of the total mapped reads were shown to map to H. muscarum (average of 3 biological replicates). This decreased to 20% in samples from 12 hours and 9% at 18 hours post ingestion. This correlates with the observed decrease in H. muscarum numbers as the parasite was cleared by D. melanogaster 18–54 hours post ingestion [20]. For differential expression analysis, only data up to 18 hours post infection was used as at 54 hours the number of sequencing reads mapping to the H. muscarum genome dropped below 1% of the total number of mapped reads (Fig 6).
Principal component analysis (PCA) shows that the first two principal components of variation in mRNAs between H. muscarum from in vitro culture and H. muscarum after ingestion by D. melanogaster explained 58% and 10% of the variance in these data (Fig 6B). The PCA plot shows a high degree of difference between the in vitro samples and samples isolated from infected flies. The level of change in expression was much higher than between the two in vitro conditions discussed above.
For the infections, log phase H. muscarum cultures were used to feed the flies. In order to identify transcriptomic changes in H. muscarum associated with being ingested by the fly, we compared the transcriptome of H. muscarum cells from log phase in vitro culture to the in-fly transcriptomes. Over a third of the genome, 4,633 genes, was significantly differentially regulated (Wald test, adjusted p-value < 0.05) between log phase axenic culture samples and samples from infected flies (S18 Table). Comparisons of gene expression between sequential time points over the course of infection revealed that there was a large initial transcriptomic change upon ingestion with 4662 genes differentially regulated between log phase culture and six hours post ingestion. This large initial transcriptomic shift was followed by more subtle transcriptomic changes between 6–12 (204 genes) and 12–18 hours (25 genes) (adjusted p-values < 0.05). Here we describe some of the changes in gene expression observed after ingestion and how these compare with other published transcriptome studies of trypanosomatids in their insect vectors including notable work by Inbar et al., 2017 [37] on genes expression of four morphologically distinct L. major stages in a sand fly vector and Savage et al., 2016 [70] on T. brucei in three tsetse fly tissues.
Herpetomonas muscarum genes differentially regulated at six hours post-ingestion by Drosophila melanogaster
Approximately a third of the H. muscarum genome was found to be significantly differentially expressed between log phase axenic culture and six hours post ingestion by D. melanogaster (p < 0.05) (S19 Table). Of this subset, 640 genes had a fold change of ≥ 4 between the time points–highlighting the magnitude of the trypanosomatids response to ingestion. GO enrichment analysis, using Ontologizer [57], identified two significantly enriched GO terms in the 346 transcripts comparatively enriched at six hours post ingestion; OG0000045 (autophagosome assembly, p = 0.0014) and OG0003333 (amino acid transporters, p = 0.0002). Given the aforementioned lack of annotated GO terms in H. muscarum, we also looked at Pfam enrichment in the H. muscarum genes significantly upregulated upon ingestion by the fly. The top 15 represented Pfam domains in genes upregulated ≥ 4-fold at six hours post-ingestion are all significantly enriched compared to the full gene set (S20 Table). Additionally, there were several Pfam domains enriched in the downregulated transcripts, which we discuss further below.
Leucine-rich repeat proteins
The most represented Pfam domain in genes upregulated at 6 hours post ingestion were the leucine-rich repeat (LRR) domains. LRRs are primarily known to be involved in protein-protein and protein-glycolipid interactions and are the major domain of the Leishmania protein surface antigens (PSAs), which are known virulence factors. Ten of the upregulated LRR-containing genes encode orthologues of the Leishmania PSAs (Fig 7A). The predicted protein structures for 8/10 of these transcripts consists of a single transmembrane domain at the N-terminus, with the majority of the protein predicted to be on the external face of the cell (S21 Table). One transcript encodes a protein with no predicted transmembrane domains and could therefore be a secreted protein. The remaining transcript encodes a protein with two predicted transmembrane domains, with the region between these domains on the external face of the cell. Other upregulated LRR-containing transcripts are putative adenylate cyclases. These proteins also feature prominently in the T. brucei genes which are differentially regulated upon ingestion by tsetse [70]. These signalling proteins likely assist in the coordination of the trypanosomatids’ responses to the environment with its vector.
Cell surface genes
Seven of the top fifteen genes, 21/346 overall, upregulated in H. muscarum at six hours post ingestion by D. melanogaster contained the Giardia variant-specific surface protein (VSP) domain (PF03302.13). These genes are members of three distinct orthogroups. A heatmap showing the normalised read counts for these genes across all samples is shown in Fig 7B. Transmembrane domain prediction tools [67, 71] predict a single transmembrane domain at the N-terminus in the majority of predicted protein sequence for these genes. However, there were also eight transcripts without predicted transmembrane domains, which are predicted to be secreted proteins. The majority of these putative surface antigens are 769–781 amino acids in length, have a single predicted transmembrane helix at residues 7–29 (S21 Table). As previously mentioned, many of these proteins are also upregulated by the cells upon entry into stationary phase, though not to the same levels. Additionally, several transcripts for VSP-containing proteins are downregulated in H. muscarum upon entry into the fly. These thirteen proteins are generally smaller than those upregulated at the same time point (95–501 amino acids) and tended to be part of orthogroup 11.
Thirty amastins, from 11 different orthogroups, were differentially regulated in H. muscarum at 6 hours post ingestion (Fig 7C). The majority (21) were upregulated upon entry into the fly, though 14 transcripts were also upregulated during stationary phase in vitro culture. Each orthogroup represented contained both up- and down-regulated genes. The function of this family of glycoproteins, are not well understood. In Leishmania, amastins are more commonly associated with macrophage-dwelling amastigote forms, where they are known to be important to both survival and virulence [72]. However, it has also been shown that β-amastins are upregulated during the insect stages of the life cycle in T. cruzi [73]. The H. muscarum amastins from orthogroup 18 share only 25–30% identity (across the whole sequence) to the two pairs of T. cruzi β-amastin alleles highlighted in this study. This may initially seem to be quite low, however the β-amastins have been shown to be highly divergent (18–25% identity) between T. cruzi strains [73]. Therefore, based on sequence alone, it is unclear which proteins may have parallel roles in the two trypanosomatid species.
Several other classes of surface protein genes were differentially expressed between log-phase axenic culture and six hours post-ingestion. Transcripts for proteins containing the Cytomegalovirus UL20A protein domain (PF05984) were significantly down regulated upon ingestion. Five of these genes were from orthogroup 11 –the same group as many of the down regulated VSP domain containing genes. Finally, sixteen (of the twenty-eight in the genome) H. muscarum orthologues to known Leishmania virulence factor, GP63, were significantly differentially regulated in the first six hours post ingestion by the fly. All but one of the differentially regulated GP63 orthologues were predicted to be GPI-anchored at the cell surface (GPI-SOM online tool) [74]. The exception, HMUS00892600.1, is predicted (THTMM v2.0) [71] to have a single transmembrane domain and for the majority of the protein to be cytosolic. Most GP63 transcripts were upregulated in H. muscarum after ingestion (log2 foldchanges 0.29–2.73), however two putative GP63 genes, HMUS01311000 and HMUS01311200, were downregulated with log2 foldchanges of -1.94 and -1.58 respectively.
Stress-related genes
The insect gut is a hostile environment. The presence of digestive enzymes, changes in pH and the insect’s gut microbiota make surviving a difficult challenge for any invading organisms. In correlation with this, a number of stress-associated genes and pathways are upregulated in H. muscarum upon entry into the fly. As previously mentioned, autophagy is an important process for survival in stressful conditions where fewer nutrients are available—such as in the midgut of an insect. Similar to observed in stationary phase axenic culture, twenty-six putative ATG8 genes were upregulated in H. muscarum at six hours post ingestion compared to log-phase axenic culture–suggesting extensive protein recycling is occurring in the cells. Additionally, 40 heat shock protein 83 genes were shown to be upregulated at six hours after ingestion. Heat shock proteins act as molecular chaperones which stabilise other proteins, help them to fold correctly and be regulated after damage in stressful conditions. The upregulation of these genes provides further evidence that these cells are in a stressed state.
Metabolism
There was significant enrichment of putative amino acid, pteridine and sugar transporters in the upregulated transcripts. These included the amino acid transporters (AATs) orthologous to the Leishmania amino acid permease 3 (AAP3), AAT11, AAT12 and AAT20. AAP3 has been shown to be arginine specific and is linked to virulence in L. donovani infections in humans [75]. AAT11 is upregulated in during stress responses associated to purine starvation [76]. In L. major, AAP3 and AAT20 were strongly upregulated in the motile, gut-dwelling nectonomad forms [37]. These transporters have been shown to transport neutral amino acids across the cell membrane, notably proline and alanine, which can be used as alternative carbon sources by trypanosomatids and are abundant in insect vectors haemolymph.
Six putative pteridine transporters were also upregulated in H. muscarum at 6 hours post ingestion. Pteridines are needed by trypanosomatids to produce enzyme cofactors such as biopterin. Leishmania parasites are unable to synthesize their own pteridines [77] and as such must scavenge them from their environment. It is not currently known if H. muscarum is also a pteridine auxotroph, however like the Leishmania species, the cells appear to scavenge from the environment upon entry into the fly.
Several transcripts putatively involved in lipid metabolism were downregulated in H. muscarum following ingestion by D. melanogaster, including triglyceride lipases and members of the biotin/lipoate protein ligase (BLPL) family. This contrasts what has been observed in L. major in the midgut of sand flies where genes from these families were upregulated [37]. Therefore, whilst upregulation of pteridine and amino acid transporters appears to be a conserved trypanosomatid response to being ingested by insects, lipid metabolism during insect infection may differ between trypanosomatid genera.
Gene expression-related transcripts
Consistent with the differential expression of many genes upon entry into the fly, and therefore a predicted increase in chromatin remodelling and translation activity, there was upregulation of histones (2A, 3 and 4), RNA polymerase subunits 1 and 2, putative 40S/60S ribosomal proteins and putative 28S beta rRNAs in H. muscarum after ingestion by the fly. This result is consistent with what has been reported in T. brucei where the 40S and 60S ribosomal subunits were amongst the most highly upregulated genes in cells isolated from the midgut and proventriculus of G. morsitans [70].
Cell cycle
Upon ingestion by the fly there was strong upregulation of putative G1-associated cyclins 4, 7 and 11 as well as the G1 associated cyclin-related kinase 1 (CRK1) [58]. Cyclin 6, cyclin 8 and CRK9, which are associated with the G2/mitosis transition [59, 78], were slightly downregulated suggesting a reduction in cell replication at six hours post ingestion (Table 6). Consistent with this there was also downregulation of putative DNA polymerase kappa, the theta DNA polymerase subunit and mitochondrial DNA polymerase subunits. Furthermore meiosis-associated genes NBS1, Rad50 and SPO11 were also downregulated.
Table 6. Cell cycle-associated proteins differentially expressed in H. muscarum upon ingestion by D. melanogaster.
Gene Name | H. muscarum orthologue ID | log2foldchange | adjusted p-value |
---|---|---|---|
cyclin 11 | HMUS01322900.1 | -3.31 | 6.54E-27 |
cyclin 4 | HMUS00787500.1 | -1.15 | 3.62E-13 |
CRK4 | HMUS00195900.1 | -0.95 | 3.46E-03 |
CRK1 | HMUS01116400.1 | -0.84 | 9.95E-08 |
CRK8 | HMUS00385600.1 | -0.49 | 2.32E-02 |
cyclin 7 | HMUS00475100.1 | -0.44 | 2.21E-02 |
cyclin 8 | HMUS00524500.1 | 0.36 | 1.94E-02 |
mitochondrial DNA polymerase I protein D | HMUS00617400.1 | 0.57 | 9.16E-03 |
cyclin 6 | HMUS00719100.1 | 0.74 | 2.14E-02 |
cyclin 5 | HMUS00580100.1 | 0.85 | 9.34E-05 |
CRK9 | HMUS01274200.1 | 0.87 | 1.45E-03 |
DNA polymerase theta catalytic subunit | HMUS00097200.1 | 1.15 | 1.51E-07 |
mitochondrial DNA polymerase I protein C | HMUS00828800.1 | 1.25 | 7.27E-09 |
DNA polymerase kappa | HMUS01207400.1 | 1.36 | 4.93E-03 |
CRK11 | HMUS00452900.1 | 1.46 | 8.20E-04 |
CRK12 | HMUS00986000.1 | 2.07 | 6.68E-18 |
Given the apparent reduction replication rate in H. muscarum cells at six hours after ingestion, the upregulation of nine tubulin genes (3 alpha- and 6 beta-tubulins) is likely to accommodate the changes in cell morphology, rather than to produce new daughter cells. Tubulin upregulation is also observed in T. brucei isolated from the midgut and proventriculus of Glossina morsitans [70], though these cells are replicative–as such the ‘motivation’ for increased tubulin gene expression may be different.
Differentiation and RNA-binding proteins
It is well documented that (human) disease-causing trypanosomatids have several life-cycle stages within their respective vectors. Coordinated differentiation between these discrete stages requires a suite of RNA-binding proteins (RBPs) which regulate parasite gene expression [38]. Despite the lack of observed differentiated forms in infections of D. melanogaster, several differentiation associated-RBPs are differentially regulated in the trypanosomatid after infection including RBP10 and hnRNP F/H. These proteins have been shown to regulate gene expression in T. brucei blood-stream forms [41, 79]. RNAi knockdown of RBP10 in bloodstream trypanosomes resulted in the downregulation of a large number of bloodstream form mRNAs [41]. The same study showed that overexpression of the protein in procyclics led to an increase of many bloodstream-form specific mRNAs, including genes involved in sugar transport. This is likely owing to the fact blood is a glucose-rich environment and the cell will attempt to utilize this ready carbon source [80]. Three out of the four orthologues of TbRBP10 were strongly (> 4-fold) upregulated in H. muscarum cells after ingestion by D. melanogaster. During feeding experiments sucrose is added to the H. muscarum culture media to encourage the flies to feed. As such these genes may be unregulated in response to increased sugars available in the environment.
However, several other cell-cycle regulating RBPs associated with blood-stream form trypanosomes were also upregulated in H. muscarum after ingestion by the fly, including zinc-finger domain-containing RBPs ZC3H11 and ZC3H18. The former is essential in bloodstream-form trypanosomes and is involved in protection from heat shock, whilst depletion of ZC3H18 delayed blood stream form-to-procyclic differentiation in T. brucei [81, 82]. As such the situation may be more complex than solely metabolism-driven expression changes.
In addition to parallels with blood-stream form trypanosomes, transcripts for ALBA3/4 proteins (named for their ‘acetylation lowers binding affinity’ domain) were significantly downregulated in H. muscarum upon entry into the fly. In T. brucei, these proteins are expressed in all stages, except those found in the tsetse proventriculus. RNAi knockdown of these proteins in T. brucei axenic procyclics resulted in elongation of the cell body and repositioning of the nucleus and the kinetoplast to resemble the epimastigote cell-stage [83]. As such the reduction in ALBA3/4 transcripts suggests there may be parallels between trypanosomes during the latter stages of tsetse infection and H. muscarum during D. melanogaster infection.
Other differentially regulated RNA-binding proteins with as yet unclear roles in differentiation included: the essential gene expression regulation protein RBP42 and ZC3H12, a protein associated with differentiation [38].
Herpetomonas muscarum genes differentially regulated between six- and twelve-hours post-ingestion by Drosophila melanogaster
There were 204 genes which were differentially regulated between six- and twelve-hours post ingestion (p-adjusted < 0.05), 161 of these had a fold change of ≥ 2 with just 31 genes upregulated at the latter timepoint (S22 Table). Hypothetical proteins lacking functional information dominated the highly upregulated genes. The most enriched transcript at 12 hours post ingestion encodes a putative surface protein, the top blastp hit for which was the Giardia variant-specific surface protein VSP136-4. This suggests VSP domain-containing proteins continue to be important throughout infection of the fly. Two DNA replication and repair associated transcripts were also upregulated at 12 hours post ingestion: an orthologue of T. brucei cell division cycle protein 45 (CDC45), and tyrosyl-DNA phosphodiesterase-like protein. CDC45 is part of the CMG (Cdc45·Mcm2–7·GINS) complex which functions as a helicase during DNA replication [84] and may also play a role in DNA repair [85]. Furthermore, Tyrosyl-DNA phosphodiesterases are involved in the repair of topoisomerase-related DNA damage [86]. These observations indicate that H. muscarum cells are under genotoxic stress after ingestion by D. melanogaster.
Herpetomonas muscarum genes differentially regulated between twelve- and eighteen-hours post-ingestion by Drosophila melanogaster
In the 23 genes found to be upregulated at 18 hours post ingestion (compared to at 12 hours, see S23 Table) genes involved in binding to damaged DNA (OG00033330) were significantly enriched. Only two of these transcripts were able to be assigned putative functions: eukaryotic replication factor A and a structure-specific endonuclease. This observation provides further evidence of genotoxic stress in H. muscarum after ingestion, as indicated by other upregulated DNA repair genes at 12 hours post-ingestion.
The most highly upregulated transcript at 18 hours post ingestion was an orthologue of the L. major UDP-galactose transporter LPG5B. This protein allows import of UDP-galactose into the golgi body where they are used to synthesize phosphoglycans. Capul et al., (2007) showed that, in L. major, loss of LPG5B resulted in cells with defects in proteophosphoglycans (PPG) [87]. PPGs are known virulence factors and are found in membrane bound, filamentous and secreted forms. The viscous secreted PPG is thought to protect the L. major in the gut and may also force the fly to regurgitate the infective Leishmania cells into the bite wounds of vertebrates.
Herpetomonas muscarum genes differentially regulated between stationary phase in vitro culture and in-fly samples
Comparisons between stationary phase in vitro culture and in-fly samples revealed 5102 differentially expressed genes (adjusted p-value < 0.05). Approximately 55% of the genes differentially regulated between in vitro and in-fly samples were the same for log phase vs in-fly and stationary phase vs. in-fly comparisons (Fig 8). However, 1639 genes were only significantly differentially regulated in stationary phase vs in-fly comparisons (S24 Table). Genes differentially regulated between log phase in vitro culture and in-fly samples have already been discussed, we will now outline the genes only differentially regulated when the transcriptomes of stationary phase in vitro samples of H. muscarum are compared with those after ingestion by D. melanogaster. Of the 1639 genes, 750 had a fold change of ≥ 2 –approximately a third of which were upregulated in H. muscarum after ingestion by D. melanogaster.
Half of the top ten in-fly enriched transcripts were TATE (telomere associated mobile elements) DNA transposons and among the most represented Pfam hits in the fly-enriched transcripts were reverse transcriptase (PF00078.27) and phage integrase (PF00589.22) domains (S25 Table). Though TATE DNA transposons comprise 1.32% of the L. major genome, very little is known about these transposable elements, other than that they contain a tyrosine recombinase [88]. It is possible that these transposable elements are more mobile in H. muscarum cells ingested by the fly. However, we predict that the overall level of transcription of cells in stationary phase cultures are reduced (vs. log phase, see above). As such, the comparative increase in TATE transposon transcription between stationary phase cells and H. muscarum from Drosophila may not be specifically a result of ingestion, but a reflection of general transcription levels in the two groups of cells.
As previously discussed, transcripts for several proteins containing a Giardia VSP domain are enriched in stationary phase compared to log phase in axenic culture. However, five were shown to be even more abundant in the H. muscarum cells ingested by D. melanogaster. Two other putative surface antigens were also enriched in ingested H. muscarum which contained a domain similar to Cytomegalovirus UL20A glycoprotein and the domain of unknown function DUF4148.
Transcripts encoding for putative antioxidant proteins were significantly enriched in H. muscarum after ingestion by the fly. Enriched Pfam domains in the upregulated gene set included thioredoxin, glutathione S-transferase and alkyl hydroperoxide reductase (AhpC)/thiol specific antioxidant (TSA) domains. Our previous work showed that the D. melanogaster response to H. muscarum ingestion included the production of reactive oxygen species [20], as such the upregulation of these antioxidant proteins is likely an attempt to cope with this insect immune response.
Conclusion
Here we have described the genome and predicted proteome of the monoxenous trypanosomatid H. muscarum and characterised the transcriptome of the parasite both in culture and inside the gut of its natural host D. melanogaster. H. muscarum shows similarity in both genome structure and content to Leishmania, with significant synteny to L. major and sharing 80% of orthogroups with other members of the subfamily Leishmaniinae. While most Herpetomonas genes have orthologs in other trypanosomatids, a number of genes found elsewhere appear to have been lost in Herpetomonas, in particular genes associated with the specialised life stages of dixenous trypanosomatids. We might expect loss of some mammal-stage specific genes, such as HASPs, HERPs and sphingolipid synthesis genes important in metacyclic Leishmania cells, but more surprising might be the loss of genes expressed in insect stages such as BARPs and procyclins.
The transcriptome of Herpetomonas inside its insect host also showed strong parallels with the responses of Leishmania promastigotes inside the sand fly gut, in particular both parasites showing significant upregulation of PSAs and GP63 (this study; see ref. [37]). These proteins have been shown to be associated with virulence in Leishmania and are important for establishment of parasite infection in the midgut, and so for transmission. The extensive changes in transcript abundance of genes likely to be expressed on the cell surface during insect infections includes a number of gene families not known to be important in dixenous trypanosomatids (e.g. related to Giardia variant surface protein) implies that a dynamic cell surface may be a shared feature of trypanosomatid life cycles beyond dixenous groups [89], and that even more diversity of surface proteins may be present in the monoxenous trypanosomatids, supporting findings from free-living kinetoplastids. We also note that the majority of the genes showing changes in expression later in insect infections are hypothetical, including many hypothetical genes conserved with other trypanosomatids. This reflects similar findings in better-studied dixenous parasites [37, 70] and highlights how much we still have to learn about the interactions between trypanosomatids and their insect host.
In the wild, there is little data pertaining to the percentage of sand flies with established Leishmania infection in endemic regions. In this context, the parallel to the more accessible Drosophila-Herpetomonas system is important, as the genetic component of the parasite that influences midgut establishment is easier to determine. However, more work is needed to ascertain whether genes upregulated in Leishmania and those in Herpetomonas are truly functionally related. The limitation is the difference between the lifestyles of these insects. Most strikingly, female sand flies become infected with Leishmania during blood feeding, while Drosophila is never haematophagous. Nevertheless, sand flies are also plant feeders, so there is some overlap in the ecological niche as well as in their basic biology. The presence of trypanosomatids is another shared feature of the midgut landscape of these flies, and our data suggest that at least some aspects of the molecular interaction between flies and trypanosomatids may also be conserved.
Materials and methods
Herpetomonas muscarum culture
H. muscarum were cultured in supplemented BHI (3% brain heart infusion broth, 2.5mg/ml haemin, 1% FCS) and incubated at 28°C. For most experiments, cells were maintained in a log phase of growth by splitting every 3 days.
Infection of D. melanogaster (see reference 20)
For each independent infection of a group of 20–30 flies, 107 H. muscarum cells were harvested from a 3 days-old culture (which showed the highest infectivity rate from our experience) and resuspended in 500ul 1% sucrose. The parasite solution was then transferred to a 21mm Whatman Grade GF/C glass microfibre filter circle (Fisher Scientific). Circles containing the parasite cells were placed into standard Drosophila small culture vial without any food. The flies used in the infections were 4–5 days old before they were starved overnight. After starvation, the flies were transferred to food vials that contained the Whatman circles with the parasite cells. After 6h of feeding, flies were moved and reared on standard yeast/molasses medium. At different time points post oral infection, infected flies were collected for downstream experiments and frozen at -80°C for molecular analyses.
DNA extraction for genome sequencing
Genomic DNA was extracted from 100 million H. muscarum cells from log phase cells from in vitro culture using the Norgen Biotek Genomic DNA extraction kit according to the manufacturer's instructions.
RNA extraction for RNA-seq
8ml of H. muscarum promastigote culture at a density of 9.25 x106 cells per ml (measured by haemocytometer) was diluted 1:40 in supplemented BHI and divided between 4 tissue culture flasks. The immediate post-dilution density was 6.5 x105 cells per ml. The following day the cell density was measured to be 1.18 x 106 cells per ml. 45ml was taken from each flask and the cells pelleted by centrifugation for 10 mins at 1000xg. The supernatant was discarded and the Norgen Biotek RNA Purification kit was (according to manufacturer's instructions) used to purify RNA from the cell pellet. This process was repeated for 5.3ml of the remaining culture three days later when the cell density was 1.21x107 cells per ml. The resulting RNA was eluted at concentrations 97–170 ng per μl with a 260/230 absorbance 1.86–2.19.
Reference genome
To produce the reference genome Illumina and Pacific Biosciences sequencing platforms were used. For Illumina sequencing 1ug of genomic DNA was sheared into 300–500 base pair (bp) fragments by focused ultrasonication (Covaris Adaptive Focused Acoustics technology, AFA Inc., Woburn, USA). An amplification-free Illumina library was prepared [90] and 150 bp paired-end reads were generated on an Illumina MiSeq following the manufacturer’s standard sequencing protocols [91]. For the Pacific Biosciences SMRT technology, 8 μg of genomic DNA was sheared to 20-25kb by passing through a 25mm blunt ended needle. A SMRT bell template library was generated using the Pacific Biosciences issued protocol (20 kb Template Preparation Using BluePippin(tm) Size-Selection System). After a greater than 7kb size-selection using the BluePippin(tm) Size-Selection System (Sage Science, Beverly, MA) the library was sequenced using P6 polymerase and chemistry version 4 (P6C4) on 6 single-molecule real-time (SMRT) cells [92].
The Pacific Bioscience reads were assembly with HGAP3 [93], with genome size parameter set to 25Mb, to produce 285 contigs. The obtained assembly was then corrected with ICORN2 [94], for five iterations. Using the Argus Optical Mapping System from OpGen, an optical map was generated from high molecular weight genomic DNA captured in agarose plugs and the restriction enzymes KpnI and BamHI. The data was analysed with associated MapManager and MapSolver software tools (http://www.opgen.com/products-services/argus-system). The optical map consisted of 37–38 chromosomes with approximately half being contiguous. With the information obtained from the optical map and REAPR [95], manual genome improvement was performed on the PacBio assembly to produce a final genome assembly of 181 contigs. Analysis of the frequency distribution of Kmers was performed using GenomeScope version 1.0 [22] with the kmer frequencies estimated using Jellyfish [96] using the default parameters suggested in the GenomeScope manual.
Transcriptomic libraries Poly-A mRNA was purified from total RNA using oligodT magnetic beads and strand-specific indexed libraries were prepared using the KAPA Stranded RNA-Seq kit followed by ten cycles of amplification using KAPA HiFi DNA polymerase (KAPA Biosystems). Libraries were quantified and pooled based on a post-PCR Agilent Bioanalyzer and 75 bp paired-end reads were generated on the Illumina HiSeq v4 following the manufacturer's standard sequencing protocols (as above).
Data release
All sequencing data was submitted to the European Nucleotide Archive (ENA) under accession number ERP008869.
Genome annotation
CRAM output files containing RNA sequencing reads from both H. muscarum in vitro culture and infected D. melanogaster were converted to fastq format and then mapped to the genome sequence using the next generation sequencing reads alignment package HISAT2 version 2.1.0 [97]. The mapped reads from each sample were assembled into transcripts with the Cufflinks package version 2.2.1[98] and merged to form a single transcript set for all reads. The Companion annotation tool [23] was then used to generate several genome annotation files based on the RNA sequencing transcriptomic evidence and pre-existing gene models from three other trypanosomatids–L. braziliensis, L. major and T. brucei (individual annotation statistics S26 Table).
Orthofinder proteome analysis
The following proteomes were inputted into the Orthofinder script; Trypanosoma brucei brucei 927 v5.1 [24], Trypanosoma brucei gambiense DAL972 v3 [99], Trypanosoma congolense IL3000 [100], Trypanosoma cruzi (CL Brener) [27], Trypanosoma evansi STIB805 [101], Trypanosoma grayi ANR4 v1 [102], Trypanosoma rangeli SC_58 v1 [103], Trypanosoma theileri Edinburgh [104], Trypanosoma vivax Y486 [105], Leishmania braziliensis M2903 [56], Leishmania donovani BPK282 v1 [105], Leishmania infantum JPCM5 [56], Leishmania major Friedlin v6 [106], Leptomonas pyrrhocoris ASM129339v1 [11], Leptomonas seymori ASM129953v1 [107], Crithidia bombi [9], Crithidia expoeki [9], Crithidia fasciculata v14.0 [108], Angomonas deanei [8], Phytomonas EM1[109] and Bodo saltans v3 [110]. Where possible the above sequences were obtained from TriTrypDB v41 [111].
RNAseq analysis in vitro culture
CRAM output files were converted to fastq format and then mapped to the concatenated D. melanogaster and H. muscarum genome sequences using the hisat2 [98] mapper. Mapped reads were then counted using HTseq-count (v. 0.10.0) [112] and differential expression analysed using the DESeq2 package in R [113].
RNAseq analysis samples from whole flies
Total RNA of 8–10 flies at 6h, 12h, 18h post H. muscarum oral infection was extracted with total RNA purification kit from Norgen Biotek following the manufacturer’s instruction. Each time point was repeated in three independent experiments. cDNA libraries were prepared with the Illumina TruSeq RNA Sample Prep Kit v2. All sequencing was performed on the Illumina HiSeq 2000 plaftform using TruSeq v3 chemistry (Oxford Gene Technology, OGT). All sequence was paired end and performed over 100 cycles. Read files (Fastq) were generated and then mapped to the concatenated D. melanogaster and H. muscarum genome sequences using the hisat2 mapper [98]. Mapped reads were then counted using HTseq-count (v. 0.10.0) [112] and differential expression analysed using the DESeq2 package in R [113].
Supporting information
Acknowledgments
We thank the staff of the DNA pipelines at Wellcome Sanger Institute for sequencing and generating sequencing libraries.
Data Availability
All sequencing data are available at the European Nucleotide Archive (ENA) under accession number ERP008869.
Funding Statement
KB, TDO, MJS and JAC were supported by Wellcome via their core support for the Wellcome Sanger Institute (WSI) through grant 206194. Work in Oxford was supported by a Consolidator grant from the European Research Council (310912 Droso-Parasite, to PL), project grant BB/K003569 from the Biological and Biotechnological Sciences Research Council (to PL) and a Wellcome Trust doctoral scholarship (to MAS). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
References
- 1.Pacheco, Raquel S, Marzochi, Mauro CA, Pires, Marize Q, Brito, Célia MM, Madeira, de Fátima Maria, & Barbosa-Santos, Elizabeth GO. (1998). Parasite Genotypically Related to a Monoxenous Trypanosomatid of Dog's Flea Causing Opportunistic Infection in an HIV Positive Patient. Memórias do Instituto Oswaldo Cruz, 93(4), 531–537. 10.1590/s0074-02761998000400021 [DOI] [PubMed] [Google Scholar]
- 2.Zídková L., Cepicka I., Votypka J., Svobodová M. 2010. Herpetomonas trimorpha sp. nov. (Trypanosomatidae, Kinetoplastida), a parasite of the biting midge Culicoides truncorum (Ceratopogonidae, Diptera). International Journal of Systematic and Evolutionary Microbiology. 60(9): pp. 2236–2246. [DOI] [PubMed] [Google Scholar]
- 3.Rowton E. D. and Barclay McGhee R. (1978) ‘Population Dynamics of Herpetomonasampelophilae, with a Note on the Systematics of Herpetomonas from Drosophila spp.’, The Journal of Protozoology. John Wiley & Sons, Ltd (10.1111), 25(2), pp. 232–235. 10.1111/j.1550-7408.1978.tb04402.x [DOI] [Google Scholar]
- 4.Lange C. E., and Lord J. (2012). “Protistan entomopathogens,” in Insect Pathology, 2nd Edn, eds Vega B. and Kaya H.(Amsterdam: Elsevier; ), 367–394. 10.1016/B978-0-12-384984-7.00010-5 [DOI] [Google Scholar]
- 5.Vega F. E. and Kaya H. K. 2012. Insect Pathology Second Edition Academic Press; Amsterdam (The Netherlands) and Boston (Massachusetts). Elsevier. ISBN: 978-0-12-384984-7. [Google Scholar]
- 6.Erwin T. L. 1983. Tropical forest canopies: the last biotic frontier. Bulletin of the Entomological Society of America, Volume 29: 14–19. [Google Scholar]
- 7.Hallmann CA, Sorg M, Jongejans E, Siepel H, Hofland N, et al. (2017) More than 75 percent decline over 27 years in total flying insect biomass in protected areas. PLOS ONE 12(10): e0185809 10.1371/journal.pone.0185809 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Motta M. C. M., Martins A. C., de Souza S. S., Catta-Preta C. M., Silva R., Klein C. C., de Almeida L. G., de Lima Cunha O., Ciapina L. P., Brocchi M. 2013. Predicting the Proteins of Angomonas deanei, Strigomonas culicis and Their Respective Endosymbionts Reveals New Aspects of the Trypanosomatidae Family. PLoS ONE. 8(4): e60209 10.1371/journal.pone.0060209 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Schmid-Hempel P. et al. (2018) ‘The genomes of Crithidia bombi and C. expoeki, common parasites of bumblebees’, PLoS ONE, 13(1). 10.1371/journal.pone.0189738 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Runckel C., DeRisi J., Flenniken M. L. 2014. A draft genome of the honey bee trypanosomatid parasite Crithidia mellificae. PLoS ONE, 9(4). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Flegontov P., Butenko A., Firsov S., Kraeva N., Eliáš M., Field M. C., Filatov D., Flegontova O., Gerasimov E. S., Hlaváčová J. et al. , 2016. Genome of Leptomonas pyrrhocoris: A high-quality reference for monoxenous trypanosomatids and new insights into evolution of Leishmania. Scientific Reports, 6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Teixeira S. M., de Paiva R. M., Kangussu-Marcolino M. M., Darocha W. D. 2012. Trypanosomatid comparative genomics: Contributions to the study of parasite biology and different parasitic diseases. Genetics and Molecular Biology, 35(1): pp. 1–17. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Zoltner M, Krienitz N, Field MC, Kramer S (2018) Comparative proteomics of the two T. brucei PABPs suggests that PABP2 controls bulk mRNA. PLOS Neglected Tropical Diseases 12(7): e0006679 10.1371/journal.pntd.0006679 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Stuart K., Allen T. E., Heidmann S., Seiwert S. D. 1997. RNA editing in kinetoplastid protozoa. Microbiology and Molecular Biology Reviews, 61(1): pp. 105–120. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Liang X. Haritan, A., Uliel, S., Michaeli, S. 2003. trans and cis Splicing in Trypanosomatids: Mechanism, Factors, and Regulation. Eukaryotic Cell, 2(5): pp. 830–840. 10.1128/EC.2.5.830-840.2003 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Chen J., Rauch C. A., White J. H., Englund P. T., Cozzarelli N. R. 1995. The topology of the kinetoplast DNA network. Cell, 80(1): pp. 61–69. 10.1016/0092-8674(95)90451-4 [DOI] [PubMed] [Google Scholar]
- 17.Lukeš J., Guilbride D. L., Votýpka J., Zíková A., Benne R., Englund P. T. 2002. Kinetoplast DNA Network: Evolution of an Improbable Structure. Eukaryotic Cell, 1(4): pp. 495–502. 10.1128/EC.1.4.495-502.2002 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Borghesan T. C., Ferreira R. C., Takata C. S., Campaner M., Borda C. C., Paiva F., Milder R. V., Teixeira M. M., Camargo E. P. 2013. Molecular Phylogenetic Redefinition of Herpetomonas (Kinetoplastea, Trypanosomatidae), a Genus of Insect Parasites Associated with Flies. Protist. Urban & Fischer, 164(1): pp. 129–152. [DOI] [PubMed] [Google Scholar]
- 19.Simpson L. and Thiemann O. H. 1995. Sense from nonsense: RNA editing in mitochondria of kinetoplastid protozoa and slime molds. Cell, 81: pp. 837–840. 10.1016/0092-8674(95)90003-9 [DOI] [PubMed] [Google Scholar]
- 20.Wang L., Sloan M., Ligoxygakis P. 2018. Intestinal NF-κB and STAT signalling is important for uptake and clearance in a Drosophila-Herpetomonas interaction model. PLoS Genet 15(3): e1007931 10.1371/journal.pgen.1007931 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Van Den Abbeele J., & Rotureau B. (2013). New insights in the interactions between African trypanosomes and tsetse flies. Frontiers in cellular and infection microbiology, 3, 63 10.3389/fcimb.2013.00063 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Vurture G. W., Sedlazeck F. J., Nattestad M., Underwood C. J., Fang H., Gurtowski J., Schatz M. C. 2017. GenomeScope: fast reference-free genome profiling from short reads, Bioinformatics, 33(14): pp. 2202–2204. 10.1093/bioinformatics/btx153 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Steinbiss S., Silva-Franco F., Brunk B., Foth B., Hertz-Fowler C., Berriman M., Otto T. D. 2016. Companion: a web server for annotation and analysis of parasite genomes. Nucleic Acids Research, 44: pp. W29–W34. 10.1093/nar/gkw292 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Berriman M., Ghedin E., Hertz-Fowler C., Blandin G., Renauld H., Bartholomeu D. C., Lennard N. J., Caler E., Hamlin N. E., Haas B., et al. 2005. The Genome of the African Trypanosome Trypanosoma brucei. Science 309(5733): 416 LP–422. [DOI] [PubMed] [Google Scholar]
- 25.Thomas S., Green A., Sturm N. R., Campbell D. A., & Myler P. J. (2009). Histone acetylations mark origins of polycistronic transcription in Leishmania major. BMC genomics, 10, 152 10.1186/1471-2164-10-152 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Daniels J. P., Gull K., & Wickstead B. 2010. Cell biology of the trypanosome genome. Microbiology and molecular biology reviews: MMBR, 74(4): pp. 552–569. 10.1128/MMBR.00024-10 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.El-Sayed N. M., Myler P. J., Blandin G., et al. (2005) ‘Comparative Genomics of Trypanosomatid Parasitic Protozoa’, Science, 309(5733), p. 404 LP-409. 10.1126/science.1112181 [DOI] [PubMed] [Google Scholar]
- 28.Yurchenko V., Kostygov A., Havlová J., Grybchuk-Ieremenko A., Ševčíková T., Lukeš J., Ševčík J., Votýpka J. 2016. Diversity of Trypanosomatids in Cockroaches and the Description of Herpetomonas tarakana sp. n.’, Journal of Eukaryotic Microbiology. 63(2): pp. 198–209. 10.1111/jeu.12268 [DOI] [PubMed] [Google Scholar]
- 29.Jackson A. P., Vaughan S. and Gull K. (2006) ‘Evolution of Tubulin Gene Arrays in Trypanosomatid parasites: genomic restructuring in Leishmania’, BMC Genomics. London: BioMed Central, 7, p. 261 10.1186/1471-2164-7-261 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Emms D. and Kelly S. 2015. OrthoFinder: solving fundamental biases in whole genome comparisons dramatically improves orthogroup inference accuracy. Genome Biology. 16(157). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Sutterwala S. S., Hsu F., Sevova E. S., Schwartz K. J., Zhang K., Key P., Turk J., Beverley S. M., Bangs J. D. 2008. Developmentally regulated sphingolipid synthesis in African trypanosomes. Molecular Microbiology, 70: pp. 281–296. 10.1111/j.1365-2958.2008.06393.x [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Zhang K., Showalter M., Revollo J., Hsu F. F., Turk J., Beverley S. M. 2003. Sphingolipids are essential for differentiation but not growth in Leishmania. EMBO J., 22: pp. 6016–6026. 10.1093/emboj/cdg584 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Vanlerberghe G. C. and McIntosh L. (1997) ‘Alternative Oxidase: From Gene to Function’, Annual Review of Plant Physiology and Plant Molecular Biology. Annual Reviews, 48(1), pp. 703–734. 10.1146/annurev.arplant.48.1.703 [DOI] [PubMed] [Google Scholar]
- 34.Jackson A. P. 2007. Origins of amino acid transporter loci in trypanosomatid parasites. BMC evolutionary biology, 7, 26 10.1186/1471-2148-7-26 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Shaked‐Mishan P., Suter‐Grotemeyer M., Yoel‐Almagor T., Holland N., Zilberstein D. and Rentsch D. 2006. A novel high‐affinity arginine transporter from the human parasitic protozoan Leishmania donovani. Molecular Microbiology, 60: pp. 30–38. 10.1111/j.1365-2958.2006.05060.x [DOI] [PubMed] [Google Scholar]
- 36.Martin J. L. et al. (2014) ‘Metabolic reprogramming during purine stress in the protozoan pathogen Leishmania donovani’, PLoS pathogens. Public Library of Science, 10(2), p. e1003938 10.1371/journal.ppat.1003938 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Inbar E., Hughitt V. K., Dillon L. A. L., Ghosh K., El-Sayed N. M. and Sacks D. L. 2017. The Transcriptome of Leishmania major Developmental Stages in Their Natural Sand Fly Vector, mBio, 8(2). 10.1128/mBio.00029-17 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Kolev N. G., Ullu E. and Tschudi C. (2014) The emerging role of RNA-binding proteins in the life cycle of Trypanosoma brucei. Cellular microbiology 16(4): 482–489. 10.1111/cmi.12268 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Naguleswaran A., Gunasekera K., Schimanski B., Heller M., Hemphill A., Ochsenreiter T. and Roditi I. (2015) Trypanosoma brucei RRM1 Is a Nuclear RNA-Binding Protein and Modulator of Chromatin Structure. mBio 6(2): e00114–15. 10.1128/mBio.00114-15 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Wippel H. H., Malgarin J. S., Martins S. de T., Vidal N. M., Marcon B. H., Miot H. T., Marchini F. K., Goldenberg S. and Alves. (2019) The Nuclear RNA-binding Protein RBSR1 Interactome in Trypanosoma cruzi. Journal of Eukaryotic Microbiology. John Wiley & Sons, Ltd [DOI] [PubMed] [Google Scholar]
- 41.Wurst M., Seliger B., Jha B. A., Klein C., Queiroz R., Clayton C. Expression of the RNA recognition motif protein RBP10 promotes a bloodstream-form transcript pattern in Trypanosoma brucei. Mol Microbiol. 2012; 83:1048–1063.1111) 66(2): 244–253. 10.1111/j.1365-2958.2012.07988.x [DOI] [PubMed] [Google Scholar]
- 42.Jones N. G. et al. (2014) ‘Regulators of Trypanosoma brucei cell cycle progression and differentiation identified using a kinome-wide RNAi screen’, PLoS pathogens. Public Library of Science, 10(1), pp. e1003886–e1003886. 10.1371/journal.ppat.1003886 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Acosta-Serrano A. et al. (2001) ‘The surface coat of procyclic Trypanosoma brucei: Programmed expression and proteolytic cleavage of procyclin in the tsetse fly’, Proceedings of the National Academy of Sciences. National Academy of Sciences, 98(4), pp. 1513–1518. 10.1073/pnas.041611698 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Haines L. R. et al. (2010) ‘Tsetse EP protein protects the fly midgut from trypanosome establishment’, PLoS pathogens. Public Library of Science, 6(3), pp. e1000793–e1000793. 10.1371/journal.ppat.1000793 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Pimenta P. F. et al. (1992) ‘Stage-specific adhesion of Leishmania promastigotes to the sandfly midgut’, Science, 256(5065), pp. 1812 LP–1815. 10.1126/science.1615326 [DOI] [PubMed] [Google Scholar]
- 46.Kamhawi S. et al. (2004) ‘A role for insect galectins in parasite survival.’, Cell. United States, 119(3), pp. 329–341. [DOI] [PubMed] [Google Scholar]
- 47.Urwyler S., Studler E., Renggli C. K., Roditi I. 2007. A family of stage-specific alanine-rich proteins on the surface of epimastigote forms of Trypanosoma brucei. Mol Microbiol., 63: pp. 218–228 10.1111/j.1365-2958.2006.05492.x [DOI] [PubMed] [Google Scholar]
- 48.Fragoso C. M., Schumann Burkard G., Oberle M., Renggli C. K., Hilzinger K., Roditi I. 2009. PSSA-2, a Membrane-Spanning Phosphoprotein of Trypanosoma brucei, Is Required for Efficient Maturation of Infection. PLoS ONE, 4(9):e7074 10.1371/journal.pone.0007074 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Casas-Sánchez A. and Acosta-Serrano Á. (2016) Skin deep. eLife. eLife Sciences Publications, Ltd 5: e21506 10.7554/eLife.21506 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Pereira F., Santos-Mallet J. R., Branquinha M. H., d'Avila-Levy C. M., Santos A. L. 2010. Influence of leishmanolysin-like molecules of Herpetomonas samuelpessoai on the interaction with macrophages. Microbes and infection, 10.1016/j.micinf.2010.07.010 [DOI] [PubMed] [Google Scholar]
- 51.D’Archivio S. and Wickstead B. (2017) ‘Trypanosome outer kinetochore proteins suggest conservation of chromosome segregation machinery across eukaryotes’, The Journal of Cell Biology, 216(2), p. 379 LP–391. 10.1083/jcb.201608043 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Robinson K.A. and Beverley S.M., 2003. Improvements in transfection efficiency and tests of RNA interference (RNAi) approaches in the protozoan parasite Leishmania. Molecular and biochemical parasitology, 128(2), pp.217–228. 10.1016/s0166-6851(03)00079-3 [DOI] [PubMed] [Google Scholar]
- 53.DaRocha W.D., Otsu K. and Teixeira S.M., 2004. Tests of cytoplasmic RNA interference (RNAi) and construction of a tetracyclineinducible T7 promoter system in Trypanosome cruzi. Mol Biochem Parasitol, 133(2): pp.175–186 10.1016/j.molbiopara.2003.10.005 [DOI] [PubMed] [Google Scholar]
- 54.Beverley S.M., 2003. Protozomics: trypanosomatid parasite genetics comes of age. Nature Reviews Genetics, 4(1): pp.11 10.1038/nrg980 [DOI] [PubMed] [Google Scholar]
- 55.Lye L. F., Owens K., Shi H., Murta S. M. F., Vieira A. C., Turco S. J., Tschudi C., Ullu E., Beverley. 2010. Retention and Loss of RNA Interference Pathways in Trypanosomatid Protozoans. PLOS Pathogens, 6(10): e1001161 10.1371/journal.ppat.1001161 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Peacock C. S., Seeger K., Harris D., Murphy L., Ruiz J. C., Quail M. A., Peters N., Adlem E., Tivey A., Aslett M., Kerhornou A., et al. 2007. Comparative genomic analysis of three Leishmania species that cause diverse human disease. Nat. Genet. 39(7): pp. 839–47. 10.1038/ng2053 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Vingron M. et al. (2007) ‘Improved detection of overrepresentation of Gene-Ontology annotations with parent–child analysis’, Bioinformatics, 23(22), pp. 3024–3031. 10.1093/bioinformatics/btm440 [DOI] [PubMed] [Google Scholar]
- 58.Tu X. and Wang C. C. (2004) ‘The Involvement of Two cdc2-related Kinases (CRKs) in Trypanosoma brucei Cell Cycle Regulation and the Distinctive Stage-specific Phenotypes Caused by CRK3 Depletion’, Journal of Biological Chemistry, 279(19), pp. 20519–20528. 10.1074/jbc.M312862200 [DOI] [PubMed] [Google Scholar]
- 59.Hammarton T. C., Engstler M. and Mottram J. C. (2004) ‘The Trypanosoma brucei Cyclin, CYC2, Is Required for Cell Cycle Progression through G1 Phase and for Maintenance of Procyclic Form Cell Morphology’, Journal of Biological Chemistry, 279(23), pp. 24757–24764. 10.1074/jbc.M401276200 [DOI] [PubMed] [Google Scholar]
- 60.Liu Y., Hu H. and Li Z. (2013) ‘The cooperative roles of PHO80-like cyclins in regulating the G1/S transition and posterior cytoskeletal morphogenesis in Trypanosoma brucei’, Molecular Microbiology. John Wiley & Sons, Ltd (10.1111), 90(1), pp. 130–146. 10.1111/mmi.12352 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Monnerat S. et al. (2013) ‘Identification and Functional Characterisation of CRK12:CYC9, a Novel Cyclin-Dependent Kinase (CDK)-Cyclin Complex in Trypanosoma brucei’, PloS one. Public Library of Science, 8(6), pp. e67327–e67327. 10.1371/journal.pone.0067327 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Lee S. H., Stephens J. L., Paul K. S., Englund P. T. 2006. Fatty Acid Synthesis by Elongases in Trypanosomes. Cell, 126(4): pp. 691–699. 10.1016/j.cell.2006.06.045 [DOI] [PubMed] [Google Scholar]
- 63.Svärd S. G. et al. (1998) ‘Differentiation-associated surface antigen variation in the ancient eukaryote Giardia lamblia’, Molecular Microbiology. John Wiley & Sons, Ltd (10.1111), 30(5), pp. 979–989. 10.1046/j.1365-2958.1998.01125.x [DOI] [PubMed] [Google Scholar]
- 64.Adam R. D. (2001) ‘Biology of Giardia lamblia’, Clinical Microbiology Reviews, 14(3), p. 447 LP–475. 10.1128/CMR.14.3.447-475.2001 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Nash T. E. et al. (1988) ‘Antigenic variation in Giardia lamblia.’, The Journal of Immunology, 141(2), p. 636 LP–641. [PubMed] [Google Scholar]
- 66.Aitcheson N. et al. (2005) ‘VSG switching in Trypanosoma brucei: antigenic variation analysed using RNAi in the absence of immune selection’, Molecular microbiology, 57(6), pp. 1608–1622. 10.1111/j.1365-2958.2005.04795.x [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Käll L., Krogh A., Sonnhammer E. L. L. 2004. A Combined Transmembrane Topology and Signal Peptide Prediction Method. Journal of Molecular Biology, 338(5): pp. 1027–1036. 10.1016/j.jmb.2004.03.016 [DOI] [PubMed] [Google Scholar]
- 68.Van Damme E. and Van Loock M. (2014) ‘Functional annotation of human cytomegalovirus gene products: an update’, Frontiers in microbiology. Frontiers Media S.A., 5, p. 218 10.3389/fmicb.2014.00218 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Matthews K. R. (2005) ‘The developmental cell biology of Trypanosoma brucei’, Journal of cell science, 118(Pt 2), pp. 283–290. 10.1242/jcs.01649 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Savage A. F. et al. (2016) ‘Transcriptome Profiling of Trypanosoma brucei Development in the Tsetse Fly Vector Glossina morsitans’, PloS one. Public Library of Science, 11(12), pp. e0168877–e0168877. 10.1371/journal.pone.0168877 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Krogh A., Larsson B., von Heijne G., and Sonnhammer E. L. L. 2001. Predicting transmembrane protein topology with a hidden Markov model: Application to complete genomes. Journal of Molecular Biology, 305(3): pp. 567–580. 10.1006/jmbi.2000.4315 [DOI] [PubMed] [Google Scholar]
- 72.de Paiva RMC, Grazielle-Silva V, Cardoso MS, Nakagaki BN, Mendonça-Neto RP, et al. (2015) Amastin Knockdown in Leishmania braziliensis Affects Parasite-Macrophage Interaction and Results in Impaired Viability of Intracellular Amastigotes. PLOS Pathogens 11(12): e1005296 10.1371/journal.ppat.1005296 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Kangussu-Marcolino M. M., de Paiva R. M. C., Araújo P. R., de Mendonça-Neto R. P., Lemos L., Bartholomeu D. C., Mortara R. A., da Rocha W. D., Teixeira S. M. R. 2013. Distinct genomic organization, mRNA expression and cellular localization of members of two amastin sub-families present in Trypanosoma cruzi. BMC Microbiology, 13(1): pp. 10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Fankhauser N. and Mäser P. (2005) Identification of GPI anchor attachment signals by a Kohonen self-organizing map. Bioinformatics. 21(9), pp. 1846–1852. 10.1093/bioinformatics/bti299 [DOI] [PubMed] [Google Scholar]
- 75.Darlyuk I. et al. (2009) ‘Arginine Homeostasis and Transport in the Human Pathogen Leishmania donovani’, Journal of Biological Chemistry, 284(30), pp. 19800–19807. 10.1074/jbc.M901066200 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Marchese L. et al. (2018) ‘The Uptake and Metabolism of Amino Acids, and Their Unique Role in the Biology of Pathogenic Trypanosomatids’, Pathogens. 10.3390/pathogens7020036 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Cunningham M. L. and Beverley S. M. 2001. ‘Pteridine salvage throughout the Leishmania infectious cycle: implications for antifolate chemotherapy. Molecular and Biochemical Parasitology. Elsevier, 113(2): pp. 199–213. 10.1016/s0166-6851(01)00213-4 [DOI] [PubMed] [Google Scholar]
- 78.Gourguechon S. and Wang C. C. (2009) ‘CRK9 contributes to regulation of mitosis and cytokinesis in the procyclic form of Trypanosoma brucei’, BMC cell biology. BioMed Central, 10, p. 68 10.1186/1471-2121-10-68 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Gupta S. K., Kosti I., Plaut G., Pivko A., Tkacz I. D., Cohen-Chalamish S., et al. (2013) The hnRNP F/H homologue of Trypanosoma brucei is differentially expressed in the two life cycle stages of the parasite and regulates splicing and mRNA stability. Nucleic Acids Res. 41:6577–6594 10.1093/nar/gkt369 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80.Smith T. K. et al. (2017) ‘Metabolic reprogramming during the Trypanosoma brucei lifecycle’, F1000Research. F1000Research, 6, p. F1000 Faculty Rev-683. 10.12688/f1000research.10342.2 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Benz C., Mulindwa J., Ouna B., Clayton C. (2011) The Trypanosoma brucei zinc finger protein ZC3H18 is involved in differentiation. Mol Biochem Parasitol. 177:148–151. 10.1016/j.molbiopara.2011.02.007 [DOI] [PubMed] [Google Scholar]
- 82.Droll D., Minia I., Fadda A., Singh A., Stewart M., Queiroz R., Clayton C. (2013) Post-transcriptional regulation of the trypanosome heat shock response by a zinc finger protein. PLoS Pathog. 9:e1003286 10.1371/journal.ppat.1003286 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83.Subota I., Rotureau B., Blisnick T., Ngwabyt S., Durand-Dubief M., Engstler M., Bastin P. (2011) ALBA proteins are stage regulated during trypanosome development in the tsetse fly and participate in differentiation. Mol Biol Cell. 22:4205–4219. 10.1091/mbc.E11-06-0511 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Dang H. Q. and Li Z. (2011) ‘The Cdc45·Mcm2-7·GINS protein complex in trypanosomes regulates DNA replication and interacts with two Orc1-like proteins in the origin recognition complex’, The Journal of biological chemistry. 2011/07/28. American Society for Biochemistry and Molecular Biology, 286(37), pp. 32424–32435. 10.1074/jbc.M111.240143 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85.DeBrot A., Lancaster C. and Bjornsti M.A. (2016) ‘Function of Cdc45 in DNA Replication and in Response to Genotoxic Stress’, The FASEB Journal. Federation of American Societies for Experimental Biology, 30(1_supplement), p. 7982 798.2. 10.1096/fj.15-275990 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.Kawale A. S. and Povirk L. F. (2018) ‘Tyrosyl-DNA phosphodiesterases: rescuing the genome from the risks of relaxation’, Nucleic acids research. 2017/12/04. Oxford University Press, 46(2), pp. 520–537. 10.1093/nar/gkx1219 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87.Capul A. A. et al. (2007) ‘Two Functionally Divergent UDP-Gal Nucleotide Sugar Transporters Participate in Phosphoglycan Synthesis in Leishmania major’, Journal of Biological Chemistry, 282(19), pp. 14006–14017. 10.1074/jbc.M610869200 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 88.Pita S. et al. (2019) ‘The Tritryps Comparative Repeatome: Insights on Repetitive Element Evolution in Trypanosomatid Pathogens’, Genome biology and evolution. Oxford University Press, 11(2), pp. 546–551. 10.1093/gbe/evz017 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 89.Jackson A. P. 2016. Gene family phylogeny and the evolution of parasite cell surfaces. Molecular and Biochemical Parasitology, 209(1): pp. 64–75. [DOI] [PubMed] [Google Scholar]
- 90.Kozarewa I., Ning Z., Quail M. A., Sanders M. J., Berriman M., Turner D. J. 2009. Amplification-free Illumina sequencing-library preparation facilitates improved mapping and assembly of (G+C)-biased genomes. Nat. Methods, 6: pp. 291–295. 10.1038/nmeth.1311 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 91.Iraad F., Bronner M. A., Quail D. J., Turner H. S. 2014. Improved Protocols for Illumina Sequencing. Curr. Protoc. Hum. Genet., 80:18(2): pp. 1–42. [DOI] [PubMed] [Google Scholar]
- 92.Eid J. L., Fehr A., Gray J., Luong K., Lyle J., Otto G., Peluso P., Rank D., Baybayan P., Bettman B., Bibillo A., et al. 2009. Real-time DNA sequencing from single polymerase molecules. Science, 323: pp. 133–138 10.1126/science.1162986 [DOI] [PubMed] [Google Scholar]
- 93.Chin C. S., Alexander D. H., Marks P., Klammer A. A., Drake J., Heiner C., Clum A., Copeland A., Huddleston J., Eichler E. E., Turner S. W., Korlach J. 2013. Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data. Nat Methods 10:563–569 10.1038/nmeth.2474 [DOI] [PubMed] [Google Scholar]
- 94.Otto T. D., Sanders M., Berriman M., Newbold C. 2010. Iterative Correction of Reference Nucleotides (iCORN) using second generation sequencing technology. Bioinforma. Oxf. Engl. 26: pp. 1704–1707. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 95.Hunt M., Kikuchi T., Sanders M., Newbold C., Berriman M., Otto T. D. 2013. REAPR: a universal tool for genome assembly evaluation. Genome Biol, 14: R47 10.1186/gb-2013-14-5-r47 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 96.Marçais G. and Kingsford C. 2011. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics, 27(6): pp. 764–770. 10.1093/bioinformatics/btr011 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 97.Kim D., Langmead B. and Salzberg S. L. 2015. HISAT: a fast spliced aligner with low memory requirements. Nature Methods, 12, pp. 357 10.1038/nmeth.3317 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 98.Trapnell C., Roberts A., Goff L., Pertea G., Kim D., Kelley D. R., Pimentel H., Salzberg S. L., Rinn J. L., Pachter L. 2012. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nature Protocols, 7: pp. 562 10.1038/nprot.2012.016 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 99.Jackson A. P., Sanders M., Berry A., McQuillan J., Aslett M. A., Quail M. A., Chukualim B., Capewell P., MacLeod A., Melville S. E., Gibson W., Barry J. D., Berriman M., Hertz-Fowler C. 2010. The genome sequence of Trypanosoma brucei gambiense, causative agent of chronic human African trypanosomiasis. PLoS Negl. Trop Dis, 13(4):e658. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 100.Jackson A. P., Berry A., Aslett M., Allison H. C., Burton P., Vavrova-Anderson J., Brown R., Browne H., Corton N., Hauser H. et al. 2012. Antigenic diversity is generated by distinct evolutionary mechanisms in African trypanosome species. PNAS, 109: pp. 3416–3421. 10.1073/pnas.1117313109 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 101.Carnes J., Anupama A., Balmer O., Jackson A., Lewis M., Brown R., Cestari I., Desquesnes M., Gendrin C., Hertz-Fowler C. 2015. Genome and Phylogenetic Analyses of Trypanosoma evansi Reveal Extensive Similarity to T. brucei and Multiple Independent Origins for Dyskinetoplasty. PLOS Neglected Tropical Diseases. Public Library of Science 9(1): e3404 10.1371/journal.pntd.0003404 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 102.Kelly S., Ivens A., Manna P. T., Gibson W., Field M. C. 2014. A draft genome for the African crocodilian trypanosome Trypanosoma grayi. Sci Data. 5(1):140024. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 103.Stoco P. H., Wagner G., Talavera-Lopez C., Gerber A., Zaha A., et al. 2014. Genome of the Avirulent Human-Infective Trypanosome—Trypanosoma rangeli. PLOS Neglected Tropical Diseases, 8(9): e3176 10.1371/journal.pntd.0003176 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 104.Kelly S., Ivens A., Mott G. A., O'Neill E., Emms D., Macleod O., Voorheis P., Tyler K., Clark M., Matthews J., Matthews K., Carrington M. 2017. An Alternative Strategy for Trypanosome Survival in the Mammalian Bloodstream Revealed through Genome and Transcriptome Analysis of the Ubiquitous Bovine Parasite Trypanosoma (Megatrypanum) theileri. Genome Biol Evol., 9(8): pp. 2093–2109. 10.1093/gbe/evx152 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 105.Downing T., Imamura H., Decuypere S., Clark T. G., Coombs G. H., Cotton J. A., Hilley J. D., de Doncker S., Maes I., Mottram J. C., et al. 2011. Whole genome sequencing of multiple Leishmania donovani clinical isolates provides insights into population structure and mechanisms of drug resistance. Genome Res, 21(12): pp. 2143–56. 10.1101/gr.123430.111 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 106.Ivens A. C., Peacock C. S., Worthey E. A., Murphy L., Aggarwal G., Berriman M., Sisk E., Rajandream M. A., Adlem E., Aert R., et al. 2005. The genome of the kinetoplastid parasite, Leishmania major. Science, 309(5733): pp. 436–42. 10.1126/science.1112680 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 107.Kraeva N., Butenko A., Hlaváčová J., Kostygov A., Myškova J., Grybchuk D., Leštinová T., Votýpka J., Volf P., Opperdoes F., Flegontov P., Lukeš J., Yurchenko V. 2015. Leptomonas seymouri: Adaptations to the Dixenous Life Cycle Analyzed by Genome Sequencing, Transcriptome Profiling and Co-infection with Leishmania donovani. PLoS Pathog., 11(8):e1005127 10.1371/journal.ppat.1005127 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 108.Runckel C., DeRisi J., & Flenniken M. L. (2014). A draft genome of the honey bee trypanosomatid parasite Crithidia mellificae. PloS one, 9(4), e95057 10.1371/journal.pone.0095057 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 109.Porcel B. M., Denoeud F., Opperdoes F., Noel B., Madoui M-A., Hammarton T. C., Field M. C., Da Silva, Couloux A., Poulain J., et al. 2014. The streamlined genome of Phytomonas spp. relative to human pathogenic kinetoplastids reveals a parasite tailored for plants. PLoS genetics. e1004007 10.1371/journal.pgen.1004007 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 110.Jackson A. P., Quail M. A., Berriman M. 2008. Insights into the genome sequence of a free-living Kinetoplastid: Bodo saltans (Kinetoplastida: Euglenozoa). BMC Genomics. 9(9):594. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 111.Aslett M., Aurrecoechea C., Berriman M., Brestelli J., Brunk B. P., Carrington M., Depledge D. P., Fischer S., Garjria B., Gao X., et al. , 2010. TriTrypDB: a functional genomic resource for the Trypanosomatidae. Nucleic Acids Research. 38(37): D457–D462 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 112.Anders S., Pyl P. T. and Huber W. 2015. HTSeq—a Python framework to work with high-throughput sequencing data. Bioinformatics, 31(2): pp. 166–169. 10.1093/bioinformatics/btu638 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 113.Love M. I., Huber W., Anders S. 2014. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biology, 15: pp. 550 10.1186/s13059-014-0550-8 [DOI] [PMC free article] [PubMed] [Google Scholar]