Skip to main content
eLife logoLink to eLife
. 2016 Jan 26;5:e11473. doi: 10.7554/eLife.11473

Population genomics reveals the origin and asexual evolution of human infective trypanosomes

William Weir 1, Paul Capewell 1, Bernardo Foth 2, Caroline Clucas 1, Andrew Pountain 1, Pieter Steketee 1, Nicola Veitch 3, Mathurin Koffi 4,, Thierry De Meeûs 5,6, Jacques Kaboré 6,7,, Mamadou Camara 8,, Anneli Cooper 1, Andy Tait 1, Vincent Jamonneau 5,6,, Bruno Bucheton 5,8,, Matt Berriman 2, Annette MacLeod 1,*,
Editor: Dominique Soldati-Favre9
PMCID: PMC4739771  PMID: 26809473

Abstract

Evolutionary theory predicts that the lack of recombination and chromosomal re-assortment in strictly asexual organisms results in homologous chromosomes irreversibly accumulating mutations and thus evolving independently of each other, a phenomenon termed the Meselson effect. We apply a population genomics approach to examine this effect in an important human pathogen, Trypanosoma brucei gambiense. We determine that T.b. gambiense is evolving strictly asexually and is derived from a single progenitor, which emerged within the last 10,000 years. We demonstrate the Meselson effect for the first time at the genome-wide level in any organism and show large regions of loss of heterozygosity, which we hypothesise to be a short-term compensatory mechanism for counteracting deleterious mutations. Our study sheds new light on the genomic and evolutionary consequences of strict asexuality, which this pathogen uses as it exploits a new biological niche, the human population.

DOI: http://dx.doi.org/10.7554/eLife.11473.001

Research Organism: Other

eLife digest

An organism’s genetic ‘blueprint’ is encoded in DNA packaged within structures called chromosomes. Most organisms have two copies of each chromosome and, through sexual reproduction, the DNA within a pair of chromosomes can recombine randomly in a process that could be likened to shuffling a deck of cards. This process generates genetic diversity and means that any undesirable combinations of genes or mutations can be eliminated from the population by natural selection. While these activities are thought to promote the long-term survival of a species, some organisms appear not to have sex at all.

Evolutionary theory predicts that ‘asexual’ organisms should face extinction in the long-term and that a lack of sexual recombination should leave a characteristic genetic ‘signature’ in their DNA. The theory also predicts that pairs of chromosomes will evolve independently, a phenomenon that is termed the ‘Meselson effect’. However, while it was first predicted almost twenty years ago, evidence for this signature has been elusive.

Now, Weir et al. have used the asexual parasite (Trypanosoma brucei gambiense), which causes African sleeping sickness in humans, to search for signs of the Meselson effect. Sequencing the whole genome of a large number of parasites revealed that the population of this parasite arose from a single individual within the last 10,000 years. Over this time, mutations have built up and the lack of sexual recombination means that the two chromosomes in each pair have evolved independently of the other. These results provide the first demonstration of the Meselson effect at a genome-wide level in any organism.

Weir et al. also uncovered evidence that this parasite uses a mechanism called “gene conversion” to compensate for its lack of sex. This mechanism essentially repairs the inferior, or mutated, copy of a gene on a chromosome by ‘copying and pasting’ the superior copy from the chromosome’s partner. The findings also suggest that gene conversion can only go some way to compensating for a lack of sex. A future challenge will be to investigate how effective this mechanism can be in the long term and to predict whether the parasite will ultimately become extinct.

DOI: http://dx.doi.org/10.7554/eLife.11473.002

Introduction

Obligate asexual reproduction has been argued to carry considerable negative evolutionary consequences (Maynard Smith, 1986) hence most clonal species undergo some degree of recombination, albeit infrequent, which generates novel genotypes (Heitman, 2006). Here we investigate the reproductive strategy of a putatively asexual yet successful human pathogen, Trypanosoma brucei gambiense Group 1, the main causative agent of human African trypanosomiasis, which contrasts with closely-related, sexually reproducing sub-species. T.b. gambiense has been divided into two groups. T.b. gambiense Group 1, found in West and Central Africa, is the main human-infective sub-species, causing >97% of all human cases of trypanosomiasis (Simarro et al., 2010). T.b. gambiense Group 2 was detected in the 1980/90s in Côte d’Ivoire and Burkina Faso but may now be extinct (Capewell et al., 2013). A third human infective sub-species, T.b. rhodesiense, is found in East Africa and causes <3% of human cases (Simarro et al., 2010). Each of these human infective sub-species appears to have arisen independently from the non-human infective T. brucei and possesses a different mechanism of human infectivity (Capewell et al., 2013; Capewell et al., 2011; Uzureau et al., 2013; Van Xong et al., 1998). All sub-species of T. brucei, with the important exception of T.b. gambiense Group 1, show evidence for mating in natural populations (Capewell et al., 2013; Duffy et al., 2013; Gibson and Stevens, 1999) and have the ability to undergo sexual reproduction with Mendelian allelic segregation and independent assortment of unlinked markers (Cooper et al., 2008; MacLeod et al., 2005). In addition, haploid gametes have been observed in T.b. brucei (Peacock et al., 2014) and although meiosis genes appear to be expressed in T.b. gambiense Group 1 (Peacock et al., 2014), no haploid gametes have ever been observed in these parasites (Peacock et al., 2014). This is consistent with clonality in all Group 1 populations analysed (Koffi et al., 2009; Morrison et al., 2008; Tibayrenc and Ayala, 2012), however, these studies were based on limited sets of genetic markers, which lack the necessary discriminatory power to distinguish between predominantly clonal evolution, with occasional bouts of genetic exchange, and strictly asexual propagation. Genomic-level analyses of T. brucei diversity to date have concentrated on T.b. brucei and T.b. rhodesiense and for T.b. gambiense Group 1, include only the genome reference strain (DAL972) (Goodhead et al., 2013) or two (Sistrom et al., 2014) field isolates. We hereby present a population-level genomic analysis as a means to determine whether this species is truly asexual, when the switch to asexuality arose and to provide insights into the genomic consequences of asexual evolution, including possible compensating strategies for eliminating deleterious mutations.

Results

The genomes of 75 isolates of T.b. gambiense Group 1 (Supplementary file 1) were sequenced, including multiple samples from geographically separated disease foci within Guinea (n=37), Côte d’Ivoire (n=36) and Cameroon (n=2) collected over fifty years (1952–2004). For comparative purposes, isolates of T.b. rhodesiense (n=4), T.b. gambiense Group 2 (n=4) and T.b. brucei (n=2) were also sequenced. A total of 230,891 single nucleotide polymorphisms (SNPs) were identified compared to the haploid consensus assembly of the T.b. brucei reference genome (Berriman et al., 2005). These were evenly distributed over the eleven major chromosomes, covering 85% of the genome (Figure 1—figure supplement 1). T.b. gambiense Group 1 showed a 5–10 fold lower number of SNPs (11,398) and SNP density compared to the other groups (Figure 1—source data 1), despite an over-representation in terms of the number of samples. Phylogenetic network analysis revealed that T.b. gambiense Group 1 genotypes showed an extremely low level of intra-group diversity (e.g. the two most distantly related isolates differed only at 435 SNP loci) and formed a monophyletic group (Figure 1A). The network features reticulation among non-Group 1 parasites indicating the presence of recombinant genotypes; this stands in contrast to Group 1 parasites and is consistent with an absence, or rarity, of recombination in this group. Network analysis of T.b. gambiense Group 1 revealed the population is geographically sub-structured (Figure 1B). A significant deviation from Hardy-Weinberg Equilibrium (HWE) was observed at 97.4% of SNP loci (P<10-17 at each locus) and this was found to be associated with every sampled genotype being heterozygous at these loci (Figure 1—source data 2). To control for geographical and temporal population sub-structure, isolates from three sub-populations were analysed and HWE deviation and heterozygote excess was confirmed (Figure 1—source data 2). FIS was calculated for each SNP locus, giving a uni-modal distribution with a median of -1 (Figure 1—figure supplement 2 and Figure 1—source data 3), as would be predicted for a strictly asexual population. Using a genome-wide panel of SNP loci, strong evidence of linkage disequilibrium (LD) was obtained for each chromosome and the whole genome formed a single genetic linkage group (Figure 1—figure supplement 3).

Figure 1. Phylogenetic network analysis.

SplitsTree phylogenetic networks were constructed using (A) each isolate for the collection of T.b. brucei (Tbb), T.b. rhodesiense (Tbr), T.b. gambiense Group 1 (Tbg1) and T.b. gambiense Group 2 (Tbg2) and (B) for just T.b. gambiense Group 1 isolates. The number of samples in each group is indicated in parenthesis.

DOI: http://dx.doi.org/10.7554/eLife.11473.003

Figure 1—source data 1. Number of SNP loci with respect to different sub-species.
DOI: 10.7554/eLife.11473.004
Figure 1—source data 2. Testing Hardy-Weinberg Equilibrium (HWE) across the T.b. gambiense Group 1 genome.
DOI: 10.7554/eLife.11473.005
Figure 1—source data 3. FIS by sub-population.
DOI: 10.7554/eLife.11473.006
Figure 1—source data 4. Number and type of T.b. gambiense Group 1 SNPs.
DOI: 10.7554/eLife.11473.007

Figure 1.

Figure 1—figure supplement 1. Genome-wide SNP density map for each sub-species.

Figure 1—figure supplement 1.

The densities of SNP loci were assessed among isolates of (a) T.b. brucei (n=2), (b) T.b. gambiense Group 1 (n=75), (c) T.b. gambiense Group 2 (n=4) and T.b. rhodesiense (n=4) over each of the 11 chromosomes (horizontal scale). The number of SNPs per 10Kb window is plotted in blue, with the scale indicating SNP density between 0 and 150 loci per window (vertical scale). T.b. gambiense Group 1 isolates have a SNP density, which is approximately ten-fold lower than other sub-species, however they are evenly distributed throughout the genome.

Figure 1—figure supplement 2. Weir and Cockerham’s fis.

Figure 1—figure supplement 2.

The number of SNP loci exhibiting different levels of fis (Weir and Cockerham’s fis (Goudet, 1995)) was plotted, showing a distribution around -1, indicating strict asexuality.

Figure 1—figure supplement 3. Genome-wide linkage disequilibrium (LD) among T.b. gambiense Group 1 parasites.

Figure 1—figure supplement 3.

Normalised LD was assessed across all eleven chromosomes. The position of each SNP used in this analysis is illustrated. The black triangle illustrates the single linkage group identified in the analysis, which spans the entire genome. Numbers 1–11 represent the 11 chromosomes. Red (D'>1, LOD≥2), Pink (D'<1, LOD≥2), Blue (D'=1, LOD<2) and White (D'<1, LOD<2).

Inspection of the SNP distribution across the genome of Group 1 isolates identified multiple long tracts of homozygosity, termed Loss of Heterozygosity (LOH) (Figure 2—figure supplement 1). Examination of read depth variation confirmed that this is not due to the loss of part of a homologue and is consistent with mitotic gene conversion; similarly there is no evidence of aneuploidy. In T.b. gambiense Group 1, LOH has occurred on every chromosome, with many isolates showing the same LOH patterns (Figure 2—figure supplement 1), strongly suggesting that many of these regions arose early in Group 1 evolution. Chromosome 10 displayed the greatest variation in LOH sites, with a total of eighteen different patterns of variation among Group 1 genomes (Figure 2). This varied from 2% to 82% of the chromosome length, with LOH occurring predominantly towards one telomere. Distinct LOH patterns were associated with particular phylogenetic lineages, indicating LOH has occurred at various points in Group 1 evolution (Figure 2—figure supplement 2).

Figure 2. Loss of heterozygosity on chromosome 10.

Loss of heterozygosity regions (blue) spanning Chromosome 10 show 18 different patterns (A-R). The number of isolates possessing each pattern and the percentage of the chromosome affected are indicated. The table (inset) shows the extent of LOH (min and max) for each chromosome as a percentage of chromosome length.

DOI: http://dx.doi.org/10.7554/eLife.11473.011

Figure 2.

Figure 2—figure supplement 1. Loss of heterozygosity across the T.b. gambiense Group 1 genome.

Figure 2—figure supplement 1.

Every T.b. gambiense Group 1 isolate (first column) was analysed for loss of heterozygosity (LOH-see Materials and methods). Blocks of LOH are shown in blue across each of the eleven chromosomes, as indicated.

Figure 2—figure supplement 2. Loss of heterozygosity across chromosome 10.

Figure 2—figure supplement 2.

Every T.b. gambiense Group 1 isolate was analysed for loss of heterozygosity (LOH). Blocks of LOH across chromosome 10 (horizontal scale) are shown in blue. Eighteen different patterns (A-R) are evident and the phylogenetic tree is shown on the left. Patterns of LOH can be seen to cluster in agreement with the tree, clearly showing that a stable LOH profile may be inherited. A selection of inherited blocks of LOH together with the branch on which they emerge are marked 1–4. In addition, very recent LOH events can be observed, for example the large LOH region in ‘DEOLA’, which is not observed in its close relative ‘LISA’, highlighted in yellow.

Taken together, these analyses clearly demonstrate that Group 1 parasites are evolving asexually. A key predicted feature of asexual diploid species is the independent evolution and divergence of alleles on chromosome homologues (Birky, 1996), often referred to as the ‘Meselson effect’ (Birky, 1996; Butlin, 2002; Mark Welch and Meselson, 2000). To test whether this phenomenon occurred in T.b. gambiense Group 1, three regions of the genome were chosen where it was possible to manually phase the genomic data, using LOH events to guide the phasing in non-LOH closely related isolates (Materials and methods). For all three regions, a clear sequence of the accumulating mutations could be inferred (Figure 3), with each haplotype evolving independently, thus illustrating the Meselson effect (Judson and Normark, 1996).

Figure 3. The Meselson effect.

Figure 3.

An accumulation of mutations on separate haplotypes, the ‘Meselson Effect’, is illustrated using three regions on chromosome 10. For each region (1, 2 and 3) the two haplotypes are shown for a series of isolates with the accumulating mutations (filled boxes) indicated in red or blue for each haplotype. The sequences of accumulating mutations observed in the isolates ‘Lisa’ and ‘B4_F303P’ are shown. The number of mutations arising between each ancestral haplotype pair is indicated with two distinct lineages apparent at each locus, illustrating the independent evolution of each haplotype.

DOI: http://dx.doi.org/10.7554/eLife.11473.014

Figure 3—source data 1. Estimated time since the most recent common ancestor.
DOI: 10.7554/eLife.11473.015

To test whether haplotypes have evolved independently at the whole-chromosomal level, the sequence dataset for all sub-species was computationally phased and haplotypes inferred for each isolate. Excluding LOH regions, phylogenetic trees for each chromosome were generated, revealing that A and B haplotypes separate into distinctive clusters (example in Figure 4A) and for every chromosome this pattern was maintained (Figure 4—figure supplement 1). Co-phylogenetic analysis revealed congruence between the A and B haplotype trees across different chromosomes (Figure 4B; Figure 4—figure supplements 2 and 3) illustrating that these haplotypes are evolving in parallel.

Figure 4. Haplotype co-evolution of chromosome 8.

(A) Phylogenetic tree of phased haplotype sequences of chromosome 8 shows a complete divergence between A (blue) and B (red) haplotypes of T.b. gambiense Group 1 (identical genotypes removed). The tree is rooted using a Group 2 isolate (green); (B) Co-phylogenetic analysis reveals 100% consensus between the A and B haplotype trees of a subset of T.b. gambiense Group 1 isolates.

DOI: http://dx.doi.org/10.7554/eLife.11473.016

Figure 4—source data 1. Co-phylogenetic analysis.
DOI: 10.7554/eLife.11473.017

Figure 4.

Figure 4—figure supplement 1. Phylogenetic trees of phased data showing ‘A’ and ‘B’ haplotypes.

Figure 4—figure supplement 1.

Maximum likelihood phylogenetic trees of phased haplotype sequences demonstrate divergence between A (blue) and B (red) haplotypes of T.b. gambiense Group 1 over each of the 11 chromosomes. This represents all SNP loci identified in each sub-species (n = 230,891). The haplotypes of non-Group 1 isolates are shown in green. Divergence of A and B haplotypes of Group 1 isolates can be observed for every chromosome.

Figure 4—figure supplement 2. Co-phylogenetic analysis.

Figure 4—figure supplement 2.

For the three chromosomes with more than 1,000 SNP loci (Figure 1—source data 4), subsets of isolates (discriminated by highly supported nodes, see Materials and methods) were used to construct haplotype trees. In each case, the A and B haplotype trees show identical topology, illustrating the co-evolution of partner haplotypes.

Figure 4—figure supplement 3. Phylogenetic tree of all T.b. gambiense Group 1 isolates.

Figure 4—figure supplement 3.

A maximum likelihood phylogenetic tree was constructed with all T.b. gambiense Group 1 isolates using the panel of SNPs associated with derived alleles (Figure 1—source data 4). Bootstrap support is shown for each node. The 27 isolates chosen to illustrate co-evolution of partner haplotypes in Figure 4—figure supplement 2 are shown in blue (Guinea) and red (Côte d'Ivoire). The clade used for molecular clock calculations is marked with an asterisk.

The time of emergence of T.b. gambiense Group 1 was determined. We estimated the genome-wide mutation rate using the number of accumulated mutations (both genome-wide and on Chromosome 9) together with the year of isolation for each isolate, using root-to-tip linear regression (Drummond et al., 2003) (Figure 3—source data 1). A rate of 1.82 x 10–8 substitutions per site per year was estimated, similar to that of the COII-NDI ‘clock’ locus in T. cruzi (Lewis et al., 2011). Given the observed number of accumulated mutations per isolate, the genome size, and our calculated mutation rate, the time since the existence of the most recent common ancestor (MRCA) of the Group 1 isolates analysed in the study is estimated to be in excess of one thousand years before present (Figure 3—source data 1). Similarly, using the mutation rates for two different T. cruzi clock genes (COII-NDI and GPI) (Lewis et al., 2011), the date of emergence was estimated to be between 750 and 9,500 years before present (Figure 3—source data 1).

Discussion

The theory of clonality in parasitic protozoan populations has been proposed and debated over the last quarter of a century (Tibayrenc and Ayala, 2002; Tibayrenc et al., 1990). To advance our understanding of clonal evolutional, we have undertaken a whole-genome population-level analysis of T.b. gambiense Group 1 focussing on a large number of isolates sampled from two countries, together with a small out-group from a more distant West African country. We provide robust evidence that this important human parasite reproduces exclusively asexually, demonstrating complete genetic linkage across the genome and the absence of allelic segregation. Despite meiosis-specific genes being intact and expressed (Peacock et al., 2014), the population genetic data is incompatible with sexual reproduction, self-fertilisation, aneuploidy (Schurko et al., 2009), a parasexual cycle (Ramírez and Llewellyn, 2014; Forche et al., 2008) or the atypical meiosis observed in Rotifers (Signorovitch et al., 2015). The barrier to sexual reproduction in T.b. gambiense Group 1 remains unclear. The lack of decay of ‘meiosis-associated’ genes may be explained by a number of non-mutually exclusive hypotheses including the relatively recent emergence of this asexual lineage, such that insufficient time has elapsed to allow decay. Alternatively, these genes may perform additional roles in processes such as DNA repair or VSG-related recombination.

Our data indicates the parasite population comprises just two independently evolving haplotypes; this remarkable observation confirms the Meselson effect at a whole-genome level for the first time. Despite being predicted for almost twenty years, empirical evidence for this phenomenon has been elusive. The original report of the Meselson effect focused on Bdelliod rotifers (Welch, 2000), however this was later shown to instead be due to the entirely different phenomenon of cryptic tetraploidy (Mark Welch et al., 2008). Subsequent attempts to demonstrate the Meselson effect have been thwarted, such as in the case of the obligately apomictic crustacean, Daphnia (Omilian et al., 2006). The relatively high rate of mitotic recombination in comparison to the mutation rate results in novel heterozygous sites being eliminated by gene conversion (LOH) events much faster than they are generated and consequently allelic divergence is not observed (Omilian et al., 2006). More recently, in the genome of Timema stick insects, nuclear alleles have been shown to display a higher level of divergence in asexual rather than sexual species (Schwander et al., 2011). In that system, as in T.b. gambiense Group 1, the mitotic recombination rate is sufficiently low so as not to obscure the pattern of accumulating mutations, and this has underpinned our ability to detect and confirm the Meselson effect.

The similarity of the genomes studied from different geographical locations, together with a lack of recombination in the evolution of T.b. gambiense Group 1, suggests this sub-species emerged from a single progenitor. Each clade represents a separate country and these are partitioned by the earliest branches of the phylogenetic tree, implying early radiation and colonisation. The emergence of T.b. gambiense Group 1 within the last 10,000 years coincided with an important period in human history when civilisation and livestock farming were developing in West Africa (Oliver, 1966), but whether asexual reproduction was a prerequisite for human infection or occurred subsequently is uncertain. This successful asexual human pathogen contrasts markedly with the virtually extinct sexual T.b. gambiense Group 2 parasite that occupied similar biological and geographical niches and this may be an example of the dominance of an asexual lineage over its sexual counterpart (Charlesworth, 1980). Another remarkable feature of the T.b. gambiense Group 1 genome is the extensive loss of heterozygosity across large regions of each chromosome, which likely arose from gene conversion/mitotic recombination. Gene conversion provides a mechanism for removing a proportion of deleterious mutations in asexual eukaryotes, as hypothesised in other systems (Tucker et al., 2013). A fitter allele on one haplotype would be positively selected, resulting in long-range tracts of LOH that encompass the loci under selection together with extensive flanking regions. However, it has been predicted that following an LOH event, individuals will experience a slight decline in long-term fitness due to newfound homozygosity featuring pre-existing sub-optimal alleles, leaving a signature equivalent to inbreeding (Tucker et al., 2013) in the progeny. This process has been described as a more powerful evolutionary force than the accumulation of point mutations (Omilian et al., 2006) and thus may drive Muller’s Ratchet more quickly than de novo mutations. This suggests that although LOH may effectively counteract the accumulating mutational load in the short-term, it is unclear whether it can prevent this uniquely well-adapted pathogen from ultimately entering an extinction vortex.

Materials and methods

Sample collection

A panel of eighty-five DNA samples was collected, representing T. brucei isolates from East and West Africa (Supplementary file 1), including T.b. brucei (n=2), T.b. rhodesiense (n=4), T.b. gambiense Group 1 (n=75) and T.b. gambiense Group 2 (n=4). The main focus of the study was human-derived T.b. gambiense Group 1, with the samples collected from sleeping sickness patients in Guinea (n=37), Côte d'Ivoire (n=36) and Cameroon (n=2). This included collections from Bonon (n=14, collected 2000–2004) in the Côte d'Ivoire and Boffa (n=18, collected 2002) and Dubreka (n=19, collected 1998–2002) in Guinea.

Illumina sequencing and SNP calling

The T. brucei genome is approximately 26 Mb in size and comprises eleven major chromosomes between one and six megabases together with an array of intermediate and mini-chromosomes (Ogbadoyi et al., 2000). Illumina paired-end sequencing libraries were prepared from genomic DNA and sequenced by standard procedures on Illumina HiSeq machines, to yield paired sequence reads of 75 bases in length. For each parasite strain, the data yield from the sequencing machines passing the default purity filter was between 7.4 million and 40.4 million read pairs (median of 15.3 million), which corresponds to a nominal genome coverage of between 37.9-fold and 207.1-fold (median of 78.2-fold). For the purpose of calling SNPs, mapping of the paired sequencing reads to the genome reference sequence from GeneDB (Trypanosoma brucei brucei TREU927, referred to here as Tb927) was carried out with SMALT (www.sanger.ac.uk/resources/software/smalt/) version 0.7.4 using the following parameters: word length = 13, skip step = 3, maximum insert size = 800, minimum Smith-Waterman score = 60, and with the exhaustive search option enabled. A median fraction of 79.8% of sequencing reads were mapped and a median fraction of 64.9% of sequencing reads were classified as 'proper pairs', i.e. with the two mates of a sequencing read pair mapped within the expected distance and in the correct orientation. Only sequence reads mapped as 'proper pairs' were used for SNP calling, and the first 5 and last 15 nucleotides were clipped from all reads prior to subsequent analysis. Genotypes for every genomic position were determined using SAMtools version 0.1.19 (Li et al., 2009) by using the 'samtoolsmpileup' command with minimum baseQ/BAQ ratio of 15 (-Q) followed by SAMtools' 'bcftools view' command with options -c and -g enabled. SNP calls were filtered according to the following criteria: a minimum of 6 high-quality base calls (DP4); a minimum and maximum coverage depth (DP) of 0.25 times and 4 times the median, respectively; a minimum quality score (QUAL) of 22; a minimum mapping quality (MQ) of 22; a minimum second best genotype likelihood value (PL) of 0.25 times the median; a maximum fraction of conflicting base calls for homozygous genotype calls of 10%; and a minimum percentage of 5% for base calls (as a fraction of all base calls for a given genotype) that mapped either to the forward or the reverse strand of the reference sequence. Only loci that passed the quality control criteria for every sample were used for the population analysis. To ensure the SNP-calling parameters used in our analysis did not skew the distribution of variant sites detected (a) within individual genotypes and (b) across the sample collection, SNP-calling was performed using a range of stringencies. While the total number of SNP loci identified per individual and throughout the population varied depending on the stringency of the filter, the allele frequency spectrum remained constant (data not shown). For the analysis presented in the manuscript, our SNP-calling parameters corresponded with a relatively low stringency filter, which was considered capable of detecting a very high proportion of variant sites while ablating the effects of sporadic read errors. Thus, the filter was designed to eliminate artefactual variants in the first instance, although a relatively low number of variant sites may remained undetected due to the necessity for every sample to pass QC at a given locus. The entire SNP dataset is deposited at the TritrypDB pathogen database, which is freely accessible at http://www.tritrypdb.org/tritrypdb/.

Panels of SNPs for population analysis

A subset of SNP loci was selected where a high-confidence genotype could be identified for every sample in the dataset (Figure 1—source data 1). This totalled 230,891 bi-allelic markers, which were used as the basis for the population genomic analyses presented in this study. The number of SNP loci was calculated for each sub-species using two methods: (1) in comparison to the Tb927 reference genome; and (2) in comparison to other members of that sub-species (Figure 1—source data 1). A total of 130,180 SNP loci were identified among the seventy-five Group 1 isolates in comparison to the reference T.b. brucei genome although only 11,398 of these showed polymorphism within Group 1 itself. In order to facilitate different types of analysis, a series of panels of a sub-set of SNP loci were defined (Figure 1—source data 4). For some analyses, Loss of Heterozygosity (LOH) regions of the genome were excluded and therefore a sub-set of SNP loci in non-LOH regions of the genome were identified (Figure 1—source data 4; n=5201). In addition, SNP loci in non-LOH regions where the minimum allele frequency was greater than 20% were identified, n=3,549 (Figure 1—source data 4), in order to provide sufficient power for testing linkage disequilibrium among the sequenced samples. In order to investigate whether SNP loci where the minimum allele frequency (MAF) is low correspond to localised areas where recombination events have occurred, the distribution of these loci was visually compared with the distribution of other SNPs in the non-LOH regions of the genome. Similar to the other SNPs, these SNP loci were evenly distributed over the non-LOH regions of the genome, excluding this possibility (data not shown). Finally, a set of SNP loci was identified over non-LOH regions of the genome, excluding fixed heterozygous loci, which was polymorphic only among T.b. gambiense Group 1 isolates. These correspond to ‘derived’ alleles, which arose since the most recent common ancestor of the Group 1 isolates studied (Figure 1—source data 4, ‘Tbg1 derived’). A set of SNPs loci polymorphic both within and outside the Group 1 population was also defined (Figure 1—source data 4, ‘Tbg1 ancient’).

Phylogenetic networks and trees

Phylogenetic networks were constructed using the Split Decomposition method of SplitsTree4 (Huson and Bryant, 2006): Figure 1A shows the reconstruction using all the isolates sequenced in this study (n=85) and the full panel of SNPs; Figure 1B shows relationships among the Group 1 isolates (n=75) using the SNP panel corresponding to derived alleles in non-LOH areas (Figure 1—source data 4). The virtually non-reticulated topology of the network presented in Figure 1B supports our finding of strict asexuality in the Group 1 population and it is therefore appropriate to utilise a classical phylogenetic tree approach for the analysis of this sub-species. Maximum Likelihood phylogenetic trees were constructed using RAxML (Stamatakis, 2014) using a generalised time-reversible model of sequence evolution. Confidence in individual branching relationships was assessed using 100 bootstrap pseudo-replicates and trees visualised using FigTree 1.4 (tree.bio.ed.ac.uk).

Linkage disequilibrium and Hardy-Weinberg equilibrium

The Haploview software package (Barrett, 2009) was used to investigate LD across each chromosome using unphased data and to calculate the normalised measure of allelic association, D' (Daly et al., 2001). Linkage blocks were defined using the method of Gabriel et al. (2002) with a block being created if 95% of informative comparisons between SNP loci were found to possess ‘strong LD’. Strong LD was defined as D'=1 and LOD score ≥2. Haploview was also used to calculate the probability that any observed deviation from HWE could be explained by chance using a χ² test. This was performed for SNP loci across the T.b. gambiense Group 1 genome utilising the set of ‘Tbg1 ancient’ SNP loci (Figure 1—source data 4). Statistically significant loci at P<0.001 were additionally tested to determine whether deviation from HWE was associated with a heterozygote excess (Figure 1—source data 2), by comparing the predicted with the observed heterozygote frequency. Overall, a high proportion of SNP loci (97.4%) showed a statistically significant departure from HWE (P<10–17) and strikingly, all of these loci showed an excess of heterozygotes (Figure 1—source data 2). Along with the entire set of T.b. gambiense Group 1 isolates, a separate HWE analysis was performed on three spatio-temporally defined sub-populations (Bonon, Boffa and Dubreka), thus accounting for any geographical and temporal population sub-structure. A series of further genetic tests was performed using Fstat version 2.9 (Goudet, 1995). FIS was calculated across the T.b. gambiense Group 1 genome, again utilising the set of ‘ancient’ SNP loci (Figure 1—figure supplement 2). A median figure of -1 was calculated for the entire set of T.b. gambiense Group 1 isolates and for each of the Bonon, Boffa and Dubreka sub-populations, indicating strict asexuality. This was supported by permutation testing (n iterations = 30,000), which indicated that FIS was lower than expected (Figure 1—source data 3). Similarly, using the Bonon, Boffa and Dubreka sub-populations, Weir & Cockerham’s fis was also calculated using Fstat. This had a uni-modal distribution with a median of -1 (fis = -1 at 91.4% of loci), indicating strict asexuality.

‘Loss of heterozygosity’ analysis

To assess the distribution of heterozygous sites across the genome, the density of these sites was calculated in 10 kb segments for every isolate. These density figures were used to determine whether each 10 kb segment could be considered a candidate area for long-range LOH. LOH blocks were defined using a custom Perl script to perform Interval Analysis using the following criteria: max number of heterozygous sites allowed per block = 0, minimum number of contiguous blocks = 6, maximum gap size in a contiguous block = 2, max number of heterozygous sites allowed within gap = 2. LOH block data was converted for viewing in the Integrative Genome Viewer (IGV) (Thorvaldsdottir et al., 2013). In order to determine whether genomic structural variation could explain observed LOH, we performed a systematic copy number variation (CNV) analysis across the genome using CNVnator (Mills et al., 2011). This reveals that there was no loss or gain of chromosomal material associated with LOH segments (data not shown).

Manual phasing, computational phasing and co-phylogenetic analysis

In order to validate the computational phasing and investigate the relationship between haplotypes, three loci were selected where recent LOH events had occurred independently on chromosome 10 in different isolates. Such independent LOH events may be identified by examining patterns of LOH in comparison to the phylogenetic tree (Figure 2—figure supplement 2). For locus 1, for example, an LOH region recently arose independently in B7_2 and DEOLA and can be observed over approximately 650 kb of chromosome 10 (Figure 2—figure supplement 2). Examination of the 506 SNP loci identified in this region indicated that two ancestral haplotypes existed. The nearest neighbours of these isolates (B7_2 and DEOLA) on the phylogenetic tree, CP1_2_KIVI and LISA, respectively, did not possess LOH in this region and therefore the sequences of B7_2 and DEOLA could be used as guides to allow CP1_2_KIVI and LISA to be confidently phased. Since the divergence of CP_1_2_KIVI with LISA, at this locus 15 mutations arose in the former isolate and 9 mutations in the latter and this is illustrated for LISA in Figure 3. Again, because of the existence of only two haplotypes, more distant isolates could also be phased in this manner, which allowed intermediate haplotypes to be inferred and the accumulating sequence of Meselson mutations to be determined.

To permit a genome-wide analysis, computational phasing of the diploid genotypic data was performed using the segmented haplotype estimation and imputation tool SHAPEIT2 (Delaneau et al., 2012; Delaneau et al., 2013). The default parameters were used together with an adjusted window size of 0.5 Mb and a recombination rate of 0.0003 (MacLeod et al., 2005). The accuracy of the computational phasing for each isolate was assessed in comparison to a large LOH region on one isolate (DEOLA at the 650 kb Locus 1 on Chromosome 10). LOH in this region provided a set of ‘gold standard’ phasing information, which was used to check the phasing of all isolates, except the five which shared LOH in this region. A switch error rate (Lin et al., 2004) of between 4% and 12% was observed (mean 9.3%) across the 69 isolates, validating the results of the computational phasing.

Phased sequence data from all isolates in the collection was used to create a separate Maximum Likelihood phylogenetic tree for each chromosome with RAxML (Stamatakis, 2014) (Figure 4—figure supplement 1). For T.b. gambiense Group 1 isolates, co-phylogeny of the phased haplotypes (A vs B) was then assessed for each chromosome in turn using Jane (Conow et al., 2010). Tree topologies were resolved using a Genetic Algorithm for co-phylogeny reconstruction with the default cost model. To assess whether the trees were more similar than would be expected by chance, 1,000 simulations were carried out using each of: (a) a random tip-mapping method; and (b) a random tree method (beta = -1). For every chromosome, A and B haplotype trees were significantly similar to each other (Figure 4—source data 1, P < 0.001). In order to illustrate this similarity between A and B haplotype trees, a set of 27 isolates was selected which could be resolved with 100% bootstrap support from a phylogenetic tree constructed using the whole-genome dataset (Figure 4—figure supplement 3 ). The three chromosomes with the largest number of SNPs among T.b. gambiense Group 1 (8, 10 and 11) were then selected, as these were the most informative. Isolates which could be resolved by the most highly supported nodes in trees representing the A and B haplotypes were subsequently selected and this corresponded to 16, 18 and 13 isolates for chromosomes 8, 10 and 11 respectively. In each case, the A and B haplotype sub-trees showed identical topology (Figure 4—figure supplement 2), illustrating the co-evolution of partner haplotypes.

Dating the emergence of T.b. gambiense Group 1

The date of emergence of T.b. gambiense Group 1 was calculated by combining the mutational rate with the estimated number of mutations arising since the most recent common ancestor of our collection of Group 1 isolates. The mutation rate was calculated using two approaches, the first using all seventy-five isolates, the second using a subset of samples representing a discrete lineage isolated in the Côte d'Ivoire (Figure 3—source data 1). Both methods gave rise to a similar mutation rate of approximately 2 x 10–8 mutations per base per year.

Two approaches were used to assess the number of accumulated mutations present in the genome of each isolate. In the first, only the non-LOH portion of the genome was utilised to avoid the risk of mutations being ‘erased’ by way of LOH/gene conversion events. The number of SNP loci where derived alleles were present in the genome was counted. 388 such loci (+/- 12) were identified on both homologues over 8.47 Mb of the non-LOH portion of the genome. The second involved focusing on Chromosome 9, where a large region of LOH is found over much of the chromosome in every Group 1 isolate. Any mutations occurring since the existence of the most recent common ancestor of all the isolates analysed can easily be identified as they manifest themselves as heterozygous loci; this avoids the issue of identifying fixed heterozygous loci, which represent uninformative pre-existing mutations rather than accumulated mutations. Similar results were achieved with both methods, providing us with estimates of the time to most recent common ancestor (TMRCA) of approximately 1,000 to 1,500 years before present (Figure 3—source data 1). The mutation rates of two loci in T. cruzi, for which a 95% confidence interval was available, were also utilised to date the emergence of Group 1 parasites. These provided a confidence interval of approximately 750 to 9,500 years before present (Figure 3—source data 1).

Acknowledgements

We thank Lucio Marcello for providing technical assistance in the LOH analysis.

Funding Statement

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Funding Information

This paper was supported by the following grants:

  • Wellcome Trust 095201/Z/10/Z to William Weir, Paul Capewell, Caroline Clucas, Andrew Pountain, Pieter Steketee, Nicola Veitch, Anneli Cooper, Annette MacLeod.

  • Wellcome Trust 085349 to William Weir, Paul Capewell, Caroline Clucas, Andrew Pountain, Pieter Steketee, Nicola Veitch, Anneli Cooper, Annette MacLeod.

  • Wellcome Trust 098051 to Bernardo Foth, Matt Berriman.

Additional information

Competing interests

The authors declare that no competing interests exist.

Author contributions

WW, performed the genomic and genetic analysis and wrote the paper.

PC, performed the computational phasing.

BF, performed the SNP calling.

CC, prepared parasite material and purified DNA.

AP, was involved in LOH analysis.

PS, was involved in LOH analysis.

NV, prepared parasite material and purified DNA.

MK, Co-ordinated and performed field sampling and prepared parasite DNA.

TDM, provided expert support in analysing an interpreting the data.

JK, Co-ordinated and performed field sampling and prepared parasite DNA.

MC, Co-ordinated and performed field sampling and prepared parasite DNA.

AC, was involved in LOH analysis.

AT, wrote the paper.

VJ, Co-ordinated and performed field sampling and prepared parasite DNA.

BB, Co-ordinated and performed field sampling and prepared parasite DNA.

MB, supervised the sequencing.

AMACL, Conceived the study and wrote the paper.

Ethics

Human subjects: Overall ethical approval for the study was granted by the University of Glasgow College of Medical, Veterinary Life Sciences Ethics Committee (project number 200120043). Guinean parasite strains were collected with local ethical approval within the framework of medical surveys conducted by the national Human African Trypanosomiasis (HAT) control program (NCP) according to the national HAT diagnostic procedures of the Republic of Guinea Ministry of Health. No samples other than those collected for routine screening and diagnostic procedures were collected and all human samples were anonymised. All participants were orally informed of the objective of the study in their own language by an NCP health officer. This study is part of a larger project for which approval was obtained from the WHO Research Ethics Review Committee (RPC222) and Institut de Recherche pour le Développement (Comité Consultatif de Déontologie et d'Ethique) ethical committee. Ivory Coast parasite strains were collected during medical surveys conducted by the Ivory Coast NCP in agreement with the National Ministry of Health and in collaboration with the IRD, according to WHO and Ivory Coast NCP recommendations. Patients who gave their consent after explanation of the objective and rationale of the study were used in this work. In both countries, all confirmed cases were offered treatment.

Additional files

Supplementary file 1. Isolates used in this study.

For each isolate, the year of isolation, host, country and location are given along with the results of the BIIT test (Blood Incubation Infectivity Test), which determines human infectivity. The presence/absence of TgSGP, the T.b. gambiense Group 1 human serum resistance gene and SRA, the T.b. rhodesiense human serum resistance gene are indicated. The majority of samples in this study were T.b. gambiense Group 1, details of which have been previously published (Heitman, 2006; Thorvaldsdottir et al., 2013).

DOI: http://dx.doi.org/10.7554/eLife.11473.021

elife-11473-supp1.pdf (108.2KB, pdf)
DOI: 10.7554/eLife.11473.021

Major datasets

The following datasets were generated:

Weir W, Foth B, Clucas C, Pountain A, Steketee P, Veitch N, Koffi M, De Meeûs T, Kaboré J, Camara M, Cooper A, Tait A, Jamonneau V, Bucheton B, Berriman M, MacLeod A,2016,Tritrypdb SNP collection,http://tritrypdb.org/tritrypdb/,NA

References

  1. Barrett JC. Haploview: visualization and analysis of SNP genotype data. Cold Spring Harbor Protocols. 2009;2009:pdb.ip71. doi: 10.1101/pdb.ip71. [DOI] [PubMed] [Google Scholar]
  2. Berriman M, Ghedin E, Hertz-Fowler C, Blandin G, Renauld H, Bartholomeu DC, Lennard NJ, Caler E, Hamlin NE, Haas B, Böhme U, Hannick L, Aslett MA, Shallom J, Marcello L, Hou L, Wickstead B, Alsmark UC, Arrowsmith C, Atkin RJ, Barron AJ, Bringaud F, Brooks K, Carrington M, Cherevach I, Chillingworth TJ, Churcher C, Clark LN, Corton CH, Cronin A, Davies RM, Doggett J, Djikeng A, Feldblyum T, Field MC, Fraser A, Goodhead I, Hance Z, Harper D, Harris BR, Hauser H, Hostetler J, Ivens A, Jagels K, Johnson D, Johnson J, Jones K, Kerhornou AX, Koo H, Larke N, Landfear S, Larkin C, Leech V, Line A, Lord A, Macleod A, Mooney PJ, Moule S, Martin DM, Morgan GW, Mungall K, Norbertczak H, Ormond D, Pai G, Peacock CS, Peterson J, Quail MA, Rabbinowitsch E, Rajandream MA, Reitter C, Salzberg SL, Sanders M, Schobel S, Sharp S, Simmonds M, Simpson AJ, Tallon L, Turner CM, Tait A, Tivey AR, Van Aken S, Walker D, Wanless D, Wang S, White B, White O, Whitehead S, Woodward J, Wortman J, Adams MD, Embley TM, Gull K, Ullu E, Barry JD, Fairlamb AH, Opperdoes F, Barrell BG, Donelson JE, Hall N, Fraser CM, Melville SE, El-Sayed NM. The genome of the african trypanosome trypanosoma brucei. Science. 2005;309:416–422. doi: 10.1126/science.1112642. [DOI] [PubMed] [Google Scholar]
  3. Birky CW. Heterozygosity, heteromorphy, and phylogenetic trees in asexual eukaryotes. Genetics. 1996;144:427–437. doi: 10.1093/genetics/144.1.427. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Butlin R. Opinion — evolution of sex: the costs and benefits of sex: new insights from old asexual lineages. Nature Reviews Genetics. 2002;3:311–317. doi: 10.1038/nrg749. [DOI] [PubMed] [Google Scholar]
  5. Capewell P, Veitch NJ, Turner CMR, Raper J, Berriman M, Hajduk SL, MacLeod A, Büscher P. Differences between trypanosoma brucei gambiense groups 1 and 2 in their resistance to killing by trypanolytic factor 1. PLoS Neglected Tropical Diseases. 2011;5:e11473. doi: 10.1371/journal.pntd.0001287. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Capewell P, Cooper A, Duffy CW, Tait A, Turner CMR, Gibson W, Mehlitz D, MacLeod A, Langsley G. Human and animal trypanosomes in cte d'Ivoire form a single breeding populationn. PLoS ONE. 2013;8:e11473. doi: 10.1371/journal.pone.0067852. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Capewell P, Clucas C, DeJesus E, Kieft R, Hajduk S, Veitch N, Steketee PC, Cooper A, Weir W, MacLeod A, Alsford S. The TgsGP gene is essential for resistance to human serum in trypanosoma brucei gambiense. PLoS Pathogens. 2013;9:e11473. doi: 10.1371/journal.ppat.1003686. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Charlesworth B. The cost of sex in relation to mating system. Journal of Theoretical Biology. 1980;84:655–671. doi: 10.1016/S0022-5193(80)80026-9. [DOI] [PubMed] [Google Scholar]
  9. Conow C, Fielder D, Ovadia Y, Libeskind-Hadas R. Jane: a new tool for the cophylogeny reconstruction problem. Algorithms for Molecular Biology. 2010;5:16. doi: 10.1186/1748-7188-5-16. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Cooper A, Tait A, Sweeney L, Tweedie A, Morrison L, Turner CMR, MacLeod A. Genetic analysis of the human infective trypanosome trypanosoma brucei gambiense: chromosomal segregation, crossing over, and the construction of a genetic map. Genome Biology. 2008;9:R103. doi: 10.1186/gb-2008-9-6-r103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Daly MJ, Rioux JD, Schaffner SF, Hudson TJ, Lander ES. High-resolution haplotype structure in the human genome. Nature Genetics. 2001;29:229–232. doi: 10.1038/ng1001-229. [DOI] [PubMed] [Google Scholar]
  12. Delaneau O, Marchini J, Zagury J-F. A linear complexity phasing method for thousands of genomes. Nature Methods. 2012;9:179–181. doi: 10.1038/nmeth.1785. [DOI] [PubMed] [Google Scholar]
  13. Delaneau O, Zagury J-F, Marchini J. Improved whole-chromosome phasing for disease and population genetic studies. Nature Methods. 2013;10:5–6. doi: 10.1038/nmeth.2307. [DOI] [PubMed] [Google Scholar]
  14. Drummond A, Pybus OG, Rambaut A. Inference of viral evolutionary rates from molecular sequences. Advances in Parasitology. 2003;54:331–358. doi: 10.1016/s0065-308x(03)54008-8. [DOI] [PubMed] [Google Scholar]
  15. Duffy CW, MacLean L, Sweeney L, Cooper A, Turner CMR, Tait A, Sternberg J, Morrison LJ, MacLeod A, Masiga DK. Population genetics of trypanosoma brucei rhodesiense: clonality and diversity within and between foci. PLoS Neglected Tropical Diseases. 2013;7:e11473. doi: 10.1371/journal.pntd.0002526. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Forche A, Alby K, Schaefer D, Johnson AD, Berman J, Bennett RJ, Heitman J. The parasexual cycle in candida albicans provides an alternative pathway to meiosis for the formation of recombinant strains. PLoS Biology. 2008;6:e11473. doi: 10.1371/journal.pbio.0060110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Gabriel SB, Schaffner SF, Nguyen H, Moore JM, Roy J, Blumenstiel B, Higgins J, DeFelice M, Lochner A, Faggart M, Liu-Cordero SN, Rotimi C, Adeyemo A, Cooper R, Ward R, Lander ES, Daly MJ, Altshuler D. The structure of haplotype blocks in the human genome. Science. 2002;296:2225–2229. doi: 10.1126/science.1069424. [DOI] [PubMed] [Google Scholar]
  18. Gibson W, Stevens J. Genetic exchange in the trypanosomatidae. Advances in Parasitolgy. 1999;43:1–46. doi: 10.1016/S0065-308X(08)60240-7. [DOI] [PubMed] [Google Scholar]
  19. Goodhead I, Capewell P, Bailey JW, Beament T, Chance M, Kay S, Forrester S, MacLeod A, Taylor M, Noyes H, Hall N. Whole-genome sequencing of trypanosoma brucei reveals introgression between subspecies that is associated with virulence. mBio. 2013;4:e11473. doi: 10.1128/mBio.00197-13. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Goudet J. FSTAT (version 1.2): a computer program to calculate f-statistics. Journal of Heredity. 1995;86:485–486. [Google Scholar]
  21. Heitman J. Sexual reproduction and the evolution of microbial pathogens. Current Biology. 2006;16:R711–R725. doi: 10.1016/j.cub.2006.07.064. [DOI] [PubMed] [Google Scholar]
  22. Huson DH, Bryant D. Application of phylogenetic networks in evolutionary studies. Molecular Biology and Evolution. 2006;23:254–267. doi: 10.1093/molbev/msj030. [DOI] [PubMed] [Google Scholar]
  23. Judson OP, Normark BB. Ancient asexual scandals. Trends in Ecology & Evolution. 1996;11:41–46. doi: 10.1016/0169-5347(96)81040-8. [DOI] [PubMed] [Google Scholar]
  24. Koffi M, De Meeus T, Bucheton B, Solano P, Camara M, Kaba D, Cuny G, Ayala FJ, Jamonneau V. Population genetics of trypanosoma brucei gambiense, the agent of sleeping sickness in western africa. Proceedings of the National Academy of Sciences of the United States of America. 2009;106:209–214. doi: 10.1073/pnas.0811080106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Lewis MD, Llewellyn MS, Yeo M, Acosta N, Gaunt MW, Miles MA, Carlton JM. Recent, independent and anthropogenic origins of trypanosoma cruzi hybrids. PLoS Neglected Tropical Diseases. 2011;5:e11473. doi: 10.1371/journal.pntd.0001363. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R, 1000 Genome Project Data Processing Subgroup The sequence Alignment/Map format and SAMtools. Bioinformatics. 2009;25:2078–2079. doi: 10.1093/bioinformatics/btp352. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Lin S, Chakravarti A, Cutler DJ. Haplotype and missing data inference in nuclear families. Genome Research. 2004;14:1624–1632. doi: 10.1101/gr.2204604. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. MacLeod A, Tweedie A, McLellan S, Taylor S, Hall N, Berriman M, El-Sayed NM, Hope M, Turner CM, Tait A. The genetic map and comparative analysis with the physical map of trypanosoma brucei. Nucleic Acids Research. 2005;33:6688–6693. doi: 10.1093/nar/gki980. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Mark Welch D, Meselson M. Evidence for the evolution of bdelloid rotifers without sexual reproduction or genetic exchange. Science. 2000;288:1211–1215. doi: 10.1126/science.288.5469.1211. [DOI] [PubMed] [Google Scholar]
  30. Mark Welch DB, Mark Welch JL, Meselson M. Evidence for degenerate tetraploidy in bdelloid rotifers. Proceedings of the National Academy of Sciences of the United States of America. 2008;105:5145–5149. doi: 10.1073/pnas.0800972105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Mills RE, Walter K, Stewart C, Handsaker RE, Chen K, Alkan C, Abyzov A, Yoon SC, Ye K, Cheetham RK, Chinwalla A, Conrad DF, Fu Y, Grubert F, Hajirasouliha I, Hormozdiari F, Iakoucheva LM, Iqbal Z, Kang S, Kidd JM, Konkel MK, Korn J, Khurana E, Kural D, Lam HYK, Leng J, Li R, Li Y, Lin C-Y, Luo R, Mu XJ, Nemesh J, Peckham HE, Rausch T, Scally A, Shi X, Stromberg MP, Stütz AM, Urban AE, Walker JA, Wu J, Zhang Y, Zhang ZD, Batzer MA, Ding L, Marth GT, McVean G, Sebat J, Snyder M, Wang J, Ye K, Eichler EE, Gerstein MB, Hurles ME, Lee C, McCarroll SA, Korbel JO. Mapping copy number variation by population-scale genome sequencing. Nature. 2011;470:59–65. doi: 10.1038/nature09708. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Morrison LJ, Tait A, McCormack G, Sweeney L, Black A, Truc P, Likeufack ACL, Turner CM, MacLeod A. Trypanosoma brucei gambiense type 1 populations from human patients are clonal and display geographical genetic differentiation. Infection, Genetics and Evolution. 2008;8:847–854. doi: 10.1016/j.meegid.2008.08.005. [DOI] [PubMed] [Google Scholar]
  33. Ogbadoyi E, Ersfeld K, Robinson D, Sherwin T, Gull K. Architecture of the trypanosoma brucei nucleus during interphase and mitosis. Chromosoma. 2000;108:501–513. doi: 10.1007/s004120050402. [DOI] [PubMed] [Google Scholar]
  34. Oliver R. The problem of the bantu expansion. The Journal of African History. 1966;7:361–376. doi: 10.1017/S0021853700006472. [DOI] [Google Scholar]
  35. Omilian AR, Cristescu MEA, Dudycha JL, Lynch M. Ameiotic recombination in asexual lineages of daphnia. Proceedings of the National Academy of Sciences of the United States of America. 2006;103:18638–18643. doi: 10.1073/pnas.0606435103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Peacock L, Bailey M, Carrington M, Gibson W. Meiosis and haploid gametes in the pathogen trypanosoma brucei. Current Biology. 2014;24:181–186. doi: 10.1016/j.cub.2013.11.044. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Ramírez JD, Llewellyn MS. Reproductive clonality in protozoan pathogens-truth or artefact? Molecular Ecology. 2014;23:4195–4202. doi: 10.1111/mec.12872. [DOI] [PubMed] [Google Scholar]
  38. Schurko AM, Neiman M, Logsdon JM. Signs of sex: what we know and how we know it. Trends in Ecology & Evolution. 2009;24:208–217. doi: 10.1016/j.tree.2008.11.010. [DOI] [PubMed] [Google Scholar]
  39. Schwander T, Henry L, Crespi BJ. Molecular evidence for ancient asexuality in timema stick insects. Current Biology. 2011;21:1129–1134. doi: 10.1016/j.cub.2011.05.026. [DOI] [PubMed] [Google Scholar]
  40. Signorovitch A, Hur J, Gladyshev E, Meselson M. Allele sharing and evidence for sexuality in a mitochondrial clade of bdelloid rotifers. Genetics. 2015;200:581–590. doi: 10.1534/genetics.115.176719. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Simarro PP, Cecchi G, Paone M, Franco JR, Diarra A, Ruiz JA, Fèvre EM, Courtin F, Mattioli RC, Jannin JG. The atlas of human african trypanosomiasis: a contribution to global mapping of neglected tropical diseases. International Journal of Health Geographics. 2010;9:57–75. doi: 10.1186/1476-072X-9-57. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Sistrom M, Evans B, Bjornson R, Gibson W, Balmer O, Maser P, Aksoy S, Caccone A. Comparative genomics reveals multiple genetic backgrounds of human pathogenicity in the trypanosoma brucei complex. Genome Biology and Evolution. 2014;6:2811–2819. doi: 10.1093/gbe/evu222. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Maynard Smith J. Evolution: contemplating life without sex. Nature. 1986;324:300–301. doi: 10.1038/324300a0. [DOI] [PubMed] [Google Scholar]
  44. Stamatakis A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics. 2014;30:1312–1313. doi: 10.1093/bioinformatics/btu033. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Thorvaldsdottir H, Robinson JT, Mesirov JP. Integrative genomics viewer (iGV): high-performance genomics data visualization and exploration. Briefings in Bioinformatics. 2013;14:178–192. doi: 10.1093/bib/bbs017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Tibayrenc M, Kjellberg F, Ayala FJ. A clonal theory of parasitic protozoa: the population structures of entamoeba, giardia, leishmania, naegleria, plasmodium, trichomonas, and trypanosoma and their medical and taxonomical consequences. Proceedings of the National Academy of Sciences of the United States of America. 1990;87:2414–2418. doi: 10.1073/pnas.87.7.2414. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Tibayrenc M, Ayala FJ. The clonal theory of parasitic protozoa: 12 years on. Trends in Parasitology. 2002;18:405–410. doi: 10.1016/S1471-4922(02)02357-7. [DOI] [PubMed] [Google Scholar]
  48. Tibayrenc M, Ayala FJ. Reproductive clonality of pathogens: a perspective on pathogenic viruses, bacteria, fungi, and parasitic protozoa. Proceedings of the National Academy of Sciences of the United States of America. 2012;109:e11473. doi: 10.1073/pnas.1212452109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Tucker AE, Ackerman MS, Eads BD, Xu S, Lynch M. Population-genomic insights into the evolutionary origin and fate of obligately asexual daphnia pulex. Proceedings of the National Academy of Sciences of the United States of America. 2013;110:15740–15745. doi: 10.1073/pnas.1313388110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Uzureau P, Uzureau S, Lecordier L, Fontaine F, Tebabi P, Homblé F, Grélard A, Zhendre V, Nolan DP, Lins L, Crowet J-M, Pays A, Felu C, Poelvoorde P, Vanhollebeke B, Moestrup SK, Lyngsø J, Pedersen JS, Mottram JC, Dufourc EJ, Pérez-Morga D, Pays E. Mechanism of trypanosoma brucei gambiense resistance to human serum. Nature. 2013;501:430–434. doi: 10.1038/nature12516. [DOI] [PubMed] [Google Scholar]
  51. Van Xong H, Vanhamme L, Chamekh M, Chimfwembe CE, Van Den Abbeele J, Pays A, Van Meirvenne N, Hamers R, De Baetselier P, Pays E. A VSG expression site–associated gene confers resistance to human serum in trypanosoma rhodesiense. Cell. 1998;95:839–846. doi: 10.1016/S0092-8674(00)81706-7. [DOI] [PubMed] [Google Scholar]
eLife. 2016 Jan 26;5:e11473. doi: 10.7554/eLife.11473.024

Decision letter

Editor: Dominique Soldati-Favre1

eLife posts the editorial decision letter and author response on a selection of the published articles (subject to the approval of the authors). An edited version of the letter sent to the authors after peer review is shown, indicating the substantive concerns or comments; minor concerns are not usually shown. Reviewers have the opportunity to discuss the decision before the letter is sent (see review process). Similarly, the author response typically shows only responses to the major concerns raised by the reviewers.

Thank you for submitting your work entitled "Population genomics reveals the origin and asexual evolution of human infective trypanosomes" for consideration by eLife. Your article has been reviewed by three peer reviewers, and the evaluation has been overseen by a Reviewing Editor, Dominique Soldait-Favre and Diethard Tautz as the Senior Editor.

The reviewers have discussed the reviews with one another and the Reviewing editor has drafted this decision to help you prepare a revised submission.

Summary:

The study reports a population genomics approach applied to the examination of the irreversible accumulation of mutations in the asexually reproducing diploid Trypanosoma brucei gambiense. The authors provide persuasive evidence for the absence of sex and recombination during a substantial period of the evolution of T. brucei gambiense. This manuscript features a spectacular case of the Meselson effect at a genome-wide level. It clearly shows divergence between homologous chromosomes and loss of heterozygosity that the authors interpret as a compensatory mechanism for counteracting deleterious mutations.

As well as being important for understanding the evolution of this pathogen, it may become a textbook example of the population genetics of an asexual diploid, thus advancing the field of evolutionary biology.

Essential revisions:

The analysis is very comprehensive and accessible to non-specialists and the reviewers mainly identified some minor issues. However the results presented here are far from being expected, since until now, this phenomenon has not been observed in other asexual organisms. In consequence the authors should discuss much more extensively the Meselson effect story taking the following points in consideration:

The authors somewhat underplay the surprise of this, characterizing the Meselson effect as "predicted". I'd strongly suggest a slightly fuller account of the history of work on the Meselson effect to provide clearer context of the importance of the present paper. Although the Meselson effect was indeed predicted at one time (Birky, 1996; Butlin, 2002; Welch and Meselson, 200; Judson and Normark, 1996), empirically it has been elusive. The original empirical report of the Meselson effect in bdelloid rotifers (Welch and Meselson, 2000) was later shown to have been due to a different phenomenon entirely: cryptic tetraploidy (Mark Welch, D. B., J. L. Mark Welch, and M. Meselson. 2008. Evidence for degenerate tetraploidy in bdelloid rotifers. Proc. Natl. Acad. Sci. USA 105:5145-5149). The Meselson effect was also found to be absent in obligately apomictic Daphnia, due to the high rate of mitotic recombination in comparison to the mutation rate (Omilian, A., M. Cristescu, J. Dudycha, and M. Lynch. 2006. Ameiotic recombination in asexual lineages of Daphnia. Proc. Natl. Acad. Sci. USA 103:18638-18643). The Meselson effect appeared to require an unrealistically high ratio of mutation to mitotic recombination. The present manuscript is the most thorough and persuasive of the trickle of recent studies indicating that the Meselson effect can in fact happen, in lineages with a sufficiently low rate of mitotic recombination (cf. Schwander, T., L. Henry, and B. J. Crespi. 2011. Molecular Evidence for Ancient Asexuality in Timema Stick Insects. Curr. Biol. 21:1129-1134.) Indeed the swathes of Loss-of-Heterozygosity give a vivid sense of the interplay between recombination and mutation, in a lineage in which recombination is rare enough not to have erased the whole record.

Remarks:

LOH, generated by gene conversion is hypothesized to be a mechanism for removing deleterious alleles. This seems plausible. However to determine that LOH events are selected rather than neutral, a more comprehensive analysis of the distribution, size, gene content and rate of accumulation across time would be informative. For example, I wonder if it is possible to demonstrate that LOH specifically occurs in regions containing phylogenetically conserved genes that are intolerant to mutation accumulation. This is not critical for the current report, but would be well worth following up.

The fine scale congruence between the phylogenies of the A and B genomes, documented in Figure 4, is impressive and goes a long way towards demonstrating that the observed heterozygosity represents evolution within the asexual lineage and is not inherited from, say, a hybrid ancestor. (Initially I was puzzled that it looked like there was incongruence between the chromosomes but closer inspection shows that there is no such incongruence, just sporadic omission of isolates that failed to meet the 1000SNP/chromosome cutoff.)

Author contributions – one author is listed who "commented on the analysis and manuscript". Acknowledgements would be more appropriate here unless a more substantial contribution can be provided.

eLife. 2016 Jan 26;5:e11473. doi: 10.7554/eLife.11473.025

Author response


Essential revisions: The analysis is very comprehensive and accessible to non-specialists and the reviewers mainly identified some minor issues. However the results presented here are far from being expected, since until now, this phenomenon has not been observed in other asexual organisms. In consequence the authors should discuss much more extensively the Meselson effect story taking the following points in consideration:

The authors somewhat underplay the surprise of this, characterizing the Meselson effect as "predicted". I'd strongly suggest a slightly fuller account of the history of work on the Meselson effect to provide clearer context of the importance of the present paper. Although the Meselson effect was indeed predicted at one time (Birky, 1996; Butlin, 2002; Welch and Meselson, 200; Judson and Normark, 1996), empirically it has been elusive. The original empirical report of the Meselson effect in bdelloid rotifers (Welch and Meselson, 2000) was later shown to have been due to a different phenomenon entirely: cryptic tetraploidy (Mark Welch, D. B., J. L. Mark Welch, and M. Meselson. 2008. Evidence for degenerate tetraploidy in bdelloid rotifers. Proc. Natl. Acad. Sci. USA 105:5145-5149). The Meselson effect was also found to be absent in obligately apomictic Daphnia, due to the high rate of mitotic recombination in comparison to the mutation rate (Omilian, A., M. Cristescu, J. Dudycha, and M. Lynch. 2006. Ameiotic recombination in asexual lineages of Daphnia. Proc. Natl. Acad. Sci. USA 103:18638-18643). The Meselson effect appeared to require an unrealistically high ratio of mutation to mitotic recombination. The present manuscript is the most thorough and persuasive of the trickle of recent studies indicating that the Meselson effect can in fact happen, in lineages with a sufficiently low rate of mitotic recombination (cf. Schwander, T., L. Henry, and B. J. Crespi. 2011. Molecular Evidence for Ancient Asexuality in Timema Stick Insects. Curr. Biol. 21:1129-1134.) Indeed the swathes of Loss-of-Heterozygosity give a vivid sense of the interplay between recombination and mutation, in a lineage in which recombination is rare enough not to have erased the whole record.

We now discuss the history of the Meselson as requested (Discussion):

“Our data indicates the parasite population comprises just two independently evolving haplotypes; this remarkable observation confirms the Meselson effect at a whole-genome level for the first time. […] In that system, as in T.b. gambiense Group 1, the mitotic recombination rate is sufficiently low so as not to obscure the pattern of accumulating mutations, and this has underpinned our ability to detect and confirm the Meselson effect.”

Remarks: LOH, generated by gene conversion is hypothesized to be a mechanism for removing deleterious alleles. This seems plausible. However to determine that LOH events are selected rather than neutral, a more comprehensive analysis of the distribution, size, gene content and rate of accumulation across time would be informative. For example, I wonder if it is possible to demonstrate that LOH specifically occurs in regions containing phylogenetically conserved genes that are intolerant to mutation accumulation. This is not critical for the current report, but would be well worth following up.

We thank the editor/reviewers for this comment. We are currently performing further analysis on the LOH regions of the genome to investigate this issue and we hope to publish the results in the early New Year.

The fine scale congruence between the phylogenies of the A and B genomes, documented in Figure 4, is impressive and goes a long way towards demonstrating that the observed heterozygosity represents evolution within the asexual lineage and is not inherited from, say, a hybrid ancestor. (Initially I was puzzled that it looked like there was incongruence between the chromosomes but closer inspection shows that there is no such incongruence, just sporadic omission of isolates that failed to meet the 1000SNP/chromosome cutoff.)

We again thank the editor/reviewers for this comment; we are very pleased that the A/B genome phylogeny comparison provides such clear evidence for haplotype co-evolution.

Author contributions – one author is listed who "commented on the analysis and manuscript". Acknowledgements would be more appropriate here unless a more substantial contribution can be provided.

This author’s contribution is more substantial that indicated – he provided expert support on applying classical population genetic analysis to genomic- scale data and provided extensive input on the direction of the manuscript, which focused our analysis on pursuing the Meselson effect.

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    Figure 1—source data 1. Number of SNP loci with respect to different sub-species.

    DOI: http://dx.doi.org/10.7554/eLife.11473.004

    DOI: 10.7554/eLife.11473.004
    Figure 1—source data 2. Testing Hardy-Weinberg Equilibrium (HWE) across the T.b. gambiense Group 1 genome.

    DOI: http://dx.doi.org/10.7554/eLife.11473.005

    DOI: 10.7554/eLife.11473.005
    Figure 1—source data 3. FIS by sub-population.

    DOI: http://dx.doi.org/10.7554/eLife.11473.006

    DOI: 10.7554/eLife.11473.006
    Figure 1—source data 4. Number and type of T.b. gambiense Group 1 SNPs.

    DOI: http://dx.doi.org/10.7554/eLife.11473.007

    DOI: 10.7554/eLife.11473.007
    Figure 3—source data 1. Estimated time since the most recent common ancestor.

    DOI: http://dx.doi.org/10.7554/eLife.11473.015

    DOI: 10.7554/eLife.11473.015
    Figure 4—source data 1. Co-phylogenetic analysis.

    DOI: http://dx.doi.org/10.7554/eLife.11473.017

    DOI: 10.7554/eLife.11473.017
    Supplementary file 1. Isolates used in this study.

    For each isolate, the year of isolation, host, country and location are given along with the results of the BIIT test (Blood Incubation Infectivity Test), which determines human infectivity. The presence/absence of TgSGP, the T.b. gambiense Group 1 human serum resistance gene and SRA, the T.b. rhodesiense human serum resistance gene are indicated. The majority of samples in this study were T.b. gambiense Group 1, details of which have been previously published (Heitman, 2006; Thorvaldsdottir et al., 2013).

    DOI: http://dx.doi.org/10.7554/eLife.11473.021

    elife-11473-supp1.pdf (108.2KB, pdf)
    DOI: 10.7554/eLife.11473.021

    Articles from eLife are provided here courtesy of eLife Sciences Publications, Ltd

    RESOURCES