Abstract
Schistosomes are obligately sexual blood flukes that can be maintained in the laboratory using freshwater snails as intermediate and rodents as definitive hosts. The genetic composition of laboratory schistosome populations is poorly understood: whether genetic variation has been purged due to serial inbreeding or retained is unclear. We sequenced 19 – 24 parasites from each of five laboratory Schistosoma mansoni populations and compared their genomes with published exome data from four S. mansoni field populations. We found abundant genomic variation (0.897 – 1.22 million variants) within laboratory populations: these carried on average 62% (π = 1.52e-04 – 7.15e-04) less nucleotide diversity than the four field parasite populations (π = 9.06e-03 – 2.24e-03). However, the pattern of variation was very different in laboratory and field populations. Tajima’s D was positive in all laboratory populations (except SmBRE), indicative of recent population bottlenecks, but negative in all field populations. Current effective population size estimates of laboratory populations were lower (2 – 258) compared to field populations (3,174 – infinity). The distance between markers at which linkage disequilibrium (LD) decayed to 0.5 was longer in laboratory populations (59 bp – 271 kb) compared to field populations (9 bp – 17.1 kb). SmBRE was the least variable laboratory population; this parasite also shows low fitness across the lifecycle, consistent with inbreeding depression. The abundant genetic variation present in most laboratory schistosome populations has several important implications: (i) measurement of parasite phenotypes, such as drug resistance, using laboratory parasite populations will determine average values and underestimate trait variation; (ii) genome-wide association studies (GWAS) can be conducted in laboratory schistosome populations by measuring phenotypes and genotypes of individual worms; (iii) genetic drift may lead to divergence in schistosome populations maintained in different laboratories. We conclude that the abundant genetic variation retained within many laboratory schistosome populations can provide valuable, untapped opportunities for schistosome research.
Author summary
Most protozoan, bacterial or viral pathogens are maintained as asexual lineages in the laboratory. This simplifies research, because the single pathogen genotypes can be studied. In contrast, many helminth parasites have two sexes and are maintained as obligately sexual populations. These populations are often referred to as “strains” and treated akin to clonal bacteria, but levels of genetic variation retained within populations is typically unknown. The blood fluke Schistosoma mansoni (Phylum: Platyhelminthes; Class: Trematoda) is a commonly used laboratory helminth. We sequenced individual parasites from five different laboratory populations of Schistosoma mansoni maintained for up to 80 years in the laboratory to directly measure genetic variation. We found abundant variation in four of five laboratory populations studied, with up to 1.22 million variants per population; variation in the five laboratory populations was 62% less than that present in the four field collected populations. The abundant variation retained provides both opportunities and issues for researchers. On the positive side, genetically variable laboratory populations can be used to examine the genetic basis of important parasite traits, such as drug resistance or host specificity, using genome-wide association. On the negative side, measurement of parasite traits in worm populations will determine average values and underestimate trait variation for individual parasites. Furthermore, parasite populations may diverge due to genetic drift, resulting in poor repeatability between studies.
Introduction
Many viral, bacterial and protozoan pathogens can be cloned and maintained as asexual lineages in the laboratory. This has many advantages for research because experimental infections can be established using genetically homogeneous pathogens, and differences in biomedically important pathogen traits can be directly attributed to genetic differences between pathogen clones. In contrast, most helminth parasites (e.g., Trichuris spp, Heligosomoides spp, filarial nematodes) used in biomedical research are obligately sexual. For example, our laboratory focuses on the blood fluke, Schistosoma mansoni, which has separate sexes (males are ZZ; females are ZW): these parasites are maintained as meiotically recombining populations in the laboratory, and individual parasites from each population may differ in genotype.
Cryopreservation of many viral, bacterial or protozoan pathogens simplifies research on these pathogens. However, while cryopreservation has been reported for schistosomes [1], it is inconsistent and cannot be used reliably for maintaining schistosome populations. Schistosome populations are therefore typically maintained in the laboratory by continuous passage through the aquatic snail intermediate host, where asexual proliferation of larval stages occurs, and the rodent definitive host, where adult males and females pair, meiosis and recombination occur and eggs are produced.
Schistosome populations have been maintained in the laboratory for up to 80 years [2]. For example, the SmNMRI parasite population maintained by the Biomedical Research Institute (BRI) [3] was originally isolated from Puerto Rico in the 1940s [2]. Our laboratory maintains four different parasite populations: SmEG from Egypt, collected at an undetermined date (possibly in the 1980s) by US researchers and then established at the Theodor Bilharz Research Institute in Cairo in 1990 [4,5]. SmLE isolated in Brazil in 1965 [2], while SmBRE was acquired from Brazil in 1975 [6], and SmOR, a descendant from SmHR, which was isolated in Puerto Rico in 1971 [7]. Assuming five complete parasite generations per year (sexual and asexual), these parasite populations have been maintained continuously for ~400 (SmNMRI), ~ 160 (SmEG), ~ 285 (SmLE), ~ 235 (SmBRE), and 270 (SmOR) generations.
The genomic consequences of long-term laboratory passage in schistosomes are not known, but several authors investigated this question in the pre-genomic era. Fletcher et al. [8] examined enzyme polymorphism at 18 loci in individual worms. They measured mean heterozygosity per locus and observed that genetic variation within laboratory populations maintained from 1-40 generations was approximately half that observed in fresh parasite isolates. Minchella et al. [9] quantified genetic variation in a maternally inherited DNA element (pSM750) using restriction fragment length polymorphism (RFLP) of individual parasites from 14 laboratory isolates. They noted that parasites from the same laboratory isolate generally showed low variability. However, SmNMRI parasites exhibited extensive variation. Pinto et al. [10] found no variation between worms from a laboratory isolate (SmLE), but extensive variation within parasites derived from different Brazilian patients using random amplified polymorphic DNA (RAPD) analysis from three different primer sets. Hence, these studies reached rather different conclusions.
Efforts to sequence the genome of S. mansoni provided further insights. The S. mansoni genome was initially sequenced from pools of parasites from the SmNMRI population [11]. The genetic variation present within these populations contributed to an imperfect genome assembly: the resultant assembly was fragmented in > 19,000 scaffolds [11,12]. As a consequence, subsequent work used DNA isolated from worms of a single genotype, that were a product of snail infections with single miracidium larvae infections. This approach contributed to a much improved genome assembly, closing more than 40,000 gaps and assigning 81% of the data to chromosomes [13]. The current version 10 of the S. mansoni genome assembly was further improved using libraries constructed from schistosome infections derived from single miracidia, and encodes seven chromosomes, ZW sex chromosomes, and is ~ 391 Mb in size [14].
Phenotypic data provides further evidence that parasite populations may not be homogeneous. Davies et al. isolated parasites that shed low or high numbers of schistosome larvae (cercariae) from the snail host from the SmPR population [15]. Furthermore, they were able to select low and high shedding populations [16], indicating this phenotypic variation has a genetic basis. Similarly Le Clec’h et al. [17] demonstrated that the SmLE-PZQ-R population, which was selected for resistance to praziquantel (PZQ) in the SmLE population from Brazil, comprises a mixture of praziquantel (PZQ) resistant and sensitive parasites, as well as abundant variation across the genome [18].
The present study was designed to directly measure genomic variation within five laboratory schistosome populations. We speculated that either i) a low number of founders or inbreeding due to repeated laboratory passage could result in bottlenecks and therefore a loss of genetic variation or ii) sexual outbreeding could be sufficient to retain high levels of genetic variation. We generated 117 independent genome sequences from four schistosome populations maintained in our laboratory and from the widely used SmNMRI population maintained at the BRI. We compared variation in these laboratory populations with published exome sequence data from field collected S. mansoni parasites from Brazil, Niger, Senegal, and Tanzania [19]. We observed abundant genetic variation within laboratory populations, albeit less than half the variation observed in field collected parasites. However, laboratory and field collected parasites showed dramatic differences in pattern of variation, including the allele frequency spectrum, linkage disequilibrium, and effective population size (Ne). We evaluate the implications of these results for schistosome research.
Results
Summary of sequence data
We sequenced the genomes of 117 independent S. mansoni genotypes from five populations. This was done by sequencing cercariae larvae shed from each of 19–24 snails infected with single miracidia (for SmNMRI, SmOR, SmEG and SmLE) and by sequencing individual adult worms (for SmBRE). We retained 103 of 117 generated genome sequences from laboratory samples after quality filtering: 18 SmNMRI, 20 SmOR, 18 SmBRE, 24 for SmEG, and 23 for SmLE. The mean read depths for these samples was 32.8x (range: 10.0 – 143.8x), and we discovered 0.897 – 1.22 million single nucleotide polymorphisms (SNPs) in the laboratory populations. Detailed information about these variants is listed in Table 1.
Table 1. Summary statistics of laboratory populations.
| Population | No. Parasite genotypes | Mean coverage (Range coverage) |
Total variants | SNVs | INDELS1 | Autosomal SNPs | Mitochondrial SNPs | SNPs MAF > 0.05 |
|---|---|---|---|---|---|---|---|---|
| BRE | 20 | 71.1 (47.8, 143.8) | 8.97E+05 | 8.11E+05 | 8.55E+04 | 7.37E+05 | 7 | 1.26E+05 |
| EG | 24 | 24.8 (17.3, 38.3) | 1.22E+06 | 1.11E+06 | 1.10E+05 | 1.03E+06 | 9 | 8.69E+05 |
| LE | 24 | 23.4 (10.5, 44.5) | 1.01E+06 | 9.15E+05 | 9.65E+04 | 8.62E+05 | 7 | 5.23E+05 |
| NMRI | 19 | 26.3 (15.9, 38.5) | 1.08E+06 | 9.83E+05 | 9.35E+04 | 9.36E+05 | 2 | 7.23E+05 |
| OR | 21 | 24.4 (10.0, 42.2) | 1.07E+06 | 9.55E+05 | 1.19E+05 | 9.23E+05 | 5 | 6.40E+05 |
| Population | Number of samples | Synonymous coding | Non-synonymous coding | Intron | Intergenic | |||
| BRE | 20 | 9.65E+03 | 1.33E+04 | 4.64E+05 | 4.27E+05 | |||
| EG | 24 | 1.30E+04 | 1.53E+04 | 6.19E+05 | 5.95E+05 | |||
| LE | 24 | 1.03E+04 | 1.26E+04 | 5.12E+05 | 4.96E+05 | |||
| NMRI | 19 | 1.08E+04 | 1.33E+04 | 5.57E+05 | 5.20E+05 | |||
| OR | 21 | 1.10E+04 | 1.33E+04 | 5.51E+05 | 5.20E+05 | |||
1Mean INDEL size = -98, range (-369, 406).
In addition, we utilized 124 previously generated exome sequences from field collected parasites from Brazil (n = 46), Niger (n = 9), Senegal (n = 23), and Tanzania (n = 45) [19]. To make field and laboratory samples directly comparable, we filtered genotyped laboratory and field samples jointly, keeping only variants that were genotyped in ≥ 80% samples in each of the lab and field populations. This resulted in 131,207 autosomal variants (Table 2). Coverage statistics for each sample are listed in S1 Table.
Table 2. Summary of variants used for comparison of laboratory and field collected schistosomes.
| Population | Number of samples | Autosomal SNPs in CDS region | MAF filtered (> 0.05) |
|---|---|---|---|
| BRE | 18 | 6,790 | 399 |
| EG | 24 | 9,186 | 7,448 |
| LE | 23 | 7,203 | 4,040 |
| NMRI | 18 | 7,788 | 6,007 |
| OR | 20 | 7,669 | 5,283 |
| Brazil | 46 | 22,719 | 11,499 |
| Niger1 | 9 | 17,848 | 16,784 |
| Senegal | 23 | 27,820 | 7,260 |
| Tanzania | 45 | 71,865 | 22,433 |
1MAF filtering removes variants present in just one copy (“singletons”) in all populations except Niger. Presence of singletons may result in overestimation of variation in Niger.
Principal component analysis (PCA) and admixture
We generated a PCA plot using 1.24 million MAF filtered, autosomal variants (MAF > 0.05) from our laboratory genome sequences (Fig 1A). This analysis identified five distinct clusters. While SmOR, SmEG, SmNMRI, and SmLE were separated along principal component 2, they clustered along principal component 1 and were distant from SmBRE. We used ADMIXTURE and plotted five populations, as k = 5 resulted in the smallest cross-validation score (Fig 1B). This analysis confirmed the presence of five schistosome populations with distinct allelic components.
Fig 1. Population structure in S. mansoni laboratory populations.
(A) PCA plot showing clustering of sequenced S. mansoni laboratory populations. (B) Admixture analysis with k = 5 populations.
We also generated a PCA plot containing both field and laboratory populations using 131,207 exon variants only (S1 Fig). Principal component 1 explains 7.84% of the variation and separates the Tanzanian field population from the other 8 populations. Principal component 2 explains 2.05% of the variation with SmBRE on one extreme and a cluster containing two west African field populations (Niger and Senegal) on the other. All remaining lab (SmEG, SmLE, SmOR, SmNMRI) and field (Brazil) populations form a closely related intermediate cluster.
Nucleotide diversity in S. mansoni laboratory and field populations
The distribution of SNP variation across the genome in the laboratory populations is shown in Fig 2A. We calculated nucleotide diversity (π) in 25 kb windows (Fig 2B). Mean SNP numbers were similar across the chromosomes, but we observed an increase in the variance in SNP numbers at the chromosome ends (S2 Fig). This analysis revealed minimal diversity in the SmBRE population. While SmBRE had 1.26E + 05 segregating SNPs (MAF > 0.05), equivalent numbers for the other lab populations were 8.69E + 05 (SmEG), 5.23E + 05 (SmLE), 6.40E + 05 (SmOR) and 7.23E + 05 (SmNMRI) (Table 1).
Fig 2. Comparable nucleotide diversity in field and laboratory populations.
(A) Average nucleotide diversity (π) across the whole genome for each laboratory population calculated in 25 kb windows and plotted for each autosome. The line indicates a LOESS smoothed curve. (B) Box and whisker plot showing nucleotide diversity (π) in 25 kb windows across the CDS in laboratory and field populations. Outliers are not shown. Wilcoxon tests compares π from laboratory and field populations (C) Histogram showing numbers of derived fixed alleles in each of the laboratory and field populations. Wilcoxon tests compares numbers of derived fixed alleles from laboratory and field populations.
We compared diversity in laboratory and field samples using a filtered dataset (Fig 2B). Laboratory populations (π = 1.52e-04 – 7.15e-04) showed 62% lower diversity than field populations (π = 9.06e-03 – 2.24e-03) (W = 360188499, p < 0.001). As previously documented, samples from Tanzania had the highest nucleotide diversity of all populations [19]. When we removed the Tanzania field population, genetic diversity in laboratory populations was 50% that found in field populations. The laboratory populations originating in Brazil (SmBRE and SmLE), showed 85% and 54% less variation than the Brazilian field population (Brazil).
We calculated the number of fixed derived variants in each population (Fig 2C) by identifying sites at fixation in S. mansoni populations, but absent from the outgroup (S. rodhaini). We observed more fixed derived SNPs in laboratory populations than field populations (Wilcoxon test, W = 2, p = 0.063). SmBRE, which showed the lowest diversity, showed the most fixed SNPs, while SmEG, which showed the highest diversity among the laboratory populations, carried the fewest.
We also plotted the average number of nucleotide differences per site (DXY) for all pairwise combinations of populations to identify population specific regions of high or low diversity (S3 Fig). This includes 20 comparisons between lab and field populations, 6 comparisons between field populations and 10 comparisons between laboratory populations. These plots reveal genome regions on chr 2, 3, 4 and 6 with elevated DXY values. The top 50 pairwise comparisons identify five 25kb windows and contain 6 genes and include a mini-exon gene (Smp_159830 (MEG 2.3 isoform 1)) (S3 Fig).
Tajima’s D and allele frequency distributions
While four of five laboratory populations exhibited a positive Tajima’s D, all field populations showed negative values (Fig 3A; Wilcoxon test; W = 17, p = 0.111). The exception to this was SmBRE, which had a negative Tajima’s D like the field populations. We inspected allele frequency spectra in each population, to better understand why Tajima’s D differs between populations. This revealed SNPs at intermediate frequencies were common in SmEG, SmLE, SmOR, and SmNMRI, whereas field populations (and SmBRE) had a high frequency of rare alleles (S4 Fig). We plotted the empirical cumulative distribution (ECDF) of allele frequencies for each population (Fig 3B) and measured Kolmogorov-Smirnov statistics for all using pairwise comparisons. Observed mean statistics for comparisons of field and lab populations, were significantly higher than those observed in all but 12 of 10,000 permutations (one side p-value = 0.0012), demonstrating significant differences in allele frequency spectra of field and lab parasite populations.
Fig 3. Indicators of recent bottlenecks in laboratory populations.
(A) Bar plots showing mean and standard error of Tajima’s D in each population. We used a Wilcoxon test to compare means of Tajima’s D in field and laboratory populations. (B) Line plot showing the empirical cumulative distribution function (ECDF) of allele frequencies in each population. A permutation-based Kolmogorov-Smirnov test was used to compare field vs laboratory distributions (see S4 Fig).
Linkage disequilibrium in laboratory and field populations
We calculated linkage disequilibrium (LD) for each S. mansoni population and estimated LD decay with physical distance between markers from pairwise r2 values. As we only retained 399 common (MAF > 0.05) exonic SNPs in the SmBRE population, we used all autosomal variants to calculate LD decay in the laboratory populations. Fig 4A shows slower LD decay in four out of the five laboratory populations compared to the field populations. To compare LD decay curves, we measured the distance at which LD is reduced to r2 = 0.5 (LD0.5, Fig 4B) [20]. LD decayed extremely rapidly in the Tanzanian parasite population (LD0.5 = 9 bp). LD decayed uniformly in the Nigerien, Senegalese, and Brazilian populations, with LD0.5 ranging from 1,000–9,543 bp. LD decay was slower in the laboratory populations (Wilcoxon test, W = 17, p = 0.111), with LD₀.₅ ranging from 72 kb to 180 kb in SmEG, SmLE, SmNMRI, and SmOR. In stark contrast to other laboratory populations, SmBRE exhibited very rapid LD decay (LD₀.₅ = 59 bp). We also calculated LD using exonic SNPs only to ensure that the differences observed did not result from use of different marker sets in field and laboratory populations (S5 Fig). This confirmed slower LD decay in laboratory than field populations (Wilcoxon, W = 17, p = 0.111), with the exception of SmBRE.
Fig 4. Slower LD decay in laboratory populations.
(A) r2 showing LD decay with physical distance between all autosomal SNPs in laboratory populations and exonic SNPs in field populations along the chromosomes. The points show mean values calculated over 1 kb windows while the fitted lines show relationship between distance and LD decay in each population. The plots are on a log scale. (B) Bar plot showing position when r2 = 0.5 (LD0.5) for field and laboratory populations. A Wilcoxon test was used to compare LD0.5 in field and laboratory populations. S5 Fig shows LD calculated using exonic SNPs, while S6 Fig shows LD between unlinked markers on different chromosomes.
Population size
We used our sequencing data to predict the current effective population size (Ne) based on either linkage disequilibrium (NeEstimator) or sibship frequency (COLONY). NeEstimator computed effective population sizes ranging from 2 – 258 in the laboratory and 3,174 (Brazil) – infinity (Niger, Senegal, Tanzania) in the field populations (Fig 5A), while COLONY reported Ne values from 5 – 123 for laboratory populations and 3,612 – infinity for field populations (Fig 5B). Both NeEstimator and COLONY identified SmNMRI and SmLE as having the highest Ne estimates among the laboratory populations, while SmBRE had the lowest Ne. Ne estimates for laboratory populations using both approaches were correlated (R2 = 0.96, p = 0.020). Ne estimates were at least 12-fold greater in field than in laboratory schistosome populations with NeEstimator and at least 29-fold greater with COLONY.
Fig 5. Reduced effective population size in laboratory populations.
Bar plots showing effective population size Ne calculated with (A) NeEstimator and (B) COLONY. The y-axis is split to show both high and low Ne values clearly. The error bars represent a 95% confidence interval. We used Wilcoxon tests to compare Ne in laboratory and field populations. Infinite values were set to 100,000 for this purpose.
Using our life cycle maintenance records, we estimated the census size (Nc) of our four laboratory schistosome populations over time and calculated the harmonic mean of each population [21]. This was done by estimating the number of parasite genotypes used to infect hamsters for each laboratory maintenance cycle over a seven-year period (S7 Fig). We did not have census data for the SmNMRI population maintained at BRI. Census size remained relatively consistent in SmLE, SmOR, and SmEG. However, population size increased in SmBRE parasites starting in 2021 (S7A Fig), as a result of a laboratory contamination event [22]. SmLE had the highest census with 157 genotypes, followed by SmOR (137) and SmEG (132), and SmBRE (93) (S7 Fig). Population size data is summarized in Table 3.
Table 3. Nc, Ne estimates and ratios.
| Population | Census | Census CI95 (L) | Census CI95 (U) | NeEstimator | NeEstimator CI95 (L) | NeEstimator CI95 (U) | COLONY | COLONY CI95 (L) | COLONY CI95 (U) | NeEstimator Ne/Nc |
COLONY Ne/Nc |
|---|---|---|---|---|---|---|---|---|---|---|---|
| BRE | 93 | 79 | 114 | 2 | 2 | 2 | 5 | 2 | 20 | 0.02 | 0.05 |
| EG | 132 | 111 | 161 | 81 | 80 | 81 | 55 | 33 | 112 | 0.61 | 0.42 |
| LE | 157 | 127 | 205 | 258 | 253 | 264 | 123 | 71 | 417 | 1.65 | 0.79 |
| OR | 137 | 112 | 174 | 42 | 42 | 42 | 45 | 27 | 99 | 0.31 | 0.33 |
| NMRI | 237 | 232 | 242 | 114 | 60 | 581 | |||||
| Brazil | 3,174 | 3,064 | 3,291 | 3,612 | 1,211 | Infinite | |||||
| Niger | Infinite | Infinite | Infinite | Infinite | 1 | Infinite | |||||
| Senegal | Infinite | Infinite | Infinite | Infinite | 1 | Infinite | |||||
| Tanzania | Infinite | Infinite | Infinite | Infinite | 1 | Infinite |
Measures of Census population size (Nc), and are based on the estimated number of parasite genotypes present in infected snails used to infect hamsters. Nc measures are measured from colony maintenance data collected over the past 7 years only, and are no available for field collected parasites. Effective population size estimates (Ne) are estimated from sequence data using two different estimator approaches (NeEstimator and Colony).
Simulations of genomic diversity in populations of different size
We conducted simulations to examine how population size impacts retention of diversity in schistosomes (Fig 6). When the number of parasite genotypes (N) = 5, where N is the number of adult worm genotypes in each generation, simulations show that autosomal diversity is reduced by >99% in 91 generations. When N = 100, we observe a 63.4% reduction relative to the progenitor population after 400 generations. When N = 200, the reduction 39.9% after 400 generations. The average effective population size (Ne) in our laboratory populations is 124, as calculated by NeEstimator, and 68, as calculated using Colony (Table 3).
Fig 6. Bottleneck simulation over 400 generations.
Line plot showing simulated reduction in genetic diversity of schistosome populations of different sizes over 400 generations. We used constant N ranging from 5 – 400. H0 indicates heterozygosity at time 0. The horizontal dashed line (38%) shows of the % reduction in diversity observed in our laboratory populations when compared to field populations.
Discussion
High levels of genetic diversity in most laboratory schistosome populations
We sequenced parasites from five different laboratory-maintained S. mansoni populations and compared them to four field populations from Africa and South America. Our genomic data revealed 0.897 – 1.22 million variants segregating within the five laboratory populations. This is equivalent to one variant every 321–436 bp. Furthermore, our study revealed 62% lower nucleotide diversity (π) in exome data from laboratory-maintained schistosome populations than from field populations. Despite repeated passage over 30 – 80 years (~150–400 generations, assuming five generations per year), high levels of genetic diversity remain in many laboratory schistosome populations.
Studies of free-living animals and plants have compared the genetic composition of different wild and domesticated/farmed species. Domestication of animals or plants often involves a low number of founders and the selection for specific traits [23–27]. Nucleotide diversity (π) is reduced in domesticated species populations by between 33 and 98% relative to wild populations (S2 Table) [28,29]. Laboratory S. mansoni populations fall close to the center of this range, with π being 62% lower relative to wild populations, a level of diversity reduction comparable to Mediterranean brown trout [30] and sunflower [31]. The relatively high levels of variation observed in laboratory schistosome populations may result from: (i) the relatively large size of S. mansoni founder populations; laboratory schistosome populations are typically founded by collecting eggs from one or more patients, each of which may be infected with hundreds of adult worms [32]; and (ii) that laboratory schistosome populations are maintained in large populations to prevent loss during laboratory maintenance. Our census estimates show that numbers of independent schistosome genotypes used to infect hamsters ranges from 93-157 (harmonic means) in our laboratory populations.
The level of variation retained within populations is dependent on the size and duration of population bottlenecks as demonstrated with our population bottleneck simulation [33]. Our Ne estimates are 2 – 258 with NeEstimator and 5 – 123 with COLONY, while our census (Nc) estimates range from 93 to 157. That laboratory populations show 62% less nucleotide diversity (π) compared to field populations is generally compatible with the simulation results when N ≈ 100. This is broadly consistent with the observed Nc or Ne values estimated. However, we note that factors other than demographics may maintain genetic variation: both the action of balancing selection [34] or preferential mating between unrelated parasites [35] may also act to retain genetic variation in laboratory parasite populations. Nevertheless, given sufficient breeding adults (~100) during laboratory passage, maintenance of genetic variation within these schistosome populations over multiple years to laboratory maintenance is perhaps not surprising.
While our study indicated lower levels in nucleotide diversity in laboratory compared with field schistosome populations, there were dramatic differences in the pattern of variation in laboratory and field populations. The patterns observed are consistent with strong bottlenecks during establishment and maintenance of S. mansoni colonies. We observed (i) lower numbers of rare alleles in laboratory populations: this is reflected in the positive Tajima’s D for four of the five laboratory populations, while the field populations show negative Tajima’s D. This is furthermore confirmed by the allele frequency spectra, which show a comparative deficit in rare alleles and more alleles at intermediate frequencies compared to field populations. Population contraction in the laboratory is the most likely cause of the allele frequency spectra observed as intermediate frequency alleles are more likely to survive bottlenecks [36]. (ii) more fixed derived SNPs in laboratory populations: this is consistent with increased drift in small populations. Notably, we observed the greatest number of fixed SNPs in SmBRE, which showed the lowest diversity and Ne estimates, and the lowest number of fixed SNPs in SmEG, which showed the highest diversity and Ne estimates (ii) Differences in LD decay: with the notable exception of SmBRE, we observed 6 – 19-fold slower decay of LD with physical distance on chromosomes in laboratory populations compared with field populations. This reduction is expected given the increased levels of sib-mating, genetic drift, and reduced total number of meiotic recombination events in small laboratory populations.
The exception: SmBRE is depauperate and shows low fitness
We have studied the SmBRE laboratory population extensively. These parasites typically show reduced snail infectivity, lower cercarial shedding and virulence in the intermediate snail host, and reduced immunopathology in the mouse [37–40]. One possible explanation for low fitness of SmBRE is inbreeding depression. In line with this, we found that nucleotide diversity (π) was two- to threefold lower in SmBRE than in the other four laboratory populations; SmBRE also had the lowest estimates for effective population size (Ne), and elevated levels of fixed SNPs (Table 1). We sequenced adult worms for SmBRE rather than cercariae. We were concerned that some SmBRE adults sequenced might be derived from clonal cercariae emerging from a single snail, which would downwardly bias our diversity measures. However, none of the SmBRE parasites showed elevated relatedness (r > 0.9) so this explanation can be eliminated.
However, other population genetic results for SmBRE were entirely unexpected. While SmEG, SmNMRI, SmLE, and SmOR showed strongly positive Tajima’s D, SmBRE had a strongly negative Tajima’s D and site frequency spectra like the field collected populations. We do not know what features of SmBRE demography might have contributed to this.
LD analysis for SmBRE produced the most puzzling result and showed a pattern that was radically different from the other laboratory populations. Given the low genetic diversity and effective population size (Ne) in SmBRE, we had expected to see the slowest rate of LD decay in this population. However, LD decayed very rapidly in SmBRE, and was higher than three of the four field populations examined. A genetic map for S. mansoni revealed that 1cM = 227 kb (95% CI 181–309kb) [41]. A possible explanation for the rapid decay of LD is that the recombination rate may be higher in SmBRE than in other laboratory populations. Analysis of further S. mansoni genetic crosses involving SmBRE could be used to explore this hypothesis. It is known that recombination rates can vary both across the genome and among populations of the several species [42,43].
Implication for schistosome research
That four of five laboratory-maintained schistosome populations retain abundant genetic variation has several important implications for schistosome research:
Phenotype measures in individual worms.
Current research on schistosome parasites, including developmental, immunological, transcriptomic, or drug response studies, utilizes pools of genetically variable worms rather than homogeneous inbred parasite lines [44–47]. As a consequence, these studies capture average population phenotypes and underestimate variation in the traits studied. For example, our laboratory has recently documented a significant impact of parasite population on immunopathological parameters, including spleen/liver weight and fibrosis [37]. However, we recognize that these impacts are likely to be underestimated, as our studies, like many others, utilize genetically variable laboratory populations. Analysis of praziquantel response provides a dramatic example. The SmLE-PZQ-R laboratory population, selected for praziquantel resistance, shows 14-fold increase in drug resistance relative to SmLE population from which it was selected. However, SmLE-PZQ-R is a mixture of both PZQ sensitive (PQZ-S) and PZQ resistant (PQZ-R) parasites that differ by > 377 fold in drug response [18]. We suspect other parasite phenotypes may show equally dramatic variation when measured in individual worms rather than diverse populations. Open source tools like the Single Worm Analysis of Movement Pipeline (SWAMP) [48] and wrmXpress [49] now offer the capability to accurately measure drug response phenotypes in individual worms, while transcriptomic variation can be measured in single worms or single cells [50–53]. We encourage researchers to shift their focus from genetically diverse populations to individual parasites for clearer measurement of parasite phenotypes.
Genome-wide association studies (GWAS).
Schistosome parasites show abundant phenotypic variation in a wide range of traits [54]. These include cercarial shedding [16,38,39,55–57], host specificity [58–60], and drug resistance [18,61–64]. Our laboratory is specifically interested in understanding the genetic basis of phenotypic traits in schistosomes, and we have primarily used genetic crosses and linkage analysis for this purpose [54]. That high levels of genetic variation are found within laboratory populations allows us to use a second powerful mapping approach (GWAS) to identify genes underlying specific traits. GWAS is considerably simpler than linkage analysis, because conducting two-generation (F2) genetic crosses is not required. Furthermore, GWAS more effectively examines variation across multiple individuals within populations, while genetic crosses examine differences between the two parents only, so samples genetic and phenotypic variation less effectively. Le Clec’h et al.’s [18] work on PZQ resistance provides strong proof-of-principal for use of GWAS approaches for schistosomes using laboratory populations. Their GWAS study used single worm measures of drug response in the SmLE-PZQ-R population and then sequenced pools of PZQ-S and PZQ-R worms showing extreme drug response phenotypes to determine the genome regions involved [18,65].
GWAS relies on association (LD) between trait loci and surrounding genetic markers. We observed much slower decay in LD in four out of five laboratory-maintained schistosome populations than observed in the field. GWAS studies in laboratory populations are therefore likely to generate much broader peaks [66,67]. For example, in the GWAS of praziquantel resistance locus, the genome region mapped spanned 5.72 mb and 137 genes [65]. Broad peaks have some advantages, as such peaks are unlikely to be missed if they are situated in genome regions that are difficult to genotype. However, broad peaks containing multiple genes make the task of identifying the causative locus much harder. We note that the extremely rapid decay in LD observed in some field populations (e.g., Tanzania) suggests that GWAS using freshly isolated parasite populations collected from infected patients may result in narrow peaks and allow identification of candidate regions with greater precision.
Reproducibility at different institutions.
Several laboratories maintain the same and/or different schistosome populations as examined here. The literature often refers to these schistosome populations as strains or lines, akin to bacterial clones or inbred mice, and so the assumption is that they will produce similar results at different institutions. However, bottlenecks and low Ne will result in genetic drift, and divergence between populations at different institutions. Such changes are likely to affect reproducibility, as is the case with non-model rodents [68]. Accessing schistosome parasites through the BRI [3] increases short-term consistency and reliability, but even parasites obtained from BRI in different years may vary due to genetic drift. While genetically variable laboratory populations have advantages for some genetic analyses (e.g., see “Genome-wide association studies”), one possible solution to increase repeatability in laboratory experiments might be to establish inbred parasite lines by serial inbreeding over a minimum of seven generations to reach 99% homozygosity. Such inbred lines have been used for snails [69] and mice [70]: the addition of inbred schistosome lines would allow precise dissection of parasite host interactions across the parasite life cycle. However, our experience with SmBRE illustrates that highly inbred populations may suffer from inbreeding depression and reduced fitness, posing a significant challenge to overcome.
Relevance to other helminths
Recent work suggests that the degree of genetic variation in other laboratory-maintained helminths could be underestimated as well. Stevens et al. [34] showed that Heligosomoides bakeri, a commonly used model nematode of rodents, retains extensive genetic diversity despite laboratory maintenance for 70 years [34]. Even in the selfing nematode C. elegans, long-term balancing selection maintains genetic variation to increase fitness and survival [71]. Studies of genetic variation filarial nematodes are particularly informative [72] (Table 4). Populations of the mosquito transmitted filarial nematodes Brugia malayi established from microfilariae from a single infected human have been maintained in several laboratories since the 1960s, while several populations of B. pahangi (originally collected from a green leaf monkey) have been maintained since the 1970s. Yet these laboratory filarial parasites remain genetically diverse: laboratory B. malayi individuals carried on average 108035–190445 segregating SNPs, while laboratory B. pahangi colonies contained 204,101–219,183 SNPs. B. pahangi adults retrieved from cats provide a comparison from natural infections: these carried segregating 507,464 SNPs. Hence, B. pahangi maintained for ~50 years in the laboratory still carry 50–60% of the genetic variation found in field collected worms (Table 4), similar to our results from S. mansoni. We expect that sequencing of other laboratory-maintained helminths, such as Trichuris muris or Strongyloides spp. may also reveal substantial genetic diversity, providing new research opportunities for a wide range of model helminth parasites.
Table 4. Nucleotide diversity in Brugia malayi and B. pahangi.
| Parasite population | Origin | N | Sequence coverage | No. SNV | π | ± 1 SD | Accession |
|---|---|---|---|---|---|---|---|
| Brugia malayi | |||||||
| CDRI, Lucknow, India | Laba | 4 | 95.94 | 108,035 | 0.0012 | 0.0015 | SRR3111731, SRR3111738, SRR3111864, SRR3112012 |
| TRS Jird1 | Laba | 4 | 95.86 | 87,661 | 0.0013 | 0.0015 | SRR111504, SRR3111510, SRR3111514, SRR3111517 |
| FR3 Jird Bmalayi | Laba | 4 | 95.26 | 88,357 | 0.0014 | 0.0015 | SRR3111544, SRR3111568, SRR3111579, SRR3111581 |
| Liverpool Jird | Laba | 4 | 92.23 | 88,920 | 0.0013 | 0.0015 | SRR3111629, SRR3111630, SRR3111634, SRR3111636, SRR3111640 |
| WashU Jird | Laba | 6 | 94.86 | 93,937 | 0.0013 | 0.0012 | SRR3111318, SRR3111319, SRR5190289, SRR5190290, SRR3111488, SRR3111493, SRR3111498, SRR5190287, SRR5190288 |
| Thai Jird | Labb | 4 | 92.08 | 190,455 | 0.0013 | 0.0015 | SRR12884294, SRR12884293, SRR12884292, SRR12884291 |
| Brugia pahangi | |||||||
| FR3, University of Georgia, Athens, Georgia, USA via BEI | Labc | 3 | 87.07 | 219,183 | 0.0025 | 0.0031 | SRR13482041, SRR13482040, SRR13482039 |
| University of Malaya, Kuala Lumpur, Malaysia | Fieldd | 3 | 90.97 | 507,464 | 0.0042 | 0.0044 | SRR7226912, SRR7227476, SRR7227477, SRR7227478, SRR7227479 |
| FR3, University of Wisconsin Oshkosh, Oshkosh, Wisconsin, USA | Labc | 4 | 94.56 | 204,101 | 0.0023 | 0.0030 | SRR12884296, SRR12884297, SRR12884298, SRR12884299 |
These data are reanalyzed from Mattick et al. [72]. Note that SNP calls in Mattick et al. [72] and this paper were made with different software tools (GATK for Mattick et al [72] and bcftools v1.22 in this paper). Hence the number of SNPs called may not match precisely.
aOriginally established from an infected human from Malaysia in the 1960s, and distributed to several laboratories [73,74].
bOriginally established from an infected human from Thailand in the 1980s [75].
cOriginally established from a green leaf monkey in the 1970s [76].
dAdult worms isolated from a naturally infected cat [77].
Limitations of this study
Paired field and laboratory populations from the same location are most informative for examining the impact of laboratory culture or domestication. These were not available here, so we compared laboratory sequence variation with that from previously published, but independent field collected samples. We used Illumina short read sequencing for this work. Highly variable genes are difficult to align to a reference sequence, so are poorly genotyped in Illumina-based resequencing studies due to poor read mapping in these regions. It is therefore likely that we significantly underestimated diversity in this study. For a more exhaustive evaluation of genetic variation within laboratory schistosome populations, long read (Nanopore or PacBio) sequencing, Hi-C and de novo assembly will be needed [34]. For the same reason, our study was not well powered to detect islands of genetic variation, suggestive of balancing selection, as observed in C. elegans [71] and H. bakeri [34] (see “Relevance to other helminths”).
An additional potential concern is the comparison of whole genome sequence from laboratory samples with exome capture for field samples. Exome capture methods may poorly sequence divergent alleles resulting in underestimation of variation. In reality, this potential bias is expected to be minimal, because capture methods typically allow efficient sequencing of alleles ≤10% divergent from the RNA baits used [78].
Materials and methods
Ethics statement
We utilize Syrian hamsters as the rodent host for maintaining schistosome parasites. This study was performed in accordance with the Guide for the Care and Use of Laboratory Animals of the National Institutes of Health. The protocol was approved by the Institutional Animal Care and Use Committee of Texas Biomedical Research Institute (permit number: 1419-MA).
Recovery of Schistosoma mansoni miracidia and snail infections for sample generation
For each parasite population (except SmBRE), we collected pools of cercariae shed from snails infected with single miracidia larvae. This ensures that each parasite sequenced is an independent genotype. We extracted gDNA from cercarial larvae in lieu of adult worms for two reasons: (i) adult schistosome females carry fertilized eggs which would result in mixed genotype sequences, and (ii) we wanted to avoid the sampling of identical adult worms derived from clonal cercariae from a single snail. In brief, we recovered S. mansoni eggs from livers of infected Golden Syrian hamsters as previously described [79] and infected Biomphalaria glabrata (line Bg36 for SmOR and SmLE) and B. alexandrina (for SmEG) snails by placing individuals in 24-well plates with a single miracidium. Plates were placed under a light source overnight before putting the snails in trays covered with a clear plastic lid. The lids were exchanged for a dark lid three weeks post infection to prevent cercarial shedding.
Sample generation for SmBRE
Preliminary analyses for this project revealed that our SmBRE population was contaminated with SmLE [80]. We therefore extracted gDNA from adult schistosome parasites previously collected during life cycle maintenance, prior to contamination. Individual male worms were processed as described below. To avoid obtaining mixed genotype eggs, we decapitated individual female worms and extracted gDNA for sequencing with Chelex solution following an established protocol [41]. All samples were whole-genome amplified as described below.
Collection of S. mansoni cercariae and gDNA extraction
We placed all snails into 24-well plates and shed them for 2 hours under light 28 days post infection. The content in each individual well was collected, transferred into microtubes, and spun down at 500 × g for 5 minutes to pellet the cercariae. We removed supernatant before flash-freezing cercariae in liquid nitrogen. Samples were stored at -80°C until gDNA extraction with the DNeasy Blood & Tissue Kit (Qiagen, Germantown, MD, USA) according to manufacturer instructions (tissue lysis for 2 hours at 56°C). We quantified gDNA using a Qubit dsDNA BR Assay Kit (Invitrogen, Carlsbad, CA, USA). We used the GenomiPhi V2 DNA Amplification Kit for whole genome amplification (WGA) of samples with gDNA yield < 200 ng (Cytiva, Marlborough, Massachusetts, USA). We have previously demonstrated high accuracy (99.55%) of genotyping from WGA of schistosome material, by comparing genotypes of F1 larval parasites with their parents [81].
gDNA Library preparation and sequencing
We used the KAPA HyperPlus Kit with library amplification (Roche, Indianapolis, IN, USA) to generate whole genome libraries with 200–400 ng of input material. We followed the manufacturer’s instructions with the following modifications: we fragmented the samples for 25 minutes, amplified libraries using six PCR cycles, and we performed library size selection using a ratio of 0.6X (30 µl beads) for the first size cut and 0.8X (10 µl beads) for the second size cut. We assessed the library profile with TapeStation 4200 D1000 ScreenTape (Agilent, Santa Clara, CA, USA) (average library size: 455) and quantified all libraries with the KAPA Library Quantification Kit (Roche, Indianapolis, IN, USA) (average library concentration: 43 nM). Pooled samples were sent to Admera Health and sequenced on a NovaSeq S4 (one pool with 40 samples) or NovaSeq X Plus (3 pools with 18–19 samples) platform (Illumina) using 150 bp paired-end reads.
Computational environment
We used conda version 23.1.0 to manage environments and download packages used in the analysis. Data was processed in R 4.2.0 using tidyverse v1.3.2, and plots were generated with ggplot v3.4.2. All shell and R scripts written for this project are available at https://github.com/kathrinsjutzeler/sm_single_gt and Zenodo https://doi.org/10.5281/zenodo.10672478.
Genotyping
We used trim_galore v0.6.7 [82] (-q 28 --illumina --max_n 1 --clip_R1 9 --clip_R2 9) for adapter and quality trimming before mapping the sequences to version 9 of the S. mansoni reference genome (GenBank assembly accession GCA_000237925.5) with BWA v0.7.17-r118 [83] and the default parameters. We used GATK v4.3.0.0 [84] for further processing of the sequences. First, we removed all optical/PCR duplicates with MarkDuplicates. Next, we called single nucleotide polymorphisms (SNPs) with HaplotypeCaller and GenotypeGVCFs on a contig-by-contig basis, which we combined for each individual and finally merged into a single VCF file for all sequences, including the ones from previously processed field samples [19]. At this point, we lifted the file over to v10 of the S. mansoni reference genome (Wellcome Sanger Institute, project PRJEA36577) using LiftoverVcf. We used VariantFiltration with the recommended parameters (FS > 60.0, SOR > 3.0, MQ < 40.0, MQRankSum < -12.5, ReadPosRankSum < -8.0, QD < 2.0) and VCFtools v0.1.16 [85] for quality filtering. For variant statistics of genomic data from laboratory populations, we removed sites with quality < 15, read depths < 10, and missingness > 20% and individuals with a genotyping rate < 50%. For the combined laboratory/field population analyses of exome data, we used bedtools intersect (v2.31.0) [86] to keep sites overlapping with the exome probes used for field samples. We then filtered each population individually by removing i) sites with quality < 15, read depth < 10 and > 20% missingness, and ii) individuals with a genotyping rate < 50%. Finally, we filtered [86] the population files to keep high quality sites that were scored in 80% of individual parasites from each of the laboratory and field populations.
Principal component analysis (PCA) and admixture
We used the snpgdsPCA() function from the SNPRelate v1.30.1 [87] R package to generate the PCA matrix and ADMIXTURE v1.3.0 [88] to estimate population ancestry for which we examined between k = 1 and k = 10 populations. In the end, we chose the model with the smallest cross validation score and used Q estimates as a proxy for ancestry fractions.
Summary statistics, Tajima’s D, and nucleotide diversity (π)
We calculated coverage statistics with samtools v1.9 [89] and mosdepth v0.3.6 [90]. We used VCFtools to calculate Tajima’s D in windows of 25 kb using autosomal variants in each population separately. We generated a VCF file containing both variant and non-variant sites from the genotyped GATK database to calculate nucleotide diversity (π) in 25 kb windows with pixy [91]. To compared relative nucleotide diversity (π) in laboratory and field populations, we measured the mean π in lab populations (by calculating π for each lab population, then taking the average) and comparing to the mean π in Field populations. We used the same VCF file to calculate the average number of nucleotide differences per site (DXY) for all pairwise combinations of populations using pixy [91]. We calculated numbers of fixed derived SNPs in each population by comparison to S. rodhaini (ERR114786, ERR310938, ERR7978134, ERR7978135, ERR7978144, SRR16526444, SRR16526443, SRR16526442, SRR16526441, SRR16526440, SRR16526439, SRR16526438, SRR16526437.). To do so, we calculated allele frequencies in each population, including S. rodhaini, individually. We then merged and filtered these frequency tables to retain only those absent in S. rodhaini (allele frequency = 0) and fixed in the respective focal population (allele frequency = 1).
Allele frequency spectrum and empirical cumulative distribution function (ECDF)
We used the site.spectrum() function from the pegas v1.2 R package [92] to compute the folded site frequency spectrum and bcftools v1.9 [89] to get overall allele frequency for SNPs in each individual population. We used stat_ecdf() from ggplot to calculate and plot ECDF for a statistical comparison of laboratory and field populations.
Linkage disequilibrium
We examined linkage disequilibrium (LD) between autosomal variants within each population with PLINK v1.90b6.21 to make pairwise comparisons between SNPs within 1Mb of one another (--ld-window-r2 0.0, --ld-window 1000000, --ld-window-kb 1000). We binned average r2 values using stats.bin() from the fields v14.1 R package [93] into 1,000 equal windows along the log scale which were calculated with logseq() from the pracma v2.4.4 package [94]. Rare variants (MAF < 0.05) were excluded from this analysis. We fitted LD decay curves to these points using locally estimated scatterplot smoothing (LOESS), and compared LD decay curves, we measured the distance at which LD is reduced to r2 = 0.5 (LD0.5). We used a custom script to calculate LD between unlinked markers. Briefly, we assigned genomic in lieu of chromosome positions to 10,000 randomly selected variants in each population and used PLINK to calculate LD (--ld-window-r2 0.0, --ld-window 999999, --ld-window-kb 10000). We then reassigned chromosomes and excluded results where R2 was calculated between markers on the same chromosome.
Census and Ne estimation
Census: We estimated the number of adult genotypes each generation (census size) data using detailed schistosome life cycle maintenance records we keep for each of our laboratory populations. Specifically, we estimated the number of parasite genotypes present within shedding snails used to infect the hamster hosts. Generally, we infect individual snails with five to ten miracidia and record the number of infected and uninfected snails at the time of the first shedding. Therefore, the probability of a snail not being infected is:
We then computed the probabilities that each shedding snail contains 1,2…n parasite genotypes utilizing a Poisson distribution with the dpois() function from the stats v4.2 package [95,96]. We note that this provides an upper limit on the number of adult worm genotypes within hamsters, because some cercarial genotypes may fail to establish.
Ne estimation: We used two programs to determine effective population size: NeEstimator v2 [97], which relies on linkage disequilibrium between pairs of SNPs on different chromosomes to estimate Ne and COLONY v2 [98], which calculates Ne based on sibship inference. We used the R package radiator v1.2.8 [99] to convert working VCF files per population (14,073 – 119,643 loci) to suitable input files for each software and ran COLONY via the command line with default parameters. Additionally, we created an input file listing chromosomes and loci to run NeEstimator v2 with the “LD Locus Pairing” option which excludes the comparison of loci on the same chromosome.
Bottleneck simulation
We used vcfR v1.13.0 [100] to extract genotypes from a VCF file containing common variants in the Brazilian field population. We then randomly sampled 10,000 loci to generate an input file suitable for BottleSim v2.6 [101]. We simulated bottleneck events with the “Diploid multilocus, constant population size” option, assuming no overlap between generations, and dioecy with random mating. We ran this simulation for N = 400, 200, 100, 50, 25, and 5 for 400 generations with a 1:1 sex ratio. This simulation approach is applicable to any sexually reproducing organism and examines the loss in heterozygosity (He) over time in parasite populations maintained with differing numbers of breeding adult parasites.
Statistical analysis
We performed all statistical analyses with R package rstatix v0.7.2 [102] or stats v4.2. We used Student’s t-tests (parametric) or Wilcoxon’s rank-sum test (non-parametric) to compare the means of field and laboratory populations (normally distributed data, Shapiro test, p > 0.05). To compare empirical cumulative distributions (ECDFs) describing site frequency spectra, we used Kolmogorov-Smirnov statistic and permutation tests. We conducted pairwise comparisons between all populations examined (36 pairwise comparisons). We then calculated the mean K-S statistic for 16 comparisons of field and lab populations, and compared this to 10,000 randomly permuted datasets. We considered comparisons statistically significant when p < 0.05 [32].
Supporting information
PCA plot showing clusters of all populations used in this study.
(TIFF)
We calculated nucleotide diversity (π) in 25kb windows separately in each laboratory S. mansoni population. The distance from each window to the nearest chromosome end was calculated as a percentage of the total chromosome length and binned to the nearest whole percentage point. Box plots indicate the range of π values across all populations at each particular distance bin. There was no relationship between mean π and proximity to the chromosome ends, but variance in π increased at the chromosome ends. This result was also obtained with windows of 5 and 2 kb, so is robust to window size.
(PNG)
Data from all pairwise combinations are plotted on the same graph: red dots indicate comparisons between lab and field populations, green dots field vs field populations and blue dots are lab vs lab populations. The gene content in the DXY peaks are shown in the table below.
(PNG)
(A) Histograms of folded allele frequency spectra of each S. mansoni population. (B) Permutation tests to compare ECDFs from field and laboratory populations. We conducted pairwise comparisons between all populations examined (36 pairwise comparisons). We then calculated the mean K-S statistic for 16 comparisons of field and lab populations, and compared this to 10,000 randomly permuted datasets. The histogram shows the distribution of permutated values, with the empirical value and one tailed permutation test statistic marked by the red arrow.
(PNG)
(A) r2 showing LD decay with physical distance between exonic SNPs along the chromosomes. Mean was calculated over 1 kb windows following the log scale except for SmBRE for which all data points were plotted. (B) Bar plot showing position when r2 = 0.5 (LD0.5) for field and laboratory populations. A t-test was used to compare field and laboratory populations.
(TIFF)
Box and whisker plot showing LD (squared correlation coefficient, R2) of unlinked variants in each population.
(TIFF)
(A) Line plot showing estimated census size over time. We used detailed life cycle maintenance records to estimate P(0) and calculated numbers of parasites/snail assuming a Poisson distribution. Note that these Nc values are likely to be systematic overestimates. We conduct hamster infections with newly infected batches of snails to which we add surviving infected snails from the prior life cycle maintenance. Therefore, the proportion of uninfected snails (P(0)) will be underestimated, and Poisson estimates of numbers of parasite genotypes per snail will be overestimated. The actual Nc values are likely to be somewhat lower. (B) Bar plot showing the harmonic mean of the Nc for each population. The error bars represent a 95% confidence interval. (C) Scatter plot showing the relationship between Ne as calculated by COLONY (filled circle) and NeEstimator (open circle) for each population. The lines represent a linear regression model, and the corresponding Pearson correlation coefficients are displayed in accordance with the legend of the tool used.
(PDF)
(XLSX)
(XLSX)
Acknowledgments
Snails infected with SmNMRI parasites were provided by the Schistosomiasis Resource Center of the Biomedical Research Institute (Rockville, MD) through NIH-NIAID Contract HHSN272201700014I. We thank Sarah Schmid and Gabrielle Bate for conducting the monomiracidial snail infections and coordinating shipping and Dr. Margaret Mentink-Kane for her assistance.
Data Availability
o The sequencing data generated for this project are available on Sequence Read Archive (SRA) under BioProject PRJNA1074697 (SmEG, SmOR, SmLE, SmNMRI) and PRJNA1170908 (SmBRE). Exome sequences from field samples have previously been published by Platt et al. [19] and are available on SRA under BioProjects PRJNA743359 (Brazil) and PRJNA560070 (Niger, Senegal, and Tanzania). All shell and R scripts written for this project are available at https://github.com/kathrinsjutzeler/sm_single_gt and Zenodo https://doi.org/10.5281/zenodo.10672478.
Funding Statement
This research was supported by a Graduate Research in Immunology Program training grant NIH T32 AI138944 (KSJ), and NIH R21 AI171601-02 (FDC, WL), R01 AI133749, R01 AI166049 (TJCA), and was conducted in facilities constructed with support from Research Facilities Improvement Program grant C06 RR013556 from the National Center for Research Resources. SNPRC research at Texas Biomedical Research Institute is supported by grant P51 OD011133 from the Office of Research Infrastructure Programs, NIH. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
References
- 1.Stirewalt M, Lewis FA, Cousin CE, Leef JL. Cryopreservation of schistosomules of Schistosoma mansoni in quantity. Am J Trop Med Hyg. 1984;33(1):116–24. doi: 10.4269/ajtmh.1984.33.116 [DOI] [PubMed] [Google Scholar]
- 2.Lewis FA, Stirewalt MA, Souza CP, Gazzinelli G. Large-scale laboratory maintenance of Schistosoma mansoni, with observations on three schistosome/snail host combinations. J Parasitol. 1986;72(6):813–29. doi: 10.2307/3281829 [DOI] [PubMed] [Google Scholar]
- 3.Cody JJ, Ittiprasert W, Miller AN, Henein L, Mentink-Kane MM, Hsieh MH. The NIH-NIAID Schistosomiasis Resource Center at the Biomedical Research Institute: Molecular Redux. PLoS Negl Trop Dis. 2016;10(10):e0005022. doi: 10.1371/journal.pntd.0005022 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Hassan AHM, Haberl B, Hertel J, Haas W. Miracidia of an Egyptian strain of Schistosoma mansoni differentiate between sympatric snail species. J Parasitol. 2003;89(6):1248–50. doi: 10.1645/GE-85R [DOI] [PubMed] [Google Scholar]
- 5.Botros SS, Hammam OA, El-Lakkany NM, El-Din SHS, Ebeid FA. Schistosoma haematobium (Egyptian strain): rate of development and effect of praziquantel treatment. J Parasitol. 2008;94(2):386–94. doi: 10.1645/GE-1270.1 [DOI] [PubMed] [Google Scholar]
- 6.Fneich S, Théron A, Cosseau C, Rognon A, Aliaga B, Buard J, et al. Epigenetic origin of adaptive phenotypic variants in the human blood fluke Schistosoma mansoni. Epigenetics Chromatin. 2016;9:27. doi: 10.1186/s13072-016-0076-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Rogers SH, Bueding E. Hycanthone resistance: development in Schistosoma mansoni. Science. 1971;172(3987):1057–8. doi: 10.1126/science.172.3987.1057 [DOI] [PubMed] [Google Scholar]
- 8.Fletcher M, LoVerde PT, Woodruff DS. Genetic variation in Schistosoma mansoni: enzyme polymorphisms in populations from Africa, Southwest Asia, South America, and the West Indies. Am J Trop Med Hyg. 1981;30(2):406–21. doi: 10.4269/ajtmh.1981.30.406 [DOI] [PubMed] [Google Scholar]
- 9.Minchella DJ, Lewis FA, Sollenberger KM, Williams JA. Genetic diversity of Schistosoma mansoni: quantifying strain heterogeneity using a polymorphic DNA element. Mol Biochem Parasitol. 1994;68(2):307–13. doi: 10.1016/0166-6851(94)90175-9 [DOI] [PubMed] [Google Scholar]
- 10.Pinto PM, Brito CF, Passos LK, Tendler M, Simpson AJ. Contrasting genomic variability between clones from field isolates and laboratory populations of Schistosoma mansoni. Mem Inst Oswaldo Cruz. 1997;92(3):409–14. doi: 10.1590/s0074-02761997000300019 [DOI] [PubMed] [Google Scholar]
- 11.Berriman M, Haas BJ, LoVerde PT, Wilson RA, Dillon GP, Cerqueira GC, et al. The genome of the blood fluke Schistosoma mansoni. Nature. 2009;460(7253):352–8. doi: 10.1038/nature08160 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Protasio AV, Tsai IJ, Babbage A, Nichol S, Hunt M, Aslett MA, et al. A systematically improved high quality genome and transcriptome of the human blood fluke Schistosoma mansoni. PLoS Negl Trop Dis. 2012;6(1):e1455. doi: 10.1371/journal.pntd.0001455 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Protasio AV, Tsai IJ, Babbage A, Nichol S, Hunt M, Aslett MA, et al. A systematically improved high quality genome and transcriptome of the human blood fluke Schistosoma mansoni. PLoS Negl Trop Dis. 2012;6(1):e1455. doi: 10.1371/journal.pntd.0001455 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Buddenborg SK, Tracey A, Berger DJ, Lu Z, Doyle SR, Fu B, et al. Assembled chromosomes of the blood fluke Schistosoma mansoni provide insight into the evolution of its ZW sex-determination system. Genomics. 2021. doi: 10.1101/2021.08.13.456314 [DOI] [Google Scholar]
- 15.Davies CM, Webster JP, Woolhous ME. Trade-offs in the evolution of virulence in an indirectly transmitted macroparasite. Proc Biol Sci. 2001;268(1464):251–7. doi: 10.1098/rspb.2000.1367 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Gower CM, Webster JP. Fitness of indirectly transmitted pathogens: restraint and constraint. Evolution. 2004;58(6):1178–84. doi: 10.1111/j.0014-3820.2004.tb01698.x [DOI] [PubMed] [Google Scholar]
- 17.Couto FFB, Coelho PMZ, Araújo N, Kusel JR, Katz N, Jannotti-Passos LK, et al. Schistosoma mansoni: a method for inducing resistance to praziquantel using infected Biomphalaria glabrata snails. Mem Inst Oswaldo Cruz. 2011;106(2):153–7. doi: 10.1590/s0074-02762011000200006 [DOI] [PubMed] [Google Scholar]
- 18.Le Clec’h W, Chevalier FD, Mattos ACA, Strickland A, Diaz R, McDew-White M, et al. Genetic analysis of praziquantel response in schistosome parasites implicates a transient receptor potential channel. Sci Transl Med. 2021;13(625):eabj9114. doi: 10.1126/scitranslmed.abj9114 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Platt RN, Le Clec’h W, Chevalier FD, McDew-White M, LoVerde PT, de Assis RR, et al. Genomic analysis of a parasite invasion: colonization of the Americas by the blood fluke, Schistosoma mansoni. Mol Ecol. 2021. doi: 10.1101/2021.10.25.465783 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Reich DE, Cargill M, Bolk S, Ireland J, Sabeti PC, Richter DJ, et al. Linkage disequilibrium in the human genome. Nature. 2001;411(6834):199–204. doi: 10.1038/35075590 [DOI] [PubMed] [Google Scholar]
- 21.Sjödin P, Kaj I, Krone S, Lascoux M, Nordborg M. On the meaning and existence of an effective population size. Genetics. 2005;169(2):1061–70. doi: 10.1534/genetics.104.026799 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Jutzeler KS, Platt RN, Li X, Morales M, Diaz R, Le Clec’h W, et al. Molecular dissection of laboratory contamination between two schistosome populations. Parasit Vectors. 2024;17(1):528. doi: 10.1186/s13071-024-06588-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Dutrow EV, Serpell JA, Ostrander EA. Domestic dog lineages reveal genetic drivers of behavioral diversification. Cell. 2022;185(25):4737–4755.e18. doi: 10.1016/j.cell.2022.11.003 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Bower MA, McGivney BA, Campana MG, Gu J, Andersson LS, Barrett E, et al. The genetic origin and history of speed in the Thoroughbred racehorse. Nat Commun. 2012;3:643. doi: 10.1038/ncomms1644 [DOI] [PubMed] [Google Scholar]
- 25.Brotherstone S, Goddard M. Artificial selection and maintenance of genetic variance in the global dairy cow population. Philos Trans R Soc Lond B Biol Sci. 2005;360(1459):1479–88. doi: 10.1098/rstb.2005.1668 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Cole JB, Makanjuola BO, Rochus CM, van Staaveren N, Baes C. The effects of breeding and selection on lactation in dairy cattle. Anim Front. 2023;13: 55–63.doi: 10.1093/af/vfad044 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Núñez-León D, Cordero GA, Schlindwein X, Jensen P, Stoeckli E, Sánchez-Villagra MR, et al. Shifts in growth, but not differentiation, foreshadow the formation of exaggerated forms under chicken domestication. Proc Biol Sci. 2021;288(1953):20210392. doi: 10.1098/rspb.2021.0392 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Liu W, Chen L, Zhang S, Hu F, Wang Z, Lyu J, et al. Decrease of gene expression diversity during domestication of animals and plants. BMC Evol Biol. 2019;19(1):19. doi: 10.1186/s12862-018-1340-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Albert FW, Somel M, Carneiro M, Aximu-Petri A, Halbwax M, Thalmann O, et al. A comparison of brain gene expression levels in domesticated and wild animals. PLoS Genet. 2012;8(9):e1002962. doi: 10.1371/journal.pgen.1002962 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Leitwein M, Gagnaire P-A, Desmarais E, Guendouz S, Rohmer M, Berrebi P, et al. Genome-wide nucleotide diversity of hatchery-reared Atlantic and Mediterranean strains of brown trout Salmo trutta compared to wild Mediterranean populations. J Fish Biol. 2016;89(6):2717–34. doi: 10.1111/jfb.13131 [DOI] [PubMed] [Google Scholar]
- 31.Liu A, Burke JM. Patterns of nucleotide diversity in wild and cultivated sunflower. Genetics. 2006;173(1):321–30. doi: 10.1534/genetics.105.051110 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Cheever AW, Kamel IA, Elwi AM, Mosimann JE, Danner R. Schistosoma mansoni and S. haematobium infections in Egypt. II. Quantitative parasitological findings at necropsy. Am J Trop Med Hyg. 1977;26(4):702–16. doi: 10.4269/ajtmh.1977.26.702 [DOI] [PubMed] [Google Scholar]
- 33.Nei M, Maruyama T, Chakraborty R. The bottleneck effect and genetic variability in populations. Evolution. 1975;29(1):1–10. doi: 10.1111/j.1558-5646.1975.tb00807.x [DOI] [PubMed] [Google Scholar]
- 34.Stevens L, Martínez-Ugalde I, King E, Wagah M, Absolon D, Bancroft R, et al. Ancient diversity in host-parasite interaction genes in a model parasitic nematode. Nat Commun. 2023;14(1):7776. doi: 10.1038/s41467-023-43556-w [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Beltran S, Cézilly F, Boissier J. Genetic dissimilarity between mates, but not male heterozygosity, influences divorce in schistosomes. PLoS One. 2008;3(10):e3328. doi: 10.1371/journal.pone.0003328 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Carlson CS, Thomas DJ, Eberle MA, Swanson JE, Livingston RJ, Rieder MJ, et al. Genomic regions exhibiting positive selection identified from dense genotype data. Genome Res. 2005;15(11):1553–65. doi: 10.1101/gr.4326505 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Jutzeler KS, Le Clec’h W, Chevalier FD, Anderson TJC. Contribution of parasite and host genotype to immunopathology of schistosome infections. Parasit Vectors. 2024;17(1):203. doi: 10.1186/s13071-024-06286-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Le Clec’h W, Chevalier FD, McDew-White M, Menon V, Arya G-A, Anderson TJC. Genetic architecture of transmission stage production and virulence in schistosome parasites. Virulence. 2021;12(1):1508–26. doi: 10.1080/21505594.2021.1932183 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Le Clecʼh W, Diaz R, Chevalier FD, McDew-White M, Anderson TJC. Striking differences in virulence, transmission and sporocyst growth dynamics between two schistosome populations. Parasit Vectors. 2019;12(1):485. doi: 10.1186/s13071-019-3741-z [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Le Clec’h W, Chevalier FD, Jutzeler K, Anderson TJC. No evidence for schistosome parasite fitness trade-offs in the intermediate and definitive host. Parasit Vectors. 2023;16(1):132. doi: 10.1186/s13071-023-05730-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Criscione CD, Valentim CLL, Hirai H, LoVerde PT, Anderson TJC. Genomic linkage map of the human blood fluke Schistosoma mansoni. Genome Biol. 2009;10(6):R71. doi: 10.1186/gb-2009-10-6-r71 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Dutheil JY. On the estimation of genome-average recombination rates. Genetics. 2024;227(2):iyae051. doi: 10.1093/genetics/iyae051 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Venu V, Harjunmaa E, Dreau A, Brady S, Absher D, Kingsley DM, et al. Fine-scale contemporary recombination variation and its fitness consequences in adaptively diverging stickleback fish. Nat Ecol Evol. 2024;8(7):1337–52. doi: 10.1038/s41559-024-02434-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Chalmers IW, McArdle AJ, Coulson RM, Wagner MA, Schmid R, Hirai H, et al. Developmentally regulated expression, alternative splicing and distinct sub-groupings in members of the Schistosoma mansoni venom allergen-like (SmVAL) gene family. BMC Genomics. 2008;9:89. doi: 10.1186/1471-2164-9-89 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Kalantari P, Shecter I, Hopkins J, Pilotta Gois A, Morales Y, Harandi BF, et al. The balance between gasdermin D and STING signaling shapes the severity of schistosome immunopathology. Proc Natl Acad Sci U S A. 2023;120(13):e2211047120. doi: 10.1073/pnas.2211047120 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Lu Z, Sankaranarayanan G, Rawlinson KA, Offord V, Brindley PJ, Berriman M, et al. The transcriptome of Schistosoma mansoni developing eggs reveals key mediators in pathogenesis and life cycle propagation. Front Trop Dis. 2021;2. doi: 10.3389/fitd.2021.713123 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Mukendi JPK, Nakamura R, Uematsu S, Hamano S. Interleukin (IL)-33 is dispensable for Schistosoma mansoni worm maturation and the maintenance of egg-induced pathology in intestines of infected mice. Parasit Vectors. 2021;14(1):70. doi: 10.1186/s13071-020-04561-w [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Chevalier FD. SWAMP: Single Worm Analysis of Movement Pipeline. Available from: https://github.com/fdchevalier/SWAMP
- 49.Wheeler NJ, Gallo KJ, Rehborg EJG, Ryan KT, Chan JD, Zamanian M. wrmXpress: A modular package for high-throughput image analysis of parasitic and free-living worms. PLoS Negl Trop Dis. 2022;16(11):e0010937. doi: 10.1371/journal.pntd.0010937 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Diaz Soria CL, Lee J, Chong T, Coghlan A, Tracey A, Young MD, et al. Single-cell atlas of the first intra-mammalian developmental stage of the human parasite Schistosoma mansoni. Nat Commun. 2020;11(1):6411. doi: 10.1038/s41467-020-20092-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Nanes Sarfati D, Li P, Tarashansky AJ, Wang B. Single-cell deconstruction of stem-cell-driven schistosome development. Trends Parasitol. 2021;37(9):790–802. doi: 10.1016/j.pt.2021.03.005 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Wendt G, Zhao L, Chen R, Liu C, O’Donoghue AJ, Caffrey CR, et al. A single-cell RNA-seq atlas of Schistosoma mansoni identifies a key regulator of blood feeding. Science. 2020;369(6511):1644–9. doi: 10.1126/science.abb7709 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Wendt GR, Reese ML, Collins JJ 3rd. SchistoCyte Atlas: a single-cell transcriptome resource for adult schistosomes. Trends Parasitol. 2021;37(7):585–7. doi: 10.1016/j.pt.2021.04.010 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Anderson TJC, LoVerde PT, Le Clec’h W, Chevalier FD. Genetic crosses and linkage mapping in schistosome parasites. Trends Parasitol. 2018;34(11):982–96. doi: 10.1016/j.pt.2018.08.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Webster JP, Gower CM, Blair L. Do hosts and parasites coevolve? Empirical support from the Schistosoma system. Am Nat. 2004;164 Suppl 5:S33–51. doi: 10.1086/424607 [DOI] [PubMed] [Google Scholar]
- 56.Webster JP, Davies CM. Coevolution and compatibility in the snail-schistosome system. Parasitology. 2001;123 Suppl:S41–56. doi: 10.1017/s0031182001008071 [DOI] [PubMed] [Google Scholar]
- 57.Théron A. Chronobiology of trematode cercarial emergence: from data recovery to epidemiological, ecological and evolutionary implications. Adv Parasitol. 2015;88:123–64. doi: 10.1016/bs.apar.2015.02.003 [DOI] [PubMed] [Google Scholar]
- 58.Mitta G, Gourbal B, Grunau C, Knight M, Bridger JM, Théron A. The Compatibility between Biomphalaria glabrata snails and Schistosoma mansoni: an increasingly complex puzzle. Adv Parasitol. 2017;97:111–45. doi: 10.1016/bs.apar.2016.08.006 [DOI] [PubMed] [Google Scholar]
- 59.Rollinson D, Stothard JR, Southgate VR. Interactions between intermediate snail hosts of the genus Bulinus and schistosomes of the Schistosoma haematobium group. Parasitology. 2001;123 Suppl:S245–60. doi: 10.1017/s0031182001008046 [DOI] [PubMed] [Google Scholar]
- 60.Theron A, Rognon A, Gourbal B, Mitta G. Multi-parasite host susceptibility and multi-host parasite infectivity: a new approach of the Biomphalaria glabrata/Schistosoma mansoni compatibility polymorphism. Infect Genet Evol. 2014;26:80–8. doi: 10.1016/j.meegid.2014.04.025 [DOI] [PubMed] [Google Scholar]
- 61.Valentim CLL, Cioli D, Chevalier FD, Cao X, Taylor AB, Holloway SP, et al. Genetic and molecular basis of drug resistance and species-specific drug action in schistosome parasites. Science. 2013;342(6164):1385–9. doi: 10.1126/science.1243106 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Greenberg RM. New approaches for understanding mechanisms of drug resistance in schistosomes. Parasitology. 2013;140(12):1534–46. doi: 10.1017/S0031182013000231 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Melman SD, Steinauer ML, Cunningham C, Kubatko LS, Mwangi IN, Wynn NB, et al. Reduced susceptibility to praziquantel among naturally occurring Kenyan isolates of Schistosoma mansoni. PLoS Negl Trop Dis. 2009;3(8):e504. doi: 10.1371/journal.pntd.0000504 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Mwangi IN, Sanchez MC, Mkoji GM, Agola LE, Runo SM, Cupit PM, et al. Praziquantel sensitivity of Kenyan Schistosoma mansoni isolates and the generation of a laboratory strain with reduced susceptibility to the drug. Int J Parasitol Drugs Drug Resist. 2014;4(3):296–300. doi: 10.1016/j.ijpddr.2014.09.006 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Chevalier FD, Le Clec’h W, Berriman M, Anderson TJC. A single locus determines praziquantel response in Schistosoma mansoni. Antimicrob Agents Chemother. 2024;68(3):e0143223. doi: 10.1128/aac.01432-23 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Christoforou A, Dondrup M, Mattingsdal M, Mattheisen M, Giddaluru S, Nöthen MM, et al. Linkage-disequilibrium-based binning affects the interpretation of GWASs. Am J Hum Genet. 2012;90(4):727–33. doi: 10.1016/j.ajhg.2012.02.025 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Joiret M, Mahachie John JM, Gusareva ES, Van Steen K. Confounding of linkage disequilibrium patterns in large scale DNA based gene-gene interaction studies. BioData Mining. 2019;12(1). doi: 10.1186/s13040-019-0199-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Brekke TD, Steele KA, Mulley JF. Inbred or Outbred? Genetic Diversity in Laboratory Rodent Colonies. G3 (Bethesda). 2018;8(2):679–86. doi: 10.1534/g3.117.300495 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Mulvey M, Woodruff DS, Carpenter MP. Linkage Relationships of Seven Enzyme and Two Pigmentation Loci in the Snail Biomphalaria glabrata. Journal of Heredity. 1988;79(6):473–6. doi: 10.1093/oxfordjournals.jhered.a110554 [DOI] [PubMed] [Google Scholar]
- 70.Casellas J. Inbred mouse strains and genetic stability: a review. Animal. 2011;5(1):1–7. doi: 10.1017/s1751731110001667 [DOI] [PubMed] [Google Scholar]
- 71.Lee D, Zdraljevic S, Stevens L, Wang Y, Tanny RE, Crombie TA, et al. Balancing selection maintains hyper-divergent haplotypes in Caenorhabditis elegans. Nat Ecol Evol. 2021;5(6):794–807. doi: 10.1038/s41559-021-01435-x [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Mattick J, Libro S, Bromley R, Chaicumpa W, Chung M, Cook D, et al. X-treme loss of sequence diversity linked to neo-X chromosomes in filarial nematodes. PLoS Negl Trop Dis. 2021;15(10):e0009838. doi: 10.1371/journal.pntd.0009838 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Taylor MJ, Bilo K, Cross HF, Archer JP, Underwood AP. 16S rDNA phylogeny and ultrastructural characterization of Wolbachia intracellular bacteria of the filarial nematodes Brugia malayi, B. pahangi, and Wuchereria bancrofti. Exp Parasitol. 1999;91(4):356–61. doi: 10.1006/expr.1998.4383 [DOI] [PubMed] [Google Scholar]
- 74.Ash LR, Riley JM. Development of subperiodic Brugia malayi in the jird, Meriones unguiculatus, with notes on infections in other rodents. J Parasitol. 1970;56(5):969–73. doi: 10.2307/3277515 [DOI] [PubMed] [Google Scholar]
- 75.Saeung A, Hempolchom C, Baimai V, Thongsahuan S, Taai K, Jariyapan N, et al. Susceptibility of eight species members in the Anopheles hyrcanus group to nocturnally subperiodic Brugia malayi. Parasit Vectors. 2013;6:5. doi: 10.1186/1756-3305-6-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Ash LR. Chronic Brugia pahangi and Brugia malayi infections in Meriones unguiculatus. J Parasitol. 1973;59(3):442–7. doi: 10.2307/3278769 [DOI] [PubMed] [Google Scholar]
- 77.Lau Y-L, Lee W-C, Xia J, Zhang G, Razali R, Anwar A, et al. Draft genome of Brugia pahangi: high similarity between B. pahangi and B. malayi. Parasit Vectors. 2015;8:451. doi: 10.1186/s13071-015-1064-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Bi K, Vanderpool D, Singhal S, Linderoth T, Moritz C, Good JM. Transcriptome-based exon capture enables highly cost-effective comparative genomic data collection at moderate evolutionary scales. BMC Genomics. 2012;13:403. doi: 10.1186/1471-2164-13-403 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Tucker MS, Karunaratne LB, Lewis FA, Freitas TC, Liang Y-S. Schistosomiasis. Curr Protoc Immunol. 2013;103:19.1.1–19.1.58. doi: 10.1002/0471142735.im1901s103 [DOI] [PubMed] [Google Scholar]
- 80.Jutzeler KS, Platt RN, Li X, Morales M, Diaz R, Le Clec'h W, Chevalier FD, Anderson TJC. Molecular dissection of laboratory contamination between two schistosome populations. Parasit Vectors. 2024. Dec 22;17(1):528. doi: 10.1186/s13071-024-06588-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Valentim CLL, LoVerde PT, Anderson TJC, Criscione CD. Efficient genotyping of Schistosoma mansoni miracidia following whole genome amplification. Mol Biochem Parasitol. 2009;166(1):81–4. doi: 10.1016/j.molbiopara.2009.02.010 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Krueger F, James F, Ewels P, Afyounian E, Weinstein M, Schuster-Boeckler B. TrimGalore. Available from: https://github.com/FelixKrueger/TrimGalore
- 83.Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25(14):1754–60. doi: 10.1093/bioinformatics/btp324 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, et al. The Genome Analysis Toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010;20(9):1297–303. doi: 10.1101/gr.107524.110 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85.Danecek P, Auton A, Abecasis G, Albers CA, Banks E, DePristo MA, et al. The variant call format and VCFtools. Bioinformatics. 2011;27(15):2156–8. doi: 10.1093/bioinformatics/btr330 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010;26(6):841–2. doi: 10.1093/bioinformatics/btq033 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87.Zheng X, Levine D, Shen J, Gogarten SM, Laurie C, Weir BS. A high-performance computing toolset for relatedness and principal component analysis of SNP data. Bioinformatics. 2012;28(24):3326–8. doi: 10.1093/bioinformatics/bts606 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 88.Alexander DH, Novembre J, Lange K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 2009;19(9):1655–64. doi: 10.1101/gr.094052.109 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 89.Danecek P, Bonfield JK, Liddle J, Marshall J, Ohan V, Pollard MO, et al. Twelve years of SAMtools and BCFtools. Gigascience. 2021;10(2):giab008. doi: 10.1093/gigascience/giab008 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 90.Pedersen BS, Quinlan AR. Mosdepth: quick coverage calculation for genomes and exomes. Bioinformatics. 2018;34(5):867–8. doi: 10.1093/bioinformatics/btx699 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 91.Korunes KL, Samuk K. pixy: Unbiased estimation of nucleotide diversity and divergence in the presence of missing data. Mol Ecol Resour. 2021;21(4):1359–68. doi: 10.1111/1755-0998.13326 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 92.Paradis E. pegas: an R package for population genetics with an integrated-modular approach. Bioinformatics. 2010;26(3):419–20. doi: 10.1093/bioinformatics/btp696 [DOI] [PubMed] [Google Scholar]
- 93.Nychka D, Furrer R, Paige J, Sain S. Fields: Tools for spatial data. Boulder, CO, USA: University Corporation for Atmospheric Research; 2021. https://github.com/dnychka/fieldsRPackage [Google Scholar]
- 94.Borchers HW. pracma: Practical Numerical Math Functions. 2023. Available from: https://CRAN.R-project.org/package=pracma
- 95.Gourbière S, Morand S, Waxman D. Fundamental factors determining the nature of parasite aggregation in hosts. PLoS One. 2015;10(2):e0116893. doi: 10.1371/journal.pone.0116893 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 96.McVinish R, Lester RJG. Measuring aggregation in parasite populations. J R Soc Interface. 2020;17(165):20190886. doi: 10.1098/rsif.2019.0886 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 97.Do C, Waples RS, Peel D, Macbeth GM, Tillett BJ, Ovenden JR. NeEstimator v2: re-implementation of software for the estimation of contemporary effective population size (Ne) from genetic data. Mol Ecol Resour. 2014;14(1):209–14. doi: 10.1111/1755-0998.12157 [DOI] [PubMed] [Google Scholar]
- 98.Jones OR, Wang J. COLONY: a program for parentage and sibship inference from multilocus genotype data. Mol Ecol Resour. 2010;10(3):551–5. doi: 10.1111/j.1755-0998.2009.02787.x [DOI] [PubMed] [Google Scholar]
- 99.Gosselin T. Thierrygosselin/radiator: update. Zenodo. 2020. doi: 10.5281/ZENODO.3687060 [DOI] [Google Scholar]
- 100.Knaus BJ, Grünwald NJ. vcfr: a package to manipulate and visualize variant call format data in R. Mol Ecol Resour. 2017;17(1):44–53. doi: 10.1111/1755-0998.12549 [DOI] [PubMed] [Google Scholar]
- 101.Kuo C‐H, Janzen FJ. Bottlesim: a bottleneck simulation program for long‐lived species with overlapping generations. Molecular Ecology Notes. 2003;3(4):669–73. doi: 10.1046/j.1471-8286.2003.00532.x [DOI] [Google Scholar]
- 102.Kassambara A. rstatix: Pipe-Friendly Framework for Basic Statistical Tests. 2023. Available from: https://CRAN.R-project.org/package=rstatix






