Skip to main content
Genome Research logoLink to Genome Research
. 2015 May;25(5):667–678. doi: 10.1101/gr.187237.114

Full-genome evolutionary histories of selfing, splitting, and selection in Caenorhabditis

Cristel G Thomas 1,10, Wei Wang 1,10, Richard Jovelin 1, Rajarshi Ghosh 2,3, Tatiana Lomasko 4, Quang Trinh 4, Leonid Kruglyak 2,5,6, Lincoln D Stein 4,7,8, Asher D Cutter 1,9
PMCID: PMC4417115  PMID: 25783854

Abstract

The nematode Caenorhabditis briggsae is a model for comparative developmental evolution with C. elegans. Worldwide collections of C. briggsae have implicated an intriguing history of divergence among genetic groups separated by latitude, or by restricted geography, that is being exploited to dissect the genetic basis to adaptive evolution and reproductive incompatibility; yet, the genomic scope and timing of population divergence is unclear. We performed high-coverage whole-genome sequencing of 37 wild isolates of the nematode C. briggsae and applied a pairwise sequentially Markovian coalescent (PSMC) model to 703 combinations of genomic haplotypes to draw inferences about population history, the genomic scope of natural selection, and to compare with 40 wild isolates of C. elegans. We estimate that a diaspora of at least six distinct C. briggsae lineages separated from one another approximately 200,000 generations ago, including the “Temperate” and “Tropical” phylogeographic groups that dominate most samples worldwide. Moreover, an ancient population split in its history approximately 2 million generations ago, coupled with only rare gene flow among lineage groups, validates this system as a model for incipient speciation. Low versus high recombination regions of the genome give distinct signatures of population size change through time, indicative of widespread effects of selection on highly linked portions of the genome owing to extreme inbreeding by self-fertilization. Analysis of functional mutations indicates that genomic context, owing to selection that acts on long linkage blocks, is a more important driver of population variation than are the functional attributes of the individually encoded genes.


The record of natural selection in shaping the genetic basis to organismal form and function of a species is inscribed in the genomes of its constituent individuals. Comparisons of genome sequences for each of the related nematodes Caenorhabditis elegans and C. briggsae have revealed powerful insights into the evolution of functional novelty and constraint in phenotypes and genetic pathways (Cutter et al. 2009; Marri and Gupta 2009; Thomas et al. 2012; Haag and Liu 2013; Verster et al. 2014). The high quality C. briggsae genome assembly facilitated such analysis (Stein et al. 2003; Hillier et al. 2007; Ross et al. 2011), and genomic analysis of populations of individuals provides a powerful means to further characterize evolution on contemporary timescales to understand the origins of novelty and constraint (The 1000 Genomes Project Consortium 2012; Langley et al. 2012; Brandvain et al. 2014). Indeed, key questions remain to be solved: How do genomes respond to the simultaneous pressures of mutation, natural selection, and genetic linkage—especially when a novel reproductive mode, facultative self-fertilization, has evolved in the ancestry of a species?

C. briggsae is similar to C. elegans in many ways, most notably in their streamlined morphology, amenability to genetic and experimental manipulation, and in both being comprised primarily of self-fertilizing hermaphrodites that are found around the globe. However, C. briggsae is distinctive in having more molecular and phenotypic wild diversity, which is divided along latitudinal phylogeographic lines (Cutter 2006; Raboin et al. 2010; Félix et al. 2013), and by being partly interfertile with its male–female (dioecious) sister species C. nigoni (Woodruff et al. 2010; Kozlowska et al. 2012; Félix et al. 2014). Some strain combinations within C. briggsae also appear to show reproductive incompatibilities and outbreeding depression (Dolgin et al. 2008; Ross et al. 2011; Baird and Stonesifer 2012). However, the extent of genetic exchange and admixture across the genome within this species, as well as a full depiction of its evolutionary history, has remained elusive. Moreover, the extensive linkage disequilibrium conferred on the genome by self-fertilizing reproduction is thought to interact with selection to shape chromosome-scale patterns of genetic diversity (Cutter and Choi 2010; Cutter and Payseur 2013). Consequently, selection, self-fertilization, and gene flow all likely interact to control diversity and divergence in ways requiring genomic-scale population information to discern.

These features make C. briggsae a powerful tool for dissecting evolutionary pattern and process in connection with trait divergence in nature, especially in combination with its deep experimental toolkit (Koboldt et al. 2010; Ross et al. 2011; Frøkjaer-Jensen 2013). Indeed, this species is now an active target of research into the molecular basis of trait variation and adaptation (Baird et al. 2005; Prasad et al. 2011; Ross et al. 2011; Stegeman et al. 2013), the evolution of development (Delattre and Félix 2001; Hill et al. 2006; Guo et al. 2009; Marri and Gupta 2009), and speciation (Woodruff et al. 2010; Baird and Stonesifer 2012; Kozlowska et al. 2012; Yan et al. 2012). Yet, the limited genomic scope of understanding for its natural variation constrains our ability to fully exploit it. Here we provide the population genomic framework for relating evolutionary pressures and demographic histories to their genomic signatures in a global sample of C. briggsae.

Results

Low recombination regions show drastic skews in polymorphism

We sequenced 37 wild isolate genomes of C. briggsae to high coverage (median 32×) (Supplemental Table 1) and identified a total of 2.70 million single nucleotide polymorphisms (SNPs) and 329,000 short indel variants. Because of the high rate of self-fertilization in this species, each strain's genome sequence is effectively homozygous and provides a single haploid genome. We first focused our analyses on the sample of 26 strains (including the reference strain genome) from the so-called “Tropical” circumglobal phylogeographic group (Supplemental Table 1), which excludes some genetically distinct strains also derived from low-latitude locations (Cutter et al. 2006; Félix et al. 2013). Polymorphisms in this Tropical sample showed a greater than fourfold enrichment on the chromosome arms compared to the central and tip chromosomal domains (Fig. 1). Both arm and center regions are euchromatic, but arms experience higher meiotic recombination rates and lower densities of genes (Supplemental Fig. 1; Stein et al. 2003; Cutter and Choi 2010; Ross et al. 2011). Moreover, SNPs in the central chromosome domains show a greater skew toward rare variants in the population than do SNPs in the high-recombination rate arm regions (Fig. 1). We quantified this skew in variant frequencies with Tajima's D (Fig. 1), the values of which fluctuate around the neutral expectation throughout much of the chromosome arm regions, whereas the chromosome centers approach the minimum possible value given the sample size (Supplemental Fig. 2). These patterns of polymorphism implicate selection having eliminated linked polymorphism across megabase spans and disproportionately so in low-recombination portions of chromosomes (Sella et al. 2009; Cutter and Payseur 2013).

Figure 1.

Figure 1.

Patterns of genetic diversity depend on genomic architecture. (A) Nucleotide polymorphism of silent sites in 20-kb windows is depressed in chromosome centers (and tips) compared to arm domains for Tropical strains (median of 20-kb windows on arms πsil = 0.165%; centers πsil = 0.041%; Wilcoxon χ2 = 1996.3; P < 0.0001). However, absolute divergence (Dxy) between Tropical and the distant Kerala group strains is largely insensitive to chromosomal domain. The recombination-associated domain structure of the X chromosome is less pronounced than for autosomes (Cutter and Choi 2010; Ross et al. 2011). (B) Chromosome centers also show more skew in the site frequency spectrum, indicative of an excess of rare alleles, which is expected to result from the interaction of selection and linkage. These qualitative patterns are consistent with analysis restricted to synonymous sites (Supplemental Fig. 10), indicating that differential constraint in noncoding regions does not drive the observed genomic patterns of polymorphism.

To further understand these molecular evolutionary patterns, we quantified the genomic distribution of divergence of the Tropical population relative to the “Kerala” strains that are thought to have a relatively ancient split within C. briggsae (Cutter et al. 2010). With a molecular clock, we estimate the time to the most recent common ancestor of known C. briggsae strains to be approximately 2 million generations ago (∼200,000 yr, assuming 10 generations per yr). Using Kerala strains to infer ancestral and derived alleles in the Tropical sample allowed us to screen chromosomes for regions with an unusual incidence of new derived variants, which is indicative of recent positive selection (Fay and Wu 2000; Zeng et al. 2006). Interestingly, despite the overall skew toward rare variants in chromosome centers (Fig. 1), centers did not show a biased incidence of high-frequency derived variants (Supplemental Fig. 2). This result suggests that the genomic distributions of polymorphism in C. briggsae might more likely derive from linked selection associated with removal of deleterious mutations (“background selection”) (Charlesworth et al. 1993; Fay and Wu 2000) and contrasts with some features of the frequency spectrum in the self-fertile nematode Pristionchus pacificus (Rödelsperger et al. 2014).

In contrast to patterns of polymorphism, we found that absolute sequence divergence between Tropical and Kerala strains does not differ markedly among chromosomal domains (Fig. 1). We can ask, however, whether the slight elevation in divergence on the arms could be explained by incomplete lineage sorting of ancestral polymorphism (median of 20-kb windows of silent sites on arms Dxy = 0.0122; centers Dxy = 0.0113). Presuming that the highly selfing C. briggsae common ancestor had equivalent levels of polymorphism as extant Tropical strains (median of 20-kb windows on arms πsi = 0.00165; centers πsi = 0.000409), divergence in chromosome centers would actually be predicted to be 10.2% less than on arms relative to the observed 7.6% reduction. Consequently, this implies that C. briggsae populations were likely moderately larger and more diverse in the past compared to present-day Tropical strains (see subsequent PSMC analysis for evidence supportive of this hypothesis) (Supplemental Fig. 3). These findings are consistent with the lack of evidence for differences in mutational input for chromosome arms and centers, based on mutation accumulation experiments (Denver et al. 2009, 2012) and is inconsistent with recombination-associated mutation (Lercher and Hurst 2002; Cutter and Payseur 2013). Interestingly, some chromosomes’ arms show lower divergence than centers using relative metrics (i.e., FST) (Supplemental Fig. 4) in contrast to the opposite pattern with absolute divergence metrics (i.e., Dxy), which is consistent with differential gene flow and/or incomplete lineage sorting across the genome (Noor and Bennett 2009; Pease and Hahn 2013; Cruickshank and Hahn 2014). Altogether, we conclude that patterns of polymorphism and divergence for C. briggsae are most consistent with selection at linked sites having induced the distinct molecular evolutionary signatures in low-recombination centers relative to high-recombination arms.

Within the Tropical group strains, the chromosomal extent of linkage disequilibrium (LD) in C. briggsae is less extreme than in 39 strains of C. elegans (Fig. 2). Nevertheless, LD in C. briggsae occurs between polymorphisms on different chromosomes to a greater extent than expected from sample size alone (Fig. 2) and much more than is observed in the selfing nematode P. pacificus (Rödelsperger et al. 2014). Such high interchromosomal LD is consistent with extreme self-fertilization as the dominant mode of reproduction in nature and a genetically effective outcrossing rate <0.0011 per generation, provided Ne > 10,000 (Supplemental Fig. 5).

Figure 2.

Figure 2.

The decay of linkage disequilibrium (r2) is more rapid in C. briggsae than C. elegans along every chromosome. The sex chromosome is not distinctive relative to autosomes in terms of linkage disequilibrium (LD) decay in either species (A), although C. elegans Chromosomes I and IV have elevated LD, and C. briggsae Chromosome II shows reduced LD compared to other chromosomes. The interchromosomal LD for C. briggsae (B) spans a narrower range of mean values among chromosome pairs than C. elegans (C), although both species have more LD between chromosomes than expected (horizontal lines). Horizontal lines indicate the background LD expected given the sample size (Weir and Hill 1980). C. briggsae strains include 25 Tropical strains (excluding reference strain AF16); C. elegans includes 39 strains (excluding Hawaiian CB4846). LD calculations exclude singleton polymorphisms.

A diaspora of C. briggsae populations

We next constructed neighbor networks to visualize genetic distances using a set of 439,139 SNPs with complete information across all 38 strains (Fig. 3). The genetic affiliations of strains implies a rapid succession of splitting events in a short time that gave rise to diverse genomic haplotypes, leading us to refer to the resulting populations as a “diaspora” of C. briggsae lineages. To explore the robustness of our inference of strain and phylogeographic affinity, we applied several alternative approaches that emphasize different features of the genomic data. First, we assessed population differentiation more formally with ADMIXTURE (Alexander et al. 2009) and identified up to eight possible subpopulations corresponding to genetic clusters observed in the neighbor network, with K = 4 minimizing the cross-validation error (Fig. 3). Moving from K = 2 to 8, first the large sample of Tropical strains are distinguished from all others, after which the Kerala samples are revealed as distinct, followed by the Taiwanese strain pair, with further subdivisions until K = 8 that separate the Montreal, Hubei, Nairobi strains, as well as splitting the Tropical strains into two subgroups (Supplemental Fig. 6). The strains from Hubei–Montreal–Nairobi–Taiwan do not show evidence of simply being a group derived from admixture between the four Temperate and 26 Tropical clade strains, based on nonsignificance for the F(3) test of Reich et al. (2009) (F(3) = 0.009897 ± 1; SE = 0.000741; Z = 13.35). Notably, chromosomes differ in the number of populations that minimize the cross-validation error identified by ADMIXTURE (K = 3 for chromosomes 1, 3, 4, 5; K = 4 for chromosomes 2, X, and the full genome) (Supplemental Fig. 6), suggesting varying degrees of gene flow for different chromosomes. A caveat to the ADMIXTURE analysis is that the strong linkage disequilibrium across the C. briggsae genome will not be properly accounted, and several potential genetic populations are represented by only one or two strains; this may explain the grouping together of the Nairobi, Montreal, Hubei, and Taiwan strains that are separated by long genetic distances (Fig. 3). The pattern of genomic haplotype chunk sharing between strains, as identified with ChromoPainter (Lawson et al. 2012; Yahara et al. 2013) recapitulates these trends of genetic differentiation (Fig. 3). Further analysis using TreeMix (Pickrell and Pritchard 2012) implicates plausible cases of gene flow between the genomes of geographically and genetically distinct strains. Although TreeMix should not be sensitive to ancestral polymorphism for a single ancestral population (Pickrell and Pritchard 2012), the topological positions of highest-weight migration events suggest the possibility that this method detects incomplete sorting of ancestral polymorphism from ancient population subdivision (Fig. 3; Supplemental Fig. 7). These diverse genome-wide analyses all are consistent and extend the findings from phylogeographic studies of small numbers of loci (Cutter 2006; Cutter et al. 2010; Raboin et al. 2010; Félix et al. 2013), reinforcing the identity of the Temperate and Tropical phylogeographic groups, clarifying the genomic make-up of geographically restricted strains, and underlining the great genetic distance of basal strains from Kerala, India, to other C. briggsae.

Figure 3.

Figure 3.

Diverse genomic analyses affirm the genetic distinctiveness of phylogeographic groups within C. briggsae. A neighbor network for all chromosomes (A) discriminates phylogeographic groups of strains, corresponding to the pan-global “Tropical” clade (red and pink strain labels), pan-global “Temperate” clade (blue), and genomic haplotype groups that exhibit restricted geographic origins around the globe: (Quebec) purple; (Nairobi) black; (Hubei) orange; (Taiwan) yellow; (Kerala) green. (B) The ADMIXTURE program minimizes the cross-validation error of ancestral relationships when it identifies four genetic clusters in this data set. (C) Permitting migration in the genomic ancestry of the 37 C. briggsae strains with TreeMix suggests multiple plausible instances of migration, although incomplete sorting of ancestral polymorphism provides an alternate interpretation. Heatmap above the genealogy indicates residual fit to a model with five migration events. (D) Haplotype clustering of the phylogeographic groups is recapitulated in a similar manner in ChromoPainter's genome-wide coancestry matrix (Euclidean log2). Dendrogram on left indicates strain clustering with the unlinked model; top dendrogram indicates strain clustering with the linked model. All analyses used the set of 439,139 SNPs with allele information present in all strains.

Our sample of four strains from the “Temperate” circumglobal phylogeographic group of C. briggsae showed two- to threefold lower polymorphism across the genome than the Tropical population sample, consistent with previous findings based on a few loci (Cutter et al. 2006). As observed for the Tropical strains, diversity for the Temperate group is reduced in chromosome centers compared to arms (median of 20-kb windows on arms πsi = 0.00053; centers πsi = 0.00027) (Supplemental Fig. 4). We next quantified genetic differentiation between the Temperate and Tropical samples along their chromosomes. In contrast to our observations for absolute divergence, relative measures of differentiation (FST) indicate stronger differentiation in chromosome centers (Fig. 4; Supplemental Fig. 4). These findings are fully consistent with the effects of linked selection on genomic regions with high gene density and low recombination rates (Charlesworth et al. 1997; Charlesworth 1998) and corroborate observations outside of nematodes, including humans and other mammals, birds, insects, and plants (Keinan and Reich 2010; Geraldes et al. 2011; Nachman and Payseur 2012; Pease and Hahn 2013; Renaut et al. 2013; Cruickshank and Hahn 2014), as well as the prediction that the genomes of selfing species will be particularly affected by linked selection (Charlesworth et al. 1997).

Figure 4.

Figure 4.

Demographic history of C. briggsae and C. elegans populations. (A) Relative measures of population differentiation (FST) are greater in chromosome centers between Temperate and Tropical phylogeographic groups. In contrast, the Dxy measure of absolute divergence between populations shows the opposite trend, indicative of selection at linked sites being stronger in the low recombination chromosome centers (Pease and Hahn 2013; Cruickshank and Hahn 2014). Window of 20 kb for silent sites along Chromosome I is shown as an exemplar of all chromosomes (Supplemental Fig. 2). Iterated pairwise sequential Markovian coalescent (PSMC) analysis of all C. briggsae (B,C) and C. elegans (D) genomes show the history of population size change and population splitting. Each line represents the change in population size (Ne) through time inferred for a pair of genomes, with all pairs of haploid genomes among the 37 strains of C. briggsae and 40 strains of C. elegans superimposed to indicate biological replication in the inference of demographic patterns. PSMC curve profiles restricted to the upper right in B illustrate the deep divergence of Kerala strains to all others (approximately 2 million generations ago; green) and the more recent “diaspora” of several genetically distinct strain groups from each other nearly simultaneously 300,000–500,000 generations (Kgen) ago (purple). PSMC profiles within each of the Tropical (red, pink) and Temperate phylogeographic groups show Ne fluctuations in their past, with larger Ne in the distant past and a recent population split within the Tropical group (pink versus red). Only analyses of chromosome center domains are shown in B. PSMC profiles of C. briggsae strain pairs from within a phylogeographic group other than Temperate and Tropical are not shown. Rapid recent time Ne increases likely reflect an artifact of the PSMC algorithm in estimating Ne on short timescales (Li and Durbin 2011). (C) PSMC profiles involving Tropical strain comparisons with all other strains partitioned according to chromosomal domain: (cyan) center domain; (magenta) arm domain. Low recombination chromosome centers have lower Ne and more recent coalescence, and the ancestral polymorphism that generates heterogeneity in the PSMC profiles of the “diaspora” differentiation of phylogeographic groups 300–500 Kgen ago is more constricted for chromosome centers. (D) PSMC analysis of C. elegans indicates a split of the Hawaiian CB4856 strain (blue; 30–50 Kgen ago) with all other strains in the sample (orange), and an overall strong decline in population time since then. Analyses of chromosome centers are shown, with analysis of arm regions in Supplemental Figure 8, excluding 14 of 703 C. briggsae strain pairs and 10 of 780 C. elegans strain pairs owing to spurious PSMC profiles.

Genomic histories of population size and differentiation are sensitive to selection and linkage

To investigate the population history of the Temperate and Tropical populations in relationship to representatives from geographically restricted genetic groups, we applied the pairwise sequential Markovian coalescent (PSMC) model of Li and Durbin (2011). We iteratively created 703 “pseudodiploid” genomes involving all pairwise combinations of the 38 strains (including the reference strain genome) to evaluate biological replication of the coalescent histories inferred by PSMC. Because of the strong recombination domain structure apparent in the C. briggsae genome, we performed analyses separately for arm and center chromosomal regions. The PSMC analysis revealed that the low-recombination centers of chromosomes have had a smaller effective population size (Ne) throughout their history and coalesce in the more recent past than chromosome arms (Fig. 4). This is consistent with our finding of greatly reduced polymorphism owing to linked selection in the central, low-recombination portions of chromosomes (Fig. 1).

PSMC profiles for within-population genome pairs of Temperate and Tropical strains indicate elevated population size in the distant past (40,000–70,000 generations ago), with low population sizes (Ne < 20,000) in the more recent past (10,000–30,000 generations ago) (Fig. 4; Supplemental Fig. 8). The PSMC profiles ostensibly show rapid population growth within the last approximately 8000 generations, but this likely represents an artifact of PSMC having difficulty estimating Ne on very recent timescales (Li and Durbin 2011). As anticipated from patterns of polymorphism, PSMC indicates that the Tropical effective population size has been larger than the Temperate population through much of their histories (more than 30,000 generations ago) (Fig. 4).

We also used PSMC profiles to draw inferences about the timing of lineage splitting for geographically distinct genetic groups of C. briggsae. Lineage splitting can be inferred from the pseudodiploid PSMC profiles as a signal of rapid population size increase in the distant past, as the distinct genomic haplotypes from separated populations accrue mutations independently (Li and Durbin 2011). Our pseudodiploid genome combinations of genetically distinct strains revealed (1) an ancient split of strains from Kerala, India (more than 2 million generations ago); (2) a nearly simultaneous split of six distinct lineages occurring approximately 200,000–300,000 generations ago (Temperate, Tropical, and strains sampled in Montreal, Hubei, Taiwan, and Nairobi); and (3) a relatively recent separation of the Tropical strains into two subgroups 50,000–100,000 generations ago (Figs. 3, 4). Theory predicts that the low polymorphism chromosome centers should exhibit reduced heterogeneity in the timing of divergence among populations, owing to more complete lineage sorting (Charlesworth 1998; Cutter 2013; Pease and Hahn 2013). Indeed, interpopulation divergence of the chromosome center versus arm PSMC profiles is consistent with this expectation (Fig. 4; Supplemental Fig. 8). If C. briggsae populations pass through 10 generations per yr on average, then much of this dynamic worldwide intraspecific history that is detectable in the genome has taken place within just the last few hundred thousand years.

For comparison to the striking signatures of population structure in C. briggsae, we analyzed the PSMC profiles for all pairs of pseudodiploid genomes for 40 wild strains of C. elegans (Thompson et al. 2013) that exhibit little global phylogeographic structure (Sivasundar and Hey 2003; Barrière and Félix 2005; Cutter 2006; Andersen et al. 2012). This analysis revealed a pattern of a modestly increasing population size for C. elegans during its ancient coalescent history, then having suffered a strong decline in effective size recently over the past approximately 10,000 generations and experiencing a demographic split approximately 30,000 generations ago of the Hawaiian strain CB4856 from other wild isolates in the sample (Fig. 4; Supplemental Fig. 8). Like C. briggsae, chromosome centers in C. elegans show markedly depressed Ne and depth of coalescence compared to chromosome arms (Supplemental Fig. 8), indicative of the influence of especially strong linked selection in the gene dense and low recombination centers of chromosomes (Cutter and Payseur 2003; Rockman et al. 2010; Andersen et al. 2012).

Signatures of hyperdiversity in the dioecious ancestor

To further explore molecular evolutionary patterns in chromosomal centers compared to chromosomal arms, we quantified sequence divergence between species. No published genome annotation exists for C. nigoni, the sister species to C. briggsae (Kiontke et al. 2011; Félix et al. 2014). Therefore we computationally extracted and aligned a set of 6435 ortholog coding sequences for C. nigoni from draft genome sequence (Kumar et al. 2012), from which we computed rates of sequence divergence at synonymous sites (dS) and replacement sites (dN) (Supplemental Fig. 9). Synonymous site divergence between C. briggsae and C. nigoni averages 20.7%, and we estimate that C. briggsae shares a most recent common ancestor with C. nigoni approximately 35 million generations ago (3.5 million yr, assuming 10 generations per yr).

We found that synonymous-site interspecies divergence differs significantly among chromosomes (F(5,5085) = 72.0; P < 0.0001), with the X chromosome and Chromosome V having significantly higher rates of divergence than other chromosomes (Tukey's range test). The X chromosome also shows significantly higher divergence than autosomes for distant populations of C. briggsae (Tropical-Kerala silent-site Dxy; F(5,5256) = 294.7; P < 0.0001; Tukey's range test), suggesting that the mutation rate of the X chromosome exceeds the autosomes.

Curiously, genes in high-recombination chromosome arms exhibit higher dS than for genes in low-recombination centers (dS = 0.259 versus 0.186; F(1,4854) = 766.8; P < 0.0001) (Fig. 5), which also was observed for the deeper divergence between C. briggsae and C. elegans (Cutter and Payseur 2003). Three scenarios could explain this disparity in synonymous site divergence between recombination domains: (1) stronger selection for biased codon usage that depresses dS among genes found in chromosome centers (Hershberg and Petrov 2008; Plotkin and Kudla 2011); (2) recombination-associated mutation (RAM) could elevate divergence in high-recombination regions (Lercher and Hurst 2002; Hellmann et al. 2003); or (3) stronger linked selection in low-recombination regions could lead to less ancestral polymorphism (AP) for a hyperdiverse ancestral species (Begun et al. 2007; Cutter and Choi 2010; Cruickshank and Hahn 2014), because observed neutral divergence is the sum of new mutational differences since speciation and the time to coalescence for polymorphisms in the ancestral species (Gillespie and Langley 1979). Codon bias is significantly stronger for genes in chromosome centers, but by only a small magnitude (mean ENCarm = 50.03, ENCcenter = 49.11; P < 0.0001). Current experimental data from Caenorhabditis does not support RAM (Denver et al. 2009, 2012), and divergence between distant populations within C. briggsae shows no similarly strong effect (Fig. 1; Supplemental Fig. 3). For the AP explanation to hold, we calculated that ancestral diversity would need to be reduced by >40% in chromosome centers, given ancestral hyperdiversity of 6%–8% at synonymous sites on chromosome arms (Supplemental Fig. 9; Cutter et al. 2013). Drosophila estimates indicate at least a 25% reduction, potentially >75% reduction, in diversity across the genome owing to linked selection (Comeron 2014; Elyashiv et al. 2014), but empirical estimates in outbreeding Caenorhabditis are not yet available. It also remains possible that differing mutational or gene conversion dynamics of the recombination domains for selfing and outcrossing species could contribute to these observations, or chromosome rearrangements between species and the substantial genome reduction in C. briggsae (Thomas et al. 2012) could conspire to produce spurious ortholog inferences and artifactually strong disparity in dS between chromosome domains. Keeping these caveats in mind, the AP hypothesis appears to be consistent with the data, whereas codon bias and RAM seem less able to account for the magnitude of heterogeneity in dS.

Figure 5.

Figure 5.

Divergence at synonymous sites for orthologs of C. briggsae and C. nigoni is higher for genes linked to arm domains than center domains for all chromosomes (all Wilcoxon test P ≤ 0.0035). The chromosomes also differ significantly from one another in average substitution rates (F(5,5085) = 72.0; P < 0.0001), with Chromosome I being lowest and Chromosomes V and X being highest (Tukey's range test). Horizontal lines indicate mean dS for genes in center domains (cyan, dS = 0.186) or arms (magenta, dS = 0.259) across all chromosomes. Loci with strong codon bias (ENC < 45) excluded from analysis.

Relaxed selection from high selfing is recorded in functional variation

In our panel of 38 C. briggsae genomes, we identified 209,482 SNPs that alter the amino acid sequence encoded by 18,697 genes. In addition, of the 329,281 small indel variants (1- to 60-bp long) that we identified, 10,250 of them occur in coding sequences to affect 6879 genes. Premature stop codons often arise in these genes as a result of frame-shift or non-frame-shift indel changes (5856 genes) and 320 genes having splice junction-spanning indels. Moreover, there are 1244 nonsense SNP variants that create premature stop codon (PSC) alleles in 1027 genes in addition to those genes with premature stops induced by indel variants (1736 genes have both indel- and SNP-induced PSCs). We also identify 356 genes with SNPs that change the stop codon in the reference genome gene annotation into an amino acid codon (stop codon losses [SCL]); given that the stop codon allele is usually extremely rare or is unique to the reference sequence (Fig. 6), we conclude that such sites identify premature stop mutations (or errors) in the reference gene annotation and that the true wild-type coding sequences extend farther downstream. In all, we find 18,697 of the 19,884 genes detected in the genome to harbor natural mutations that could alter the function of the encoded protein, complementing a similar resource for C. elegans (Thompson et al. 2013), to provide a valuable catalog of mutants for experimental analysis.

Figure 6.

Figure 6.

Missense and nonsense mutations are generally deleterious and selected against. (A) Windows of 20 kb sliding along Chromosome I indicate the lower polymorphism at nonsynonymous sites (πrep) than at synonymous sites (πsyn) for Tropical strains (see Supplemental Fig. 10 for all chromosomes). A slight trend of elevated πrepsyn in chromosome centers (B) is indicative of less effective selection in purging deleterious mutations from these regions of high linkage (median of 20-kb windows on arms πrepsyn = 0.259; centers πrepsyn = 0.290; Wilcoxon χ2 = 15.84; P < 0.0001). The stacked histogram of πrepsyn in B shows the cumulative abundance of 20-kb windows with a given bin of πreps across the six chromosomes, partitioned into chromosome arm and center domains. (C) Distribution of nonsense SNPs along coding sequences (CDS) expressed as percentage of the CDS length. (D) Distribution of the minor allele frequency (MAF) for different classes of SNPs: (red) synonymous; (blue) nonsynonymous; (orange) premature stop codons (PSC); (green) stop codon losses (SCL). PSC SNPs have significantly skewed low MAF values, indicating stronger selective constraints. In contrast, SCL SNPs have significantly higher MAF suggesting misannotation of the reference stop codon or PSC mutations in the reference genome.

Our analysis of SNPs that create premature stop codons (PSC) indicates that such nonsense mutations are generally deleterious and disfavored by selection within wild contemporary C. briggsae populations. Specifically, nonsense alleles occur at significantly lower population frequency than do the minor alleles of missense or synonymous SNPs (χ2 = 39.94, P < 0.0001; χ2 = 56.62, P < 0.0001) (Fig. 6D). Moreover, PSC mutations are nonuniformly distributed along coding sequences (χ2 = 92.37, P < 0.0001), being more prevalent in the last 10% of coding sequences (Fig. 6C) as seen in Drosophila (Hoehn et al. 2012; Lee and Reinhardt 2012). However, unlike in D. melanogaster (Lee and Reinhardt 2012), PSC-containing genes are enriched on the C. briggsae X chromosome (6.8% for the X versus 4.9% for autosomes; χ2 = 17.46, P < 0.0001). Despite the general pattern of selection against PSC alleles, we find that they occur more commonly in genes that are expected to be subject to weaker selection in this species, namely, genes with low expression and genes with male-biased expression. Overall, the expression level of PSC-containing genes is 1.3-fold lower than other genes (t = 13.32, P < 0.0001) and their protein sequences evolve faster (dN/dS-PSC = 0.146; dN/dS-others = 0.106; t = 4.46, P < 0.0001). Moreover, male-biased genes contain a disproportionate load of these polymorphic nonsense mutations (χ2 = 6.2987, P = 0.012), indicative of relaxed selection on male functions in C. briggsae.

To characterize genic influences of selection across the genome, we contrasted polymorphism (πrep, πsyn) within C. briggsae and divergence (dN, dS) relative to C. nigoni for nonsynonymous and synonymous coding sites (Fig. 6; Supplemental Fig. 10). Replacement site SNPs for the Tropical population sample are less common in chromosome centers than arms, mirroring the pattern for synonymous polymorphisms (median of 20-kb windows on arms πrep = 0.073%; centers πrep = 0.020%; Wilcoxon χ2 = 1057.6; P < 0.0001). This finding is reminiscent of the observation in C. elegans that genomic context, owing to linked selection, is a more important driver of heritable variation in gene expression than are the functional attributes of the encoded genes (Rockman et al. 2010). Indeed, sliding windows of πrepsyn indicate that chromosome centers have an 11% relative excess of nonsynonymous SNPs, despite the paucity in absolute numbers of SNPs in centers (median of 20-kb windows on arms πrepsyn = 0.259; centers πrepsyn = 0.290; Wilcoxon χ2 = 15.84; P < 0.0001) (Fig. 6). This relative overabundance of nonsynonymous SNPs in chromosome centers suggests that linkage interferes with the effective elimination of detrimental mutations to a greater extent in such regions. In contrast, divergence with C. nigoni at replacement sites does not depend on affiliation with chromosomal domains (median dN/dS arms = 0.0711; centers 0.0719; Wilcoxon χ2 = 0.172; P = 0.68), implicating recent demography and linked selection within C. briggsae as being responsible for differential efficacy of selection on functional protein variation among chromosomal domains. Moreover, the distribution of πrepsyn relative to dN/dS and πrepsyn in related outbreeding species indicates that selection has failed to purge many deleterious nonsynonymous site mutations from the C. briggsae population (median πrepsyn = 0.275; dN/dS = 0.0727; mean C. brenneri [Dey et al. 2013] πrepneu = 0.026).

Discussion

This first full-genome characterization of Caenorhabditis population polymorphism reveals the striking evolutionary histories of self-fertilization, population splitting, and natural selection of C. briggsae and C. elegans. We demonstrate that natural selection and genetic linkage interact to eliminate both neutral and functional genetic variability in low-recombination regions of C. briggsae’s genome. Low-recombination chromosome centers are gene rich in this highly self-fertilizing species, thus compounding the influence of genetic hitchhiking and background selection owing to the high density of selected targets in strong genetic linkage (Maynard Smith and Haigh 1974; Charlesworth et al. 1993; Cutter and Payseur 2013). In contrast to C. elegans (Andersen et al. 2012) and P. pacificus (Rödelsperger et al. 2014), the absence of a strong signal of high-frequency derived polymorphisms in C. briggsae implicates the elimination of deleterious mutations by background selection in long linkage blocks as the primary force shaping genomic polymorphisms in this species (Kaiser and Charlesworth 2009; Cutter and Choi 2010; Cutter and Payseur 2013). This mark of selection on the genome also reveals itself in the profiles of coalescent history for both C. briggsae and C. elegans as analyzed with the Markovian coalescent modeling framework (Li and Durbin 2011). As a consequence, we find interpretations from PSMC to be sensitive to the interaction between selection and linkage, which may help to understand demographic histories under some circumstances (e.g., timing of population splitting). A further repercussion of such a strong interaction between linkage and selection in the selfing C. briggsae genome is that evolution on recent timescales will be determined to a large extent by where in the genome new mutations happen to arise, rather than by the individual functional effects of those mutations.

Plant species also reveal dramatic influences of self-fertilization on polymorphism in their genomes (Wright et al. 2013). The less diverse genome of self-fertile Mimulus nasutus, compared to outbreeding M. guttatus, also experiences strong linkage disequilibrium and an abundance of premature stop codon alleles (Brandvain et al. 2014). The recent origins of self-fertilization in Capsella rubella from its progenitor C. grandiflora, and between Arabidopsis thaliana and A. lyrata, has yielded similar effects (Foxe et al. 2009; Cao et al. 2011; Brandvain et al. 2013), as has the more distantly related selfing nematode P. pacificus (Rödelsperger et al. 2014). Unlike these species, however, we find that C. briggsae’s euchromatic genome architecture that combines high gene density and low recombination rate lend it especially pronounced signatures of linked selection, with the effect on the C. elegans genome even more extreme (Andersen et al. 2012). Whether the corresponding genomic regions in obligatorily outbreeding, often hyperdiverse, species of Caenorhabditis will exhibit similar evolutionary signatures remains an open question (Cutter and Payseur 2013; Cutter et al. 2013): To what extent will more effective recombination from outbreeding free the genome from the effects of linked selection, and might soft sweeps play an increased role in shaping genome evolution in such species? General answers to questions like these will benefit from the merger of population genomic analysis with phylogenetic comparative methods.

The globally distributed C. briggsae is comprised of differentiated lineages, most of which separated from one another approximately 200,000 generations ago (∼20 kya). We refer to these populations found in different parts of the world as a “diaspora” that share a near simultaneous time of separation from their common ancestor. Our iterated pairwise sequential Markovian coalescent analyses complemented more conventional procedures for inference of C. briggsae’s evolutionary history to reveal the timing of differentiation and population size change. We find little evidence for ongoing gene flow among these lineages, although most of the distinct lineages have been sampled only rarely in nature, with the “Temperate” and “Tropical” genetic groups predominating in wild collection efforts (Félix et al. 2013). Moreover, the deep split of “Kerala” strains from all other C. briggsae lineages approximately 2 million generations ago (∼200 kya), averaging 1.2% silent-site sequence divergence, is an especially striking feature of the evolutionary history of this species. This splitting reinforces a key role for C. briggsae in studying adaptive divergence (Prasad et al. 2011) and incipient speciation (Dolgin et al. 2008; Ross et al. 2011; Baird and Stonesifer 2012), by exploiting available genetic toolkits (Koboldt et al. 2010; Ross et al. 2011; Frøkjaer-Jensen 2013), and the genomic resource provided here.

In addition to revealing these evolutionary phenomena, the more than 3 million polymorphisms that we discovered in wild strains of C. briggsae, of which 7.25% alter the coding sequences of 94% of the genes in the genome, provide a trove of mutant alleles for functional analysis. As comparative molecular genetic and developmental studies accelerate our understanding of fundamental processes like developmental genetic networks, cell morphogenesis, and cell lineage (Wang and Chamberlin 2004; Zhao et al. 2008; Riche et al. 2013; Chen et al. 2014; Ellis and Lin 2014; Verster et al. 2014), such mutational inventories provide a crucial experimental resource (Thompson et al. 2013). This study of genomic polymorphism in wild populations of C. briggsae provides a foundation to probe the consequences for molecular function to such genomic change and in response to divergence of traits.

Methods

Population genomic sequencing

We shotgun sequenced genomes to at least 15× coverage for 37 C. briggsae strains derived from seven previously hypothesized global phylogeographic groups (median 32× coverage per strain), with deepest geographic sampling to obtain 25 “Tropical” group strains (Cutter et al. 2006, 2010; Félix et al. 2013). Details about sequencing library and platform for Illumina sequencing for each strain are given in Supplemental Table 1.

Mapping and SNP determination

We performed a first-pass mapping of perfect reads using the Burrows-Wheeler Aligner v.0.2.2-r126 (Li and Durbin 2009) to the WS242 (cb4) reference genome assembly for “Tropical” C. briggsae strain AF16 (http://www.wormbase.org), which also was used later for annotated feature extraction. We followed this with Stampy v.1.020 for mapping the remaining divergent reads (Lunter and Goodson 2011). Picard tools v.1.96 was used for file format manipulation to integrate with the Genome Analysis ToolKit (GATK v.2.7-4) (Van der Auwera et al. 2013). For SNP calling, we applied the GATK UnifiedGenotyper with ploidy = 1, given the highly inbred nature of C. briggsae strains. After filtering to incorporate per strain mapping quality and depth of coverage, we derived a total of 2,700,664 sites with single nucleotide polymorphisms (SNPs) across all strains, of which 439,139 SNPs had high-quality allele calls in every strain. We applied BreakDancer (Chen et al. 2009) and Pindel (Ye et al. 2009) to identify short indel variants ≤60 bp long, requiring ≥3 supporting reads for inclusion in subsequent analyses. Repeats were identified in the cb4 reference genome using RepeatModeler and RepeatMasker.

Population genetic metrics

We computed standardized per-nucleotide measures of polymorphism (π, average number of pairwise differences; θw number of segregating sites) in 5262 nonoverlapping 20-kb windows using a corrected site frequency spectrum (SFS) (Nielsen 2005; Wakeley 2009; Hufford et al. 2012). These metrics were calculated separately for nonsynonymous sites, synonymous sites, and all silent sites (synonymous sites plus nonrepetitive noncoding sites); only windows where at least 5000 bp of informative sequence passing our filters are shown in sliding window analyses plots. For polymorphism metrics of the Tropical population sample, we required ≥20 strains to have informative sequence data for a given site. The SFS of polymorphisms from the Tropical population sample was summarized for silent sites with unfolded metrics using the divergent Kerala strains as outgroup (Fay and Wu's H) (Fay and Wu 2000) and folded metrics that do not require an outgroup (Tajima's D) (Tajima 1989) and Schaeffer's D/Dmin (Schaeffer 2002). Differentiation between Tropical and Temperate population samples was computed with FST (Hudson et al. 1992). We also calculated absolute divergence between C. briggsae population sets with Dxy (Nei and Li 1979). Per-gene metrics were calculated only for genes in which at least 100 bp, or 25% of the sequence length for genes whose CDS is shorter than 100 bp, had passed filters for informative sites. We excluded 789 annotated genes that lacked an ATG start codon for their coding sequence, lacked a termination codon, or had a mismatch in their annotated length. Correction for multiple hits was applied according to Jukes and Cantor (1969) and McVean et al. (2002).

To compute linkage disequilibrium (r2) for C. briggsae with VCFtools (Danecek et al. 2011), we merged all 37 VCF files and extracted the 2.7 million SNP sites into a new VCF file, which was set as containing phased haplotypes for analysis. We then computed r2 for the 25 sequenced “Tropical” group samples, excluding singleton SNPs. This procedure was also applied to 39 C. elegans strains (excluding Hawaiian CB4856) from Thompson et al. (2013), using the variant calls from this previous study. Interchromosomal linkage disequilibrium (LD) was calculated as the average r2 for all pairs of sites between a given pair of chromosomes, computed for each of the 15 pairings of the six chromosomes. Outcrossing rate was then computed from interchromosomal LD using the approach of Cutter (2006), after first subtracting the expected value of r2 given a sample size of 25 or 39 for C. briggsae or C. elegans, respectively (E[r2] ∼ 1/n) (Weir and Hill 1980).

Ortholog identification and divergence time calculation

Sequences for C. nigoni (formerly known as C. sp. 9) (Félix et al. 2014) were extracted from the November 30, 2012 assembly from Kumar et al. (2012) (http://wormgenomes.caltech.edu). We ran TBLASTN v. 2.2.26+ against the C. nigoni genome using the C. briggsae CDS transcripts (version WS242 from wormbase.org for reference strain AF16), and then extracted the best-hit regions in the C. nigoni genome, requiring >80% overlap and collinearity to the corresponding C. briggsae gene, as well as requiring >80% sequence identity to exclude contaminating sequence from C. afra that is known to be present in the draft C. nigoni assembly (E Schwarz, pers. comm.). The aligned regions of each collinear best hit were then concatenated into a preliminary C. nigoni gene sequence, which we subjected to BLASTX using C. briggsae WS242 protein sequences to obtain gene pairs with mutual best hits between C. nigoni and C. briggsae for use in our 6435 ortholog gene set.

We computed an estimate of the divergence time between C. briggsae and C. nigoni according to T = (dS − πanc)/(2μ) (Gillespie and Langley 1979), where dS-arm = 0.259, assuming silent-site polymorphism on the arms of the dioecious common ancestor is πanc-arm = 0.07 (AD Cutter, unpubl.) and a per-site mutation rate each generation of μ = 2.7 × 10−9 (Denver et al. 2009). This yields T = 35.0 million generations (Mgen) ago; lower πanc or μ would result in more ancient T. Divergence in low recombination center regions of chromosomes might better reflect the time since the cessation of gene flow between species, owing to reduced ancestral polymorphism (Cutter 2013; Cruickshank and Hahn 2014), so we also computed T given dS-center = 0.186 and assuming πanc-center = 0 to yield T = 34.8 Mgen ago; higher πanc or μ would result in more recent T. These estimates of neutral divergence excluded 958 loci with dS > 0.8 and ENC < 45 to mitigate against any remaining false ortholog assignments and genes with strong selection on codon usage. For divergence time estimation of the Kerala strains to other strains of C. briggsae, we used this same approach using silent-site Dxy instead of dS, with values of Dxy-center = 0.0113 and πanc-center = 0.00041 to yield T = 2.02 Mgen ago.

Analysis of nonsense alleles

Analyses of premature stop codon (PSC) mutations within coding sequences (CDS) excluded genes containing indel-induced premature stops. For the corresponding 1027 genes, when multiple nonsense mutations occurred in a gene, we determined the most 5′ PSC incidence along the CDS in decile bins of length in order to capture the degree of CDS truncation. Divergence analyses between C. briggsae and C. nigoni excluded 29 genes with dS = 0 and 24 genes with dN/dS > 1. To estimate overall gene expression, we used the average expression level measured across 10 embryonic time points in strain AF16 from Levin et al. (2012). Sex-biased expression was inferred from expression levels of AF16 males and she-1 pseudofemales in Thomas et al. (2012).

Phylogeographic and demographic analysis

We performed phylogeographic analyses using the subset of 439,139 SNPs with perfect information in all strains. SplitsTree (Huson and Bryant 2006) was run separately for each chromosome as well as for a concatenation of sites from all chromosomes to create neighbor-network trees for strain genomes, with reticulation in the network indicating recombination and gene flow. We next ran ADMIXTURE on the samples (Alexander et al. 2009), varying population number K from 1 to 9 for data from each chromosome separately or all together. We applied the ChromoPainter program (Lawson et al. 2012; Yahara et al. 2013) to cluster strains. For this analysis, we set the recombination scaling start value to 0.2 and input the recombination rate of polymorphic sites based on linear interpolation of genetic and physical distances as reported previously (Cutter and Choi 2010; Ross et al. 2011). Based on the results of these previous analyses, for input into TreeMix (Pickrell and Pritchard 2012), we grouped the 38 strains (37 strains sequenced here plus reference genome of AF16) into seven groups (Supplemental Table 1), using the two Kerala strains as outgroup. We ran TreeMix allowing 1–5 migration events, using groups of 100 SNPs to account for linkage disequilibrium.

We conducted historical demographic analysis with PSMC (Li and Durbin 2011), considering each strain as a single genomic haplotype. We constructed pseudodiploid genomes for all 703 possible combinations of the 38 strains and restricted SNP identity to the set of 2.7 million SNPs. We excluded 14 pseudodiploid strain combinations from further analysis owing to spurious PSMC profiles. PSMC was run separately for arm and center chromosomal domains as defined in Ross et al. (2011). We ran the same pipeline on 40 C. elegans strain whole genome sequences and subchromosomal domains (Thompson et al. 2013). Of the 780 pseudodiploid C. elegans strain combinations, 10 were excluded owing to spurious PSMC profiles.

Data access

Sequence data have been submitted to the NCBI Sequence Read Archive (SRA; http://www.ncbi.nlm.nih.gov/sra) under accession number PRJNA274705. Variant calls have been submitted to dbSNP (http://www.ncbi.nlm.nih.gov/SNP/; SNP numbers 1699546001–1703191050), WormBase (http://www.wormbase.org), and can be found in the Supplemental Material.

Supplementary Material

Supplemental Material

Acknowledgments

This research was supported by funds to A.D.C. from the Natural Sciences and Engineering Research Council of Canada, a Canada Research Chair, and the National Institutes of Health (GM096008).

Footnotes

[Supplemental material is available for this article.]

Article published online before print. Article, supplemental material, and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.187237.114.

References

  1. The 1000 Genomes Project Consortium. 2012. An integrated map of genetic variation from 1,092 human genomes. Nature 491: 56–65. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Alexander DH, Novembre J, Lange K. 2009. Fast model-based estimation of ancestry in unrelated individuals. Genome Res 19: 1655–1664. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Andersen EC, Gerke JP, Shapiro JA, Crissman JR, Ghosh R, Bloom JS, Félix MA, Kruglyak L. 2012. Chromosome-scale selective sweeps shape Caenorhabditis elegans genomic diversity. Nat Genet 44: 285–290. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Baird SE, Stonesifer R. 2012. Reproductive isolation in Caenorhabditis briggsae: dysgenic interactions between maternal- and zygotic-effect loci result in a delayed development phenotype. Worm 1: 189–195. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Baird S, Davidson C, Bohrer J. 2005. The genetics of ray pattern variation in Caenorhabditis briggsae. BMC Evol Biol 5: 3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Barrière A, Félix M-A. 2005. High local genetic diversity and low outcrossing rate in Caenorhabditis elegans natural populations. Curr Biol 15: 1176–1184. [DOI] [PubMed] [Google Scholar]
  7. Begun DJ, Holloway AK, Stevens K, Hillier LW, Poh YP, Hahn MW, Nista PM, Jones CD, Kern AD, Dewey CN, et al. 2007. Population genomics: whole-genome analysis of polymorphism and divergence in Drosophila simulans. PLoS Biol 5: e310. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Brandvain Y, Slotte T, Hazzouri KM, Wright SI, Coop G. 2013. Genomic identification of founding haplotypes reveals the history of the selfing species Capsella rubella. PLoS Genet 9: e1003754. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Brandvain Y, Kenney AM, Flagel L, Coop G, Sweigart AL. 2014. Speciation and introgression between Mimulus nasutus and Mimulus guttatus. PLoS Genet 10: e1004410. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Cao J, Schneeberger K, Ossowski S, Günther T, Bender S, Fitz J, Koenig D, Lanz C, Stegle O, Lippert C, et al. 2011. Whole-genome sequencing of multiple Arabidopsis thaliana populations. Nat Genet 43: 956–963. [DOI] [PubMed] [Google Scholar]
  11. Charlesworth B. 1998. Measures of divergence between populations and the effect of forces that reduce variability. Mol Biol Evol 15: 538–543. [DOI] [PubMed] [Google Scholar]
  12. Charlesworth B, Morgan MT, Charlesworth D. 1993. The effect of deleterious mutations on neutral molecular variation. Genetics 134: 1289–1303. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Charlesworth B, Nordborg M, Charlesworth D. 1997. The effects of local selection, balanced polymorphism and background selection on equilibrium patterns of genetic diversity in subdivided populations. Genet Res 70: 155–174. [DOI] [PubMed] [Google Scholar]
  14. Chen K, Wallis JW, McLellan MD, Larson DE, Kalicki JM, Pohl CS, McGrath SD, Wendl MC, Zhang Q, Locke DP, et al. 2009. BreakDancer: an algorithm for high-resolution mapping of genomic structural variation. Nat Methods 6: 677–681. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Chen X, Shen Y, Ellis RE. 2014. Dependence of the sperm/oocyte decision on the nucleosome remodeling factor complex was acquired during recent Caenorhabditis briggsae evolution. Mol Biol Evol 31: 2573–2585. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Comeron JM. 2014. Background selection as baseline for nucleotide variation across the Drosophila genome. PLoS Genet 10: e1004434. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Cruickshank TE, Hahn MW. 2014. Reanalysis suggests that genomic islands of speciation are due to reduced diversity, not reduced gene flow. Mol Ecol 23: 3133–3157. [DOI] [PubMed] [Google Scholar]
  18. Cutter AD. 2006. Nucleotide polymorphism and linkage disequilibrium in wild populations of the partial selfer Caenorhabditis elegans. Genetics 172: 171–184. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Cutter AD. 2013. Integrating phylogenetics, phylogeography and population genetics through genomes and evolutionary theory. Mol Phylogenet Evol 69: 1172–1185. [DOI] [PubMed] [Google Scholar]
  20. Cutter AD, Choi JY. 2010. Natural selection shapes nucleotide polymorphism across the genome of the nematode Caenorhabditis briggsae. Genome Res 20: 1103–1111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Cutter AD, Payseur BA. 2003. Selection at linked sites in the partial selfer Caenorhabditis elegans. Mol Biol Evol 20: 665–673. [DOI] [PubMed] [Google Scholar]
  22. Cutter AD, Payseur BA. 2013. Genomic signatures of selection at linked sites: unifying the disparity among species. Nat Rev Genet 14: 262–274. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Cutter AD, Félix MA, Barrière A, Charlesworth D. 2006. Patterns of nucleotide polymorphism distinguish temperate and tropical wild isolates of Caenorhabditis briggsae. Genetics 173: 2021–2031. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Cutter AD, Dey A, Murray RL. 2009. Evolution of the Caenorhabditis elegans genome. Mol Biol Evol 26: 1199–1234. [DOI] [PubMed] [Google Scholar]
  25. Cutter AD, Yan W, Tsvetkov N, Sunil S, Félix MA. 2010. Molecular population genetics and phenotypic sensitivity to ethanol for a globally diverse sample of the nematode Caenorhabditis briggsae. Mol Ecol 19: 798–809. [DOI] [PubMed] [Google Scholar]
  26. Cutter AD, Jovelin R, Dey A. 2013. Molecular hyperdiversity and evolution in very large populations. Mol Ecol 22: 2074–2095. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Danecek P, Auton A, Abecasis G, Albers CA, Banks E, DePristo MA, Handsaker RE, Lunter G, Marth GT, Sherry ST, et al. 2011. The variant call format and VCFtools. Bioinformatics 27: 2156–2158. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Delattre M, Félix MA. 2001. Polymorphism and evolution of vulval precursor cell lineages within two nematode genera, Caenorhabditis and Oscheius. Curr Biol 11: 631–643. [DOI] [PubMed] [Google Scholar]
  29. Denver DR, Dolan PC, Wilhelm LJ, Sung W, Lucas-Lledo JI, Howe DK, Lewis SC, Okamoto K, Thomas WK, Lynch M, et al. 2009. A genome-wide view of Caenorhabditis elegans base-substitution mutation processes. Proc Natl Acad Sci 106: 16310–16324. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Denver DR, Wilhelm LJ, Howe DK, Gafner K, Dolan PC, Baer CF. 2012. Variation in base-substitution mutation in experimental and natural lineages of Caenorhabditis nematodes. Genome Biol Evol 4: 513–522. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Dey A, Chan CKW, Thomas CG, Cutter AD. 2013. Nucleotide hyperdiversity defines populations of Caenorhabditis brenneri. Proc Natl Acad Sci 110: 11056–11060. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Dolgin ES, Félix MA, Cutter AD. 2008. Hakuna nematoda: genetic and phenotypic diversity in African isolates of Caenorhabditis elegans and C. briggsae. Heredity 100: 304–315. [DOI] [PubMed] [Google Scholar]
  33. Ellis RE, Lin SY. 2014. The evolutionary origins and consequences of self-fertility in nematodes. F1000Prime Rep 6: 62. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Elyashiv E, Sattath S, Hu TT, Strustovsky A, McVicker G, Andolfatto P, Sella G. 2014. A genomic map of the effects of linked selection in Drosophila. arXiv: 1408.5461. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Fay JC, Wu CI. 2000. Hitchhiking under positive Darwinian selection. Genetics 155: 1405–1413. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Félix MA, Jovelin R, Ferrari C, Han S, Cho YR, Andersen EC, Cutter AD, Braendle C. 2013. Species richness, distribution and genetic diversity of Caenorhabditis nematodes in a remote tropical rainforest. BMC Evol Biol 13: 10. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Félix MA, Braendle C, Cutter AD. 2014. A streamlined system for species diagnosis in Caenorhabditis (Nematoda: Rhabditidae) with name designations for 15 distinct biological species. PLoS One 9: e94723. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Foxe JP, Slotte T, Stahl EA, Neuffer B, Hurka H, Wright SI. 2009. Recent speciation associated with the evolution of selfing in Capsella. Proc Natl Acad Sci 106: 5241–5245. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Frøkjaer-Jensen C. 2013. Exciting prospects for precise engineering of Caenorhabditis elegans genomes with CRISPR/Cas9. Genetics 195: 635–642. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Geraldes A, Basset P, Smith KL, Nachman MW. 2011. Higher differentiation among subspecies of the house mouse (Mus musculus) in genomic regions with low recombination. Mol Ecol 20: 4722–4736. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Gillespie JH, Langley CH. 1979. Are evolutionary rates really variable? J Mol Evol 13: 27–34. [DOI] [PubMed] [Google Scholar]
  42. Guo Y, Lang S, Ellis RE. 2009. Independent recruitment of F box genes to regulate hermaphrodite development during nematode evolution. Curr Biol 19: 1853–1860. [DOI] [PubMed] [Google Scholar]
  43. Haag ES, Liu QW. 2013. Using Caenorhabditis to explore the evolution of the germ line. In Germ cell development in C. elegans (ed. Schedl T), pp. 405–425 Springer Science, NY. [DOI] [PubMed] [Google Scholar]
  44. Hellmann I, Ebersberger I, Ptak SE, Pääbo S, Przeworski M. 2003. A neutral explanation for the correlation of diversity with recombination rates in humans. Am J Hum Genet 72: 1527–1535. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Hershberg R, Petrov DA. 2008. Selection on codon bias. Annu Rev Genet 42: 287–299. [DOI] [PubMed] [Google Scholar]
  46. Hill RC, Egydio de Carvalho C, Salogiannis J, Schlager B, Pilgrim D, Haag ES. 2006. Genetic flexibility in the convergent evolution of hermaphroditism in Caenorhabditis nematodes. Dev Cell 10: 531–538. [DOI] [PubMed] [Google Scholar]
  47. Hillier LW, Miller RD, Baird SE, Chinwalla A, Fulton LA, Koboldt DC, Waterston RH. 2007. Comparison of C. elegans and C. briggsae genome sequences reveals extensive conservation of chromosome organization and synteny. PLoS Biol 5: e167. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Hoehn K, McGaugh S, Noor MF. 2012. Effects of premature termination codon polymorphisms in the Drosophila pseudoobscura subclade. J Mol Evol 75: 141–150. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Hudson RR, Slatkin M, Maddison WP. 1992. Estimation of levels of gene flow from DNA-sequence data. Genetics 132: 583–589. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Hufford MB, Xu X, van Heerwaarden J, Pyhäjärvi T, Chia JM, Cartwright RA, Elshire RJ, Glaubitz JC, Guill KE, Kaeppler SM, et al. 2012. Comparative population genomics of maize domestication and improvement. Nat Genet 44: 808–811. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Huson DH, Bryant D. 2006. Application of phylogenetic networks in evolutionary studies. Mol Biol Evol 23: 254–267. [DOI] [PubMed] [Google Scholar]
  52. Jukes TH, Cantor CR. 1969. Evolution of protein molecules. In Mammalian protein metabolism (ed. Munro HN), pp. 21–132 Academic Press, New York. [Google Scholar]
  53. Kaiser VB, Charlesworth B. 2009. The effects of deleterious mutations on evolution in nonrecombining genomes. Trends Genet 25: 9–12. [DOI] [PubMed] [Google Scholar]
  54. Keinan A, Reich D. 2010. Human population differentiation is strongly correlated with local recombination rate. PLoS Genet 6: e1000886. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Kiontke K, Félix MA, Ailion M, Rockman M, Braendle C, Penigault JB, Fitch D. 2011. A phylogeny and molecular barcodes for Caenorhabditis, with numerous new species from rotting fruits. BMC Evol Biol 11: 339. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Koboldt DC, Staisch J, Thillainathan B, Haines K, Baird SE, Chamberlin HM, Haag ES, Miller RD, Gupta BP. 2010. A toolkit for rapid gene mapping in the nematode Caenorhabditis briggsae. BMC Genomics 11: 236. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Kozlowska JL, Ahmad AR, Jahesh E, Cutter AD. 2012. Genetic variation for post-zygotic reproductive isolation between Caenorhabditis briggsae and Caenorhabditis sp. 9. Evolution 66: 1180–1195. [DOI] [PubMed] [Google Scholar]
  58. Kumar S, Koutsovoulos G, Kaur G, Blaxter M. 2012. Toward 959 nematode genomes. Worm 1: 42–50. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Langley CH, Stevens K, Cardeno C, Lee YCG, Schrider DR, Pool JE, Langley SA, Suarez C, Corbett-Detig RB, Kolaczkowski B, et al. 2012. Genomic variation in natural populations of Drosophila melanogaster. Genetics 192: 533–598. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Lawson DJ, Hellenthal G, Myers S, Falush D. 2012. Inference of population structure using dense haplotype data. PLoS Genet 8: e1002453. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Lee YCG, Reinhardt JA. 2012. Widespread polymorphism in the positions of stop codons in Drosophila melanogaster. Genome Biol Evol 4: 533–549. [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Lercher MJ, Hurst LD. 2002. Human SNP variability and mutation rate are higher in regions of high recombination. Trends Genet 18: 337–340. [DOI] [PubMed] [Google Scholar]
  63. Levin M, Hashimshony T, Wagner F, Yanai I. 2012. Developmental milestones punctuate gene expression in the Caenorhabditis embryo. Dev Cell 22: 1101–1108. [DOI] [PubMed] [Google Scholar]
  64. Li H, Durbin R. 2009. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25: 1754–1760. [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. Li H, Durbin R. 2011. Inference of human population history from individual whole-genome sequences. Nature 475: 493–496. [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. Lunter G, Goodson M. 2011. Stampy: a statistical algorithm for sensitive and fast mapping of Illumina sequence reads. Genome Res 21: 936–939. [DOI] [PMC free article] [PubMed] [Google Scholar]
  67. Marri S, Gupta BP. 2009. Dissection of lin-11 enhancer regions in Caenorhabditis elegans and other nematodes. Dev Biol 325: 402–411. [DOI] [PubMed] [Google Scholar]
  68. Maynard Smith J, Haigh J. 1974. Hitch-hiking effect of a favorable gene. Genet Res 23: 23–35. [PubMed] [Google Scholar]
  69. McVean G, Awadalla P, Fearnhead P. 2002. A coalescent-based method for detecting and estimating recombination from gene sequences. Genetics 160: 1231–1241. [DOI] [PMC free article] [PubMed] [Google Scholar]
  70. Nachman MW, Payseur BA. 2012. Recombination rate variation and speciation: theoretical predictions and empirical results from rabbits and mice. Philos Trans R Soc Lond B Biol Sci 367: 409–421. [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. Nei M, Li WH. 1979. Mathematical-model for studying genetic-variation in terms of restriction endonucleases. Proc Natl Acad Sci 76: 5269–5273. [DOI] [PMC free article] [PubMed] [Google Scholar]
  72. Nielsen R. 2005. Molecular signatures of natural selection. Annu Rev Genet 39: 197–218. [DOI] [PubMed] [Google Scholar]
  73. Noor MAF, Bennett SM. 2009. Islands of speciation or mirages in the desert? Examining the role of restricted recombination in maintaining species. Heredity 103: 439–444. [DOI] [PMC free article] [PubMed] [Google Scholar]
  74. Pease JB, Hahn MW. 2013. More accurate phylogenies inferred from low-recombination regions in the presence of incomplete lineage sorting. Evolution 67: 2376–2384. [DOI] [PMC free article] [PubMed] [Google Scholar]
  75. Pickrell JK, Pritchard JK. 2012. Inference of population splits and mixtures from genome-wide allele frequency data. PLoS Genet 8: e1002967. [DOI] [PMC free article] [PubMed] [Google Scholar]
  76. Plotkin JB, Kudla G. 2011. Synonymous but not the same: the causes and consequences of codon bias. Nat Rev Genet 12: 32–42. [DOI] [PMC free article] [PubMed] [Google Scholar]
  77. Prasad A, Croydon-Sugarman M, Murray RL, Cutter AD. 2011. Temperature-dependent fecundity associates with latitude in Caenorhabditis briggsae. Evolution 65: 52–63. [DOI] [PubMed] [Google Scholar]
  78. Raboin MJ, Timko AF, Howe DK, Félix MA, Denver DR. 2010. Evolution of Caenorhabditis mitochondrial genome pseudogenes and C. briggsae natural isolates. Mol Biol Evol 27: 1087–1096. [DOI] [PMC free article] [PubMed] [Google Scholar]
  79. Reich D, Thangaraj K, Patterson N, Price AL, Singh L. 2009. Reconstructing Indian population history. Nature 461: 489–494. [DOI] [PMC free article] [PubMed] [Google Scholar]
  80. Renaut S, Grassa CJ, Yeaman S, Moyers BT, Lai Z, Kane NC, Bowers JE, Burke JM, Rieseberg LH. 2013. Genomic islands of divergence are not affected by geography of speciation in sunflowers. Nat Commun 4: 1827. [DOI] [PubMed] [Google Scholar]
  81. Riche S, Zouak M, Argoul F, Arneodo A, Pecreaux J, Delattre M. 2013. Evolutionary comparisons reveal a positional switch for spindle pole oscillations in Caenorhabditis embryos. J Cell Biol 201: 653–662. [DOI] [PMC free article] [PubMed] [Google Scholar]
  82. Rockman MV, Skrovanek SS, Kruglyak L. 2010. Selection at linked sites shapes heritable phenotypic variation in C. elegans. Science 330: 372–376. [DOI] [PMC free article] [PubMed] [Google Scholar]
  83. Rödelsperger C, Neher RA, Weller AM, Eberhardt G, Witte H, Mayer WE, Dieterich C, Sommer RJ. 2014. Characterization of genetic diversity in the nematode Pristionchus pacificus from population-scale resequencing data. Genetics 196: 1153–1165. [DOI] [PMC free article] [PubMed] [Google Scholar]
  84. Ross JA, Koboldt DC, Staisch JE, Chamberlin HM, Gupta BP, Miller RD, Baird SE, Haag ES. 2011. Caenorhabditis briggsae recombinant inbred line genotypes reveal inter-strain incompatibility and the evolution of recombination. PLoS Genet 7: e1002174. [DOI] [PMC free article] [PubMed] [Google Scholar]
  85. Schaeffer SW. 2002. Molecular population genetics of sequence length diversity in the Adh region of Drosophila pseudoobscura. Genet Res 80: 163–175. [DOI] [PubMed] [Google Scholar]
  86. Sella G, Petrov DA, Przeworski M, Andolfatto P. 2009. Pervasive natural selection in the Drosophila genome? PLoS Genet 5: e1000495. [DOI] [PMC free article] [PubMed] [Google Scholar]
  87. Sivasundar A, Hey J. 2003. Population genetics of Caenorhabditis elegans: the paradox of low polymorphism in a widespread species. Genetics 163: 147–157. [DOI] [PMC free article] [PubMed] [Google Scholar]
  88. Stegeman GW, de Mesquita MB, Ryu WS, Cutter AD. 2013. Temperature-dependent behaviours are genetically variable in the nematode Caenorhabditis briggsae. J Exp Biol 216: 850–858. [DOI] [PubMed] [Google Scholar]
  89. Stein LD, Bao Z, Blasiar D, Blumenthal T, Brent MR, Chen N, Chinwalla A, Clarke L, Clee C, Coghlan A, et al. 2003. The genome sequence of Caenorhabditis briggsae: a platform for comparative genomics. PLoS Biol 1: 166–192. [DOI] [PMC free article] [PubMed] [Google Scholar]
  90. Tajima F. 1989. Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123: 585–595. [DOI] [PMC free article] [PubMed] [Google Scholar]
  91. Thomas CG, Li RH, Smith HE, Woodruff GC, Oliver B, Haag ES. 2012. Simplification and desexualization of gene expression in self-fertile nematodes. Curr Biol 22: 2167–2172. [DOI] [PMC free article] [PubMed] [Google Scholar]
  92. Thompson O, Edgley M, Strasbourger P, Flibotte S, Ewing B, Adair R, Au V, Chaudry I, Fernando L, Hutter H, et al. 2013. The million mutation project: a new approach to genetics in Caenorhabditis elegans. Genome Res 23: 1749–1762. [DOI] [PMC free article] [PubMed] [Google Scholar]
  93. Van der Auwera GA, Carneiro MO, Hartl C, Poplin R, del Angel G, Levy-Moonshine A, Jordan T, Shakir K, Roazen D, Thibault J, et al. 2013. From FastQ data to high-confidence variant calls: the Genome Analysis Toolkit best practices pipeline. Curr Protoc Bioinformatics 43: 11.10.11–11.10.33. [DOI] [PMC free article] [PubMed] [Google Scholar]
  94. Verster AJ, Ramani AK, McKay SJ, Fraser AG. 2014. Comparative RNAi screens in C. elegans and C. briggsae reveal the impact of developmental system drift on gene function. PLoS Genet 10: e1004077. [DOI] [PMC free article] [PubMed] [Google Scholar]
  95. Wakeley J. 2009. Coalescent theory: an introduction. Roberts and Company Publishers, Greenwood Village, CO. [Google Scholar]
  96. Wang X, Chamberlin HM. 2004. Evolutionary innovation of the excretory system in Caenorhabditis elegans. Nat Genet 36: 231–232. [DOI] [PubMed] [Google Scholar]
  97. Weir BS, Hill WG. 1980. Effect of mating structure on variation in linkage disequilibrium. Genetics 95: 477–488. [DOI] [PMC free article] [PubMed] [Google Scholar]
  98. Woodruff GC, Eke O, Baird SE, Félix MA, Haag ES. 2010. Insights into species divergence and the evolution of hermaphroditism from fertile interspecies hybrids of Caenorhabditis nematodes. Genetics 186: 997–1012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  99. Wright SI, Kalisz S, Slotte T. 2013. Evolutionary consequences of self-fertilization in plants. Proc Biol Sci 280: 20130133. [DOI] [PMC free article] [PubMed] [Google Scholar]
  100. Yahara K, Furuta Y, Oshima K, Yoshida M, Azuma T, Hattori M, Uchiyama I, Kobayashi I. 2013. Chromosome painting in silico in a bacterial species reveals fine population structure. Mol Biol Evol 30: 1454–1464. [DOI] [PMC free article] [PubMed] [Google Scholar]
  101. Yan C, Bi Y, Yin D, Zhao Z. 2012. A method for rapid and simultaneous mapping of genetic loci and introgression sizes in nematode species. PLoS One 7: e43770. [DOI] [PMC free article] [PubMed] [Google Scholar]
  102. Ye K, Schulz MH, Long Q, Apweiler R, Ning Z. 2009. Pindel: a pattern growth approach to detect break points of large deletions and medium sized insertions from paired-end short reads. Bioinformatics 25: 2865–2871. [DOI] [PMC free article] [PubMed] [Google Scholar]
  103. Zeng K, Fu YX, Shi S, Wu CI. 2006. Statistical tests for detecting positive selection by utilizing high-frequency variants. Genetics 174: 1431–1439. [DOI] [PMC free article] [PubMed] [Google Scholar]
  104. Zhao Z, Boyle TJ, Bao Z, Murray JI, Mericle B, Waterston RH. 2008. Comparative analysis of embryonic cell lineage between Caenorhabditis briggsae and Caenorhabditis elegans. Dev Biol 314: 93–99. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental Material

Articles from Genome Research are provided here courtesy of Cold Spring Harbor Laboratory Press

RESOURCES