Abstract
Rapid and inexpensive methods for genome-wide SNP discovery and genotyping are urgently needed for population management and conservation. In hybridized populations, genomic techniques that can identify and genotype thousands of species-diagnostic markers would allow precise estimates of population- and individual-level admixture, as well as identification of “super invasive” alleles, which show elevated rates of introgression above the genome-wide background (likely due to natural selection). Techniques like restriction-site associated DNA (RAD) sequencing can discover and genotype large numbers of SNPs, but they have been limited by the length of continuous sequence data they produce with Illumina short-read sequencing. We present a novel approach, overlapping paired-end RAD sequencing, to generate RAD contigs of >300-400bp. These contigs provide sufficient flanking sequence for design of high-throughput SNP genotyping arrays and strict filtering to identify duplicate paralogous loci. We applied this approach in five populations of native westslope cutthroat trout that previously showed varying (low) levels of admixture from introduced rainbow trout. We produced 77,141 RAD contigs and used these data to filter and genotype 3,180 previously identified species-diagnostic SNP loci. Our population-level and individual-level estimates of admixture were generally consistent with previous microsatellite-based estimates from the same individuals. However, we observed slightly lower admixture estimates from genome-wide markers, which might result from natural selection against certain genome regions, different genomic locations for microsatellites versus RAD-derived SNPs, and/or sampling error from the small number of microsatellite loci (n = 7). We also identified candidate adaptive super invasive alleles from rainbow trout that had excessively high admixture proportions in hybridized cutthroat trout populations.
Keywords: conservation genomics, next generation sequencing, hybridization, invasive species, super invasive genes, adaptive introgression, natural selection, salmonids
Introduction
Hybridization between native and introduced taxa is an increasing concern for conservation and legal assessments of threatened species (Allendorf et al. 2001). Hybridization can reduce fitness through outbreeding depression (Muhlfeld et al. 2009a), cause genomic extinction (Allendorf et al. 2001), and destroy important genetic and ecological adaptations (Muhlfeld et al. 2009b; Kelly et al. 2010). The loci most responsible for the genetic effects of hybridization may be outliers in their degree of introgression because of natural selection in admixed populations (“super invasive alleles”; Gompert & Buerkle 2009; Fitzpatrick et al. 2010; Teeter et al. 2010; Miller et al. 2012). As a result, estimates of admixture averaged across loci at the individual or population level may miss important genetic factors in conservation and management of native taxa. Current high-throughput sequencing techniques now allow genome scans for invasive alleles in natural populations of non-model species.
Anthropogenic hybridization is especially widespread in freshwater fishes due to decades of fish translocations and hatchery supplementation of wild populations. Rainbow trout (RBT, Onchorhynchus mykiss) is the most widely translocated and problematic invasive fish worldwide (Halverson 2010). RBT hybridize with cutthroat trout (O. clarkii), including the subspecies westslope cutthroat trout (WCT, O. c. lewisi). WCT is the most widely distributed of 12 extant cutthroat subspecies, and hybridization is the leading threat to persistence of genetically pure WCT populations (Shepard et al. 2005).
Management of WCT populations would benefit from detection of hybridization and introgression at low levels, and from the ability to precisely estimate individual-level admixture proportion. Previous work has used microsatellites and other loci to assess levels of admixture from RBT into native WCT populations (Hitt et al. 2003; Boyer et al. 2008; Muhlfeld et al. 2009a,c). Muhlfeld et al. (2009c) found that levels of RBT admixture were negatively related to distance from the source of RBT hybridization (Abbot Creek; see Fig. 1) and positively related to mean summer water temperature, suggesting potential for the existence of rainbow trout alleles that are adaptive to warm water temperatures (Perry et al. 2001; Narum et al. 2010). However, the low number of diagnostic markers available with microsatellites typically allows precise admixture estimates only at the population level, not at the individual or genome-scan level.
Fig. 1.
Map of the North Fork Flathead River study area, showing the five admixed WCT populations examined here plus the initial source of introduced RBT individuals (Abbot Creek; see Boyer et al. 2008, and Muhlfeld et al. 2009c for more information on these populations).
Single-nucleotide polymorphisms (SNPs) are ideal markers for hybridization assessment and monitoring because hundreds of SNPs can be rapidly, reliably, and cheaply genotyped using new genotyping platforms (Morin et al. 2004; Seeb et al. 2009, 2011a; Angeloni et al. 2011; Twyford & Ennos 2012). Much recent effort has been committed to assembling a set of diagnostic SNP loci for RBT and WCT (Finger et al. 2009; McGlaughlin et al. 2010; Harwood amp; Phillips 2011; Kalinowski et al. 2011; Amish et al. 2012; Campbell et al. 2012; Pritchard et al. 2012).
A high density of markers across the genome promises individual-level estimates of admixture proportion, as well as detection of super invasive alleles. However, SNP discovery in salmonid fish is especially challenging due to a recent genome duplication event, making it difficult to distinguish true SNPs from fixed sequence differences between homeologous duplicate chromosomal regions (Allendorf amp; Danzmann 1997; Everett et al. 2011; Seeb et al. 2011b) as well as more typical tandem duplicated paralogous regions. One way to filter out both paralogs and homeologs is to gather more sequence data around candidate SNP markers to resolve between next generation sequence reads that come from one locus versus two different loci.
We previously used restriction-site associated DNA (RAD) sequencing (Baird et al. 2008) to identify several thousand WCT-diagnostic SNPs (Hohenlohe et al. 2011). Those candidate diagnostic markers have shown a high rate of subsequent validation in microfluidic PCR-based genotyping assays (Amish et al. 2012). However, primer design for those genotyping assays required >50 bp of flanking sequence on each side of each SNP, which we obtained from previously published sequence data, reducing the number of candidate markers for which assays could be designed (Amish et al. 2012). In addition, our ability to distinguish duplicate sequence on the basis of flanking sequence was limited to the 54bp single-end Illumina read length in that study. The approach we present here can be used to simultaneously identify and genotype SNP markers, as well as gather substantial flanking sequence, in a single RAD sequencing experiment. The amount of flanking sequence is more than sufficient for primer design and also allows better discrimination of paralogous loci.
RAD sequencing is one of a family of genomic approaches that provide sequence data adjacent to restriction enzyme recognition sites (Davey et al. 2011). The primary difference between RAD and related techniques is that RAD incorporates a random shearing step in library preparation. As a result, while the forward reads are anchored at the restriction site, the reverse reads produced by paired-end Illumina sequencing of RAD libraries are staggered over a local genomic region (of several hundred base pairs). These staggered paired-end reads can be assembled into a “mini-contig,” a continuous stretch of genomic sequence that is longer than each individual read and potentially up to 1kb (Baxter et al. 2011; Etter et al. 2011; Willing et al. 2011; Etter amp; Johnson 2012). Here we designed our RAD libraries so that a substantial fraction of DNA fragments would produce overlapping paired-end reads, allowing assembly of contigs containing both the forward and reverse reads of each pair. These “RAD contigs” are anchored at one end by the restriction enzyme recognition site and contain several hundred base pairs of continuous genomic sequence data across dozens of individuals.
Our goals in this study were to: (1) assemble a large set of RAD contigs from a sample of low-admixture WCT populations; (2) provide flanking sequence for finer filtering of candidate diagnostic SNP markers between RBT and WCT; (3) genotype filtered diagnostic SNPs across five WCT populations to assess the ability of RAD sequencing compared to microsatellites to provide precise individual-level estimates of admixture; and (4) to identify outlier loci exhibiting the signature of super invasive alleles.
Methods
Study system
We focus on WCT populations in tributaries to the North Fork of the Flathead River in northwestern Montana (Fig. 1). The North Fork Flathead River originates in Canada and forms the western border of Glacier National Park before joining the main-stem Flathead River, which flows into Flathead Lake. The presence of hybridization and RBT admixture were previously estimated in several populations using seven diagnostic microsatellite loci (Boyer et al. 2008; Muhlfeld et al. 2009c).
Here we use five of these populations (Meadow, Nicola, Dutch, Lower Hay, Tepee) for which estimates of the mean population-level admixture based on microsatellite loci ranged from 1.3 to 13.0 percent (see Boyer et al. 2008; Muhlfeld et al. 2009c for further information on these populations). We chose populations without F1 hybrids as identified in previous studies with the goal of using later-generation admixed populations to detect specific loci with elevated levels of introgression. We used preserved DNA samples, collected from 18-22 individuals in each population during 2003-2004 for the study by Boyer et al. (2008), in order to allow individual-level comparisons between SNP-based and microsatellite-based admixture estimates. We selected individuals across the range of admixture proportions previously estimated within each population.
RAD sequencing
We prepared RAD sequencing libraries for 97 samples from the five WCT populations described above, following the protocol of Etter et al. (2011). The RAD protocol produces libraries of genomic fragments bounded on one end by a restriction enzyme cut site (therefore common across individuals), with the other end randomly sheared. Typically fragments in RAD libraries are size-selected simply to optimize the efficiency of the Illumina sequencing process. Here we used the restriction enzyme SbfI and 6-nucleotide barcoded adaptors differing from each other by at least 3 nucleotides to identify individuals. We modified the standard protocol to target DNA fragments of 330-400bp during gel size selection, so that the size of genomic DNA inserts targeted the range 200-270bp, to produce overlapping paired-end reads for a large proportion of sequenced fragments (Fig. 2). We sequenced the RAD libraries in portions of two lanes (grouped with other RAD sequencing experiments) on an Illumina HiSeq sequencer at the University of Oregon, producing 153bp paired-end reads.
Fig. 2.
Schematic diagram of overlapping paired-end RAD sequencing. (a) RAD libraries are prepared according to Etter et al. (2011), with the exception that a smaller size range of fragments are selected to obtain overlapping reads. The green triangle indicates the restriction enzyme cut site, and fragments from only one side of the cut site are shown for three individuals (represented by different colors). (b) Libraries are sequenced by Illumina with paired-end reads. Loci are identified with Stacks software, using only the forward reads (solid lines) to cluster reads by locus. (c) Both the forward and reversereads from each locus are pooled across a set of individuals and assembled into a RAD contig. The depth of sequencing coverage across overlapping paired-end RAD contigs has a unique signature. (d) Reads from each individual are separately aligned against the reference contig set and diploid SNP genotypes are called statistically. The length of genotyped sequence data may vary across individuals, and in some cases genotype data may have a gap where paired ends did not overlap.
We processed the sequence data and grouped the read pairs from all individuals into RAD loci using several modules from the Stacks software package, version 0.998 (Catchen et al. 2011). First, using the Stacks program process_radtags.pl, we sorted read pairs by barcode, filtered for read quality, and removed any pairs in which the forward read did not contain both a correct barcode and the remaining 6 bases of the SbfI recognition sequence. We then removed read pairs that represented PCR duplicates using the Stacks program clone_filter. The random shearing step in RAD sequencing produces staggered paired-end reads as described above, so that any set of read pairs that are identical across both the forward and reverse reads are likely PCR duplicates of a single original genomic DNA fragment (Davey et al. 2011). Because genotyping depends on using read counts of alternative alleles in a statistical sampling model, PCR duplicates can be misleading because they do not represent independent samples from the genomic pool of DNA.
We identified RAD loci by applying ustacks to the forward reads across all individuals. We enabled the Deleveraging and Removal algorithms in order to filter out highly repetitive, likely paralogous loci, and we used a maximum nucleotide distance between stacks of 4 to achieve a balance between filtering paralogs and maintaining true alleles at a single locus roughly consistent with the expected number of RAD loci (Hohenlohe et al. 2011; Miller et al. 2011). We created a catalog of RAD tag loci using cstacks and matched individuals against the catalog using sstacks. We populated and indexed a MySQL database of loci using load_radtags.pl and index_radtags.pl, and then exported the data using export_sql.pl. Finally we grouped the forward and reverse reads from each individual corresponding to each RAD locus using sort_read_pairs.pl.
Contig assembly
We pooled many individuals for contig assembly to increase sequence coverage of read pairs at each RAD locus (Fig. 2b). However, we also wanted to limit levels of polymorphism that could complicate assembly. Therefore we pooled data from 60 individuals from the three populations with the lowest level of admixture as estimated from previous microsatellite data (Boyer et al. 2008): Lower Hay, Nicola, and Tepee. We grouped the forward and reverse reads from all individuals in these populations into a separate file for each RAD locus, using the Stacks program sort_read_pairs.pl. We assembled the reads in each file separately to produce a set of RAD contigs (Fig. 2b), using both Velvet (Zerbino amp; Birney 2008) and CAP3 (Huang amp; Madan 1999) assembly software. Because CAP3 performed better (see Results), all further analyses below used the CAP3 assemblies. Because of our pooling strategy, the consensus sequences in this reference set of RAD contigs represent primarily WCT with minimal RBT admixture.
Genotyping and admixture estimates
We aligned the filtered read pairs for each individual from all five populations against the reference set of RAD contigs (Fig. 2c). (Three individuals with very low coverage were dropped: one each from Meadow, Nicola, and Tepee, leaving a total sample size of 94 individuals.) We used the alignment software Bowtie (Langmead et al. 2009), allowing up to 3 nucleotide mismatches in the first 30bp of each read and up to 15 mismatches over the total read. These parameters represent a compromise aimed at producing valid alignments to the reference, while minimizing bias against divergent RBT haplotypes. We chose them after aligning and genotyping a subset of the data across a wide range of parameter values, but we found that alignment parameters created only marginal differences in overall genotype calls (not shown). We retained only those read pairs that aligned uniquely to the reference contig set and that aligned in the expected orientation (i.e. the forward read aligns at position 0 of the contig, matching the position of the restriction enzyme cut site, and the reverse read aligns in the opposite direction along the same contig within a distance up to 750bp).
We assigned diploid genotypes to each nucleotide position for each individual using the maximum likelihood method of Hohenlohe et al. (2010), modified by bounds on the per-nucleotide sequencing error rate of 0.0001 < ε < 0.0025 and a significance level of α = 0.05 (custom software available at http://webpages.uidaho.edu/hohenlohe/software.html). These limits on ε have the effect of being more likely to call a heterozygous genotype. While in de novo genotyping these bounds on ε would increase the frequency of false alleles, here we are genotyping against previously identified WCT and RBT alleles (see below). This strategy and the relatively high significance threshold are also justified because of the quality filtering and removal of PCR duplicates described above, which increases confidence that each read represents a true independent sample of genomic sequence.
We used previously identified species-diagnostic SNP loci to assess introgression from RBT into these WCT populations. From the RAD sequencing data in WCT and RBT published by Hohenlohe et al. (2011), we extracted all RAD loci in which there was either one SNP fixed between species and no other polymorphism in the 54bp sequence (2,923 loci), two fixed SNPs and no other polymorphism (643 loci), or one fixed SNP and one additional SNP polymorphic within either species (1,348 loci), for a total of 4,914 diagnostic SNPs (see Amish et al. 2012 for validation of some of these SNP markers). We aligned both the WCT and RBT alleles of these 54bp sequences against the new reference set of RAD contigs, using Bowtie (Langmead et al. 2009) and allowing up to 2 nucleotide mismatches. We retained only those diagnostic loci that aligned uniquely with up to 2 mismatches (for both the RBT and WCT alleles) to the reference contig set.
We then genotyped all individuals from the 5 admixed WCT populations in the current study as WCT, RBT, or heterozygous at each of these loci for which genotype calls were made above (any genotype calls that did not match previously identified alleles at these SNPs were treated as missing data). As a final filtering step for paralogous loci, we removed loci for which these genotypes exhibited observed heterozygosity > 0.5 and FIS < −0.5 (Hohenlohe et al. 2011). Using all such diagnostic SNPs for which at least half of the individuals (47 or more) were genotyped, we estimated proportion of admixture at the locus, individual, and population levels as the frequency of RBT alleles across diagnostic loci.
We applied the heterogeneity test of Long (1991) to test for super invasive alleles. This analysis tests whether the variance in admixture across loci exceeds that expected from random sampling as well as genetic drift across loci (other tests for admixture outliers do not account for drift and may suffer from a high false positive rate, so our approach is a conservative test; Fitzpatrick et al. 2009). Because this method cannot handle allele frequencies of 0.0, we used Bayesian estimates of allele frequencies with an uninformative prior (Fitzpatrick et al. 2009). We adjusted for differences in sample size of genotypes across loci, which affect the expected variance in allele frequency estimates, in equation 6 of Long (1991). For each locus in each population, we calculated a p-value for the deviation from expected admixture and adjusted for false discovery rate at a level of α = 0.05 within each population (Benjamini amp; Hochberg 1995). We identified candidate super invasive alleles as those with significantly elevated admixture proportions in two or more populations.
Results
RAD sequencing and contig assembly
After filtering for read quality and presence of a correct barcode and SbfI recognition site, we generated 63,061,577 RAD sequence read pairs across 94 individuals in five admixed WCT populations. Of these, 22 percent represented PCR duplicates and were removed, leaving 49,248,922 unique read pairs. We identified a total of 222,830 putative RAD loci in Stacks using the forward reads of each pair across all individuals. Only 82,721 of these loci represented 8 or more read pairs across all individuals.
We pooled the read pairs corresponding to these 82,721 loci for individuals from three populations with the lowest previously estimated admixture proportions (Lower Hay, Nicola, Tepee). We conducted separate assemblies at each locus using both Velvet (Zerbino amp; Birney 2008) and CAP3 (Huang amp; Madan 1999). In Velvet, we used fixed k-mer lengths of 25, 35, 45, and 55bp as well as optimizing the k-mer length across these values independently at each locus. All of these assemblies failed to connect overlapping paired-end reads at many loci, and the maximum contig length per locus was only ~100-300bp (Fig. S1). Thus in many cases, the contigs assembled were smaller than the read length of 147bp (after trimming the barcode) for the forward reads (Fig. S1), meaning that sequences were broken into k-mers and unable to be reassembled. This difficulty in paired-end assembly of RAD data has been observed elsewhere (Davey et al. 2012), although that study had better success than we did in optimizing assembly parameters per locus. The general problem may be due to the unique signature of sequence coverage expected across contigs for overlapping paired-end RAD data (Fig. 2c; Etter et al. 2011; Fig. 1 of Davey et al. 2012).
In contrast, the simpler algorithm of CAP3 performed much better. While more computationally intensive, it is still feasible on a desktop computer because the locus identification from Stacks significantly reduces the complexity of each individual assembly. Of the 82,721 loci, 72,124 (87.2 percent) assembled into single contigs, all but one containing both the overlapping forward and reverse reads. An additional 5,017 loci assembled into two or more contigs, of which only the largest contig was anchored at the expected restriction enzyme recognition site. Of these, all but 151 contained both the forward and reverse reads. We combined these to produce our final reference set of RAD contigs, which contained 77,141 contigs from 82,721 loci (93.3 percent). Fragment size selection to produce overlapping paired-end reads was remarkably successful, so that over 93 percent of loci produced contigs spanning the forward and reverse reads. Contig lengths ranged from 147-519bp with most between 250 and 450bp (Fig. 3a), suggesting that longer fragments were carried through the gel-based size selection step. The mean number of read pairs contributing to each contig was 379.3. Contig length was positively related to the number of sequence pairs contributing to each assembly (Fig. 3b), so our strategy of pooling individuals to increase coverage at this consensus assembly step appears sound.
Fig. 3.
(a) Frequency histogram of consensus sequence lengths across 77,141 contigs assembled by CAP3 from overlapping paired-end RAD sequencing in admixed WCT populations. (b) Relationship between sequencing depth at each locus (number of sequence pairs from 60 pooled individuals) and RAD contig length.
Genotyping and admixture
We aligned 54bp RAD sequences for 4,914 previously identified SNP loci (Hohenlohe et al. 2011; Amish et al. 2012) against the reference RAD contig set. Of these, 3,456 (70.4 percent) aligned uniquely to a single contig in the reference set with up to 2 mismatches for both the RBT and WCT alleles. In addition, 392 (8.0 percent) aligned to multiple contigs with relatively few mismatches. These multiple contigs appear to represent genomic regions with duplicate sequence beyond the 54bp length of the previously identified RAD sequence. Figure S2 shows one such example in which sequences diverge relatively rapidly beyond the first 54bp, illustrating how longer reads and overlapping paired-end RAD sequencing may provide powerful tools for distinguishing paralogous sequence from polymorphism at homologous loci.
We genotyped each individual at all nucleotide positions aligned to the reference contig set using the maximum likelihood statistical approach described above. Of the 3,456 uniquely aligned diagnostic SNP loci, 3,182 had diploid genotype calls for at least half the individuals sampled. Two of these were likely paralogous loci, with elevated observed heterozygosity (0.95 and 0.80) and reduced FIS (−0.90 and −0.61, respectively), and these were removed from further analysis. The remaining 3,180 loci had observed heterozygosity less than 0.45 and FIS greater than −0.23, suggesting a clear break between them and the two presumptive paralogous loci. We translated genotypes for the final list of 3,180 loci into homozygous WCT, heterozygous, or homozygous RBT and assessed proportion of admixture as simply the frequency of RBT alleles.
For all of the individuals genotyped here, we also had individual-level estimates of admixture proportion based on 7 species-diagnostic microsatellite loci (Boyer et al. 2008). Our SNP-based estimates were highly correlated with previous microsatellite-based estimates overall and within each population (Table 1), although they tended to be slightly lower (Fig. 4a). We detected evidence of introgression in all 53 individuals for which no RBT alleles had been observed at the microsatellite loci. In these individuals, RBT alleles were detected at 1 to 235 loci, leading to individual admixture proportions ranging from 0.0013 to 0.0439 that were undetected in the microsatellite data. Average population-level admixture proportions are also consistent with microsatellite-based estimates (Pearson r = 0.99; p = 0.0013; Fig. 4b), in which Dutch and Meadow exhibited higher levels of admixture than the other three populations, although SNP-based estimates were lower than microsatellite estimates for four of the five populations.
Table 1.
Correlation between previous microsatellite and current SNP-based estimates of individual-level admixture proportions, and super invasive alleles exhibiting significantly elevated introgression with a false discovery rate corrected p-value.
| SNP-microsatellite correlation | ||||
|---|---|---|---|---|
| Population | r | p-value | # super invasive alleles |
FDR p-value threshold |
| Meadow | 0.879 | <10−5 | 2 | 3.4 × 10−5 |
| Nicola | 0.837 | <10−5 | 5 | 8.0 × 10−5 |
| Dutch | 0.711 | 0.0009 | 1 | 2.0 × 10−3 |
| Lower Hay | 0.960 | <10−10 | 5 | 4.9 × 10−4 |
| Tepee | 0.844 | 7lt;10−4 | 4 | 4.2 × 10−4 |
| All 5 populations |
0.805 | <10−15 | 2 | 7.8 × 10−5 |
Fig. 4.
(a) Individual-level admixture proportions estimated from 7 diagnostic microsatellite loci (Boyer et al. 2008) versus current estimates from 3,180 SNP loci across 94 WCT individuals from 5 populations. Note that many of the points, particularly those with admixture proportions near 0.0, lie on top of each other. (b) Population-level admixture proportions estimated from the same two datasets, calculated using only the individuals genotyped by both Boyer et al. (2008) and the current study.
Comparing admixture proportions across SNP loci reveals a positively skewed distribution within each population and overall, with many loci showing little or no admixture and a small set of outlier loci (Fig. 5). Of the 3,180 diagnostic SNP loci genotyped, 634 showed no RBT alleles in any of the five populations. However, 94 loci exhibited admixture levels of 0.1 or greater across all five populations combined, up to a maximum of 0.542 (Fig. 5f). These are candidate super invasive alleles: RBT alleles that may have spread rapidly or have higher probabilities of persistence in WCT populations. Within each population, loci exhibited significantly elevated admixture proportions using the heterogeneity test of Long (1991), corrected for false discovery rate (Table 1). Three loci were significantly invasive in two or more populations, one of which was significant across all five populations (Fig. 5).
Fig. 5.
Frequency histograms of admixture proportion across 3,180 diagnostic SNP loci. (a) Meadow. (b) Nicola. (c) Dutch. (d) Lower Hay. (e) Tepee. (f) All 5 populations combined. Arrows indicate super invasive alleles - loci with significantly elevated admixture proportion (α= 0.05, corrected for false discovery rate) independently in two or more populations. (i) RAD locus 118904. (ii) RAD locus 117399. (iii) RAD locus 82847.
We conducted a translated nucleotide BLAST search using the RAD contig sequence for each of these three super invasive alleles. Two of them aligned closely to annotated genes whose function is consistent with selection in hybridized WCT populations. The locus significantly admixed in all five populations (RAD locus 118904) aligned significantly to the vertebrate gene latent transforming growth factor beta binding protein 2 (LTBP2), with the most significant hit in Bos taurus (E-value = 10-7). The second locus, significantly admixed in the Nicola and Tepee populations (RAD locus 117399), aligned to the vertebrate gene furry homolog-like (FRYL), with the most significant hit in zebrafish (Danio rerio, E-value = 10-9). It is worth noting that the BLAST alignments to these two annotated gene sequences began at nucleotide positions 191 and 210, respectively, of the RAD contigs, so that the identification of these candidate genes would not have been possible solely with single-end RAD sequence data.
Discussion
Genomic tools hold remarkable promise for conservation and management of many taxa. The ability to rapidly identify and genotype large numbers of genetic markers allows improved estimates of demographic parameters (gene flow, effective population size, population-level admixture), as well as identification of outlier loci (locally adapted genes, invasive alleles). Overlapping paired-end RAD sequencing offers advantages for rapid development of large numbers of candidate SNPs that can be used in high-throughput genotyping assays, particularly in the case of large or repetitive genomes.
In a specific application of this technique, here we assessed genomic patterns of introgression and were able to detect individuals with very low levels of admixture, precisely estimate individual- and population-level admixture, and detect candidate super invasive alleles driven to high frequency by selection. Below we discuss some general aspects of the sequencing technique for conservation genomics and lessons from its application to the genomics of hybridization.
Overlapping paired-end RAD for conservation genomics
By assembling contigs of 400bp or more adjacent to RAD loci, overlapping paired-end RAD provides sufficient flanking sequence for SNP assay design simultaneous with SNP discovery. The ability to generate sufficient flanking sequence has previously been a limitation of RAD sequencing for converting rapid SNP discovery to a set of high-throughput assays (Ogden 2011; Amish et al. 2012). Our approach can rapidly provide a multitude of candidate SNP markers for high-throughput assay development. Here we only analyzed a few thousand diagnostic markers that had been previously identified. In general, the majority of contigs of 300-400bp or longer would be expected to contain SNPs relevant for most population genomic or conservation applications.
Assembling RAD contigs provides more continuous genomic sequence data for discriminating paralogous loci. This is a particular challenge in salmonids because of their ancestral genome duplication, which created homeologous duplicate sequence across the genome (Allendorf amp; Danzmann 1997; Everett et al. 2011; Seeb et al. 2011b). Here we found examples of loci sharing very similar sequence over ~50bp, so that they were grouped together in previous analysis, but diverged beyond that length. As a result we were able to further screen the candidate diagnostic SNP loci we had previously identified (Hohenlohe et al. 2011; Amish et al. 2012) by removing the 8 percent that aligned to multiple RAD contigs. Ongoing validation of the reduced set will determine the success rate of these refined candidate markers.
Our approach to RAD contig assembly produced a single contig with high average read depth for most of our RAD loci. Nonetheless, the assembly and validation of RAD contigs can be challenging (Davey et al. 2012). Assemblies using the de Bruijn graph technique of Velvet (Zerbino amp; Birney 2008) produced consistently shorter contigs than a simpler (but more computationally intensive) assembly algorithm in CAP3 (Huang amp; Madan 1999) (compare Fig. 3 and Fig. S1). This contrasts with the results of Etter et al. (2011), who had better success with Velvet in assembling the reverse reads from non-overlapping paired-end RAD. Willing et al. (2011) used non-overlapping paired-end RAD in guppies and assembled the reverse reads for 91.3% of loci into a single contig with generally lower sequence coverage than used at the assembly step here. That study used the assembler LOCAS, specifically designed by one of the authors for low-coverage data. Davey et al. (2012) had poor results with LOCASOpt and Velvet in assembling paired-end RAD data from Heliconius butterflies, but better results using the computationally intensive VelvetOptimiser. In our trout dataset, over 87% of loci produced a single contig of both forward and reverse reads with CAP3, and many of the remainder could be filtered out as paralogs.
Techniques like overlapping paired-end RAD sequencing may allow new analytical power. Compared to other markers like microsatellites, SNPs can be limiting in that they typically exhibit only two alleles in natural populations. More power to understand population genetic processes would come from using multi-allelic haplotypes instead of SNPs in analyses of high-throughput sequence data (Gompert amp; Buerkle 2011; Buerkle et al. 2011). Because of the relatively long contigs that can be generated (Etter et al. 2011; Willing et al. 2011), and because haplotype phase is known across read pairs and thus can be inferred along the length of RAD contigs, paired-end RAD offers the possibility of using haplotype-rather than SNP-based analyses. Genealogical relationships among multiple haplotypes is very useful for inferring demographic and evolutionary history (Sunnucks 2000; Beaumont and Rannala 2004).
Assessing genome-wide patterns of introgression
Here we provide one of the first genome-wide assessments of human-mediated introgressive hybridization in salmonid fishes (see also Lamaze et al. 2012). Our results confirm previous patterns of hybridization between introduced RBT and native WCT in the North Fork Flathead system (Boyer et al. 2008; Muhlfeld et al. 2009c). Population-level admixture estimates were generally consistent for diagnostic microsatellites and RAD-based SNP loci, suggesting that thousands of diagnostic loci are generally unnecessary for approximate estimates of population-level admixture. However, one estimate did differ: the estimate for Dutch Creek was over 40 percent higher using the microsatellite data (Fig. 4b). This may be explained by selection against RBT alleles in chromosomal regions near RAD loci and/or sampling error from using only 7 diagnostic microsatellite loci, especially for populations with low levels of introgression. Given the variation in introgression we observed here among SNP loci, the genomic location of those microsatellite loci could also be a major source of variation.
Overestimation of admixture (by using only a handful of neutral loci) could cause populations to not be protected under conservation laws, such as the U.S. Endangered Species Act (ESA). For Lahontan cutthroat trout, listed under the ESA, 10 percent RBT admixture is the threshold for a population to be protected as if it were non-hybridized (pure native) Lahontan. Based on sampling theory for neutral loci, it is likely that 50-100 diagnostic loci would improve accuracy to levels approaching that of thousands of RAD loci, if those diagnostic loci are widely distributed across the genome (Amish et al. 2012).
At the individual level, overlapping paired-end RAD sequencing allowed detection of very low levels of RBT introgression. Here we detected RBT alleles in all 94 samples analyzed, over half of which did not exhibit RBT alleles at 7 microsatellite loci (Boyer et al. 2008). Some of the assumed RBT-diagnostic alleles could actually exist in non-hybridized WCT populations. Additional RAD sequencing of pure-native populations (e.g. isolated above barriers in the Flathead River) could help identify assumed diagnostic RBT alleles that might exist in WCT (e.g. due to maintenance of ancestral polymorphism).
Genome-wide marker coverage is an important advance for conservation and management because it allows powerful screening of individuals to prevent inadvertent release of hybridized individuals into populations (e.g. during assisted migration, broodstock development, translocation and reintroduction), and identification of markers for rapid screening for early detection of hybridization. From a landscape genetics perspective, the ability to precisely estimate admixture would allow fine spatial mapping of hybridization and introgression patterns. This approach may be useful in monitoring and preventing the spread of invasive species and their alleles in many plant and animal species facing hybridization threats in nature (Schwartz et al. 2007).
Dense coverage of markers across the genome allows for detection of candidate super invasive alleles - alleles of an invasive taxon that rise to much higher frequency (level of introgression) than the genomic background, analogous to outlier loci in genome scans for selection (Luikart et al. 2003). Here we detected several candidate superinvasive alleles as evidenced by the distributions of admixture proportions among SNPs (in all populations) containing a long tail of outlier loci. Several of these loci were consistent as outliers across populations. Further study is needed to confirm that these are indeed RBT alleles that have introgressed into these WCT populations. The haplotype information provided by longer overlapping paired-end RAD (e.g. using 250bp reads as provided by Illumina MiSeq technology) may facilitate that analysis. Further study would also be needed to identify the phenotypic and fitness consequences of these invasive alleles.
BLAST searching revealed close sequence matches for two candidate invasive alleles to vertebrate genes (LTBP-2 and FRYL). Super invasive alleles may be under positive selection and increase fitness in hybridized populations. Alternatively, they may spread by having phenotypic effects on dispersal or through segregation distortion, despite reducing overall fitness from outbreeding depression (Shine et al. 2011). The LTBP family of proteins interacts with TGF-beta and has a wide range of developmental and physiological functions, including effects on fertility (Morén et al. 1994; Öklü amp; Hesketh 2000; Kosova et al. 2012), although the specific relationship between LTBP-2 and TGF-beta is unclear (Hirani et al. 2007). In RBT, the related protein LTBP-3 and other related proteins have been implicated in early ovarian development and early embryonic development (Andersson amp; Eggen 2006; Lankford amp; Weber 2010; Gahr et al. 2012), suggesting the hypothesis that the RBT allele at this locus positively affects fecundity in admixed individuals. It is exciting that future research and additional studies like this one will help understand mechanisms driving super invasive alleles and genome-wide introgression in natural populations.
Figure S1: Frequency histograms of contig length across all loci assembled from overlapping paired-end RAD sequencing, using Velvet. Shown are results for four different fixed k-mer lengths in Velvet assembly. (a) k = 25bp. (b) k = 35bp. (c) k = 45bp. (d) k = 55bp.
Figure S2: A representative example of discrimination of duplicate sequence with longer reads and paired-end RAD sequencing. A single previously identified 54bp RAD tag sequence aligned to 22 of the RAD contigs produced in the current study. To illustrate the sequence divergence across these contig sequences beyond the initial 54bp, the figure shows nucleotide differentiation in the form of expected heterozygosity. The previously identified sequence presumably corresponds to multiple locations across the genome. With paired-end contig assembly, it becomes clear that there are four polymorphisms within the duplicated 54bp region, but sequence among paralogs diverges widely farther from the RAD site.
Supplementary Material
Acknowledgements
P.A.H. and M.D.D. received support from U.S. National Institutes of Health/NCRR grant P20RR16448 (L. Forney, PI). FWA and G.L. were partially supported by the U.S. National Science Foundation grants DEB-0742181. G.L. also received support from Montana Fish Wildlife and Parks and NSF grant DEB-1067613. This work was partially supported by BPA contract #199101993. We thank J.J. Giersch for providing the study area map, Robb Leary for helpful comments on the potential of some assumed diagnostic alleles for RBT to be present at low frequency in nonhybridized WCT, and Montana Fish Wildlife and Parks and Glacier National Park for support and help in sampling. Any use of trade, product, or firm names is for descriptive purposes only and does not imply endorsement by the U.S. Government. This research was conducted in accordance with the Animal Welfare Act and its subsequent amendments.
Footnotes
Data Accessibility
Raw sequence data, RAD contig sequences, and genotype data: Dryad doi:10.5061/dryad.32b88.
References
- Angeloni F, Wagemaker N, Vergeer P, Ougorg J. Genomic toolboxes for conservation biologists. Evolutionary Applications. 2011;5:130–143. doi: 10.1111/j.1752-4571.2011.00217.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Allendorf FW, Danzmann RG. Secondary tetrasomic segregation of MDH-B and preferential pairing of homeologues in rainbow trout. Genetics. 1997;145:1083–1092. doi: 10.1093/genetics/145.4.1083. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Allendorf FW, Leary RF, Spruell P, Wenburg JK. The problems with hybrids: setting conservation guidelines. Trends in Ecology and Evolution. 2001;16:613–622. [Google Scholar]
- Allendorf FW, Hohenlohe PA, Luikart G. Genomics and the future of conservation genetics. Nature Reviews Genetics. 2010;11:697–709. doi: 10.1038/nrg2844. [DOI] [PubMed] [Google Scholar]
- Amish SJ, Hohenlohe PA, Painter S, Leary RF, Muhlfeld C, Allendorf FW, Luikart G. RAD sequencing yields a high success rate for westslope cutthroat and rainbow trout species-diagnostic SNP assays. Molecular Ecology Resources. 2012;12:653–660. doi: 10.1111/j.1755-0998.2012.03157.x. [DOI] [PubMed] [Google Scholar]
- Andersson ML, Eggen RI. Transcription of the fish latent TGFbeta-binding protein gene is controlled by estrogen receptor alpha. Toxicology. 2006;20:417–425. doi: 10.1016/j.tiv.2005.08.010. in vitro. [DOI] [PubMed] [Google Scholar]
- Baird NA, Etter PD, Atwood TS, et al. Rapid SNP discovery and genetic mapping using sequenced RAD markers. PLoS ONE. 2008;3:e3376. doi: 10.1371/journal.pone.0003376. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Baxter SW, Davey JW, Johnston JS, et al. Linkage mapping and comparative genomics using next-generation RAD sequencing of a non-model organism. PLoS ONE. 2011;6:e19315. doi: 10.1371/journal.pone.0019315. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Beaumont M, Rannala B. The Bayesian revolution in genetics. Nature Reviews Genetics. 2004;5:251–261. doi: 10.1038/nrg1318. [DOI] [PubMed] [Google Scholar]
- Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society Series B. 1995;57:289–300. [Google Scholar]
- Boyer MC, Muhlfeld CC, Allendorf FW. Rainbow trout (Oncorhynchus mykiss) invasion and the spread of hybridization with native westslope cutthroat trout (Oncorhynchus clarkia lewisii) Canadian Journal of Fisheries and Aquatic Sciences. 2008;65:658–669. [Google Scholar]
- Buerkle CA, Gompert Z, Parchman TL. The n = 1 constraint in population genomics. Molecular Ecology. 2011;20:1575–1581. doi: 10.1111/j.1365-294X.2011.05046.x. [DOI] [PubMed] [Google Scholar]
- Catchen JM, Amores A, Hohenlohe PA, Cresko WA, Postlethwait JH. Stacks: building and genotyping loci de novo from short-read sequences. G3 Genes Genomes Genetics. 2011;1:171–182. doi: 10.1534/g3.111.000240. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Davey JW, Hohenlohe PA, Etter PD, Boone JQ, Catchen JM, Blaxter ML. Genome-wide genetic marker discovery and genotyping using next-generation sequencing. Nature Reviews Genetics. 2011;12:499–510. doi: 10.1038/nrg3012. [DOI] [PubMed] [Google Scholar]
- Davey JW, Cezard T, Fuentes-Utrilla P, et al. Special features of RAD sequencing data: implications for genotyping. Molecular Ecology. 2012 doi: 10.1111/mec.12084. this issue. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Etter PD, Preston JL, Bassham S, Cresko WA, Johnson EA. Local de novo assembly of RAD paired-end contigs using short sequencing reads. PLoS ONE. 2011a;6:e18561. doi: 10.1371/journal.pone.0018561. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Etter PD, Bassham S, Hohenlohe PA, Johnson EA, Cresko WA. SNP discovery and genotyping for evolutionary genetics using RAD sequencing. In: Orgogozo V, Rockman MV, editors. Molecular Methods for Evolutionary Genetics. Humana Press; New York: 2011b. pp. 157–178. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Etter PD, Johnson EA. RAD paired-end sequencing for local de novo assembly and SNP discovery in non-model organisms. In: Pompanon F, Bonin A, editors. Data Production and Analysis in Population Genomics: Methods and Protocols. Humana Press; New York: 2012. pp. 135–151. [DOI] [PubMed] [Google Scholar]
- Everett MV, Grau ED, Seeb JE. Short reads and nonmodel species: exploring the complexities of next-generation sequence assembly and SNP discovery in the absence of a reference genome. Molecular Ecology Resources. 2011;11:93–108. doi: 10.1111/j.1755-0998.2010.02969.x. [DOI] [PubMed] [Google Scholar]
- Finger AJ, Stephens MR, Clipperton NW, May B. Six diagnostic single nucleotide polymorphism markers for detecting introgression between cutthroat and rainbow trout. Molecular Ecology Resources. 2009;9:759–763. doi: 10.1111/j.1755-0998.2009.02532.x. [DOI] [PubMed] [Google Scholar]
- Fitzpatrick BM, Johnson JR, Kump DK, Shaffer HB, Smith JJ, Voss SR. Rapid fixation of non-native alleles revealed by genome-wide SNP analysis of hybrid tiger salamanders. BMC Evolutionary Biology. 2009;9:176. doi: 10.1186/1471-2148-9-176. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fitzpatrick BM, Johnson JR, Kump DK, Smith JJ, Voss SR, Shaffer HB. Rapid spread of invasive genes into a threatened native species. Proceedings of the National Academy of Sciences USA. 2010;107:3606–3610. doi: 10.1073/pnas.0911802107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gahr SA, Weber GM, Rexroad CE., III Identification and expression of Smads associated with TGF-β/activin/nodal signaling pathways in the rainbow trout (Oncorhynchus mykiss) Fish Physiology and Biochemistry. 2012;38:1233–1244. doi: 10.1007/s10695-012-9611-7. [DOI] [PubMed] [Google Scholar]
- Gompert Z, Buerkle CA. A powerful regression-based method for admixture mapping of isolation across the genome of hybrids. Molecular Ecology. 2009;18:1207–1224. doi: 10.1111/j.1365-294X.2009.04098.x. [DOI] [PubMed] [Google Scholar]
- Halverson A. An Entirely Synthetic Fish: How Rainbow Trout Beguiled America and Overran the World. Yale University Press; New Haven, CT: 2010. [Google Scholar]
- Hitt NP, Frissell CA, Muhlfeld CC, Allendorf FW. Spread of hybridization between native westslope cutthroat trout, Oncorhynchus clarki lewisi, and nonnative rainbow trout, Oncorhynchus mykiss. Canadian Journal of Fisheries and Aquatic Sciences. 2003;60:1440–1451. [Google Scholar]
- Hohenlohe PA, Bassham S, Etter PD, Stiffler N, Johnson EA, Cresko WA. Population genomic analysis of parallel adaptation in threespine stickleback using sequenced RAD tags. PLoS Genetics. 2010;6:e1000862. doi: 10.1371/journal.pgen.1000862. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hohenlohe PA, Amish SJ, Catchen JM, Allendorf FW, Luikart G. Next-generation RAD sequencing identifies thousands of SNPs for assessing hybridization between rainbow and westslope cutthroat trout. Molecular Ecology Resources. 2011;11:117–122. doi: 10.1111/j.1755-0998.2010.02967.x. [DOI] [PubMed] [Google Scholar]
- Huang X, Madan A. CAP3: A DNA sequence assembly program. Genome Research. 1999;9:868–877. doi: 10.1101/gr.9.9.868. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kalinowski ST, Novak BJ, Drinan DP, Jennings deM, Vu NV. Diagnosti single nucleotide polymorphisms for identifying westslope cutthroat trout (Oncorhynchus clarki lewisi), Yellowstone cutthroat trout (Oncorhynchus clarki bouvieri) and rainbow trout (Oncorhynchus mykiss) Molecular Ecology Resources. 2011;11:389–393. doi: 10.1111/j.1755-0998.2010.02932.x. [DOI] [PubMed] [Google Scholar]
- Kelly BP, Whiteley A, Tallmon D. The Arctic melting pot. Nature. 2010;468:891. doi: 10.1038/468891a. [DOI] [PubMed] [Google Scholar]
- Kosova G, Scott NM, Niederberger C, Prins GS, Ober C. Genome-wide association study identifies candidate genes for male fertility traits in humans. The American Journal of Human Genetics. 2012;90:950–961. doi: 10.1016/j.ajhg.2012.04.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lamaze FC, Sauvage C, Marie A, Garant D, Bernatchez L. Dynamics of introgressive hybridization assessed by SNP population genomics of coding genes in stocked brook charr (Salvelinus fontinalis) Molecular Ecology. 2012;21:2877–2895. doi: 10.1111/j.1365-294X.2012.05579.x. [DOI] [PubMed] [Google Scholar]
- Langmead B, Trapnell C, Pop M, Salzberg SL. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biology. 2009;10:R25. doi: 10.1186/gb-2009-10-3-r25. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lankford SE, Weber GM. Temporal mRNA expression of transforming growth factor-beta superfamily members and inhibitors in the developing rainbow trout ovary. General and Comparative Endocrinology. 2010;166:250–258. doi: 10.1016/j.ygcen.2009.09.007. [DOI] [PubMed] [Google Scholar]
- Long JC. The genetic structure of admixed populations. Genetics. 1991;127:417–428. doi: 10.1093/genetics/127.2.417. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Luikart G, England PR, Tallmon D, Jordan S, Taberlet P. The power and promise of population genomics: from genotyping to genome typing. Nature Reviews Genetics. 2003;4:981–994. doi: 10.1038/nrg1226. [DOI] [PubMed] [Google Scholar]
- McGlauflin MT, Smith MJ, Wang JT, et al. High-resolution melting analysis for the discovery of novel single-nucleotide polymorphisms in rainbow and cutthroat trout for species identification. Transactions of the American Fisheries Society. 2010;139:676–684. [Google Scholar]
- Miller RR, Williams JD, Williams JE. Extinctions of North American fishes during the past century. Fisheries. 1989;14:22–38. [Google Scholar]
- Miller JM, Poissant J, Hogg JT, Coltman DW. Genomic consequences of genetic rescue in an insular population of bighorn sheep (Ovis canadensis) Molecular Ecology. 2012;21:1583–1596. doi: 10.1111/j.1365-294X.2011.05427.x. [DOI] [PubMed] [Google Scholar]
- Miller MR, Brunelli JP, Wheeler PA, Liu S, Rexroad CE, III, Palti Y, Doe CQ, Thorgaard GH. A conserved haplotype controls parallel adaptation in geographically distant salmonid populations. Molecular Ecology. 2011;21:237–249. doi: 10.1111/j.1365-294X.2011.05305.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Morén A, Olofsson A, Stenman G, et al. Identification and characterization of LTBP-2, a novel latent transforming growth factor-β-binding protein. Journal of Biological Chemistry. 1994;269:32469–32478. [PubMed] [Google Scholar]
- Morin PA, Luikart G, Wayne RK, SNP workshop group SNPs in ecology, evolution and conservation. Trends in Ecology and Evolution. 2004;19:208–216. [Google Scholar]
- Muhlfeld CC, Kalinowski ST, McMahon TE, Taper ML, Painter S, Leary RF, Allendorf FW. Hybridization rapidly reduces fitness of a native trout in the wild. Biology Letters. 2009a;5:328–331. doi: 10.1098/rsbl.2009.0033. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Muhlfeld CC, McMahon TE, Belcer D, Kershner JL. Spatial and temporal spawning dynamics of native westslope cutthroat trout, Oncorhynchus clarkii lewisi, introduced rainbow trout, Oncorhynchus mykiss, and their hybrids. Canadian Journal of Fisheries and Aquatic Sciences. 2009b;66:1153–1168. [Google Scholar]
- Muhlfeld CC, McMahon TE, Boyer MC, Gresswell RE. Local habitat, watershed, and biotic factors influencing the spread of hybridizations between native westslope cutthroat trout and introduced rainbow trout. Transactions of the American Fisheries Society. 2009c;138:1036–1051. [Google Scholar]
- Narum SR, Campbell NR, Kozfkay CC, Meyer KA. Adaptation of redband trout in desert and montane environments. Molecular Ecology. 2010;19:4622–4637. doi: 10.1111/j.1365-294X.2010.04839.x. [DOI] [PubMed] [Google Scholar]
- Ogden R. Unlocking the potential of genomic technologies for wildlife forensics. Molecular Ecology Resources. 2011;11:109–116. doi: 10.1111/j.1755-0998.2010.02954.x. [DOI] [PubMed] [Google Scholar]
- Öklü R, Hesketh R. The latent transforming growth factor β binding protein (LTBP) family. Biochemical Journal. 2000;352:601–610. [PMC free article] [PubMed] [Google Scholar]
- Perry GML, Danzmann RG, Ferguson MM, Gibson JP. Quantitative trait loci for upper thermal tolerance in outbred strains of rainbow trout (Oncorhynchus mykiss) Heredity. 2001;86:333–341. doi: 10.1046/j.1365-2540.2001.00838.x. [DOI] [PubMed] [Google Scholar]
- Schwartz MK, Luikart G, Waples RS. Genetic monitoring as a promising tool for conservation and management. Trends in Ecology and Evolution. 2007;22:25–33. doi: 10.1016/j.tree.2006.08.009. [DOI] [PubMed] [Google Scholar]
- Seeb JE, Pascal CE, Ramakrishnan R, Seeb LW. SNP genotyping by the 5′-nuclease reaction: advances in high throughput genotyping with non-model organisms. In: Komar A, editor. Methods in Molecular Biology, Single Nucleotide Polymorphisms. 2nd edn. Humana Press; New York: 2009. pp. 277–292. [DOI] [PubMed] [Google Scholar]
- Seeb JE, Carvalho G, Hauser L, Naish K, Roberts S, Seeb LW. Single-nucleotide polymorphism (SNP) discovery and applications of SNP genotyping in nonmodel organisms. Molecular Ecology Resources. 2011a;11:1–8. doi: 10.1111/j.1755-0998.2010.02979.x. [DOI] [PubMed] [Google Scholar]
- Seeb LW, Templin WD, Sato S, Abe S, Warheit K, Park JY, Seeb JE. Single nucleotide polymorphisms across a species’ range implications for conservation studies of Pacific salmon. Molecular Ecology Resources. 2011b;11:195–217. doi: 10.1111/j.1755-0998.2010.02966.x. [DOI] [PubMed] [Google Scholar]
- Shepard BB, May BE, Urie W. Status and conservation of westslope cutthroat within the western United States. North American Journal of Fisheries Management. 2005;25:1426–1440. [Google Scholar]
- Shine R. Invasive species as drivers of evolutionary change: cane toads in tropical Australia. Evolutionary Applications. 2011;5:107–116. doi: 10.1111/j.1752-4571.2011.00201.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sunnucks P. Efficient genetic markers for population biology. Trends in Ecology amp; Evolution. 2000;15:199–203. doi: 10.1016/s0169-5347(00)01825-5. [DOI] [PubMed] [Google Scholar]
- Teeter KC, Thibodeau LM, Gompert Z, Buerkle CA, Nachman MW, Tucker PK. The variable genomic architecture of isolation between hybridizing species of house mice. Evolution. 2010;64:472–485. doi: 10.1111/j.1558-5646.2009.00846.x. [DOI] [PubMed] [Google Scholar]
- Twyford AD, Ennos RA. Next-generation hybridization and introgression. Heredity. 2012;108:179–189. doi: 10.1038/hdy.2011.68. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Willing E-M, Hoffmann M, Klein JD, Weigel D, Dreyer C. Paired-end RAD-seq for de-novo assembly and marker design without available reference. Bioinformatics. 2011;27:2187–2193. doi: 10.1093/bioinformatics/btr346. [DOI] [PubMed] [Google Scholar]
- Zerbino DR, Birney E. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Research. 2008;18:821–829. doi: 10.1101/gr.074492.107. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.





