Abstract
Evolutionary biologists from Darwin forward have dreamed of having data that would elucidate our understanding of evolutionary history and the diversity of life. Sequence capture is a relatively old DNA technology, but its use is growing rapidly due to advances in: 1) massively parallel DNA sequencing approaches and instruments, 2) massively parallel bait construction, 3) methods to identify target regions, and 4) sample preparation. We give a little historical context to these developments, summarize some of the important advances reported in this special issue, and point to further advances that can be made to help fulfill Darwin’s dream.
Keywords: biotinylated baits, Illumina, massively parallel sequencing, next-generation sequencing, sequence capture, target enrichment
Introduction
One of Darwin’s primary goals was to understand and explain “the [evolutionary] bond” that connected the diversity of life (Darwin 1859). Although Darwin did not understand the explicit genetic mechanisms underlying shared evolutionary history, he was convinced that careful scientific study and testing of his theory would elucidate both the mechanisms and effects of evolution by natural selection. Nuttall (1901) was among the first to use molecular genetic markers, decades before DNA was shown to carry genetic information, in pursuit of Darwin’s goal by studying blood protein interactions to infer evolutionary relationships. Crick, Franklin, Watson, and Wilkin’s discovery of the molecular structure of DNA (Watson et al. 1953; Franklin & Gosling 1953; Wilkins et al. 1953) made clear the mechanism of information transfer, Sanger et al. (1977) gave us a method for determining the sequence of DNA, and Mullis and Faloona (1987) provided an immensely powerful tool (PCR) for selectively amplifying DNA regions of interest that could be sequenced. Combining the power of PCR with conserved primers (e.g., Kocher et al. 1989) or hypervariable loci (e.g., Tautz 1989) facilitated decades of research. Evolutionary insight was gained through very hard work that applied these methods to collect data from relatively few loci that were often limited to the scope of the specific questions being addressed – whether they were phylogenetic, phylogeographic, population genetic, behavioral, or ecological.
The invention of massively parallel sequencing (MPS; Margulies et al. 2005; Bentley et al. 2008) fundamentally altered this status quo by providing a literal torrent of data while simultaneously dropping the cost of DNA sequencing to pennies per millions of bases (Glenn 2011, 2016). The ability to collect massive amounts of sequence data enabled many studies that were previously infeasible and changed a number of our assumptions about the universe of possible sequence data collection techniques (Tautz et al. 2010). Although MPS allows us to collect massive amounts of data at low costs, a variety of methodological, financial, and analytical limitations still impede our desire to simply sequence everything (Kahvejian et al. 2008; Koepfli et al. 2015; Jones & Good, 2016).
In lieu of sequencing everything, many researchers want techniques that collect data from a large number of loci across many organisms at a low, per-sample cost. Many different groups have created new ways to sample consistent, multilocus subsets of the genome from many individuals (Hardenbol et al. 2003; Meyer et al. 2007). Broadly, these methods tend to fall into one of two categories. The first uses restriction-enzymes to sample a consistent portion of the genome (Baird et al. 2008; Elshire et al. 2011; Peterson et al. 2012), and these techniques have taken center-stage for non-model organism studies at the population level (Narum et al. 2013). The second, which is often used in biomedical research as well as some evolutionary studies, is sequence capture (Albert et al. 2007; Ng et al. 2009; Faircloth et al. 2012; Tennessen et al. 2012; Lemmon et al. 2012; Bi et al. 2012).
Sequence capture is a type of target enrichment (Mamanova et al. 2010) that hybridizes single-stranded DNA or RNA baits (also called probes) to target DNA regions, physically pulls-down the targeted DNA regions of interest, and washes away unwanted DNA fragments so that the targeted DNA can be sequenced (Karagyozov et al. 1993; Kandpal et al. 1994; Okou et al. 2007; Albert et al. 2007; Gnirke et al. 2009). The background and use of sequence capture in the context of molecular ecology have recently been reviewed well by Jones and Good (2016), with which we assume readers are familiar. Here, we discuss some historical context surrounding sequence capture, then we focus on the advances made by researchers with publications in this special issue, and we close with a discussion of questions and research avenues related to sequence capture that are yet to be explored.
History of Sequence Capture
Sequence capture is a relatively old DNA technology, with a clear history in the early 1990’s, when many laboratories were developing microsatellite DNA loci (Tautz 1989; Ellegren 2004). Because microsatellites are relatively abundant in the first species investigated (~1% of cloned small DNA fragments contained microsatellite repeats in mammals), microsatellite loci could be identified by simple, though inefficient, hybridization-screening of bacterial clones and Sanger sequencing of positive colonies. As the desire grew to sequence larger numbers of microsatellite loci for genome mapping purposes, and as additional workers began to focus on non-mammalian taxa where microsatellite loci are less frequent, the need increased to develop more efficient methods for microsatellite discovery and characterization.
Brenig and Brem (1991) were the first to develop a microsatellite enrichment method. Their process attached oligonucleotides composed of microsatellite repeats to a physical surface. Hybridization of DNA when one strand is attached to a surface is not efficient, so Brenig and Brem’s method required large quantities of input DNA and only yielded ~10% of captured fragments with targeted microsatellites. Shortly thereafter, Ostrander et al. (1992) published a method that used special bacterial cloning techniques and methods derived from site-directed mutagenesis (Kunkel 1985) to enrich short-insert genomic libraries so that ~40-50% of clones contained microsatellites. Karagyozov et al. (1993) and Armour et al. (1994) then showed that oligonucleotide probes could simply be attached to nitrocellulose filters to enrich for DNA fragments containing microsatellite repeats, but again, hybridizing DNA to probes bound to a physical surface was not efficient, although it was much easier than the microbiology required for the Ostrander et al. (1992) approach.
Kandpal et al. (1994) first demonstrated an in-solution method where they hybridized DNA to biotinylated oligonucleotides of microsatellite repeats and used beads coated with streptavidin to pull down the biotinylated probes attached to the DNA during hybridization. Kijas et al. (1994) then modified the approach of Kandpal et al. (1994) to use streptavidin coated magnetic particles, which improved both its ease and efficiency. Further refinements of the Kijas et al. (1994) methods have been in widespread use for the past two decades (e.g., Hamilton et al. 1999; Zane et al. 2002; Glenn & Schable 2005), only recently being displaced by microsatellite characterization methods that simply sequence random genomic libraries using low-cost MPS (e.g., Castoe et al. 2012) rather than enriching, cloning, and Sanger sequencing microsatellite libraries or enrichment followed by MPS.
Shortly after MPS methods became available, the history of capturing DNA with synthetic probes was recapitulated, but this time, Albert et al. (2007) and others (see references in Jones & Good 2016) demonstrated that the probes could be thousands of oligonucleotides synthesized on microarrays. Gnirke et al. (2009) and others (see references in Jones & Good 2016) showed that cleaving the oligonucleotides from the surface of the microarray chips to facilitate in-solution hybridization of biotinylated oligonucleotides followed by pull-down with streptavidin-coated magnetic beads was superior to capturing desired targets with oligonucleotides attached to the solid surface of microarrays. In the years that have followed, nearly all companies and researchers have adopted these in-solution approaches while also adopting the term “baits” in place of “probes” (cf. (Blumenstiel et al. 2010).
Special Issue Summary
The papers in this special issue demonstrate the varied and novel uses of in-solution sequence capture to collect genome-scale data from a large number of individuals across a variety of challenging scenarios. These difficult scenarios include data collection across different types of organisms (vertebrates, invertebrates, plants, bacteria, and viruses) from different types of target regions (exons, mitochondrial DNA, ultraconserved elements [UCEs], and dsRNA) across extraordinarily different genome sizes (viruses to salamanders and pine trees) using various types of capture baits (RNA, DNA, and monoclonal antibodies) from libraries made from DNA of variable quality (from high- to low-quality modern DNA to low-quality museum DNA). Each of these papers also focuses on specific methodological advances that were needed to tackle these challenging scenarios, several of which we highlight, below.
Genome size and enrichment success
C-values of organismal genomes are the subject of much theoretical (Thomas 1971; Eddy 2012) and methodological consternation (Nystedt et al. 2013; Neale et al. 2014), and organisms having large genomes are particularly difficult to work with, even with the multitude of available techniques. Because sequence capture can explicitly target and enrich particular genomic regions, it may offer one of the few, efficient ways to collect data from organisms having large-genomes. However, exactly how to make sequence capture function optimally in these organisms is poorly known. In this issue, several papers tackle this problem, demonstrating a variety of approaches to increase the efficiency of sequence capture from frogs (Portik et al. 2016), salamanders (McCartney-Melstad et al. 2016), and pines (Suren et al. 2016). Suren et al. (2016) also provide methods to deal with limitations imposed by incomplete reference genome assemblies, while Portik et al. (2016) and McCartney-Melstad et al. (2016) demonstrate the importance of different types of blocking DNA, which reduces the negative effect of non-specific hybridization, a particularly acute problem when collecting data from large-genome organisms. McCartney-Melstad et al. (2016) and Portik et al. (2016) also show the inverse relationship between input DNA concentration or library pooling and enrichment success. By contrast, Hoffberg et al. (2016) show that in organisms with smaller genomes and using libraries prepared with restriction enzymes, at least 96 samples can be pooled and successfully enriched for desired targets.
Sample quality
Molecular ecologists frequently deal with DNA of suboptimal quantity and quality. These samples are sometimes collected under difficult circumstances in the field (Roffler et al. 2016), they may represent partially degraded DNA from gut contents (Campana et al. 2016), or they can be highly-degraded, historical DNA extracted from museum samples (McCormack et al. 2015; Hawkins et al. 2016; Lim & Braun 2016). Even fresh samples collected from many invertebrates (Campana et al. 2016; Dowle et al. 2016; Teasdale et al. 2016; Yuan et al. 2016) and plants (Blouin et al. 2016; Schmickl et al. 2016; Suren et al. 2016; Hoffberg et al. 2016) are difficult to work with because the required extraction procedures physically damage the resulting DNA or leave impurities. Sequence capture makes these samples useable when other protocols, like RAD-seq, may not work well with damaged or impure DNA. Lim & Braun (2016) illustrate techniques that can be used to minimize the amount and effects of damaged DNA in the resulting data.
Baits
Most papers in this issue develop new bait sets, most frequently for exons (Bragg et al. 2016; McCartney-Melstad et al. 2016; Portik et al. 2016; Powell et al. 2016; Roffler et al. 2016; Suren et al. 2016; Teasdale et al. 2016; Yuan et al. 2016), and many of the papers investigate the taxonomic range across which these newly developed baits may be useful, often focusing on the level of sequence divergence that baits can tolerate while still producing a successful outcome (Bragg et al. 2016; Portik et al. 2016; Suren et al. 2016). Twelve of the 15 papers in this issue use RNA baits from MYcroarray (www.mycroarray.com), whereas two studies (Bragg et al. 2016; Suren et al. 2016) use DNA baits from NimbleGen (sequencing.roche.com), one (Powell et al. 2016) uses RNA baits from Agilent (www.genomics.agilent.com), and one (Blouin et al. 2016) uses a monoclonal antibody to dsRNA. Hoffberg et al. (2016) take a new approach to bait design that combines the positive aspects of RAD-seq with those of sequence capture to reliably enrich polymorphic, anonymous loci from hundreds of individuals using a protocol that is exceptionally fast, easy, and cost-effective.
Applications
It is possible to group the papers in this issue in a variety of ways, but regardless of grouping, it is clear that the applications of sequence capture represented by each are very diverse and include: genome mapping (Suren et al. 2016), population genetics and phylogeography (McCormack et al. 2015; Lim & Braun 2016; McCartney-Melstad et al. 2016; Hoffberg et al. 2016), parasite and disease detection and ecology (Blouin et al. 2016; Campana et al. 2016; Yuan et al. 2016), environmental monitoring (Dowle et al. 2016), phylogenetics (Bragg et al. 2016; Hawkins et al. 2016; Portik et al. 2016; Schmickl et al. 2016; Teasdale et al. 2016), and the identification of candidate genes influenced by selection (Powell et al. 2016; Roffler et al. 2016).
Open questions and future prospects
Despite the methodological and technical advances made by these and other publications, there are a number of unanswered questions that affect how we use sequence capture, and there are several promising avenues for future research that have not been investigated.
Not all baits are created equal
Although most researchers use in-solution biotinylated DNA or RNA baits, there are a wide variety of options for bait design and many unknowns regarding the optimality of both bait design and hybridization conditions. For example, we know little about the effects of tiling density (c.f. Tewhey et al. 2009) or bait sequence on the downstream success of a given enrichment reaction. Similarly, we do not have a good understanding of the differences in efficiency between RNA baits (Gnirke et al. 2009) and DNA baits when their sequence is identical, and little empirical work focuses on understanding the relationship between bait concentration and enrichment success across organisms spanning a variety of genome sizes.
Bait length
The role of bait length relative to enrichment success is also understudied. The length of individual target enrichment baits matters, and unlike PCR primers, which are short and sensitive to 3’ mismatches, target enrichment baits are long (60-120 mer), insensitive to 3’ mismatches, and tolerant to mismatches with their desired targets as a function of their length. However, there are important trade-offs regarding bait length, because longer baits: 1) cost more to synthesize, 2) will contain more synthesis errors, 3) are limited in length by synthesis technologies, 4) have greater potential for secondary structures, and 5) take longer to hybridize. Many studies have settled on using 120-mer RNA baits to strike a balance among these factors, although in certain situations shorter baits would be a better choice for enrichments from degraded and/or formalin-fixed DNA whereas longer baits could pull down larger fragments of DNA that are suitable for sequencing with long-read technologies (Eid et al. 2009; Quick et al. 2014; Jain et al. 2015).
Bait targets and phylogenetic breadth of bait conservation
Although the use of 454 sequencing and hybridization on microarrays has faded into sequence capture history, the legacy of targeting exons remains with the field. Sequence capture of exons provides researchers a variety of well-known advantages (Jones & Good 2016), and exon enrichment is the focus of nine publications in this special issue. But, exons are only one of many possible genomic targets, and sequence conservation analyses clearly demonstrate an inverse relationship between exon enrichment success and phylogenetic distance (McCormack et al. 2012; Bragg et al. 2016; Portik et al. 2016; Jones & Good 2016). The biological reality of exon molecular evolution, where the third-position of codons is more likely to vary, distributes mismatches uniformly along the lengths of divergent target regions, making exons harder to capture among divergent taxa. Simply focusing on exonic sequence may also mislead certain types of inferences due to genome-wide convergence among coding sequence (Castoe et al. 2009; Jarvis et al. 2014).
Ideally, many researchers would like to have bait sets that work across a wide range of species for a variety of purposes, so that orthologous sequence data can be collected at all levels of divergence from thousands of species. This desire has driven the development of bait sets that enrich sequence from hundreds or thousands of conserved genomic regions (Faircloth et al. 2012; Lemmon et al. 2012). Although it is logical to question the utility of enriching conserved loci when variable positions are needed, available evidence suggests that enriching and analyzing these loci and the sequence that surrounds them yields variable sequences among individuals at a variety of levels of divergence (Smith et al. 2014; Leaché et al. 2015; Manthey et al. 2016) that may be less biased than data obtained from exons (Jarvis et al. 2014). That said, the effects of purifying selection at these loci (Bejerano et al. 2004; Harvey et al. 2016; Jones & Good 2016) could introduce problems for certain types of analysis, not unlike those seen for exons.
Finally, there are questions where exons and conserved loci may be too conserved to be useful (Giarla & Esselstyn 2015) or where the phylogenetic breadth of the problem is too wide - requiring thousands or tens of thousands of variable sites at the species, population, and individual level. Here is where the third alternative of capturing and sequencing baits derived from variable, anonymous loci (Ali et al. 2016; Hoffberg et al. 2016) fills an important gap. First, these loci can be collected and sequenced when individual levels of resolution are needed and other marker types fail. Second, because sequence capture is such a flexible technique, it is straightforward to prepare cocktails of different bait sequences that target conserved regions, exons, and anonymous loci providing the one, two, three punch of data collected at deep, moderate, and (very) shallow levels simultaneously. The techniques needed to optimize this latter approach deserve additional research.
Bait synthesis
Another avenue for future research involves the process of synthesizing and re-synthesizing target enrichment baits. Typically, most users order commercial baits from companies like Agilent, MYcroarray, and NimbleGen. These companies synthesize custom oligo pools with universal priming sites on each end, then use the universal primers to create RNA baits using in vitro transcription (Blumenstiel et al. 2010). The resulting pool of RNA baits is sold in limited quantity at relatively high cost. As an alternative to this full-service approach, individual researchers could obtain custom oligo pools from MYcroarray, CustomArray (www.customarrayinc.com), or others and create their own baits (Liu et al. 2016) in large quantities at low cost. When smaller numbers of baits are needed, it is possible to simply order biotinylated oligonucleotides from suppliers such as IDT (www.idtdna.com), Sigma-Aldrich (www.sigmaaldrich.com), and others and use them directly for enrichment. It is also possible to use short biotinylated primers to synthesize probes as part of the enrichment process (e.g., primer extension capture, PEC; Briggs et al. 2009) or to use biotinylated PCR products as baits (Maricic et al. 2010; Peñalba et al. 2014). Many additional possibilities await creative minds.
Library length
In addition to the length of baits, the length of fragments in the libraries being enriched plays a critical role in sequence capture experiments. It is well appreciated that as library insert length increases, the size of the contigs that can be assembled also increases (McCormack et al. 2015; Jones & Good 2016). However, it may not be as well appreciated that sequencing depth (see Figure 2a of Portik et al. 2016) and sequence length also play critical roles. Importantly, on Illumina sequencers, the real limitation has not been read length or depth of coverage, but instead the length of fragments that can be clonally amplified for successful cluster generation and sequencing (~800 bp). As Illumina develops new technologies for cluster generation and as researchers explore other sequencing approaches, such as PacBio or Oxford Nanopore, it will be important to determine the maximum size of fragments that can be captured and sequenced successfully. Long-read technologies will likely change the game with respect to how we apply sequence capture to a given question and how we use the resulting data.
Potpourri
There are many additional areas in which sequence capture techniques could and likely will be used in the future. Growth areas are sure to include microbiome characterization, environmental DNA assessments, multilocus capture-based barcoding, host-pathogen interactions, and pathogen discovery and diagnostics. To continue growth into these and other areas, it will be critical to better optimize enrichment, reduce the cost of baits and library preparations, and increase the availability of baits to user-groups. Experiments that investigate the effects of bait, blocker, and target composition and concentration will be critical to optimized enrichment. We will need new approaches beyond dilution and sharing (see Heyduk et al. 2016) to reduce the cost of baits, especially when only modest numbers (dozens to hundreds) of biotinylated oligonucleotides are needed. Costs per sample also rely critically upon library preparation expenses (Meyer & Kircher 2010; Fisher et al. 2011; Head et al. 2014), thus lower-cost, highly efficient library preparation techniques that allow multiplexing of hundreds to thousands of samples per sequencing run (e.g., Glenn et al. 2016) will be critical.
Back to the future
MPS and sequence capture are facilitating our ability to address Darwin’s questions of evolutionary bonds that connect the diversity of life by enabling efficient, genome-scale data collection across an enormous number of organisms. Sequence capture techniques let us work with sample types from species having enormous genomes and those rare or extinct species with only low quality DNA sources. We can use the data collected to study evolutionary relationships including those within families to those between species to those relating hosts and their pathogens. Sequence capture techniques also allow us to study patterns of heterozygosity, genome organization, and mutation where it was not possible before. And, we wonder if the future of sequence capture will include a return to its humble roots where a single low-cost mixture of biotinylated microsatellite repeats could serve as truly universal baits. Microsatellite baits would capture both highly variable repeats and less variable flanking DNA to help understand patterns of genetic variation within any eukaryote. However, as the discovery of microsatellites has shown, sequence capture will be subsumed by our ability to simply sequence everything. Regardless of the eventual outcome, we’re pretty sure Darwin would be captivated by what we can do now.
Acknowledgements
We thank the authors of each manuscript in this special issue for their contribution, A Moussalli for co-editing this special issue and A Geraldes, A DeWoody, S Narum, and K Chambers for their assistance and patience in organizing and editing the special issue. Partial funding for our sequence capture work has been supported by DEB-1242260 (to BCF), DEB-1242241 (to TCG) and OISE 0730218 (to TCG) from the U.S. National Science Foundation and P20GM103395 from the U.S. National Institutes of Health. We thank and acknowledge our many collaborators, colleagues, and students who have generously assisted with our sequence capture experiments over the years. We have had a lot of fun and look forward more in the future.
Footnotes
Conflicts of Interest
TCG declares competing interests. The EHS DNA lab provides oligonucleotide aliquots and services at cost, including some services referenced in this manuscript. BCF declares no competing interests.
References
- Albert T, Molla M, Muzny D, et al. Direct selection of human genomic loci by microarray hybridization. Nature Methods. 2007;4:903–905. doi: 10.1038/nmeth1111. [DOI] [PubMed] [Google Scholar]
- Ali OA, O’Rourke SM, Amish SJ, et al. RAD Capture (Rapture): Flexible and Efficient Sequence-Based Genotyping. Genetics. 2016;202:389–400. doi: 10.1534/genetics.115.183665. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Armour JA, Neumann R, Gobert S, Jeffreys AJ. Isolation of human simple repeat loci by hybridization selection. Human molecular genetics. 1994;3:599–565. doi: 10.1093/hmg/3.4.599. [DOI] [PubMed] [Google Scholar]
- Baird N, Etter P, Atwood T, et al. Rapid SNP discovery and genetic mapping using sequenced RAD markers. PloS one. 2008;3:e3376. doi: 10.1371/journal.pone.0003376. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bejerano G, Pheasant M, Makunin I, et al. Ultraconserved elements in the human genome. Science. 2004;304:1321–1325. doi: 10.1126/science.1098119. [DOI] [PubMed] [Google Scholar]
- Bentley DR, Balasubramanian S, Swerdlow HP, et al. Accurate whole human genome sequencing using reversible terminator chemistry. Nature. 2008;456:53–59. doi: 10.1038/nature07517. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bi K, Vanderpool D, Singhal S, et al. Transcriptome-based exon capture enables highly cost-effective comparative genomic data collection at moderate evolutionary scales. BMC genomics. 2012;13:403. doi: 10.1186/1471-2164-13-403. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Blouin AG, Ross HA, Hobson-Peters J, et al. A new virus discovered by immunocapture of double- stranded RNA, a rapid method for virus enrichment in metagenomic studies. Molecular Ecology Resources. 2016;16 doi: 10.1111/1755-0998.12525. xxxx-xxxx, THIS ISSUE. [DOI] [PubMed] [Google Scholar]
- Blumenstiel B, Cibulskis K, Fisher S, et al. Targeted exon sequencing by in-solution hybrid selection. Current protocols in human genetics / editorial board, Jonathan L. Haines … [et al.] 2010 doi: 10.1002/0471142905.hg1804s66. Chapter 18, Unit–18.4. [DOI] [PubMed] [Google Scholar]
- Bragg JG, Potter S, Moritz C. Exon capture phylogenomics: efficacy across scales of divergence. Molecular Ecology Resources. 2016;16 doi: 10.1111/1755-0998.12449. xxxx-xxxx, THIS ISSUE. [DOI] [PubMed] [Google Scholar]
- Brenig B, Brem G. Direct cloning of sequence tagged microsatellite sites by DNA affinity chromatography. Nucleic acids research. 1991;19:5441. doi: 10.1093/nar/19.19.5441. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Briggs AW, Good JM, Green RE, et al. Primer extension capture: targeted sequence retrieval from heavily degraded DNA sources. Journal of visualized experiments: JoVE. 2009:1573. doi: 10.3791/1573. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Campana MG, Hawkins MTR, Henson LH, et al. Simultaneous identification of host, ectoparasite and pathogen DNA via in-solution capture. Molecular Ecology Resources. 2016;16 doi: 10.1111/1755-0998.12524. xxxx-xxxx, THIS ISSUE. [DOI] [PubMed] [Google Scholar]
- Castoe TA, de Koning APJ, Kim H-M, et al. Evidence for an ancient adaptive episode of convergent molecular evolution. Proceedings of the National Academy of Sciences of the United States of America. 2009;106:8986–8991. doi: 10.1073/pnas.0900233106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Castoe TA, Poole AW, de Koning APJ, et al. Rapid microsatellite identification from Illumina paired-end genomic sequencing in two birds and a snake. PloS one. 2012;7:e30953. doi: 10.1371/journal.pone.0030953. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Darwin CR. On the origin of species by means of natural selection, or the preservation of favoured races in the struggle for life. John Murray; London: 1859. [PMC free article] [PubMed] [Google Scholar]
- Dowle EJ, Pochon X, Banks JC, Wood SA. Targeted gene enrichment and high-throughput sequencing for environmental biomonitoring: a case study using freshwater macroinvertebrates. Molecular Ecology Resources. 2016;16 doi: 10.1111/1755-0998.12488. K S. xxxx-xxxx, THIS ISSUE. [DOI] [PubMed] [Google Scholar]
- Eddy SR. The C-value paradox, junk DNA and ENCODE. Current biology: CB. 2012;22:R898. doi: 10.1016/j.cub.2012.10.002. [DOI] [PubMed] [Google Scholar]
- Eid J, Fehr A, Gray J, et al. Real-time DNA sequencing from single polymerase molecules. Science. 2009;323:133–138. doi: 10.1126/science.1162986. [DOI] [PubMed] [Google Scholar]
- Ellegren H. Microsatellites: simple sequences with complex evolution. Nature reviews. Genetics. 2004;5:435–445. doi: 10.1038/nrg1348. [DOI] [PubMed] [Google Scholar]
- Elshire RJ, Glaubitz JC, Sun Q, et al. A robust, simple genotyping-by-sequencing (GBS) approach for high diversity species. PloS one. 2011;6:e19379. doi: 10.1371/journal.pone.0019379. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Faircloth BC, McCormack JE, Crawford NG, et al. Ultraconserved elements anchor thousands of genetic markers spanning multiple evolutionary timescales. Systematic Biology. 2012;61:717–726. doi: 10.1093/sysbio/sys004. [DOI] [PubMed] [Google Scholar]
- Fisher S, Barry A, Abreu J, et al. A scalable, fully automated process for construction of sequence-ready human exome targeted capture libraries. Genome Biology. 2011;12:R1. doi: 10.1186/gb-2011-12-1-r1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Franklin RE, Gosling RG. Molecular configuration in sodium thymonucleate. Nature. 1953;171:740–741. doi: 10.1038/171740a0. [DOI] [PubMed] [Google Scholar]
- Nuttall George H. F. The New Biological Test for Blood in Relation to Zoological Classification. Proceedings of the Royal Society of London. 1901;69:150–153. [Google Scholar]
- Giarla TC, Esselstyn JA. The challenges of resolving a rapid, recent radiation: empirical and simulated phylogenomics of Philippine shrews. Systematic Biology. 2015;64:727–740. doi: 10.1093/sysbio/syv029. [DOI] [PubMed] [Google Scholar]
- Glenn TC. Field guide to next-generation DNA sequencers. Molecular Ecology Resources. 2011;11:759–769. doi: 10.1111/j.1755-0998.2011.03024.x. [DOI] [PubMed] [Google Scholar]
- Glenn TC. Update to the field guide to next-generation DNA sequencers. 2016 doi: 10.1111/j.1755-0998.2011.03024.x. molecologist.com. [DOI] [PubMed]
- Glenn TC, Nilsen R, Kieran TJ, et al. Adapterama I: Universal stubs and primers for thousands of dual-indexed Illumina libraries (iTru & iNext) bioRxiv. 2016:049114. doi: 10.7717/peerj.7755. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Glenn T, Schable N. Isolating microsatellite DNA loci. Methods in Enzymology. 2005;395:202–222. doi: 10.1016/S0076-6879(05)95013-1. [DOI] [PubMed] [Google Scholar]
- Gnirke A, Melnikov A, Maguire J, et al. Solution hybrid selection with ultra-long oligonucleotides for massively parallel targeted sequencing. Nature Biotechnology. 2009;27:182–189. doi: 10.1038/nbt.1523. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hamilton M, Pincus E, Fiore A, Fleischer R. Universal linker and ligation procedures for construction of genomic DNA libraries enriched for microsatellites. BioTechniques. 1999;27 doi: 10.2144/99273st03. [DOI] [PubMed] [Google Scholar]
- Hardenbol P, Banér J, Jain M, et al. Multiplexed genotyping with sequence-tagged molecular inversion probes. Nature Biotechnology. 2003;21:673–678. doi: 10.1038/nbt821. [DOI] [PubMed] [Google Scholar]
- Harvey MG, Smith BT, Glenn TC, Faircloth BC, Brumfield RT. Sequence capture versus restriction site associated DNA sequencing for shallow systematics. Systematic biology. 2016 doi: 10.1093/sysbio/syw036. [DOI] [PubMed] [Google Scholar]
- Hawkins MTR, Hofman CA, Callicrate T, et al. In-solution hybridization for mammalian mitogenome enrichment: pros, cons and challenges associated with multiplexing degraded DNA. Molecular Ecology Resources. 2016;16 doi: 10.1111/1755-0998.12448. xxxx-xxxx, THIS ISSUE. [DOI] [PubMed] [Google Scholar]
- Head SR, Komori HK, LaMere SA, et al. Library construction for next-generation sequencing: overviews and challenges. BioTechniques. 2014;56:61–68. doi: 10.2144/000114133. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Heyduk K, Trapnell DW, Barrett CF, Leebens-Mack J. Phylogenomic analyses of species relationships in the genus Sabal (Arecaceae) using targeted sequence capture. Biological Journal of the Linnean Society. Linnean Society of London. 2016;117:106–120. [Google Scholar]
- Hoffberg S, Kieran TJ, Catchen JM, et al. RADcap: Sequence Capture of Dual-digest RADseq Libraries with Identifiable Duplicates and Reduced Missing Data. Molecular Ecology Resources. 2016;16 doi: 10.1111/1755-0998.12566. xxxx-xxxx, THIS ISSUE. [DOI] [PubMed] [Google Scholar]
- Jain M, Fiddes IT, Miga KH, et al. Improved data analysis for the MinION nanopore sequencer. Nature Methods. 2015;12:351–356. doi: 10.1038/nmeth.3290. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jarvis ED, Mirarab S, Aberer AJ, et al. Whole-genome analyses resolve early branches in the tree of life of modern birds. Science. 2014;346:1320–1331. doi: 10.1126/science.1253451. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jones MR, Good JM. Targeted capture in evolutionary and ecological genomics. Molecular Ecology. 2016;25:185–202. doi: 10.1111/mec.13304. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kahvejian A, Quackenbush J, Thompson JF. What would you do if you could sequence everything? Nature biotechnology. 2008;26:1125–1133. doi: 10.1038/nbt1494. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kandpal RP, Kandpal G, Weissman SM. Construction of libraries enriched for sequence repeats and jumping clones, and hybridization selection for region-specific markers. Proceedings of the National Academy of Sciences of the United States of America. 1994;91:88–92. doi: 10.1073/pnas.91.1.88. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Karagyozov L, Kalcheva ID, Chapman VM. Construction of random small-insert genomic libraries highly enriched for simple sequence repeats. Nucleic Acids Research. 1993;21:3911–3912. doi: 10.1093/nar/21.16.3911. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kijas JM, Fowler JC, Garbett CA, Thomas MR. Enrichment of microsatellites from the citrus genome using biotinylated oligonucleotide sequences bound to streptavidin-coated magnetic particles. BioTechniques. 1994;16:656–60. [PubMed] [Google Scholar]
- Kocher TD, Thomas WK, Meyer A, et al. Dynamics of mitochondrial DNA evolution in animals: amplification and sequencing with conserved primers. Proceedings of the National Academy of Sciences of the United States of America. 1989;86:6196–6200. doi: 10.1073/pnas.86.16.6196. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Koepfli KP, Paten B, Genome 10K Community of Scientists, O’Brien SJ The genome 10K project: A way forward. Annual Review of Animal Biosciences. 2015;3:57–111. doi: 10.1146/annurev-animal-090414-014900. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kunkel TA. Rapid and efficient site-specific mutagenesis without phenotypic selection. Proceedings of the National Academy of Sciences of the United States of America. 1985;82:488–492. doi: 10.1073/pnas.82.2.488. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Leaché AD, Chavez AS, Jones LN, et al. Phylogenomics of phrynosomatid lizards: conflicting signals from sequence capture versus restriction site associated DNA sequencing. Genome Biology and Evolution. 2015;7:706–719. doi: 10.1093/gbe/evv026. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lemmon AR, Emme SA, Lemmon EM. Anchored hybrid enrichment for massively high-throughput phylogenomics. Systematic Biology. 2012;61:727–744. doi: 10.1093/sysbio/sys049. [DOI] [PubMed] [Google Scholar]
- Lim HC, Braun MJ. High-throughput SNP genotyping of historical and modern samples of five bird species via sequence capture of ultraconserved elements. Molecular Ecology Resources. 2016;16 doi: 10.1111/1755-0998.12568. xxxx-xxxx, THIS ISSUE. [DOI] [PubMed] [Google Scholar]
- Mamanova L, Coffey AJ, Scott CE, et al. Target-enrichment strategies for next-generation sequencing. Nature methods. 2010;7:111–118. doi: 10.1038/nmeth.1419. [DOI] [PubMed] [Google Scholar]
- Manthey JD, Campillo LC, Burns KJ, Moyle RG. Comparison of target-capture and restriction-site associated DNA sequencing for phylogenomics: A test in cardinalid tanagers (Aves, Genus: Piranga) Systematic Biology. 2016;65:640–650. doi: 10.1093/sysbio/syw005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Margulies M, Egholm M, Altman WE, et al. Genome sequencing in microfabricated high-density picolitre reactors. Nature. 2005;437:376–380. doi: 10.1038/nature03959. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Maricic T, Whitten M, Pääbo S. Multiplexed DNA sequence capture of mitochondrial genomes using PCR products. PloS ONE. 2010;5:e14004. doi: 10.1371/journal.pone.0014004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- McCartney-Melstad, Mount GG, Shaffer HB. Exon capture optimization in amphibians with large genomes. Molecular Ecology Resources. 2016;16 doi: 10.1111/1755-0998.12538. xxxx-xxxx, THIS ISSUE. [DOI] [PubMed] [Google Scholar]
- McCormack JE, Faircloth BC, Crawford NG, et al. Ultraconserved elements are novel phylogenomic markers that resolve placental mammal phylogeny when combined with species tree analysis. Genome Research. 2012;22:746–754. doi: 10.1101/gr.125864.111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- McCormack JE, Tsai WLE, Faircloth BC. Sequence capture of ultraconserved elements from bird museum specimens. Molecular Ecology Resources. 2015;16 doi: 10.1111/1755-0998.12466. xxxx-xxxx, THIS ISSUE. [DOI] [PubMed] [Google Scholar]
- Meyer M, Kircher M. Illumina Sequencing Library Preparation for Highly Multiplexed Target Capture and Sequencing. Cold Spring Harbor protocols. 2010;2010 doi: 10.1101/pdb.prot5448. [DOI] [PubMed] [Google Scholar]
- Meyer M, Stenzel U, Myles S, Prüfer K, Hofreiter M. Targeted high-throughput sequencing of tagged nucleic acid samples. Nucleic Acids Research. 2007;35 doi: 10.1093/nar/gkm566. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mullis KB, Faloona FA. Specific synthesis of DNA in vitro via a polymerase-catalyzed chain reaction. Methods in Enzymology. 1987;155:335–350. doi: 10.1016/0076-6879(87)55023-6. [DOI] [PubMed] [Google Scholar]
- Narum SR, Buerkle CA, Davey JW, Miller MR, Hohenlohe PA. Genotyping-by-sequencing in ecological and conservation genomics. Molecular Ecology. 2013;22:2841–2847. doi: 10.1111/mec.12350. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Neale DB, Wegrzyn JL, Stevens KA, et al. Decoding the massive genome of loblolly pine using haploid DNA and novel assembly strategies. Genome Biology. 2014;15:R59. doi: 10.1186/gb-2014-15-3-r59. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ng SB, Turner EH, Robertson PD, et al. Targeted capture and massively parallel sequencing of 12 human exomes. Nature. 2009;461:272–276. doi: 10.1038/nature08250. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nystedt B, Street NR, Wetterbom A, et al. The Norway spruce genome sequence and conifer genome evolution. Nature. 2013;497:579–584. doi: 10.1038/nature12211. [DOI] [PubMed] [Google Scholar]
- Okou DT, Steinberg KM, Middle C, et al. Microarray-based genomic selection for high-throughput resequencing. Nature Methods. 2007;4:907–909. doi: 10.1038/nmeth1109. [DOI] [PubMed] [Google Scholar]
- Ostrander EA, Jong PM, Rine J, Duyk G. Construction of small-insert genomic DNA libraries highly enriched for microsatellite repeat sequences. Proceedings of the National Academy of Sciences of the United States of America. 1992;89:3419–3423. doi: 10.1073/pnas.89.8.3419. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Peñalba JV, Smith LL, Tonione MA, et al. Sequence capture using PCR-generated probes: a cost-effective method of targeted high-throughput sequencing for nonmodel organisms. Molecular Ecology Resources. 2014;14:1000–1010. doi: 10.1111/1755-0998.12249. [DOI] [PubMed] [Google Scholar]
- Peterson BK, Weber JN, Kay EH, Fisher HS, Hoekstra HE. Double digest RADseq: an inexpensive method for de novo snp discovery and genotyping in model and non-model species. PloS ONE. 2012;7:e37135. doi: 10.1371/journal.pone.0037135. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Portik DM, Smith LL, Bi K. An evaluation of transcriptome-based exon capture for frog phylogenomics across multiple scales of divergence (Class: Amphibia, Order: Anura) Molecular ecology resources. 2016 doi: 10.1111/1755-0998.12541. [DOI] [PubMed] [Google Scholar]
- Powell JH, Amish SJ, Haynes GD, Luikart G, Latch EK. Candidate adaptive genes associated with lineage divergence: Identifying SNPs via next-generation targeted re-sequencing in mule deer (Odocoileus hemionus) Molecular Ecology Resources. 2016;16 doi: 10.1111/1755-0998.12572. xxxx-xxxx, THIS ISSUE. [DOI] [PubMed] [Google Scholar]
- Quick J, Quinlan AR, Loman NJ. A reference bacterial genome dataset generated on the MinION™ portable single-molecule nanopore sequencer. GigaScience. 2014;3:22. doi: 10.1186/2047-217X-3-22. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Roffler GH, Amish SJ, Smith S, et al. SNP discovery in candidate adaptive genes using exon capture in a free-ranging alpine ungulate. Molecular Ecology Resources. 2016;16 doi: 10.1111/1755-0998.12560. xxxx-xxxx, THIS ISSUE. [DOI] [PubMed] [Google Scholar]
- Sanger F, Nicklen S, Coulson AR. DNA sequencing with chain-terminating inhibitors. Proceedings of the National Academy of Sciences of the United States of America. 1977;74:5463–5467. doi: 10.1073/pnas.74.12.5463. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schmickl R, Liston A, Zeisek V, et al. Phylogenetic marker development for target enrichment from transcriptome and genome skim data: the pipeline and its application in southern African Oxalis (Oxalidaceae) Molecular Ecology Resources. 2016;16 doi: 10.1111/1755-0998.12487. xxxx-xxxx, THIS ISSUE. [DOI] [PubMed] [Google Scholar]
- Smith BT, Harvey MG, Faircloth BC, Glenn TC, Brumfield RT. Target capture and massively parallel sequencing of ultraconserved elements (UCEs) for comparative studies at shallow evolutionary time scales. Systematic biology. 2014;63:83–95. doi: 10.1093/sysbio/syt061. [DOI] [PubMed] [Google Scholar]
- Suren H, Hodgins KA, Yeaman S, et al. Exome capture from the spruce and pine giga-genomes. Molecular Ecology Resources. 2016;16 doi: 10.1111/1755-0998.12570. xxxx-xxxx, THIS ISSUE. [DOI] [PubMed] [Google Scholar]
- Tautz D. Hypervariabflity of simple sequences as a general source for polymorphic DNA markers. Nucleic Acids Research. 1989;17:6463–6471. doi: 10.1093/nar/17.16.6463. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tautz D, Ellegren H, Weigel D. Next generation molecular ecology. Molecular ecology. 2010;19(Suppl 1):1–3. doi: 10.1111/j.1365-294X.2009.04489.x. [DOI] [PubMed] [Google Scholar]
- Teasdale LC, Köhler F, Murray KD, O’Hara T, Moussalli A. Identification and qualification of 500 nuclear, single-copy, orthologous genes for the Eupulmonata (Gastropoda) using transcriptome sequencing and exon-capture. Molecular Ecology Resources. 2016;16 doi: 10.1111/1755-0998.12552. xxxx-xxxx, THIS ISSUE. [DOI] [PubMed] [Google Scholar]
- Tennessen JA, Bigham AW, O’Connor TD, et al. Evolution and functional impact of rare coding variation from deep sequencing of human exomes. Science. 2012;337:64–69. doi: 10.1126/science.1219240. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tewhey R, Nakano M, Wang X, et al. Enrichment of sequencing targets from the human genome by solution hybridization. Genome Biology. 2009;10:R116. doi: 10.1186/gb-2009-10-10-r116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Thomas CA., Jr The genetic organization of chromosomes. Annual review of genetics. 1971;5:237–256. doi: 10.1146/annurev.ge.05.120171.001321. [DOI] [PubMed] [Google Scholar]
- Watson JD, Crick FHC. Molecular structure of nucleic acids. Nature. 1953;171:737–738. doi: 10.1038/171737a0. Others. [DOI] [PubMed] [Google Scholar]
- Wilkins MHF, Stokes AR, Wilson HR. Molecular structure of deoxypentose nucleic acids. Nature. 1953;171:738–740. doi: 10.1038/171738a0. [DOI] [PubMed] [Google Scholar]
- Yuan H, Jiang J, Jimenez FA, et al. Target gene enrichment in the cyclophyllidean cestodes, the most diverse group of tapeworms. Molecular Ecology Resources. 2016;16 doi: 10.1111/1755-0998.12532. xxxx-xxxx, THIS ISSUE. [DOI] [PubMed] [Google Scholar]
- Zane L, Bargelloni L, Patarnello T. Strategies for microsatellite isolation: a review. Molecular ecology. 2002;11:1–16. doi: 10.1046/j.0962-1083.2001.01418.x. [DOI] [PubMed] [Google Scholar]