Skip to main content
Genome Research logoLink to Genome Research
. 2017 Jul;27(7):1230–1237. doi: 10.1101/gr.219675.116

Extending the spectrum of DNA sequences retrieved from ancient bones and teeth

Isabelle Glocke 1, Matthias Meyer 1
PMCID: PMC5495074  PMID: 28408382

Abstract

The number of DNA fragments surviving in ancient bones and teeth is known to decrease with fragment length. Recent genetic analyses of Middle Pleistocene remains have shown that the recovery of extremely short fragments can prove critical for successful retrieval of sequence information from particularly degraded ancient biological material. Current sample preparation techniques, however, are not optimized to recover DNA sequences from fragments shorter than ∼35 base pairs (bp). Here, we show that much shorter DNA fragments are present in ancient skeletal remains but lost during DNA extraction. We present a refined silica-based DNA extraction method that not only enables efficient recovery of molecules as short as 25 bp but also doubles the yield of sequences from longer fragments due to improved recovery of molecules with single-strand breaks. Furthermore, we present strategies for monitoring inefficiencies in library preparation that may result from co-extraction of inhibitory substances during DNA extraction. The combination of DNA extraction and library preparation techniques described here substantially increases the yield of DNA sequences from ancient remains and provides access to a yet unexploited source of highly degraded DNA fragments. Our work may thus open the door for genetic analyses on even older material.


Recent methodological advances in ancient DNA research have enabled the generation of genome-wide sequence data from hundreds of Holocene and Late Pleistocene biological remains at various levels of quality, including those of ancient humans (Rasmussen et al. 2010; Fu et al. 2013; Allentoft et al. 2015; Haak et al. 2015) and their extinct archaic relatives (Meyer et al. 2012; Prüfer et al. 2014). Nevertheless, successful retrieval of DNA sequences from the Middle Pleistocene, i.e., sequences older than ∼125,000 yr, has been reported in only a few studies. These include most prominently the genome sequence of a ∼700,000-yr-old horse from permafrost (Orlando et al. 2013), as well as sequences from cave bear and hominin remains from the ∼430,000-yr-old site of Sima de los Huesos in Northern Spain (Dabney et al. 2013; Meyer et al. 2014, 2016).

The difficulty of retrieving DNA sequences from very old material is not surprising, as DNA is known to degrade over time, resulting in fragmentation and chemical modifications of bases. The best characterized base damage in ancient DNA arises from hydrolytic deamination of cytosines to uracils, which occurs predominantly in single-stranded overhangs at the ends of DNA fragments and manifests as C to T substitutions in sequence alignments (Briggs et al. 2007; Brotherton et al. 2007). Fragmentation is thought to be driven mainly by depurination, i.e., the loss of guanines and adenines, which leaves chemically instable abasic sites that lead to hydrolysis of the DNA backbone via β-elimination (Lindahl 1993; Briggs et al. 2007). DNA fragmentation causes an excess of short molecules (Pääbo 1989; Glenn et al. 1999; Poinar et al. 2003), which can be described in many samples as an inverse exponential relationship between fragment length and abundance (Handt et al. 1994; Schwarz et al. 2009; Adler et al. 2011; Allentoft et al. 2012; Orlando et al. 2013). In extremely poorly preserved material, such as the Sima de los Huesos remains, almost no authentic ancient DNA fragments longer than 45 bp can be detected (Meyer et al. 2014), underlining the importance of recovering short DNA fragments from highly degraded material.

Techniques have been developed that minimize the loss of short DNA fragments during sample preparation for high-throughput sequencing. The first step in this process is DNA extraction. Lysis of bone or tooth powder is usually performed using an EDTA/proteinase K buffer (Krings et al. 1997; Rohland and Hofreiter 2007b), which degrades hydroxyapatite and collagen (the two major components of the bone or tooth matrix), releasing DNA from the sample. The DNA then needs to be purified from the lysis buffer reagents and substances that can inhibit downstream enzymatic reactions—for example humic and fulvic acids, tannins, porphyrin products, phenolic compounds, collagen type I, and Maillard products (Tuross 1994; Scholz and Pusch 1997; Poinar 1998; Kalmar et al. 2000). Several purification methods exist, including phenol/chloroform extraction followed by alcohol precipitation (Kurosaki et al. 1993; Hänni et al. 1995), concentration and desalting of DNA using centrifuge filtration columns with defined pore sizes (Hagelberg and Clegg 1991; Leonard et al. 2000), and the most commonly used method of binding DNA to silica. Silica-based DNA extraction has seen many different implementations, some based on silica suspensions (Rohland and Hofreiter 2007a; Allentoft et al. 2015), others on silica spin columns (Yang et al. 1998; Rohland et al. 2010; Dabney et al. 2013; Gamba et al. 2016), and yet others coupling it with additional DNA purification methods (Yang et al. 1998; Rasmussen et al. 2010). Only recently, however, have implementations been developed that allow efficient recovery of DNA fragments as short as 35 bp (Dabney et al. 2013; Allentoft et al. 2015). The second step of sample preparation, the preparation of DNA libraries, is also prone to losses of short molecules. However, it has been shown that their recovery is improved when using a single-stranded library preparation method, which, unlike double-stranded methods, omits size-selective purification steps (Meyer et al. 2012; Gansauge and Meyer 2013).

Despite these advances, sequence length distributions obtained with current sample preparation techniques deviate from the negative exponential relationship predicted by simple models of DNA decay (Allentoft et al. 2012) when examining molecules shorter than ∼35 bp. It is possible that many such molecules are lost in the sample preparation process; moreover, extremely short DNA fragments may not preserve well in ancient biological material. We thus set out to explore the lower size limits of DNA preservation in ancient bones and teeth. We describe the effects of DNA extraction on the size distribution of sequences obtained from high-throughput sequencing, patterns of DNA degradation, library yields, as well as the co-extraction of inhibitory substances. The results of this work have important implications for future attempts at recovering DNA sequences from extremely poorly preserved specimens.

Results

Recovering the shortest DNA fragments from ancient bones and teeth

Determining the true fragment size distribution in ancient biological material is a profound technical challenge as usually only small amounts of DNA can be isolated from such material. Furthermore, the reagents present in the lysate as well as macromolecules coreleased with the DNA preclude attempts at separating and visualizing ancient DNA fragments without prior purification, i.e., without introducing biases. Fragment size distributions inferred from high-throughput data (see Fig. 1 for an example) are similarly skewed by biases in both DNA extraction and library preparation. In the following, we explored these biases in detail, starting out with library preparation.

Figure 1.

Figure 1.

DNA fragment size distribution reconstructed from a hominin femur fragment from Sima de los Huesos. Sequences ≥25 bp from putatively endogenous ancient DNA fragments were isolated from published sequence alignments to the human reference genome (Meyer et al. 2016) by requiring a signal of deamination to be present, i.e., a terminal C to T substitution, and removing alignments with substitutions other than C to T. The first filter depletes human contamination, whereas the second reduces the impact of spurious alignments of microbial sequences. (A) Log-transformed fragment size distribution. Fragment sizes between 40 and 60 bp provide the best fit to an exponential model of DNA decay (dashed line). (B) The same fragment size distribution plotted on a linear scale to visualize more clearly the underrepresentation of short DNA fragments.

To obtain an equimolar mixture of short DNA fragments as substrate for library preparation, we digested pUC19 plasmid DNA with DpnI, a restriction enzyme that acts on a 4-bp recognition sequence. We then prepared single-stranded DNA libraries from 0.01, 0.1, and 1 pmol of digested plasmid. After amplification and sequencing of the libraries, we counted the number of full-length sequences representing each of the expected pUC19 fragments. Sequence coverage of the DNA fragments was relatively homogenous down to 17 bp, irrespective of the input amount used, demonstrating that single-stranded library preparation in principle allows for the recovery of extremely short DNA fragments (Fig. 2A).

Figure 2.

Figure 2.

Recovery of short DNA fragments in DNA extraction and library preparation. (A) Sequence representation of DNA fragments obtained after preparing single-stranded libraries from a restriction digestion of plasmid DNA. Restriction digestion is expected to create a pool of DNA fragments in equimolar concentration. (B) Recovery of double-stranded DNA from equal quantities of a size marker (L) with four different extraction methods (buffer exchange [BE] as well as silica-based methods A–C) as visualized on a 4% agarose gel. (C) Log-transformed size distribution of DNA fragments reconstructed by sequencing DNA isolated from six ancient bones and one tooth using buffer exchange. S1: cave bear bone (Gamsulzen cave), S2: cave bear bone (Sima de los Huesos), S3: cave bear bone (Vindija cave), S4: brown bear tooth (Denisova cave), S5: yak bone (Denisova cave), S6: bison bone (Yukon permafrost), S7: beluga whale bone (North Sea). Small quantities of a 40-nt control oligonucleotide were spiked into the DNA extracts prior to library preparation.

To determine whether the underrepresentation of extremely short fragments in ancient DNA sequence data is due to losses in DNA extraction, we next devised a minimal DNA extraction and desalting procedure (hereafter referred to as ‘buffer exchange’), in which we first digested the bone or tooth matrix with an EDTA/proteinase K lysis buffer and then concentrated the lysate and removed the EDTA using spin columns with 3-kDa molecular size filters. We then inactivated the proteinase K (or greatly reduced its activity) through incubation at 95°C, exploiting the fact that denatured DNA is a suitable substrate for single-stranded library preparation. Initial experiments with a DNA size marker showed that molecules as short as 10 bp are effectively retained by buffer exchange (Fig. 2B). We thus applied this procedure to six Holocene and Pleistocene bones and one tooth preserved under different environmental conditions. We then produced libraries using very small volumes of extract to minimize potential inhibitory effects from impurities retained during buffer exchange. The fragment size distributions obtained from sequencing show an inverse exponential correlation between fragment size and abundance down to ∼18 bp (Fig. 2C), matching closely the lower size limit of DNA recovery in single-stranded library preparation. We thus conclude that much shorter DNA fragments are preserved in ancient bones and teeth than were recovered with previous methods.

Optimizing the recovery of short DNA molecules in silica-based DNA extraction

Buffer exchange contains no DNA purification step and thus retains molecules with high molecular weight, e.g., humic acids (Tuross 1994), that can inhibit enzymes used in library preparation. The most commonly used purification method for ancient DNA is based on the binding of DNA to silica at low pH in the presence of high concentrations of salt. Even though many salts promote DNA binding to silica, guanidine salts are usually chosen for this purpose as they denature proteins and reduce the carry-over of inhibitors into DNA extracts compared to nonchaotropic salts (Rohland and Hofreiter 2007b). In the recent implementation of Dabney et al. (2013) (hereafter referred to as ‘method A’), efficient recovery of molecules as short as ∼35 bp could be achieved using a binding buffer containing 5 M guanidine hydrochloride and 40% isopropanol.

In an attempt to further reduce the size cutoff of silica-based DNA extraction, we carried out a series of experiments in which we tested the influence of various parameters on the recovery of short DNA fragments in DNA extraction using a size marker as a proxy (Supplemental Fig. S1; Supplemental Table S1). We found that short DNA fragments are more efficiently recovered when increasing the alcohol concentration in the binding buffer. Adversely, EDTA, one of the main components of the lysis buffer, interferes with the recovery of short DNA fragments. Based on these experiments, we devised a new extraction procedure (‘method B’), which uses a binding buffer composed of 2 M guanidine hydrochloride and 70% isopropanol for DNA binding and a higher ratio of binding to lysis buffers to reduce the EDTA concentration in the binding step. This method recovers DNA fragments as short as 20 bp (Fig. 2B). However, increasing the alcohol concentration to 70% required decreasing the guanidine concentration to 2 M, which could make method B more prone to copurification of inhibitory substances during DNA extraction. We therefore investigated a second approach (‘method C’) that was identical to method B in DNA binding but used the high-salt binding buffer of method A as an additional wash step. Even though this wash step shifted the recovery of short DNA fragments to ≥25 bp, recovery of short molecules was substantially better than the ≥35 bp achieved with method A (Fig. 2B).

Comparisons of DNA extraction methods using ancient DNA

Using the three silica-based methods described above, we generated further DNA extracts and libraries from the seven ancient specimens (Supplemental Table S2). Extraction was performed using aliquots of the same lysate of each specimen to allow direct comparison of the results (see Supplemental Fig. S2 for an overview of the experiment design). In addition, we implemented three quality control strategies to monitor potential inefficiencies in DNA extraction and library preparation (Fig. 3). First, to quantify the loss of DNA during extraction, small amounts of a 65-bp double-stranded DNA fragment were spiked into each lysate prior to DNA extraction and quantified by digital PCR before and after DNA extraction. Second, we converted four aliquots of each extract (using 1, 3, 9, and 27 µL of 100 µL total volume) into DNA libraries to assess whether input volumes and yield of library molecules are linearly correlated. Deviations from the expected linear input-output relationship indicate the presence of inhibitory substances, which are expected to more strongly affect libraries prepared from larger volumes of extract. Because this approach requires the generation of a large number of libraries, which is not feasible in routine work, we further devised a third quality control strategy where we spiked a 40-nt control oligonucleotide into the extract at low concentration and quantified its conversion into library molecules. In addition to these controls, we determined the overall yield of library molecules by qPCR and characterized the libraries by sequencing on Illumina's HiSeq platform.

Figure 3.

Figure 3.

Quality control strategies. (A) To determine the overall efficiency at which medium-sized DNA fragments are recovered in DNA extraction, a 65-bp double-stranded DNA fragment was added to each lysate prior to DNA extraction. The concentration of the fragment was then measured before and after DNA extraction using digital PCR. (B) The conversion rate of library preparation was determined by comparing library yields measured with qPCR obtained from 3, 9, or 27 µL of extract to those obtained from 1 µL of extract. (C) In an alternative approach, a 40-nt library control oligonucleotide was added to each aliquot of DNA extract used for library preparation and to the TT buffer used as input in library negative controls. The number of library molecules generated from the control oligonucleotide was determined using a probe-based qPCR assay specific to successfully converted oligonucleotides. Note that no selection is necessary on the P7 adapter sequence, as molecules without P7 adapters lack the biotin required for bead binding in library preparation. The efficiency of library preparation is determined by comparing the number of oligonucleotide library molecules generated in the sample libraries to those in extraction and library blanks.

The average recovery rates of the extraction spike-in were 81% for method A, 85% for method B, and 89% for method C (Supplemental Fig. S3; Supplemental Table S3). Interestingly, the recovery rate for buffer exchange was significantly lower (49% on average) compared to silica-based extraction (Mann-Whitney U test: p = 1.9 × 10−7), indicating that simple DNA concentration and desalting does not prevent losses of DNA. When using small volumes of extract (up to 3 µL), linear input-output relationships between extract volumes and library molecules were observed, indicating fully efficient conversion of extract into library (Supplemental Fig. S4). However, for larger input volumes, most notably 27 µL, we observed a substantial reduction in library preparation efficiency with all methods except method A. Similar results were obtained when calculating library preparation efficiency based on the conversion rate of the control oligonucleotide (Supplemental Fig. S5). This suggests that more inhibitory substances are carried over into the extract under the low salt and high alcohol binding conditions of methods B and C, and inhibition is not noticeably reduced by the additional wash step in method C. It remains unclear whether methods B and C reduce inhibition compared to buffer exchange, as smaller volumes of lysate were used as inputs for the latter method. While both measures of library preparation efficiency consistently detect inhibition in severe cases, i.e., where the yield of library molecules is reduced to less than half (Supplemental Fig. S6), smaller signals of inhibition are obscured by experimental noise (as indicated by efficiency estimates greater than 1). Despite these limitations, the fact that both measures are highly correlated suggests that the spike-in control represents an effective strategy for detecting inefficiencies in library preparation caused by inhibition.

The effect of DNA extraction on sequence characteristics

Based on library molecule counts and the distribution of full-length molecule sequences obtained from sequencing, we binned the number of DNA fragments recovered in the libraries by size (Fig. 4; see Supplemental Fig. S7 for a plot on logarithmic scale). Even though inhibition does not alter fragment size distributions in the libraries (Supplemental Fig. S8), it reduces the total yield of molecules, especially those carrying a base damage at their 3′ ends (Supplemental Fig. S9). We therefore focused this and subsequent analyses primarily on the sequences of libraries prepared from 3 µL DNA extract. Consistent with the initial experiments using a size marker, the recovery of very short DNA fragments from the ancient samples is most efficient with buffer exchange and least efficient with method A. Surprisingly, however, the loss of short molecules with method A extends well above 35 bp. According to a simple model of DNA fragmentation, the slope of the negative linear relationship between size and log-transformed molecule numbers provides a direct estimate of λ, the frequency of strand breaks in DNA (Deagle et al. 2006). We find that λ is substantially lower in the sequences obtained with method A than with the other methods; this also holds true when limiting the analysis to putatively endogenous sequences, i.e., those that align to the respective reference genome (Supplemental Fig. S10; Supplemental Table S3), implying that the DNA extracted with buffer exchange and methods B and C is more damaged.

Figure 4.

Figure 4.

Estimates of the number of molecules in each library prepared from 3 µL DNA extract binned by size. (A) Total number of molecules (note that peaks below 20 bp are due to artifacts from library preparation). (B) Number of ‘informative’ molecules, i.e., molecules producing sequences that can be aligned to the genome of a close relative. Numbers for the libraries prepared from buffer exchange extracts were multiplied by 2.5 to compensate for the smaller volume of lysate used. See Figure 2 legend for sample S1–S7 identities.

Among the noninhibited libraries, the sum of all nucleotides present in the library (‘total sequence content’) (Fig. 5) is highest with buffer exchange. However, as sequences shorter than 35 bp cannot always be reliably identified as endogenous to the organism under study with current analytical approaches (Meyer et al. 2016), we computed a second measure, ‘informative sequence content,’ which represents the sum of all nucleotides present in DNA fragments ≥35 bp whose sequences can be aligned to a respective reference genome (Fig. 5). By this measure, the performances of buffer exchange, method B, and method C are very similar, while yields are only about half with method A.

Figure 5.

Figure 5.

Sequence content of libraries obtained from extracts prepared with the four methods as inferred by qPCR and shotgun sequencing. (A) The sum of nucleotides in all library molecules relative to that of the best method. (B) The sum of ‘informative’ nucleotides in the libraries, i.e., the sum of nucleotides subsumed in library molecules ≥35 bp that produce sequence alignments, relative to that of the best method. Numbers for the libraries prepared from buffer exchange extracts were multiplied by 2.5 to compensate for the smaller volume of lysate used.

We next investigated whether the extraction method influences sequence characteristics other than size. We first found that aligned sequences from method A exhibit an increase in GC content toward shorter fragments, whereas the average GC content of sequences produced with the other methods is stable across a wide range of fragment sizes (Supplemental Fig. S11). Unexpectedly, DNA extraction also affected the frequency of C to T substitutions in the sequence alignments, which are nearly identical for all methods at the ends of the sequences but substantially higher (by a factor of 1.6, on average) in the interior of sequences from methods B, C, and buffer exchange (Supplemental Fig. S12). This observation hints at a better recovery of DNA fragments with single-strand breaks with these methods, as DNA strands opposing a nick or gap are expected to be more strongly affected by deamination due to the presence of single-stranded regions. DNA fragments carrying single-strand breaks may be prone to guanidine-induced DNA denaturation (Prevorovský and Puta 2003) and subsequent loss in the DNA purification step of method A, a hypothesis that is compatible with the lower recovery of DNA strands >35 bases and the smaller λ observed with method A.

Lastly, to determine whether the improved recovery of short and nicked molecules with methods B and C is limited to single-stranded library preparation, we prepared libraries from two of the samples using a double-stranded library preparation protocol (Meyer and Kircher 2010) and two input volumes of DNA extract (3 and 9 µL) (Supplemental Table S4). In agreement with previous observations (Bennett et al. 2014; Wales et al. 2015; Gansauge et al. 2017), the informative sequence content is substantially lower in the double-stranded than the single-stranded libraries (Supplemental Table S4; Supplemental Fig. S13). Moreover, we observed no substantial difference in informative sequence content among the double-stranded libraries prepared from the extracts of method A and methods B and C (Supplemental Fig. S14), indicating that short DNA fragments or molecules with nicks and gaps are not good substrates for double-stranded library preparation.

Discussion

In light of the current progress made in ancient DNA research, it is a tantalizing question whether DNA sequences from even older and more degraded material can be recovered in the future. As DNA inevitably degrades into shorter and shorter molecules over time, the possible recovery of sequences from such material relies on two conditions: First, more highly degraded DNA must be preserved in ancient skeletal remains than previously known; and second, this DNA must be made accessible by novel molecular techniques. By combining single-stranded library preparation with refined DNA extraction procedures, we have successfully demonstrated that a highly abundant and yet unexploited source of extremely short DNA fragments exists in ancient bones and teeth and that the previously described inverse exponential relationship between fragment size and abundance extends to fragments shorter than 20 bp in all ancient samples analyzed here. While many of these fragments may be directly preserved as double-stranded DNA, patterns of deamination suggest that at least some of them were part of longer double-stranded DNA fragments that carried single-strand breaks. Importantly, the methods described here do not only provide access to extremely degraded DNA fragments, they also increase by a factor of 2.5, on average, the yield of sequences from molecules longer than 35 bp, i.e., sequences that are sufficiently long to allow secure identification of endogenous DNA with current analytical approaches. This improvement immediately benefits work on precious samples or specimens that contain only small amounts of DNA, such as the Sima de los Huesos remains from which only a few million base pairs of sequence could be recovered to date (Meyer et al. 2016).

In line with previous studies (Kalmar et al. 2000; Rohland and Hofreiter 2007b), we found that inhibitors are not easily separated from DNA molecules during DNA extraction, especially when targeting extremely short DNA fragments. The introduction of an additional wash step in silica-based DNA extraction (method C) did not noticeably reduce the level of inhibition in the extracts. It remains unclear, in fact, whether inhibitory substances can be separated from the most highly degraded ancient DNA fragments by silica-based DNA extraction. Among the extraction methods presented, method B is the best choice for extracting DNA from highly degraded or precious samples as it improves the overall yield of molecules compared to previous methods and allows for evaluating sequences as short as 25 bp for patterns of damage-induced substitutions that are indicative of the presence of endogenous ancient DNA. The large fraction of molecules <25 bp obtained with buffer exchange consumes additional sequencing capacity and is unlikely to be informative in downstream analyses. The method of Dabney et al. (2013) (method A), which is more robust against inhibition, remains a viable option for material with moderate or good DNA preservation if larger samples can be taken to compensate for losses of molecules during DNA extraction.

We also show that inhibition can be monitored by spiking control DNA into the DNA extracts prior to library preparation. If it occurs, the problem can be alleviated by producing several libraries from smaller input volumes of DNA extract in subsequent experiments. Because inhibition and other sources of sporadic inefficiency (e.g., pipetting errors or saturation of reactions with excessive amounts of DNA) are sample-dependent, we recommend the spike-in strategy as a general means of quality control in library preparation. When applying a similar control strategy to silica-based DNA extraction, we found that recovery rates are much less variable compared to library preparation; in fact, at between 80 and 90%, they are consistently higher than reported in a previous study (Barta et al. 2014). Unlike in library preparation, we therefore consider spike-in controls unnecessary in DNA extraction.

As the molecular methods presented here greatly enhance the spectrum of DNA sequences that can be recovered from ancient biological remains, analytical strategies will have to be refined to harvest the full informational content residing in highly degraded DNA. Analyses of sequence data from the Sima de los Huesos remains have shown that, for material that is highly contaminated with microbial DNA, a confident distinction between endogenous and microbial sequences is difficult for sequences shorter than 35 bp using alignment parameters optimized for longer sequences (Meyer et al. 2016). As short sequences can now be generated in large numbers, more stringent alignment strategies should be developed—for example, by taking ancient DNA base damage into account, similar to the approach used in Figure 1. Furthermore, additional filtering strategies could be explored to suppress spurious alignments—for example, based on differences in GC content between endogenous sequences and microbial contamination. The work on the Sima de los Huesos remains have set an example for how the ability to isolate and sequence short DNA fragments can extend access to genetic data from nonpermafrost remains by hundreds of thousands of years. The methods described here, likewise, may provide the foundation for further expanding the temporal limits of ancient DNA research.

Methods

Preparation of bone/tooth powder lysate for DNA extraction

Lysate of bone/tooth powder was prepared from a set of six ancient bones and one tooth preserved under different environmental conditions (cave, permafrost, and underwater sites) (Supplemental Table S2). After removing a thin layer of surface, ∼200 mg of fine powder was obtained from each specimen using a dentistry drill set to the lowest speed. The powder was then dissolved by adding 5 mL lysis buffer (0.45 M EDTA, pH 8.0, 0.25 mg/mL proteinase K) and rotating the tubes for 16 h at 37°C. Residual powder was pelleted by centrifugation at 16,000g for 1 min and the supernatant (the lysate) transferred to a new tube. Aliquots of the lysate were then subjected to DNA extraction using the four different methods below.

DNA extraction

To monitor losses of DNA during extraction, a double-stranded 65-bp control DNA fragment was prepared by combining two oligonucleotides (CL200 and CL204) (Supplemental Table S5) in a 50-µL reaction containing 50 mM NaCl, 10 mM Tris-HCl (pH 8.0), 1 mM EDTA (pH 8.0), and 20 µM of each oligonucleotide. Hybridization was performed by incubation at 95°C for 10 sec, followed by a ramp to 14°C at a rate of 0.1°C/sec. The DNA fragment was then diluted to a concentration of 10 pM (corresponding to ∼6 × 106 molecules/µL) using TET buffer (10 mM Tris-HCl, pH 8.0, 1 mM EDTA, 0.05% Tween 20).

For ‘buffer exchange,’ Amicon Ultra-4 Centrifugal Filter Units with Ultracel-3 membranes (Millipore) were used to desalt and concentrate DNA. For this purpose, 200 µL of lysis buffer were supplemented with 1 µL control DNA fragment and 4 mL Tris buffer (10 mM Tris–HCl, pH 8.0), and transferred to the spin column. After centrifugation at 4000g for 90 min, the flow-through was discarded and the residual liquid above the membrane (45–59 µL) supplemented with 4 mL Tris buffer. After centrifugation for another 90 min, the retained liquid (39–49 µL) was collected and adjusted to a volume of 100 µL by adding Tris buffer and Tween 20 (final concentration 0.05%).

In addition, DNA extracts were prepared from 500 µL aliquots of the bone/tooth powder lysates using three silica-based methods. First, to match the previously published protocol of Dabney et al. (2013) (method A), we combined the lysate with 500 µL 0.45 M EDTA (pH 8.0) to adjust the volume of lysate to 1 mL. The lysate was then mixed with 1 µL control DNA fragment and 10.4 mL of binding buffer A (5 M guanidine hydrochloride, 40% vol/vol isopropanol, 115 mM sodium acetate, 0.05% Tween 20), and loaded onto a silica spin column pre-assembled with a volume extender (High Pure Viral Nucleic Acid Large Volume kit, Roche). These pre-assembled silica columns are more stable and convenient to use than the MinElute (Qiagen)/extender constructs described in the original implementation of the method and only marginally less efficient in recovering short molecules (Supplemental Fig. S1; Supplemental Table S1). After centrifugation for 4 min at 500g (1500 rpm in a centrifuge with a swing-bucket rotor), tubes were turned by 90° and centrifuged for an additional 2 min at the same speed. The flow-through was discarded and the extender removed. The silica membrane was then dry-spun for 1 min at 3400g (6000 rpm in a tabletop centrifuge) and washed twice with 750 µL PE buffer (Qiagen), which was spun through the column by centrifugation at 3400g for 30 sec, followed by a dry spin at 16,400g (13,200 rpm in a tabletop centrifuge) for 1 min. DNA was eluted by adding 100 µL TET buffer to the membrane, a 5-min incubation, and then spinning for 1 min at 16,400g. To maximize DNA recovery, elution was repeated by loading the eluate onto the membrane and repeating incubation and centrifugation. For the second method, we developed an extraction procedure (‘method B’) that differs from method A in that we combined 500 µL lysate and 1 µL control DNA fragment with 10 mL of binding buffer B (2 M guanidine hydrochloride, 70% vol/vol isopropanol, 0.05% Tween 20). No pH adjustment with sodium acetate is required for this buffer. The third method tested (method C) is identical to method B except that after DNA binding, silica columns were washed twice with 750 µL binding buffer A and spun at 3400g for 1 min before proceeding to the PE washes.

In addition to DNA extractions from bone/tooth powder lysates, 1 µg of a DNA size marker (GeneRuler Ultra Low Range DNA Ladder, Thermo Fisher Scientific) was purified using the four methods above as well as other binding buffers (Supplemental Fig. S1; Supplemental Table S1). DNA losses during extraction were determined by measuring the concentration of the control fragment CL200/204 before and after DNA extraction by digital PCR (QX200 system, Bio-Rad). Amplification was carried out according to the manufacturer's instructions using the QX200 ddPCR EvaGreen Supermix (Bio-Rad), primers CL201 and CL202 (200 nM each) (Supplemental Table S5), and 1 µL template.

Library preparation, quantification, and amplification

DNA libraries were prepared from 1-, 3-, 9-, and 27-µL aliquots of each DNA extract using a recently published single-stranded library preparation method (Gansauge et al. 2017) automated on a liquid handling system (Bravo NGS workstation B, Agilent Technologies). Aliquots of DNA extract prepared with buffer exchange were incubated at 95°C for 1 min to inactivate carry-over of proteinase K from the lysis buffer. One microliter of a 10 pM dilution of oligonucleotide CL304 in TET buffer (Supplemental Table S5) was added to each sample to measure the efficiency of library preparation. In addition, each library preparation experiment included negative controls (using TT buffer [10 mM Tris-HCl, pH 8.0, 0.05% Tween 20] instead of DNA extract) and positive controls (using 0.1 pmol of oligonucleotide CL104) (Gansauge and Meyer 2013). Total yields of library molecules were determined by qPCR (in single or replicate measurements) (Supplemental Table S3) using primers specific to the adapter sequences as described elsewhere (Gansauge and Meyer 2013). In addition, the yield of control library molecules was determined using the Maxima Probe qPCR Master Mix (Thermo Fisher Scientific) with 200 nM primer IS7 (Meyer and Kircher 2010), 200 nM primer CL107 (Gansauge and Meyer 2013), and 200 nM probe CL118 (Supplemental Table S5) using an annealing temperature of 60°C and otherwise following the supplier's instructions. In addition, aliquots of DNA extracts from two samples (Supplemental Table S2) were converted into double-stranded libraries using the protocol of Meyer and Kircher (2010). Quantification was performed as described above. Single-stranded libraries were amplified for 35 cycles using AccuPrime Pfx DNA polymerase (Thermo Fisher Scientific) under reaction conditions described elsewhere (Dabney and Meyer 2012) except that indices were introduced into both adapters (Kircher et al. 2012) and index primer concentration was increased to 1 µM. Double-stranded libraries were amplified using both AmpliTaq Gold DNA polymerase (Thermo Fisher Scientific) and AccuPrime Pfx DNA polymerase as described in Gansauge et al. (2017). Amplified libraries were purified using the MinElute PCR purification kit (Qiagen). Library pools for sequencing were created by combining equal volumes of the purified libraries. Heteroduplices that had formed in PCR plateaus were removed in a single-cycle-PCR using 500 ng of each library pool, primers IS5 and IS6 (Meyer and Kircher 2010), and otherwise the conditions above. Following purification, PCR products were quantified using a DNA-1000 chip on the Bioanalyzer 2100 (Agilent Technologies).

Further, to test the recovery of DNA fragments of different sizes in library preparation, pUC19 plasmid DNA (NEB) was digested with 40 units of DpnI (NEB) in a 100-µL reaction containing 1× CutSmart Buffer (NEB) and 0.5 µg plasmid for 15 min at 37°C to create an equimolar mixture of DNA fragments. The restriction enzyme was then inactivated by incubation at 80°C for 20 min. Fragmented DNA corresponding to 0.01, 0.1, and 1 pmol was then used as a substrate for single-stranded library preparation. Amplification and sequencing were performed as described above and below.

Sequencing and sequence analysis

Libraries were sequenced on Illumina's HiSeq 2500 and MiSeq platforms using recipes for 2×76-bp paired-end sequencing with two index reads (Kircher et al. 2012). Base calls for the HiSeq data were generated with FreeIbis (Renaud et al. 2013). Overlapping paired-end reads were merged into single sequences to reconstruct full-length molecules (Renaud et al. 2014). Perfect matches to one of the expected index combinations were required to assign sequences to the library of origin. Overlap-merged sequences ≥35 bp were aligned to an appropriate reference genome (ursMar0, bosTau6, turTru1.75, and hg19) using BWA (Li and Durbin 2009) with parameters adjusted to ancient DNA (Meyer et al. 2012). PCR duplicates were removed using bam-rmdup (https://bitbucket.org/ustenzel/biohazard-tools). Summary statistics were computed using custom Perl scripts. Due to the small number of aligned sequences obtained from sample 2 (between 179 and 243 sequences), this sample was omitted from all analyses involving aligned sequences.

The total sequence content of a library was calculated as follows: [number of overlap-merged sequences]/[number of raw sequences] × [qPCR molecule count] × [average length of all sequences]. The informative sequence content was calculated as follows: [number of aligned sequences ≥35 bp]/[number of raw sequences] × [qPCR molecule count] × [average length of aligned sequences ≥35 bp]. The frequency of DNA damage (λ) was computed from the slope of the linear regression between log-transformed molecule counts and molecule size (Deagle et al. 2006), taking into account only molecules between 51 and 70 bp to ensure linearity of the relationship. Overlap-merged sequences from the pUC19 libraries were assigned to the DNA fragments they originated from by requiring a perfect match in their first and last 9 bp to the sequences obtained from an in silico digestion of the circular pUC19 sequence.

Data access

The sequencing data from this study have been submitted to the European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena) under accession number PRJEB19470.

Supplementary Material

Supplemental Material

Acknowledgments

We thank Svante Pääbo, Petra Korlević, Marie-Theres Gansauge, Viviane Slon, Michael Dannemann, Steffi Grote, and especially Louisa Jauregui for helpful discussions and comments on the manuscript, Antje Weihmann and Barbara Höber for performing the sequencing runs, and Udo Stenzel for bioinformatic support. We also thank Gernot Rabeder, Nuria Garcia, Juan-Luis Arsuaga, Pavao Rudan, Christine Verna, Michael Shunkov, Grant Zazula, and Klaas Post for providing the samples. This work was supported by the Max-Planck-Förderstiftung (grant P.S.EVANLOMP).

Footnotes

[Supplemental material is available for this article.]

Article published online before print. Article, supplemental material, and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.219675.116.

References

  1. Adler CJ, Haak W, Donlon D, Cooper A. 2011. Survival and recovery of DNA from ancient teeth and bones. J Archaeol Sci 38: 956–964. [Google Scholar]
  2. Allentoft ME, Collins M, Harker D, Haile J, Oskam CL, Hale ML, Campos PF, Samaniego JA, Gilbert MTP, Willerslev E, et al. 2012. The half-life of DNA in bone: measuring decay kinetics in 158 dated fossils. Proc Biol Sci 279: 4724–4733. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Allentoft ME, Sikora M, Sjogren KG, Rasmussen S, Rasmussen M, Stenderup J, Damgaard PB, Schroeder H, Ahlstrom T, Vinner L, et al. 2015. Population genomics of Bronze Age Eurasia. Nature 522: 167–172. [DOI] [PubMed] [Google Scholar]
  4. Barta JL, Monroe C, Teisberg JE, Winters M, Flanigan K, Kemp BM. 2014. One of the key characteristics of ancient DNA, low copy number, may be a product of its extraction. J Archaeol Sci 46: 281–289. [Google Scholar]
  5. Bennett EA, Massilani D, Lizzo G, Daligault J, Geigl EM, Grange T. 2014. Library construction for ancient genomics: single strand or double strand? Biotechniques 56: 289–300. [DOI] [PubMed] [Google Scholar]
  6. Briggs AW, Stenzel U, Johnson PLF, Green RE, Kelso J, Prufer K, Meyer M, Krause J, Ronan MT, Lachmann M, et al. 2007. Patterns of damage in genomic DNA sequences from a Neandertal. Proc Natl Acad Sci 104: 14616–14621. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Brotherton P, Endicott P, Sanchez JJ, Beaumont M, Barnett R, Austin J, Cooper A. 2007. Novel high-resolution characterization of ancient DNA reveals C > U-type base modification events as the sole cause of post mortem miscoding lesions. Nucleic Acids Res 35: 5717–5728. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Dabney J, Meyer M. 2012. Length and GC-biases during sequencing library amplification: a comparison of various polymerase-buffer systems with ancient and modern DNA sequencing libraries. Biotechniques 52: 87–94. [DOI] [PubMed] [Google Scholar]
  9. Dabney J, Knapp M, Glocke I, Gansauge MT, Weihmann A, Nickel B, Valdiosera C, Garcia N, Paabo S, Arsuaga JL, et al. 2013. Complete mitochondrial genome sequence of a Middle Pleistocene cave bear reconstructed from ultrashort DNA fragments. Proc Natl Acad Sci 110: 15758–15763. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Deagle BE, Eveson JP, Jarman SN. 2006. Quantification of damage in DNA recovered from highly degraded samples – a case study on DNA in faeces. Front Zool 3: 11. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Fu QM, Meyer M, Gao X, Stenzel U, Burbano HA, Kelso J, Paabo S. 2013. DNA analysis of an early modern human from Tianyuan Cave, China. Proc Natl Acad Sci 110: 2223–2227. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Gamba C, Hanghoj K, Gaunitz C, Alfarhan AH, Alquraishi SA, Al-Rasheid KA, Bradley DG, Orlando L. 2016. Comparing the performance of three ancient DNA extraction methods for high-throughput sequencing. Mol Ecol Resour 16: 459–469. [DOI] [PubMed] [Google Scholar]
  13. Gansauge MT, Meyer M. 2013. Single-stranded DNA library preparation for the sequencing of ancient or damaged DNA. Nat Protoc 8: 737–748. [DOI] [PubMed] [Google Scholar]
  14. Gansauge MT, Gerber T, Glocke I, Korlevic P, Lippik L, Nagel S, Riehl LM, Schmidt A, Meyer M. 2017. Single-stranded DNA library preparation from highly degraded DNA using T4 DNA ligase. Nucleic Acids Res 10.1093/nar/gkx033. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Glenn TC, Stephan W, Braun MJ. 1999. Effects of a population bottleneck on whooping crane mitochondrial DNA variation. Conser Biol 13: 1097–1107. [Google Scholar]
  16. Haak W, Lazaridis I, Patterson N, Rohland N, Mallick S, Llamas B, Brandt G, Nordenfelt S, Harney E, Stewardson K, et al. 2015. Massive migration from the steppe was a source for Indo-European languages in Europe. Nature 522: 207–211. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Hagelberg E, Clegg JB. 1991. Isolation and characterization of DNA from archaeological bone. Proc Biol Sci 244: 45–50. [DOI] [PubMed] [Google Scholar]
  18. Handt O, Höss M, Krings M, Pääbo S. 1994. Ancient DNA: methodological challenges. Experientia 50: 524–529. [DOI] [PubMed] [Google Scholar]
  19. Hänni C, Brousseau T, Laudet V, Stehelin D. 1995. Isopropanol precipitation removes PCR inhibitors from ancient bone extracts. Nucleic Acids Res 23: 881–882. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Kalmar T, Bachrati CZ, Marcsik A, Rasko I. 2000. A simple and efficient method for PCR amplifiable DNA extraction from ancient bones. Nucleic Acids Res 28: E67. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Kircher M, Sawyer S, Meyer M. 2012. Double indexing overcomes inaccuracies in multiplex sequencing on the Illumina platform. Nucleic Acids Res 40: e3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Krings M, Stone A, Schmitz RW, Krainitzki H, Stoneking M, Pääbo S. 1997. Neandertal DNA sequences and the origin of modern humans. Cell 90: 19–30. [DOI] [PubMed] [Google Scholar]
  23. Kurosaki K, Matsushita T, Ueda S. 1993. Individual DNA identification from ancient human remains. Am J Hum Genet 53: 638–643. [PMC free article] [PubMed] [Google Scholar]
  24. Leonard JA, Wayne RK, Cooper A. 2000. Population genetics of ice age brown bears. Proc Natl Acad Sci 97: 1651–1654. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Li H, Durbin R. 2009. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25: 1754–1760. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Lindahl T. 1993. Instability and decay of the primary structure of DNA. Nature 362: 709–715. [DOI] [PubMed] [Google Scholar]
  27. Meyer M, Kircher M. 2010. Illumina sequencing library preparation for highly multiplexed target capture and sequencing. Cold Spring Harb Protoc 2010: pdb.prot5448. [DOI] [PubMed] [Google Scholar]
  28. Meyer M, Kircher M, Gansauge MT, Li H, Racimo F, Mallick S, Schraiber JG, Jay F, Prufer K, de Filippo C, et al. 2012. A high-coverage genome sequence from an archaic Denisovan individual. Science 338: 222–226. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Meyer M, Fu Q, Aximu-Petri A, Glocke I, Nickel B, Arsuaga JL, Martinez I, Gracia A, de Castro JM, Carbonell E, et al. 2014. A mitochondrial genome sequence of a hominin from Sima de los Huesos. Nature 505: 403–406. [DOI] [PubMed] [Google Scholar]
  30. Meyer M, Arsuaga JL, de Filippo C, Nagel S, Aximu-Petri A, Nickel B, Martinez I, Gracia A, de Castro JM, Carbonell E, et al. 2016. Nuclear DNA sequences from the Middle Pleistocene Sima de los Huesos hominins. Nature 531: 504–507. [DOI] [PubMed] [Google Scholar]
  31. Orlando L, Ginolhac A, Zhang GJ, Froese D, Albrechtsen A, Stiller M, Schubert M, Cappellini E, Petersen B, Moltke I, et al. 2013. Recalibrating Equus evolution using the genome sequence of an early Middle Pleistocene horse. Nature 499: 74–78. [DOI] [PubMed] [Google Scholar]
  32. Pääbo S. 1989. Ancient DNA: extraction, characterization, molecular-cloning, and enzymatic amplification. Proc Natl Acad Sci 86: 1939–1943. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Poinar HN. 1998. Preservation of DNA in the fossil record. Acs Sym Ser 707: 132–146. [Google Scholar]
  34. Poinar H, Kuch M, McDonald G, Martin P, Pääbo S. 2003. Nuclear gene sequences from a Late Pleistocene sloth coprolite. Curr Biol 13: 1150–1152. [DOI] [PubMed] [Google Scholar]
  35. Prevorovský M, Puta F. 2003. A/T-rich inverted DNA repeats are destabilized by chaotrope-containing buffer during purification using silica gel membrane technology. Biotechniques 35: 698–702. [DOI] [PubMed] [Google Scholar]
  36. Prüfer K, Racimo F, Patterson N, Jay F, Sankararaman S, Sawyer S, Heinze A, Renaud G, Sudmant PH, de Filippo C, et al. 2014. The complete genome sequence of a Neanderthal from the Altai Mountains. Nature 505: 43–49. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Rasmussen M, Li YR, Lindgreen S, Pedersen JS, Albrechtsen A, Moltke I, Metspalu M, Metspalu E, Kivisild T, Gupta R, et al. 2010. Ancient human genome sequence of an extinct Palaeo-Eskimo. Nature 463: 757–762. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Renaud G, Kircher M, Stenzel U, Kelso J. 2013. freeIbis: an efficient basecaller with calibrated quality scores for Illumina sequencers. Bioinformatics 29: 1208–1209. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Renaud G, Stenzel U, Kelso J. 2014. leeHom: adaptor trimming and merging for Illumina sequencing reads. Nucleic Acids Res 42: e141. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Rohland N, Hofreiter M. 2007a. Ancient DNA extraction from bones and teeth. Nat Protoc 2: 1756–1762. [DOI] [PubMed] [Google Scholar]
  41. Rohland N, Hofreiter M. 2007b. Comparison and optimization of ancient DNA extraction. Biotechniques 42: 343–352. [DOI] [PubMed] [Google Scholar]
  42. Rohland N, Siedel H, Hofreiter M. 2010. A rapid column-based ancient DNA extraction method for increased sample throughput. Mol Ecol Resour 10: 677–683. [DOI] [PubMed] [Google Scholar]
  43. Scholz M, Pusch C. 1997. An efficient isolation method for high-quality DNA from ancient bones. Technical Tips Online 2: 61–64. [Google Scholar]
  44. Schwarz C, Debruyne R, Kuch M, McNally E, Schwarcz H, Aubrey AD, Bada J, Poinar H. 2009. New insights from old bones: DNA preservation and degradation in permafrost preserved mammoth remains. Nucleic Acids Res 37: 3215–3229. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Tuross N. 1994. The biochemistry of ancient DNA in bone. Experientia 50: 530–535. [DOI] [PubMed] [Google Scholar]
  46. Wales N, Caroe C, Sandoval-Velasco M, Gamba C, Barnett R, Samaniego JA, Madrigal JR, Orlando L, Gilbert MT. 2015. New insights on single-stranded versus double-stranded DNA library preparation for ancient DNA. Biotechniques 59: 368–371. [DOI] [PubMed] [Google Scholar]
  47. Yang DY, Eng B, Waye JS, Dudar JC, Saunders SR. 1998. Technical note: improved DNA extraction from ancient bones using silica-based spin columns. Am J Phys Anthropol 105: 539–543. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental Material

Articles from Genome Research are provided here courtesy of Cold Spring Harbor Laboratory Press

RESOURCES