ABSTRACT
Large‐scale DNA screening of palaeontological and archaeological collections remains a limiting and costly factor for ancient DNA studies. Several DNA extraction protocols are routinely used in ancient DNA laboratories and have even been automated on robotic platforms. Robots offer a solution for high‐throughput screening but the costs, as well as necessity for trained technicians and engineers, can be prohibitive for some laboratories. Here, we present a high‐throughput alternative to robot‐based ancient DNA extraction using a 96‐column plate. When compared to routine single MinElute columns, we retrieved highly similar endogenous DNA contents, an important metric in ancient DNA screening. Mitogenomes with a coverage depth greater than 0.1× could be generated and allowed for taxonomic assignment. However, average fragment lengths, DNA damage and library complexities significantly differed between methods but these differences became nonsignificant after modification of our library purification protocol. Our high‐throughput extraction method allows generation of 96 extracts within approximately 4 hours of laboratory work while bringing the cost down by ~39% compared to using single columns. Additionally, we formally demonstrate that the addition of Tween‐20 during the elution step results in higher complexity libraries, thereby enabling higher genome coverage for the same sequencing effort.
Keywords: 96‐column plate, ancient DNA, DNA extraction, high‐throughput
1. Introduction
The field of ancient DNA (aDNA) continues to push the limit of DNA recovery further back in time, enabling more in‐depth studies of evolution. For example, retrieval of million‐year‐old mammoth DNA has provided an improved understanding of mammoth migration, speciation, hybridisation and adaptation to cold environments (van der Valk et al. 2021; Díez‐del‐Molino et al. 2023). Similarly, when studying human evolution, aDNA has shaped our knowledge of early modern humans ( Homo sapiens ) and their relationship to other archaic humans, such as Neanderthals and Denisovans, the latter of which were identified initially through aDNA analyses. The oldest human DNA comes from Sima de los Huesos in northern Spain (Meyer et al. 2016); however, it is Denisova Cave in Southern Siberia that has shed ample light on the evolutionary and genetic admixture history of the broader Homo lineage (Reich et al. 2010; Slon et al. 2018). This is where bone remains of Denisovans, Neanderthals and their hybrid offspring were discovered in the past 15 years, as well as their traces in sediments from the site (Slon et al. 2017; Zavala et al. 2021).
Bones excavated from caves are often highly fragmented and morphologically nonidentifiable. More than 15,000 such fragments have been screened from Denisova Cave to date using a palaeoproteomics method known as Zooarchaeology by Mass Spectrometry (ZooMS) (Buckley et al. 2009), which has enabled taxonomic assignment to the genus level for bones with good biomolecular preservation (Brown et al. 2021). Such bones enlarge the pool of material available for genomic analyses since they are already proven to preserve some biomolecules (collagen) and are binned into broad taxonomic groups (eg, bovids, equids or carnivores) sharing similar collagen profiles. Ancient DNA screening of previously ZooMS‐analysed samples can be used to further resolve taxonomy from genus to species, and even population, level.
Research on the optimisation of protocols for maximising aDNA recovery from ancient bone has mostly revolved around human remains. This led to standardisation of the use of petrous bones, cochleas, molars and eventually middle ear ossicles as the top candidates for targeted aDNA analyses (Pinhasi et al. 2015; Sirak et al. 2020). However, the field has expanded dramatically in recent years and the scope of organisms targeted in palaeogenomic studies is broad (Brunson and Reich 2019). This means that the bone elements available are not always ideal candidates for DNA preservation, and fragmented bones are more abundant and easier to access for destructive sampling. For this reason, it is important to develop cost‐effective methods that allow screening of relatively large collections of bones to identify the best‐preserved specimens and materials.
Method development remains a central part of the aDNA field. Over the past decade, methodological developments both in the laboratory and in data processing and analysis now allow retrieval of more useful information from much older samples and very scarce data (Shapiro and Hofreiter 2014; Burrell, Disotell, and Bergey 2015; Meyer et al. 2016; Leonardi et al. 2017; Dalén et al. 2023). In the laboratory, these developments in part focus on the trade‐off between filtering out contamination while preserving short genuine endogenous aDNA sequences (Damgaard et al. 2015; Gamba et al. 2016; Korlević et al. 2015; Nieves‐Colón et al. 2018). However, current extraction methods (e.g., Dabney and Meyer 2019) are mainly based on low‐throughput single columns and are time‐consuming when applied to large numbers of samples. Optimisation of laboratory protocols for ancient DNA retrieval should therefore focus on several criteria, including not only effectiveness but also simplicity, as well as time‐ and cost‐efficiency of new approaches (Kapp, Green, and Shapiro 2021).
In this study, we develop and present an easy‐to‐set‐up 96‐column plate‐based extraction protocol which allows a large number of samples to be processed simultaneously. The new protocol is time‐efficient compared to single columns and represents an effective alternative to robot‐based extractions. To assess the method's accuracy, we compared the aDNA screening outcomes of subsampled lysates from the new 96‐column plate method and the single MinElute spin columns, using the DNA extraction approach of Dabney and Meyer (2019). The two methods will be referred to as high‐ and low‐throughput hereafter. DNA extracts generated using the two protocols were converted to Santa Cruz Reaction single‐strand libraries (Kapp, Green, and Shapiro 2021) with modifications described in (Nguyen et al. 2023), which is fast, adapted to low‐input DNA and can be performed in 96‐well format. The new protocol was successfully tested on Holocene reindeer from Franz Josef Land (Russia), Pleistocene bovids from Denisova Cave (Siberia) and Late Pleistocene mammoth bones from Alaska and New Mexico (USA). This diverse set of samples, in terms of age, origin, organism and preservational environment, highlights our method's applicability to large‐scale sample screening for aDNA analysis.
2. Materials and Methods
2.1. Sample Information
Sixty‐two samples from three different megafaunal taxa were included in this work. Thirty‐one bone fragments attributed to the bovid genera Bos/Bison by previous ZooMS screening (Brown et al. 2021) are from Denisova Cave, Siberia, Russia. These were recovered from layers 12.1, 12.2, 14, 15 and 17.1 of the East Chamber, as well as layer 11 of the South chamber, ranging in age from ~300 to 20 ka (thousand years ago) (Jacobs et al. 2019). In addition, we included 12 reindeer ( Rangifer tarandus ) bones and antler fragments from Franz Josef Land dating to the Holocene (Kellner et al. 2024; Hold et al. 2024) and 19 mammoth (Mammuthus sp.) bone fragments of Late Pleistocene age (18 from Alaska and 1 from New Mexico). Detailed information about each sample is available in Table S1. Negative controls were introduced at different stages of the laboratory workflow as per routine ancient DNA work.
2.2. Sampling and Bleach Pretreatment
All laboratory procedures were carried out in designated facilities for aDNA handling at the Centre for Palaeogenetics, Stockholm University, Sweden. A piece of each fragmented bone from Denisova Cave was removed, using either a cleaned circular saw adapted to a Dremel tool or using cleaned wire cutting tools, and subsequently crushed to expose internal microfractures. Between 24 and 299 mg of the crushed bone was bleach pretreated in a < 0.5% sodium hypochlorite solution for approximately 4 min at room temperature (modified from Lord et al. 2022; Boessenkool et al. 2017). The sodium hypochlorite was subsequently washed off by rinsing the bone fragments three times in UltraPure DNase/RNase‐Free Distilled Water (Invitrogen). One sample (AG001) was not bleach pretreated on account that the crushing resulted in an overly pulverised bone. All mammoth and reindeer bone/antler fragments, regardless of their size, were drilled to obtain around 50 mg of bone powder and were not bleach pretreated. Input sample masses are available in Table S1.
2.3. Sample Lysis
Lysis and ensuing DNA extraction were based on a modified and downscaled version of a published protocol (Dabney and Meyer 2019). The bone fragments were mixed with ~1 mL of lysis buffer, composed of 900 μL EDTA (0.5 M, pH 8; final concentration: 0.45 M), 75 μL UltraPure DNase/RNase‐Free Distilled Water (Invitrogen), 0.5 μL Tween‐20 (final concentration: 0.05%), 25 μL Proteinase K (10 μg/μL; final concentration: 0.25 μg/μL) and incubated under motion at 37°C. Tween‐20 was not added to mammoth samples. Incubation lasted from overnight to 72 h depending on the size of the bone fragments with the goal to digest as much material as possible. An additional 20 μL of Proteinase K was added to samples with visible bone pellets remaining after 48 h. Lysates were stored at −20°C until downstream DNA extraction.
2.4. Low‐Throughput DNA Extraction (Variant A in Figure 1)
FIGURE 1.

Summary of protocol variants tested (A, B, B2 and B3). Variants A and B represent low‐ (MinElute) and high‐throughput (QIAquick 96‐column plate) methods, respectively.
Lysates were thawed and centrifuged at 6000 g for 3 min to pellet any remaining undigested material. Binding buffer was prepared in a 35 mL stock as follows. We mixed 16.72 g of guanidine hydrochloride (GuHCl) with UltraPure DNase/RNase‐Free Distilled Water (Invitrogen) to generate 21 mL of concentrated GuHCl solution. We next added isopropanol to bring the total solution volume to 35 mL (final concentration: 5 M GuHCl, 40% v/v isopropanol) and added 17.5 μL Tween‐20. An aliquot of 100 μL of the lysate was combined with a mixture of 1 mL of binding buffer and 40 μL sodium acetate (3 M, pH 5.2) and thoroughly mixed by vortexing. The resulting lysis‐binding mixture was loaded onto a MinElute spin column (Qiagen) in two consecutive rounds of ~550 μL each. The spin column membrane then underwent two washes with 750 μL PE buffer (Qiagen), with a final dry spin at 13,000 rpm for 1 min to remove excess PE buffer. The columns were subsequently left open to allow traces of wash buffer to evaporate for 5 min. Elution was performed by adding 50 μL of TET buffer (Tris–HCl 10 mM (pH 8.0), EDTA 1 mM (pH 8.0), 0.05% Tween‐20; final concentrations in ultrapure water) to the spin column membrane, incubating at room temperature for 1 to 5 min, followed by centrifugation at 13,000 rpm for 1 min.
2.5. High‐Throughput DNA Extraction (Variant B in Figure 1)
To allow the use of 96‐column plates, the low‐throughput extraction method described above was modified according to recommendations from the QIAquick 96 PCR Purification Kit (Qiagen) (see Protocol S1). For each sample, a 100 μL aliquot of lysate, 1 mL of binding buffer and 40 μL of sodium acetate were mixed in the same way as for the low‐throughput extraction. The mixture was also loaded onto a column within the 96‐column plate in two rounds. In each round, the centrifugation steps were replaced by a vacuum manifold system QIAvac 96 (Qiagen) coupled to a vacuum pump, which was used to bind DNA to the columns. Columns were next washed twice with 900 μL of PE buffer (Qiagen). The maximum vacuum (around −300 mbar) was then applied for 10 min, and the plate was carefully ventilated onto absorbent paper to eliminate ethanol residues from the PE wash buffer. Elution in 60 μL of EB buffer (Qiagen; Tris–HCl 10 mM (pH 8.5)) on a provided elution plate was performed by opening the vacuum for 5 min after 1‐ to 5‐min incubation at room temperature. Some additional replicates were eluted in 60 μL of TET buffer to allow for improved comparison between the low‐ and high‐throughput methods and allow direct comparison of buffers EB and TET (see Table S2) (variant B2 in Figure 1). Negative controls were included in several randomly selected positions per plate.
2.6. Library Preparation and Sequencing
The concentration of the DNA extracts was assessed with a Qubit Flex Fluorometer and 1× dsDNA HS assay on 5 μL of DNA extract. Following this, 10 μL of the extracts was converted into libraries following the single‐strand Santa Cruz Reaction protocol for low‐input DNA (Kapp, Green, and Shapiro 2021) with modifications from Nguyen et al. (2023) up to the cleaning step. Library purification was carried out using either a MinElute spin column or a 96‐column QIAquick plate in order to match the method used for DNA extraction.
Low‐throughput library purification (variant A) was based on a Meyer and Kircher (2010) clean‐up. Briefly, 125 μL PB buffer (Qiagen) was mixed with 25 μL of the library except for 13 samples, which received 100 μL of PB buffer with no observed difference in screening metrics (see Table S2). After the wash and dry spin steps, the columns were left open to dry for 5 min. Elution was performed in 22 μL EB buffer (Qiagen) after incubation at 37°C for 5 min. High‐throughput library purification (variant B) was adjusted to follow manufacturer's recommendations (see Protocol S1). Following the method used in high‐throughput extraction, 75 μL PM buffer (Qiagen) was combined with 25 μL of the library. Elution was performed using 45 μL of EB buffer (Qiagen). The replicates that were eluted in the TET buffer during high‐throughput extraction were eluted in 45 μL of TET following the library clean‐up (variant B2).
Library quality was assessed using quantitative polymerase chain reaction (qPCR) following recommendations from the Santa Cruz Reaction protocol (Kapp, Green, and Shapiro 2021) with modifications described in Nguyen et al. (2023) with the exception of sometimes using 10 μM of each primer instead of 20 μM. Twenty‐two samples were tested with both primer concentrations (10 μM and 20 μM) with no observed differences in amplification.
Indexing PCR was carried out in a 50 μL volume with 5 U of AmpliTaq Gold DNA Polymerase, 1× Taq Gold Buffer, 2.5 mM MgCl2 (Applied Biosystems), 0.4 μM of each indexing primer, 0.25 mM dNTPs and 5 μL of library. The PCR was run with the following program: initial denaturation at 94°C for 12 min, cycles of denaturation at 94°C for 30 s, annealing at 60°C for 30 s and elongation at 72°C for 45 s, with a final extension at 72°C for 10 min. The number of PCR cycles was library‐specific and determined using qPCR Ct values.
The efficacy of indexing PCR was visually assessed by running products on an agarose gel. The indexed libraries were pooled at roughly equal concentrations based on gel band intensity. Pools were cleaned and size‐selected with either AMPure XP (Beckman Coulter) or Sera‐Mag Select (Cytiva) beads using 0.5× and 1.8 to 1.5× ratios to remove long and short fragments, respectively. The purity and concentration of pools were then assessed using a TapeStation (Agilent) and high‐sensitivity D1000 DNA screentape. Pools were sequenced on the Illumina NovaSeqX and NovaSeq 6000 platforms at the National Genomics Infrastructure (Science for Life Laboratory, Stockholm, Sweden) using 2 × 150 cycle paired‐end chemistry.
2.7. Binding Buffer Ratio Modification Experiment (Variant B3 in Figure 1)
To directly compare the DNA extraction section of both methods (prior to library preparation), we selected five samples previously eluted in TET during both the low‐ and high‐throughput DNA extractions (variant B2). These were library‐prepared following the same procedure described above, except that the high‐throughput method libraries were purified using the same binding buffer and ratio as the low‐throughput method (125 μL PB).
2.8. Data Processing
Raw reads were downsampled using seqtk (v.1.2‐r101) to the lowest sequencing depth among samples (2.67 million reads or 2.38 million reads for the binding ratio modification experiment) in order to make library complexity comparable across all samples (Daley and Smith 2013). Following this, reads were processed using the GenErode pipeline (v.0.6.0) (Kutschera et al. 2022) up to BAM deduplication and indel realignment following default settings if not otherwise specified. A modification to the source code of the fastp step set the minimum overlap for paired‐end merging at 15 bp (—overlap_len_require) and turned on polyG tail removal (—trim_poly_g). Preprocessed fragments were aligned using BWA aln to three different reference genomes depending on sample information. Bovine libraries were aligned to a composite reference composed of the beefalo genome assembly ( Bos taurus x Bison bison ; Genbank GCA_018282365.1 beefalo, USDA ARS) and a Bison priscus mitogenome (Genbank KX269145.1). Reindeer samples were aligned to a composite genome assembly of Rangifer tarandus platyrhynchus (Genbank GCA_951394145.1 mRanTar1.hap2.1, EBP Norway) and a Rangifer tarandus mitogenome (Genbank NC_007703.1). Mammoth samples were aligned to a composite genome assembly of Elephas maximus indicus (Genbank GCA_024166365.1 mEleMax1, Vertebrate Genomes Project), Homo sapiens (HumanG1Kv37 reference) as a decoy (Feuerborn et al. 2020) and a Mammuthus primigenius mitogenome (Genbank DQ188829.2). Sequence data from negative controls were mapped against all possible reference genomes relevant to the sample batch associated with the negative control (see Table S3). Mitochondrial references KX269145.1 (Bison priscus), NC_007703.1 ( Rangifer tarandus ) and NC_007596.2 (Mammuthus primigenius) were supplemented to the optional mitogenome mapping step in order to check for cross‐contamination when two sample groups were processed together in the laboratory. Optional DNA damage calculation and base quality rescaling with mapDamage were implemented.
2.9. Data Analysis
We assessed key metrics of sample quality (endogenous DNA content, library complexity, fragment lengths, cytosine deamination rates and mitogenome coverage depth) between the low‐ and high‐throughput‐prepared extracts. Endogenous DNA content was calculated as the number of mapped fragments (before duplicate removal) divided by the number of fragments that had passed the fastp preprocessing step. Library complexity was calculated as the number of unique fragments postdeduplication divided by the number of mapped fragments. To account for the 1:2 ratio in elution volume between methods during the library purification, absolute estimates of unique molecules were corrected. This was done by running Preseq (v.3.2) on sorted BAMs with options lc_extrap ‐v to estimate yields. The plateau value was estimated using a custom python script, and the high‐throughput extraction values were multiplied by two for correction. Thirteen samples were not included because the minimum required duplicate count from Preseq was not met (see Table S2 column ‘corrected preseq estimate’). Minimum, maximum and average fragment lengths were calculated from the deduplicated and indel‐realigned BAM files, after removal of alignments with mapping quality lower than 20, using a custom awk script. DNA damage was measured as the 5' C‐to‐T substitution rate at the first base position from the deduplicated and indel‐realigned BAM files and is reported as a percentage. For samples with two technical replicates per method (see Table S2), the average of the aforementioned metrics was calculated and is displayed in the results. Mitogenome coverage depth was computed with samtools (v.1.19) (depth ‐a ‐q 30 ‐Q 20) on the deduplicated and indel‐realigned BAM files and is displayed in Table S2 (if mitochondrial fragments were present). The PhyloP program from PHAST was used to compute p‐values of base conservation/acceleration between our mitochondrial references (Pollard et al. 2010). We investigated mapping bias and length distribution with the AMBER program (Dolenz et al. 2024) on deduplicated and indel‐realigned BAM files with a generated MD tag (samtools calmd).
2.10. Statistical Analyses
All datasets displayed in the results were tested for normality using a Shapiro–Wilk test. The nonparametric Wilcoxon signed‐rank test was used for statistical analysis of datasets found to be significantly nonnormally distributed, whereas the parametric t‐test was used for the normally distributed datasets. All Wilcoxon and t‐tests were paired based on the extraction method. For correlation, the Pearson test was applied to normally distributed variables and the Spearman test was applied to those that were nonnormally distributed. We used an alpha threshold of 0.05 for significance testing in all analyses.
3. Results
The 62 samples ranged in endogenous DNA content from 0.02% to 83.60% with an average of 7.71%. Cytosine deamination‐induced DNA damage at the first base position was between 0% and 32.18% with an average of 11.13%. Library complexity was between 15.21% and 99.37% with an average of 81.43%. All average DNA fragment lengths were below 70 bp. Of note, the Holocene reindeer samples had DNA with higher endogenous content, longer fragments and lower damage, which is consistent with their younger age as compared to both the Pleistocene bovine and mammoth samples (Figures 2 and 3). Example misincorporation plots from all three sample groups are shown in Figure S1 to highlight the different damage patterns observed in these samples (see Table S1 for age estimates). Extended metrics per sample are shown in Table S2.
FIGURE 2.

Correlation plot of endogenous content between low‐ (MinElute) and high‐throughput (Plate) extraction methods (variants A and B). Values were log‐transformed to accommodate for the fact that the median endogenous content is only 0.55% and reduce skewness of the data (see Figure S2). A linear regression and its equation as well as Spearman's coefficient of correlation and its p‐value are shown (S = 670.02). Confidence interval (95%) around the regression is shown in grey. Standard deviation is shown as bars for those data points with two technical replicates. Open circles are used for data points without technical replicates. Samples are colour coded by taxa. N = 62.
FIGURE 3.

Correlation plots of average fragment length (A) and DNA damage as percentage of 5' C‐to‐T substitution on the first base (%5pCtoT) (B) between low‐ (MinElute) and high‐throughput (Plate) extraction methods (variants A and B). A linear regression and its equation as well as Spearman's coefficient of correlation and p‐value are shown (S = 1912 and 3892.5 for average fragment length and DNA damage, respectively). Confidence interval (95%) around the regression is shown in grey. Standard deviation is shown as bars for those data points with two technical replicates. The dashed line stands as a 1:1 reference. Open circles are used for data points without technical replicates. Samples are colour coded by taxa. N = 62.
3.1. Endogenous Content
We found a strong linear correlation (Spearman's coefficient of correlation ⍴ = 0.98) in endogenous DNA content between the low‐ and high‐throughput extraction methods (variants A and B), across all three taxa (Figure 2). Moreover, the slope of the linear relationship was almost 1:1 and we found no significant difference in endogenous content recovered between both methods (paired two‐tailed Wilcoxon test: V = 838, p‐value = 4.4e−1). This suggests that the high‐throughput extraction method yields endogenous DNA content values that are indistinguishable from the traditional low‐throughput method.
3.2. Average DNA Fragment Length and Cytosine Deamination Damage
We observed a linear correlation (Spearman's coefficient of correlation ⍴ = 0.95) in average fragment lengths recovered from both extraction methods (Figure 3A, variants A and B). The average fragment length recovered by the high‐throughput extraction method was significantly greater than for the low‐throughput extraction (paired one‐sided Wilcoxon test: V = 132, p‐value = 1.636e−9). The overall average fragment length for the high‐throughput extraction method was 46 bp compared to 43 bp for the low‐throughput method. For our measure of DNA damage, we obtained a good linear correlation (Spearman's coefficient of correlation ⍴ = 0.9) between both methods (Figure 3B). However, the high‐throughput method retained significantly less damage (paired one‐sided Wilcoxon test: V = 1444, p‐value = 5.299e−4). The average DNA damage for the high‐throughput method was 11% compared to 13% for the low‐throughput extraction.
However, when we used the same extract elution buffer (TET) and the modified binding buffer ratio for library prep purification, so that only the extraction step was performed independently (see Table S4, variant B3), we found no significant difference between the low‐ and high‐throughput methods for measures of both average fragment length and DNA damage (see Figure S3) (average fragment length: paired two‐tailed t‐test: t = −2.04 with [−1.64; 0.25] 95% confidence interval, df = 4, p‐value = 1.1e−1; DNA damage: t = −1.65 with [−4.61; 1.18] 95% confidence interval, df = 4, p‐value = 1.7e−1).
3.3. Library Complexity
We initially found a significantly lower estimated number of unique molecules in the libraries prepared using the high‐throughput method (Figure 4, variants A and B) (paired one‐sided Wilcoxon test: V = 1125, p‐value = 8.914e−9). However, as with average fragment length and DNA damage, library complexity was rescued by controlling for both extract elution and library prep purification binding buffers between the low‐ and high‐throughput methods (see Table S4, variant B3). Here, we could directly compare library complexity and found no significant difference between the low‐ and high‐throughput methods when the extraction step was the only variable (see Figure S5) (paired two‐tailed t‐test: t = −0.43 with [−0.86; 0.63] 95% confidence interval, df = 4, p‐value = 6.9e−1).
FIGURE 4.

Boxplots of Preseq absolute estimate of unique endogenous DNA molecules at plateau between low‐ (MinElute) and high‐throughput (Plate) extraction methods (variants A and B). p‐value for the Wilcoxon one‐sided paired test is displayed. Median and interquartile range (IQR) are displayed as a box with +/− 1.5 × IQR whiskers. The y‐axis is displayed in full with outliers in Figure S4. N = 49.
3.4. Elution Buffer Comparison
We tested whether the complexity in sequencing was impacted by using TET for elution instead of EB since it is common practice in aDNA laboratories to add Tween‐20 to the elution buffer in order to limit DNA adsorption to the plastic storage tube. Two subsamples of the same lysate were eluted with either TET or EB during high‐throughput extraction and library prep clean‐up for seven samples (marked TET in Table S2, variant B2). We found a significantly higher complexity in libraries eluted with TET compared to EB (Figure 5) (paired one‐tailed t‐test: t = −3.96 with [−infinite; −2.40] 95% confidence interval, df = 6, p‐value = 3.7e−3). A significantly higher endogenous DNA content was also retrieved in libraries eluted with TET buffer (paired one‐tailed t‐test: t = 2.63 with [0.020; infinite] 95% confidence interval, df = 6, p‐value = 2.0e−2). Taken together, these results confirm that TET buffer limits DNA loss and results in more complex libraries, and its use is therefore strongly recommended.
FIGURE 5.

Boxplots of complexity in sequencing (%) between samples eluted with either EB or TET buffer during high‐throughput extraction and clean‐up (variant B2). p‐value for the one‐sided paired t‐test is displayed. Median and interquartile range (IQR) are displayed as a box with +/− 1.5 × IQR whiskers. N = 7.
3.5. Mitochondrial Coverage Depth
Given a sequencing effort of 2.67 M paired‐end reads per sample, we were able to recover mitogenomes over 0.1×, which could be used for taxonomic assignment (Table S2). However, mitochondrial coverage depth recovered from the low‐throughput method was significantly higher (paired one‐sided Wilcoxon test: V = 579, p‐value = 1.118e−2), likely due to the higher library complexity retained using this method. The average mitochondrial coverage depth from the high‐throughput extractions was 0.17× compared to 0.26× for the low‐throughput extractions, when using only samples that had any mapped mitochondrial reads across both methods.
3.6. Cross‐Contamination
We monitored sample cross‐contamination on the high‐throughput plate method by additionally mapping bovine libraries to a mammoth mitogenome and vice versa for sequencing batch P30254 that were processed together on a single plate. The same was done for bovine and reindeer libraries in batch P29464, although each group of samples was extracted on a separate plate. The count of fragments mapped on the second mitogenome is reported in column contam_mapped of Table S2. The highest number of mapped fragments occurred in samples with the highest endogenous content, which we attribute to mapping of regions conserved between these mammalian genomes. We found that 71.9% of bases were conserved between our mammoth and bovine mitochondrial references, using phyloP. Consistent with this interpretation, the same pattern was observed in both low‐ and high‐throughput‐extracted samples, as well as in both sequencing batches while batch P30254 only had bovine and mammoth samples processed together in neighbouring plate columns. The average count of so‐called contaminant fragments mapped was 8 across all samples, with a maximum of 88 fragments observed for a bovine sample in batch P29464 (AG050P2), which was not extracted on a plate together with reindeer samples. Layouts of sample positions on the plates are shown in Table S5.
All lysis, extraction and library preparation controls were sequenced and contained at most 1472 uniquely mapped fragments (see Table S3). This highest number of contaminant fragments was found in the high‐throughput extraction control PB3 of batch P30254. These reads result from mapping against the mammoth reference, which also includes the human genome, and 67% of these fragments were mapped to human chromosomes. However, 505 reads remained aligned to mammoth parts of the reference after deduplication and filtering for a mapping quality of 20. We investigated deamination patterns in several negative controls with elevated 5' C‐to‐T substitution on the first base, including PB3, and found no pattern of deamination at the end of reads, characteristic of ancient DNA (see Figure S6). Additionally, an elevated mismatch rate of short reads from PB3 suggests spurious mapping of nonendogenous short reads (see Figure S6). In other words, PB3 does not appear to contain any mammoth DNA. Thus, it appears that the detected contamination signal derives from nonmammoth reads that spuriously mapped to our reference genome. We caution that appropriate use and sequencing of negative controls are recommended to monitor for cross‐contamination when using the high‐throughput method.
4. Discussion
The 96‐column QIAquick, or high‐throughput variant B, extraction protocol presented here was developed as a scaled‐up and rapid tool for screening subfossil biological material in large scientific collections for endogenous DNA content and providing a taxonomic assignment. The method yielded endogenous DNA contents that are positively correlated and statistically equal to a routine low‐throughput variant A single MinElute extraction method (Figure 2). However, the library complexity, in terms of total uniquely mappable molecules, of high‐throughput‐extracted libraries was significantly lower compared to the low‐throughput values (Figure 4). We also found that the average fragment length recovered by the high‐throughput method was significantly higher compared to the low‐throughput method (Figure 3A), whereas cytosine deamination rates were significantly higher in fragments recovered from the low‐throughput method compared to the high‐throughput method (Figure 3B), suggesting that shorter fragments tend to retain more DNA damage. Although no significant cross‐contamination was detected in our samples extracted on the same plate, nor in the negative controls, it should be noted that reducing the lysate volume from 100 to 50 μL could be a strategy to further minimise the cross‐contamination risk.
While endogenous content and complexity retrieved from single MinElute and QIAquick spin columns were previously found to be similar, average fragment length recovered was shown to significantly differ (Dehasque et al. 2022). Modification of the binding buffer ratio in the library clean‐up step of the high‐throughput protocol was found to rescue the previously observed differences in recovered average fragment length, DNA damage and library complexity (Figures S3 and S5, variant B3). These variations can therefore be explained by the different binding buffers and their ratios (almost 2 PB:1 PM) used during clean‐up in the low‐ and high‐throughput methods. With modification of the protocol, the low‐ and high‐throughput methods recovered similar average fragment lengths, rates of DNA damage and library sequencing complexity (Figures S3 and S5). Yet, endogenous DNA content retrieved by the low‐throughput method remained significantly higher despite being linearly correlated to the high‐throughput method (Figure S7, paired one‐tailed t‐test: t = −3.86 with [−infinite; −0.013] 95% confidence interval, df = 4, p‐value = 9.1e−3). We hypothesise that a larger sample size (here only N = 5) will correct this difference we observe in endogenous content and will strengthen the correlation in library sequencing complexity. In their study, Dehasque et al. (2022) also hypothesised that a larger sample size would uncover a significant increase in mitochondrial coverage depth from MinElute columns compared to QIAquick columns because of the higher retained average fragment length of the former. Our results indicate an opposite trend: although the high‐throughput method retained significantly longer fragments, mitochondrial coverage depth recovered from the low‐throughput method was significantly higher due to the higher library complexity of the latter method.
Additionally, we found that using TET as an elution buffer generates significantly more complex libraries and retrieves more endogenous DNA content as compared to EB buffer (Figure 5, see Table S2, variant B2).
The high‐throughput plate extraction method developed here is a tool for rapid sample screening. Therefore, for samples identified as having sufficient DNA preservation, it is possible to subsequently extract the remaining 90% of lysate (900 μL in our protocol) in order to enable large‐scale palaeogenome reconstruction. It should, however, be noted that in some cases, the high‐throughput method can be used directly for production sequencing. For example, our results suggest that it can generate good enough mitochondrial coverage depths and data for molecular sexing, especially for medium‐ and good‐quality samples.
We coupled the high‐throughput extraction method with the Santa Cruz Reaction single‐strand library preparation protocol in order to accommodate potentially low‐input DNA from the downscaled extracts. However, this could be modified to use the different tiers of the Santa Cruz Reaction protocol that are adapted to various input DNA concentrations (Kapp, Green, and Shapiro 2021). Alternative library preparation protocols, such as double‐stranded Meyer and Kircher (2010) or BEST (Carøe et al. 2018) protocols, could also be applied, potentially also in conjunction with USER treatment to remove ancient DNA damage (Briggs et al. 2010).
Compared to low‐throughput single spin columns, the high‐throughput plate extraction coupled to a vacuum manifold is both faster and cheaper. Extraction cost per sample was brought down by ~39% by using the high‐throughput method, in which 96 DNA extractions could be achieved in approximately 4 hours of laboratory work. We highly recommend multichannel pipetting for ease, to increase speed and reduce the risk of sample mix‐ups and cross‐contamination due to the high number of samples processed in parallel during both high‐throughput DNA extraction and library preparation. We also note that the high‐throughput method presented here can easily be robotised thanks to the already available QIAquick 96 PCR BioRobot Kit (Qiagen).
In conclusion, we present and showcase a fast and easy‐to‐setup extraction protocol for screening aDNA from large collections of samples. From two subsamples of the same lysate, our high‐throughput plate method recovered ancient DNA quality and quantity metrics comparable to traditional low‐throughput methods, especially after controlling for elution buffer and library purification binding buffer. We applied this high‐throughput extraction method to bone/antler samples spanning different geographic locations, time periods, species, materials and sampling strategies, and suggest the method would be adaptable to more diverse samples as well as other ancient tissue types.
Author Contributions
L.D., A.G., E.L. and P.D.H. designed the study. A.G., E.L., P.D.H. and L.D. developed the original protocol. A.G., E.L., G.O.G. and G.X. performed laboratory work. K.D., M.J.W., T.R., M.D.M., M.L.M. and M.A. conducted fieldwork and provided samples. A.G. performed data analysis with contributions from E.L., G.O.G., P.D.H. and L.D. A.G. wrote the manuscript, with input from all other coauthors. All authors have read and agreed to the published version of the manuscript.
Conflicts of Interest
The authors declare no conflicts of interest.
Supporting information
Data S1.
Tables S1–S5.
Acknowledgements
L.D. acknowledges funding from the Swedish Research Council (2021‐00625), the European Union (ERC, PrimiGenomes, 101054984), and the Knut and Alice Wallenberg Foundation (KAW 2022.0033). P.D.H. acknowledges additional support from the Knut and Alice Wallenberg Foundation (KAW 2021.0048). M.D.M was supported by grant 325589 from the Norwegian Research Council. M.J.W.'s involvement in the study was supported by a grant from the US National Science Foundation (OPP 2310505). Computation was enabled by resources in project NAISS 2023/2‐9, 2023/5‐22 and 2024/5‐54 and provided by the Swedish National Infrastructure for Computing (SNIC) at UPPMAX. We thank Børge Ousland for providing seven samples from his expedition to Franz Josef Land, on the return from the North Pole in 2007. We would like to thank the Denisova Cave team, Profs. M. Shunkov and A. Derevianko and Dr. Maxim Kozlikin, for allowing access of faunal material from the site to be included in this work. The identification of the Denisova Cave bones was funded by the ERC under the European Union's Horizon 2020 Research and Innovation Programme, grant agreement no. 715069 (FINDER) to K.D.
Handling Editor: Sebastien Calvignac‐Spencer
Funding: This study was supported by Norges Forskningsråd, 325589. Vetenskapsrådet, 2021‐00625. Knut och Alice Wallenbergs Stiftelse, KAW 2021.0048, KAW 2022.0033. European Research Council, 715069 (FINDER), PrimiGenomes, 101054984. Office of Polar Programs, OPP 2310505.
Contributor Information
Alexandre Gilardet, Email: alexandre.gilardet@su.se.
Peter D. Heintzman, Email: peter.d.heintzman@geo.su.se.
Love Dalén, Email: love.dalen@zoologi.su.se.
Data Availability Statement
Downsampled raw sequence data are publicly available on the European Nucleotide Archive (ENA BioProject PRJEB82618). The high‐throughput plate protocol is publicly available on protocols.io at https://doi.org/10.17504/protocols.io.36wgqn62ygk5/v1. The custom scripts used to estimate preseq plateau values (preseq_plateau.py) and calculate fragment lengths (calculate.awk) are available on Github at https://github.com/alexandregilardet/plate_aDNA_extraction.
References
- Boessenkool, S. , Hanghøj K., Nistelberger H. M., et al. 2017. “Combining Bleach and Mild Predigestion Improves Ancient DNA Recovery From Bones.” Molecular Ecology Resources 17: 742–751. 10.1111/1755-0998.12623. [DOI] [PubMed] [Google Scholar]
- Briggs, A. W. , Stenzel U., Meyer M., Krause J., Kircher M., and Pääbo S.. 2010. “Removal of Deaminated Cytosines and Detection of In Vivo Methylation in Ancient DNA.” Nucleic Acids Research 38: e87. 10.1093/nar/gkp1163. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brown, S. , Wang N., Oertle A., et al. 2021. “Zooarchaeology Through the Lens of Collagen Fingerprinting at Denisova Cave.” Scientific Reports 11: 15457. 10.1038/s41598-021-94731-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brunson, K. , and Reich D.. 2019. “The Promise of Paleogenomics Beyond Our Own Species.” Trends in Genetics 35: 319–329. 10.1016/j.tig.2019.02.006. [DOI] [PubMed] [Google Scholar]
- Buckley, M. , Collins M., Thomas‐Oates J., and Wilson J. C.. 2009. “Species Identification by Analysis of Bone Collagen Using Matrix‐Assisted Laser Desorption/Ionisation Time‐Of‐Flight Mass Spectrometry.” Rapid Communications in Mass Spectrometry 23: 3843–3854. 10.1002/rcm.4316. [DOI] [PubMed] [Google Scholar]
- Burrell, A. S. , Disotell T. R., and Bergey C. M.. 2015. “The Use of Museum Specimens With High‐Throughput DNA Sequencers.” Journal of Human Evolution 1: 35–44. 10.1016/j.jhevol.2014.10.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Carøe, C. , Gopalakrishnan S., Vinner L., et al. 2018. “Single‐Tube Library Preparation for Degraded DNA.” Methods in Ecology and Evolution 9: 410–419. 10.1111/2041-210X.12871. [DOI] [Google Scholar]
- Dabney, J. , and Meyer M.. 2019. “Extraction of Highly Degraded DNA From Ancient Bones and Teeth.” Methods in Molecular Biology 1963: 25–29. 10.1007/978-1-4939-9176-1_4. [DOI] [PubMed] [Google Scholar]
- Dalén, L. , Heintzman P. D., Kapp J. D., and Shapiro B.. 2023. “Deep‐Time Paleogenomics and the Limits of DNA Survival.” Science 382: 48–53. 10.1126/science.adh7943. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Daley, T. , and Smith A. D.. 2013. “Predicting the Molecular Complexity of Sequencing Libraries.” Nature Methods 10: 325–327. 10.1038/nmeth.2375. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Damgaard, P. B. , Margaryan A., Schroeder H., Orlando L., Willerslev E., and Allentoft M. E.. 2015. “Improving Access to Endogenous DNA in Ancient Bones and Teeth.” Scientific Reports 5: 11184. 10.1038/srep11184. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dehasque, M. , Pečnerová P., Kempe Lagerholm V., et al. 2022. “Development and Optimization of a Silica Column‐Based Extraction Protocol for Ancient DNA.” Genes (Basel) 13: 687. 10.3390/genes13040687. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Díez‐del‐Molino, D. , Dehasque M., Chacón‐Duque J. C., et al. 2023. “Genomics of Adaptive Evolution in the Woolly Mammoth.” Current Biology 33, no. 9: 1753–1764. 10.1016/j.cub.2023.03.084. [DOI] [PubMed] [Google Scholar]
- Dolenz, S. , van der Valk T., Jin C., et al. 2024. “Unravelling Reference Bias in Ancient DNA Datasets.” Bioinformatics 40: btae436. 10.1093/bioinformatics/btae436. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Feuerborn, T. R. , Palkopoulou E., van der Valk T., et al. 2020. “Competitive Mapping Allows for the Identification and Exclusion of Human DNA Contamination in Ancient Faunal Genomic Datasets.” BMC Genomics 21: 844. 10.1186/s12864-020-07229-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gamba, C. , Hanghøj K., Gaunitz C., et al. 2016. “Comparing the Performance of Three Ancient DNA Extraction Methods for High‐Throughput Sequencing.” Molecular Ecology Resources 16: 459–469. 10.1111/1755-0998.12470. [DOI] [PubMed] [Google Scholar]
- Hold, K. , Lord E., Brealey J. C., et al. 2024. “Ancient Reindeer Mitogenomes Reveal Island‐Hopping Colonisation of the Arctic Archipelagos.” Scientific Reports 14: 4143. 10.1038/s41598-024-54296-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jacobs, Z. , Li B., Shunkov M. V., et al. 2019. “Timing of Archaic Hominin Occupation of Denisova Cave in Southern Siberia.” Nature 565: 594–599. 10.1038/s41586-018-0843-2. [DOI] [PubMed] [Google Scholar]
- Kapp, J. D. , Green R. E., and Shapiro B.. 2021. “A Fast and Efficient Single‐Stranded Genomic Library Preparation Method Optimized for Ancient DNA.” Journal of Heredity 112: 241–249. 10.1093/jhered/esab012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kellner, F. L. , Le Moullec M., Ellegaard M. R., et al. 2024. “A Palaeogenomic Investigation of Overharvest Implications in an Endemic Wild Reindeer Subspecies.” Molecular Ecology 33: e17274. 10.1111/mec.17274. [DOI] [PubMed] [Google Scholar]
- Korlević, P. , Gerber T., Gansauge M.‐T., et al. 2015. “Reducing Microbial and Human Contamination in DNA Extractions From Ancient Bones and Teeth.” BioTechniques 59: 87–93. 10.2144/000114320. [DOI] [PubMed] [Google Scholar]
- Kutschera, V. E. , Kierczak M., van der Valk T., et al. 2022. “GenErode: A Bioinformatics Pipeline to Investigate Genome Erosion in Endangered and Extinct Species.” BMC Bioinformatics 23: 228. 10.1186/s12859-022-04757-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Leonardi, M. , Librado P., Der Sarkissian C., et al. 2017. “Evolutionary Patterns and Processes: Lessons From Ancient DNA.” Systematic Biology 66: e1–e29. 10.1093/sysbio/syw059. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lord, E. , Marangoni A., Baca M., et al. 2022. “Population Dynamics and Demographic History of Eurasian Collared Lemmings.” BMC Ecology and Evolution 22: 126. 10.1186/s12862-022-02081-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Meyer, M. , Arsuaga J.‐L., de Filippo C., et al. 2016. “Nuclear DNA Sequences From the Middle Pleistocene Sima de los Huesos Hominins.” Nature 531: 504–507. 10.1038/nature17405. [DOI] [PubMed] [Google Scholar]
- Meyer, M. , and Kircher M.. 2010. “Illumina Sequencing Library Preparation for Highly Multiplexed Target Capture and Sequencing.” Cold Spring Harbor Protocols 6: prot5448. 10.1101/pdb.prot5448. [DOI] [PubMed] [Google Scholar]
- Nguyen, R. , Kapp J. D., Sacco S., Myers S. P., and Green R. E.. 2023. “A Computational Approach for Positive Genetic Identification and Relatedness Detection From Low‐Coverage Shotgun Sequencing Data.” Journal of Heredity 114: 504–512. 10.1093/jhered/esad041. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nieves‐Colón, M. A. , Ozga A. T., Pestle W. J., et al. 2018. “Comparison of Two Ancient DNA Extraction Protocols for Skeletal Remains From Tropical Environments.” American Journal of Physical Anthropology 166: 824–836. 10.1002/ajpa.23472. [DOI] [PubMed] [Google Scholar]
- Pinhasi, R. , Fernandes D., Sirak K., et al. 2015. “Optimal Ancient DNA Yields From the Inner Ear Part of the Human Petrous Bone.” PLoS One 10: e0129102. 10.1371/journal.pone.0129102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pollard, K. S. , Hubisz M. J., Rosenbloom K. R., and Siepel A.. 2010. “Detection of Nonneutral Substitution Rates on Mammalian Phylogenies.” Genome Research 20: 110–121. 10.1101/gr.097857.109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Reich, D. , Green R. E., Kircher M., et al. 2010. “Genetic History of an Archaic Hominin Group From Denisova Cave in Siberia.” Nature 468: 1053–1060. 10.1038/nature09710. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shapiro, B. , and Hofreiter M.. 2014. “A Paleogenomic Perspective on Evolution and Gene Function: New Insights From Ancient DNA.” Science 343: 1236573. 10.1126/science.1236573. [DOI] [PubMed] [Google Scholar]
- Sirak, K. , Fernandes D., Cheronet O., et al. 2020. “Human Auditory Ossicles as an Alternative Optimal Source of Ancient DNA.” Genome Research 30: 427–436. 10.1101/gr.260141.119. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Slon, V. , Hopfe C., Weiß C. L., et al. 2017. “Neandertal and Denisovan DNA From Pleistocene Sediments.” Science 356: 605–608. 10.1126/science.aam9695. [DOI] [PubMed] [Google Scholar]
- Slon, V. , Mafessoni F., Vernot B., et al. 2018. “The Genome of the Offspring of a Neanderthal Mother and a Denisovan Father.” Nature 561: 113–116. 10.1038/s41586-018-0455-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- van der Valk, T. , Pečnerová P., Díez‐del‐Molino D., et al. 2021. “Million‐Year‐Old DNA Sheds Light on the Genomic History of Mammoths.” Nature 591: 265–269. 10.1038/s41586-021-03224-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zavala, E. I. , Jacobs Z., Vernot B., et al. 2021. “Pleistocene Sediment DNA Reveals Hominin and Faunal Turnovers at Denisova Cave.” Nature (London) 595: 399–403. 10.1038/s41586-021-03675-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data S1.
Tables S1–S5.
Data Availability Statement
Downsampled raw sequence data are publicly available on the European Nucleotide Archive (ENA BioProject PRJEB82618). The high‐throughput plate protocol is publicly available on protocols.io at https://doi.org/10.17504/protocols.io.36wgqn62ygk5/v1. The custom scripts used to estimate preseq plateau values (preseq_plateau.py) and calculate fragment lengths (calculate.awk) are available on Github at https://github.com/alexandregilardet/plate_aDNA_extraction.
