Abstract
Molecular analysis of DNA samples with limited quantities can be challenging. Repeatedly sequencing the original DNA molecules from a given sample would overcome many issues related to accurate genetic analysis and mitigate issues with processing small amounts of DNA analyte. Moreover, an iterative, replicated analysis of the same DNA molecule has the potential to improve genetic characterization. Herein, we demonstrate that the use of “click”-based attachment of DNA sequencing libraries onto an agarose bead support enables repetitive primer extension assays for specific genomic DNA targets such as gene exons. We validated the performance of this assay for evaluating specific genetic alterations in both normal and cancer reference standard DNA samples. We demonstrate the stability of conjugated DNA libraries and related sequencing results over the course of independent serial assays spanning several months from the same set of samples. Finally, we finally applied this method to DNA derived from a tumor sample and demonstrated improved mutation detection accuracy.


Next generation sequencing (NGS) is increasingly used to analyze genomic DNA from a variety of tissue samples including clinical biopsies. , One of the key challenges for NGS analysis involves the scarce quantities of nucleic acid material from specific samples; examples include biopsies of disease tissue or circulating DNA isolated from blood plasma. Limited amounts of tissue samples frequently yield enough DNA or RNA for only a single assay, limiting the breadth of analyses that can be performed. Because the majority of sequencing assays commonly employ a DNA polymerase to generate copies of a template molecule, we developed a process whereby the original material is preserved for use in subsequent polymerase reactions. In this study, we explore the novel application of biocompatible “click chemistry” reactions on solid support as a novel method to preserve DNA across multiple molecular assays.
In targeted sequencing, NGS is used to analyze specific segments of genomic DNA. Also referred to as deep sequencing, this approaches enables sensitive detection of genetic variation even when these DNA alterations occur in a very low fraction of the available genomic DNA molecules. Oftentimes, these targeted sequencing assays and their various enzymatic steps deplete the available genomic DNA material from a tissue sample. Some samples such as clinical biopsies, provide a limited amount cellular material and nucleic acid. This limitation eliminates the possibility of subsequent replicate or alternative molecular assays to be performed.
“Click chemistry” provides an attractive method for conjugation of various biomolecules. Recent developments of this method have led to simple and rapid reactions between paired reactive species. Specifically, the inverse electron demand Diels–Alder cycloaddition (iEDDA) between tetrazines (Tz) and trans-cyclooctenes (TCO) is well suited for the bioconjugation of molecular species at dilute concentrations ranging in the submicromolar range. Frequently, DNA extracted from low cellularity samples such as plasma falls into this concentration range. For this study, we describe a method utilizing iEDDA to covalently tether genomic DNA to a solid-phase support. This reaction generates a reusable DNA substrate for iterative polymerase-based enzymatic reactions. Referred to as APEX (attachment-based primer extension), this molecular sequencing assay provides DNA fragments for next-generation sequencing (Figure A).
1.

Overview of APEX. A DNA sample library is covalently conjugated to functionalized agarose beads. Interrogation of genomic regions by primers and DNA polymerase creates copies of the conjugated template molecules. These copied fragments can then be eluted and sequenced. Because the DNA is covalently conjugated, the process can be repeated.
As a proof-of-concept demonstration, we validated APEX by performing highly mutiplexed primer extension assays targeting the exons of 185 genes of interest. We show its robustness across reaction conditions and iterative experiments. We observed that covalently attached genomic DNA is stable on this substrate, and molecular assays can be performed over the course of months with negligible detection of degradation. Finally, to demonstrate its clinical applicability, we applied this technique on a patient-derived matched tumor sample.
APEX utilizes iEDDA “click chemistry” to conjugate genomic library fragments tailed with TCO-modified nucleotides to a corresponding Tz-functionalized cross-linked agarose substrate (Figure B). This variant of click chemistry is several orders of magnitude faster than conventional copper-catalyzed version, obviates the need for copper ions, and is robust at the micromolar concentrations commonly observed in biomolecular samples.
The wide variety of buffer conditions in biomolecular assays required an initial validation of iEDDA. First, we characterized the Tz-TCO ligation reaction and verify its applicability for conjugating nucleic acids. Using a TCO-functionalized fluorescent Cy5 dye and dUTP nucleotide, we measured the conjugation efficiency on a corresponding Tz-functionalized cross-linked agarose support (Supporting Information, Methods). Using fluorescence and spectrophotometric measurements respectively, we observed that over 99% of the TCO-functionalized molecules were conjugated after an overnight incubation. Excessive washing of the columns did not yield any fluorescent signal, indicating negligible nonspecific adsorption.
We validated the iEDDA conjugation performance on DNA from a control human DNA sample (NA12878 DNA) (Supporting Information, Methods). First, we enzymatically ligated only one of two required Illumina sequencing adapters (“P5-Read 1”) onto DNA that was sheared to approximately 500bp. Both adapters (“P5-Read 1” and “P7-Read2”) are required for a complete assayable molecule for Illumina NGS. We subsequently PCR amplified the ligated fragments to 1 μg of DNA. Second, to this modified partial DNA library, we used terminal transferase to add multiple TCO-functionalized dUTPs to the 3′-end of each DNA molecule in solution (Figure A). After reaction cleanup, the functionalized DNA is added to the Tz-functionalized cross-linked agarose in a spin column format. A spin column additionally enabled streamlined operation without extensive technological infrastructure. The spin column is then end-capped and the mixture incubated overnight at room temperature. Full details are available in the Methods (Supporting Information).
2.

Preparation of DNA libraries and performance on control DNA. (A) DNA is fragmented and ligated with a single adapter. Terminal transferase adds TCO-functionalized nucleotides. The DNA library is then added to a spin column loaded with Tz-functionalized cross-linked agarose beads. The subsequent covalent reaction conjugates the DNA library to the beads. (B) Primer extension on NA12878 control genome libraries targeting 185 genes. The boxplot shows the depth of each target exon regions for every target gene. Each replicate is a separate column containing NA12878 genomic DNA. (C) Cumulative distribution function of target exon coverage. Each line represents the cumulative fraction of the number of target exons with a given coverage yield.
We determined the amount of nonconjugated DNA that could be eluted from the agarose substrate after the conjugation reaction based on the amount of DNA in the eluent. By nanodrop spectrophotometry, we calculated that the conjugation efficiency of sequencing libraries was 30.1 ± 1.6% (N = 4). As enzymatic tailing of TCO-dUTP nucleotides to DNA library molecules results in the incorporation of multiple modified nucleotides, steric hindrance and sequence context may contribute to variations in conjugation efficiency. This phenomenon was previously observed in other “click”-based biomolecular assays. , Nevertheless, the amount of DNA library (hundreds of nanograms) conjugated onto agarose beads was sufficient for downstream assays.
We validated the compatibility of conjugated DNA with NGS-based assays. An oligonucleotide primer containing the second “P7-Read 2” sequencing adapter is required to generate a complete library molecule compatible for sequencing; the second adapter is incorporated via primer extension reaction. We performed highly multiplexed primer extension targeting of genomic regions of interest in conjugated DNA libraries inside the spin column. Approximately 12.4k unique oligonucleotides were generated by microarray synthesis (Supporting Information, Methods, Figure S-1, and Table S-2); these DNA primers hybridize to specific sequences flanking the exons of 185 genes, many of which play a role in cancer (Supporting Information, Table S-3). We designed these primers using a previously described strategy for enriching genomic targets. − To expand this primer pool for multiplexed primer extension assays, we amplified the oligonucleotides with common flanking primers and subsequently digested them with a type-IIS restriction enzyme and lambda exonuclease to yield oligonucleotides with a common 5′-adapter region and a 40bp 3′ priming sequence. Full experimental details are outlined in the Methods (Supporting Information). We confirmed the primer pool preparation by gel electrophoresis and tested for single-strandedness by exonuclease I digestion (Supporting Information, Figure S-1).
To perform the primer extension reaction, we incubated four columns, each containing a conjugated NA12878 library with the oligonucleotide primer pool, followed by two wash steps, and then a primer extension reaction (Supporting Information, Methods). We sequenced the eluted fragments (Supporting Information, Table S-4) and performed a series of sequence read alignment and processing procedures to evaluate the assay performance (Supporting Information, Methods). We observed excellent coverage of target exons (Figure B), with 98.0 ± 2.0% (N = 4) of regions being covered by at least one read (Figure C). Overall, the representation of individual primers in the sequence data is relatively even, with over 95% of target regions being within an order of magnitude of the median primer yield (Supporting Information, Figure S-2). However, the spread between exonic coverage will require between-sample normalization in order to accurately measure somatic mutations and copy number changes. We also measured the correlation in target yield between replications of the four control libraries (Supporting Information, Figure S-3) and observed that performance was highly consistent across experiments. Overall, these results demonstrated that the conjugation and molecular assay process is reproducible from experiment to experiment.
We determined the limit of detecting genetic variation from the APEX process. These experiments rely on a set of admixtures between two reference lymphocyte-derived DNA samples, NA12878 and NA24385; the two DNA samples were mixed together in different concentration ratios, covering a range from 0 to 50%. Detection of the minor DNA component (lower fraction) indicates the sensitivity of detection. The admixture DNA samples underwent the initial step of library processing, were conjugated into the spin column format, and then were subjected to targeted sequencing using the primer mixture previously described.
We used the sequencing data to determine the different admixture ratios composed of the two reference DNAs. To quantify these ratios, we relied on the fractional representation of genetic variations specific to the NA24385 DNA. These genetic variants are referred to as single heterozygous single nucleotide variants (SNVs). Counting the sequence reads containing a specific SNV compared to the total number of sequence reads from a given target is a direct measurement of the DNA admixture ratio between the two samples.
From the sequencing data representing the target genes, we identified the fractional representation of those SNVs unique to NA24385, meaning that they are not present in the NA12878 reference DNA. For the various admixtures, the sequencing libraries had similar performance metrics and sequencing coverage distributions (Supporting Information, Figure S-4 and Table S-4). We observed that the mean read fraction corresponding to the SNVs unique to NA24385 correlated with the expected input admixture ratio (Figure A). Below a ∼1–5% SNV fraction, we observed a decline in correlation. At this lower admixture ratio, this variance may be due to errors introduced by the molecular assay and the sequencing process. In particular, the substantial error rate of Bst polymerase , could be a major contributor in the reduction in performance at low minor allelic fractions.
3.

Genetic variant analysis of admixtures in control and cancer genomes. (A) Admixtures of the NA24385 and NA12878 DNA samples for APEX processing and sequencing. The presence of known genetic variants (i.e., allele) unique to NA24385 were measured and compared to the expected admixture ratio. Dashed line indicates perfect concordance. Pearson correlation: 0.878. (B) Admixtures of a cancer reference standard spiked into NA12878 at varying ratios. The presence of known variants unique to the cancer standard were measured and compared to the expected allele fraction. Dashed line indicates perfect concordance. Pearson correlation analysis: 10% spike-in, 0.787; 50% spike-in, 0.879; 100% spike-in, 0.941. Inset: boxplot of observed allelic fractions at 0% spike-in.
We performed some additional validation experiments using admixtures of a reference DNA sample from a cancer cell line and the NA12878 reference DNA. This cancer reference standard is derived from cancer cell lines; the cancer mutations have been verified and the fractions were confirmed with digital PCR. We performed several dilutions of the cancer reference DNA into the NA12878 control DNA and generated DNA libraries for conjugation onto agarose beads (Supporting Information, Methods). Conducting a similar analysis as previously described, we observed the correlations of measured versus expected SNV allele fractions to be 0.787, 0.879, and 0.941 for the 10%, 50%, and 100% cancer reference standard spike-in respectively (Figure B).
DNA conjugated with our APEX platform is stable across multiple molecular assays. To assess the stability of DNA fragments conjugated to the cross-linked agarose beads, we sought to measure the overall complexity of conjugated DNA molecules during the NGS assay. Higher complexity of DNA fragments across iterative assays indicates an overall retention of molecules. We used molecular barcodes in the form of random DNA tags that are ligated onto genomic DNA during library preparation. Molecular barcodes are indicators of single molecules independent of the actual identity of the DNA insert sequence. The identities and abundances of these tags are thus a marker of the stability of the conjugation: maintenance of tag complexity would indicate stable conjugation, while a reduction in complexity would indicate substantial losses (Figure A).
4.

Measuring substrate stability with molecular barcodes. (A) Conjugated DNA libraries are ligated with an adapter containing a molecular barcode. Primer extension copies this molecular barcode sequence, and it can be measured across repeated assays. The maintenance of target yield and molecular barcode diversity indicates stable conjugation, while reductions in these metrics indicates loss of DNA material from the beads. (B) The depth per target exon normalized by the total number of reads is shown per repeated assay. The iteration number represents the number of repeat assays performed on the same column. (C) The number of unique molecular barcodes per exon normalized by the total number of reads is shown per repeated assay. The iteration number represents the number of repeat assays performed on the same column.
We conjugated libraries from NA12878 genomic DNA that contained molecular barcodes (Supporting Information, Methods). We performed a series of iterative primer extension reactions on the same substrate over the course of several months. To assess performance, we measured the overall depth of enriched target reactions normalized by sequencing depth (Figure B). This indicated that performance is maintained over repeated experiments. We also counted the overall molecular barcode diversity (Supporting Information, Methods). Here, we measured the number of unique molecular barcodes found in each target region normalized by the total number of sequenced reads (Figure C). We did not observe an overall decline in molecular barcode diversity, which indicates stable retention of DNA on the beads. We also measured the duplication rate of observed molecular barcodes at each exon; except for iterations 3 and 4 we did not measure a substantial change in the duplication rate (Supporting Information, Figure S-5).
Lastly, we applied the APEX process for storing and analyzing patient-derived DNA samples. Frequently, clinical biopsies provided limited amounts of DNA, thus providing a perfect example of application of APEX. We conjugated DNA libraries derived from a matched-normal colorectal tumor sample (from the same patient) to cross-linked agarose beads (Supporting Information, Methods). Multiplexed primer extension yielded a performance similar to that of control samples, with excellent sequencing coverage uniformity and relative coverage of individual primers (Supporting Information, Figure S-6). We determined the presence of cancer-specific gene copy number changes. This analysis involved comparing the relative yields of each primer between the paired tumor and normal sample. Here, we observed evidence of increased copy number of several genes (Figure A) as indicated by the increased target sequence read counts found in the tumor DNA compared to the matched normal DNA sample. Notably, we observed increased read counts for multiple exons in NOTCH1, whose activation in the Wnt pathway results in increased proliferation in colorectal cancer. , We also measured the dependence of candidate copy number variant regions on the sequenced depth in the normal sample (Supporting Information, Figure S-7). We did not observe a dropout in sequencing coverage for potential amplifications and vice versa for deletions, indicating that exon-to-exon variability in sequencing coverage did not affect our results.
5.

Analysis of patient-derived samples. (A) Analysis of copy number variation by primer extension yield. The yield of target DNA from a matched tumor and normal sample was used to derive a heuristic copy number profile. Each point represents a separate exon target. Colors represent different chromosomes. Exon targets with a tumor-to-normal ratio of over 3 or less than 0.5 are labeled with their gene target. Only targets with a normalized depth of greater than 0.1 in the normalized sample are considered as candidates. (B) Analysis of somatic mutations with iterative analysis. The allele frequency of somatic mutations is plotted for each iterative assay. Red points indicate somatic variants that were not detected in an iterative assay. Black points indicate variants found across both assays. Each of these points is labeled with their corresponding gene target.
Iterative assays on the same sample provides an opportunity to improve the accuracy of DNA sequencing assays. We performed a cancer mutation analysis of the conjugated patient-derived sample across two iterative primer extension assays. When filtering for somatic variants observed in both assays, we observed concordance in a number of mutations (Figure B, Supporting Information, Table S-5). Remarkably, these mutations correlated strongly in allele frequency, implying that such assays can be effectively used to confirm somatic mutation calls. Discordant variant calls may be due to a combination of experimental and bioinformatic factors; in particular, the inherent error-prone nature of Bst polymerase may be a major contributing factor and area for future optimization.
Overall, our results demonstrate the potential of APEX in analyzing patient-derived clinical samples. Despite the wide range in DNA quantities across biopsy types (e.g., cell-free DNA to tissue biopsies), covalent attachment of such fragments with APEX will enable multiple replicate or orthogonal assays.
Supplementary Material
Acknowledgments
This work was supported by US National Institutes of Health grants NHGRI P01HG000205 (to B.T.L. and H.P.J.), NCI R33CA174575 (to H.P.J.), and NHGRI R01HG006137 (to H.P.J.). The American Cancer Society provided support to H.P.J. (Research Scholar grant, RSG-13-297-01-TBG). H.P.J. also received support from the Doris Duke Charitable Foundation, the Clayville Foundation, the Seiler Foundation, and the Howard Hughes Medical Institute.
The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.analchem.8b05139.
List of primer probes (TXT)
Methods; primer probe generation (Figure S-1), gene exon yield uniformity (Figure S-2), target exon yield across replicates (Figure S-3), sequencing performance of NA12878 and NA24385 admixtures (Figure S-4), molecular barcode duplication rate (Figure S-5), sequencing performance of normal/tumor sample (Figure S-6), copy number quality control (Figure S-7); tables of oligonucleotides, target genes, sequencing metrics, and mutations (PDF)
B.T.L. performed the experiments and analyzed the data. B.T.L. and H.P.J. designed the experiments and wrote the manuscript. Both authors have given approval to the final version of this manuscript.
The authors declare no competing financial interest.
Raw sequence data is available on NCBI’s Sequence Read Archive under the accession number SRP167034.
References
- Garraway L. A.. J. Clin. Oncol. 2013;31:1806–1814. doi: 10.1200/JCO.2012.46.8934. [DOI] [PubMed] [Google Scholar]
- Hyman D. M., Taylor B. S., Baselga J.. Cell. 2017;168:584–599. doi: 10.1016/j.cell.2016.12.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Meyerson M., Gabriel S., Getz G.. Nat. Rev. Genet. 2010;11:685–696. doi: 10.1038/nrg2841. [DOI] [PubMed] [Google Scholar]
- Karver M. R., Weissleder R., Hilderbrand S. A.. Bioconjugate Chem. 2011;22:2263–2270. doi: 10.1021/bc200295y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Khatwani S. L., Mullen D. G., Hast M. A., Beese L. S., Distefano M. D., Taton T. A.. Bioorg. Med. Chem. 2012;20:4532–4539. doi: 10.1016/j.bmc.2012.05.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- van Buggenum J. A. G. L., Gerlach J. P., Eising S., Schoonen L., van Eijl R. A. P. M., Tanis S. E. J., Hogeweg M., Hubner N. C., van Hest J. C., Bonger K. M., Mulder K. W.. Sci. Rep. 2016;6:22675. doi: 10.1038/srep22675. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shin G., Grimes S. M., Lee H., Lau B. T., Xia L. C., Ji H. P.. Nat. Commun. 2017;8:14291. doi: 10.1038/ncomms14291. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hopmans E. S., Natsoulis G., Bell J. M., Grimes S. M., Sieh W., Ji H. P.. Nucleic Acids Res. 2014;42:e88–e88. doi: 10.1093/nar/gku282. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Myllykangas S., Buenrostro J. D., Natsoulis G., Bell J. M., Ji H. P.. Nat. Biotechnol. 2011;29:1024–1027. doi: 10.1038/nbt.1996. [DOI] [PMC free article] [PubMed] [Google Scholar]
- de Bourcy C. F., De Vlaminck I., Kanbar J. N., Wang J., Gawad C., Quake S. R.. PLoS One. 2014;9:e105585. doi: 10.1371/journal.pone.0105585. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Potapov V., Fu X., Dai N., Correa I. R. Jr., Tanner N. A., Ong J. L.. Nucleic Acids Res. 2018;46:5753–5763. doi: 10.1093/nar/gky341. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kivioja T., Vaharautio A., Karlsson K., Bonke M., Enge M., Linnarsson S., Taipale J.. Nat. Methods. 2012;9:72–74. doi: 10.1038/nmeth.1778. [DOI] [PubMed] [Google Scholar]
- Ishiguro H., Okubo T., Kuwabara Y., Kimura M., Mitsui A., Sugito N., Ogawa R., Katada T., Tanaka T., Shiozaki M., Mizoguchi K., Samoto Y., Matsuo Y., Takahashi H., Takiguchi S.. Oncotarget. 2017;8:60378–60389. doi: 10.18632/oncotarget.19534. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhang Y., Li B., Ji Z. Z., Zheng P. S.. Cancer. 2010;116:5207–5218. doi: 10.1002/cncr.25449. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
