Skip to main content
Nucleic Acids Research logoLink to Nucleic Acids Research
. 2025 Nov 21;53(21):gkaf1143. doi: 10.1093/nar/gkaf1143

The A, B, C, D’s of replicative DNA polymerase fidelity: utilizing high-throughput single-molecule sequencing to understand the molecular basis for DNA polymerase accuracy

Leonardo Betancurt-Anzola 1,2,3,4, Killian C O’Connell 5, Vladimir Potapov 6, Jennifer L Ong 7, Nathan A Tanner 8, Ludovic Sauguet 9,, Kelly M Zatopek 10,
PMCID: PMC12634111  PMID: 41273172

Abstract

DNA polymerases (DNAPs) are indispensable enzymes that play central roles in biology, by replicating and repairing genetic material, as well as in biotechnology, by fueling such innovations as polymerase chain reaction (PCR), cloning, and DNA sequencing. Replicative DNAPs possess dual catalytic activities that work together for high accuracy replication: a selective DNA-dependent DNA polymerase activity for synthesizing DNA and a proofreading exonuclease activity for removing misincorporated nucleotides. Despite their precision, DNA polymerases occasionally make errors, and understanding the mechanisms behind these mistakes is essential to fully leverage these enzymes. Indeed, measuring DNA polymerase fidelity not only reveals the basis of their accuracy, but also enables rational modulation of their fidelity. Here we employ a highly accurate Pacific Biosciences sequencing workflow that leverages long-read, non-PCR-based technology to measure DNAP error rates and profiles. By measuring the fidelity of the four primary replicative DNA polymerase families, A, B, C, and D, measurements uncovered remarkably diverse family specific error profiles. Factors that influence DNAP fidelity, such as deoxynucleoside triphosphate ratios, replication components, and exonuclease and polymerase active site mutations, are further explored. This work deepens our understanding of DNA replication, the mechanisms that underly DNA polymerase fidelity, and informs development of advanced DNA polymerase-based tools for biotechnology.

Graphical Abstract

Graphical Abstract.

Graphical Abstract

Introduction

During cellular division, the correct transfer of genetic information from one generation to the next is fundamental to life. This transfer, a process referred to as DNA replication, relies upon replicative DNA polymerases. These enzymes are responsible for copying entire genomes, from thousand base pair viral genomes to billion base pair eukaryotic genomes, with high accuracy. Importantly, errors incurred by replicative DNA polymerases can lead to detrimental consequences such as cellular apoptosis and cancer [13]. Nevertheless, random DNA polymerase errors are also the basis for evolution and speciation [4, 5]. In addition to their ubiquitous role in DNA replication, DNA polymerases are also indispensable tools in biotechnology, enabling a wide range of applications that rely on their ability to synthesize and amplify DNA with high accuracy, such as amplification, sequencing, and molecular diagnostics [6]. Therefore, the fidelity of DNA polymerase DNA synthesis is critical for both biology and biotechnology. To synthesize DNA with high accuracy, replicative DNA polymerases possess two catalytic activities: a DNA-dependent DNA polymerase activity that must incorporate the correct deoxynucleoside triphosphates (dNTPs) and a proofreading exonuclease activity that removes incorrectly incorporated dNTPs (and occasionally rNTPs). These activities must work in concert to efficiently recognize and excise misincorporated nucleotides, ensuring accurate DNA synthesis.

Given the critical importance of accurate DNA synthesis in both biology and biotechnology, numerous methodologies have been developed to measure DNA polymerase fidelity [710]. These methods range from low-throughput single-nucleotide incorporation assays, mid-throughput plasmid-based LacZα gene assays [1113], to high-throughput next generation sequencing assays [1418]. Importantly, each of these methods provides a unique perspective and understanding of DNA polymerase fidelity. For example, single-nucleotide incorporation assays can provide highly accurate kinetic measurements of nucleotide affinity (Kd) and nucleotide incorporation rates (kpol) for correct and incorrect nucleotide incorporation of any DNA polymerase of interest. While these assays are highly sensitive and can detect subtle incorporation differences amongst different DNA polymerases and their variants, they are low-throughput, sequence context dependent, and lack the ability to understand the contribution of the exonuclease site to DNA polymerase fidelity. Blue-white plasmid-based LacZ screening assays have been widely used to measure DNA polymerase error rates due to the low-cost, ease of detection (blue vs white colonies), and adaptability for both in vitro and in vivo measurements. However, this method is also labor and time intensive, experiences bias toward detectable mutations and is unable to provide an understanding of the DNA polymerase error profile (i.e. what specific mutations the DNA polymerase makes). More recently, next generation sequencing platforms have been employed to measure DNA polymerase fidelity, namely Illumina’s short read sequencing-by-synthesis technologies [1621]. Although this platform enables a high-throughput means of measuring DNA polymerase error rates, and provides error profiles, the technology is limited to short-reads. Unfortunately, amplification of the DNA prior to sequencing, through library preparation, along with error inflation by sequencing artifacts, confounds the results, rendering the extraction of accurate error rates and profiles challenging.

To simplify and more directly measure DNA polymerase fidelity, we turned to Pacific Biosciences single-molecule real-time (SMRT) sequencing. SMRT sequencing is a long-read, non-polymerase chain reaction (PCR)-amplification-based platform that uses circular consensus sequencing (CCS) to repeatedly read the same DNA molecule, achieving extremely high accuracy [22]. By performing straightforward DNA polymerase primer extension assays under a desired condition, followed by PacBio library preparation, SMRT sequencing, and analysis of single-molecule reads, one can accurately measure the fidelity of any DNA polymerase, yielding both its error rate and error profile (Fig. 1) [23]. Further, due to the long-read nature of PacBio sequencing, fidelity measurements on long DNA molecules, such as those that represent a gene, or segment of a gene, can be carried out.

Figure 1.

Figure 1.

Schematic of the primer extension assay utilized for DNA polymerase fidelity measurements. (A) A primer is annealed to 2 kb single-stranded DNA (ssDNA) followed by primer extension by a family A, B, C, or D DNA polymerase. Double stranded DNA is converted to PacBio libraries. (B) PacBio phi29-based sequencing polymerase performs CCS, allowing each DNA fragment to be sequenced multiple times. Downstream analysis identifies a consensus sequence for the initial 2 kb strand (black) and the primer extended strand (green) and compares them to identify DNA polymerase errors.

In this work, we show the utility of this methodology by measuring the fidelity of the four main families of replicative DNA polymerases: A, B, C, and D, to understand how diverse DNA polymerases enable accurate DNA replication. Indeed, although all four families are involved in genomic DNA replication and carry out the same two catalytic activities, nucleotidyl transfer for polymerization and hydrolysis for proofreading, they exhibit both sequence and structural diversity in both their polymerase and exonuclease domains (Fig. 2). For instance, families A and B possess a Klenow-like DNA polymerase active site, family C features a β-like active site, and family D contains a double-Ψ-β-barrel (DPBB) polymerase active site [2427]. Families A, B, and most of family C harbor a DnaQ-like exonuclease active site, whereas family D has a phosphodiesterase (PDE) exonuclease active site. Additionally, all Family C DNA polymerases possess a polymerase and histidinol phosphatase (PHP) domain, which may be catalytically inactive, putatively active, or confirmed to have exonuclease activity [28]. For family C DNA polymerases with a DnaQ-like exonuclease active site, e.g. Escherichia coli Pol III, the PHP is inactive. Conversely, other family C DNA polymerases entirely lack the DnaQ-like exo domain and contain an active (or putatively active) PHP instead, including Mycobacterium tuberculosis PolC (Fig. 2). Finally, both family A and B house polymerase and exonuclease domains within a single subunit, while family D is a heterodimer of DP2 polymerase and DP1 exonuclease subunits. Family C is more diverse, whereby the E. coli Pol III core is composed of a heterotrimer of α (DnaE) polymerase, ε (DnaQ) exonuclease, and θ (holE) subunits.

Figure 2.

Figure 2.

Structural diversity across replicative DNA polymerases in Bacteria and Archaea. Schematic representation of bacterial Pol I and Pol III, and archaeal PolB and PolD, where colored circles represent polymerase (P) and the exonuclease (E) active sites. Three different polymerase folds exist in replicative DNA polymerases: Klenow-like (Families A and B), Polβ-like (family C) and DPBB (family D). Exonuclease domains can host a DnaQ-like fold (Pol I, Pol III, and PolB), PDE fold (PolD) or PHP fold (*inactive in E. coli Pol III).

To better understand how structurally diverse polymerase and exonuclease active sites contribute to accurate DNA replication and genome stability, we utilize PacBio sequencing to measure the fidelity of wt and exo- variants of bacterial family A and C and archaeal family B and D DNA polymerases, using E. coli and Pyrococcus abyssi as model systems. We further explore the role of critical polymerase active site residues in dictating both error rate and error profiles, revealing a critical role for the DNA polymerase active site architecture in influencing the error profile. To further highlight the utility of this method, we explore how different factors, including replisome components and dNTP ratios, enable rational modulation and affect DNA polymerase fidelity in vitro.

Materials and methods

All reagents and enzymes are from New England Biolabs, including E. coli Klenow Fragment, Klenow Fragment exo-, Taq DNA polymerase, and Dpo4 DNA polymerase, unless otherwise stated.

Plasmid construction and site-directed mutagenesis

Construction of plasmids encoding wild-type (wt) P. abyssi PolD and proliferating cellular nuclear antigen (PCNA) have been described previously [29]. A gene encoding wt P. abyssi PolB was synthesized with N-terminal His-tag and cloned into a pET29a vector by Genscript (Supplemental Table S1). Genes encoding E. coli Pol III α (dnaE), ε (dnaQ), θ (holE) subunits, and (dnaN) β−sliding clamp were amplified by PCR from T7 express cells and each cloned into a pET29a vector via Golden Gate Assembly using NEBridge® Golden Gate Assembly Mix with final constructs containing N-terminal His-tagged α (dnaE), His-MBP-SUMO ε (dnaQ), His-MBP-SUMO θ (holE) subunits, and N-terminal His-tagged (dnaN) β−sliding clamp (Supplemental Table S1). Final Golden Gate assembled clones were verified by whole plasmid sequencing utilizing Oxford Nanopore sequencing. Exonuclease deficient mutants of P. abyssi PolB and PolD, and E. coli Pol III ε subunit (D215A, H451A, and D12A/D14A, respectively) and steric gate mutants for P. abyssi PolB and PolD (Y410A and H923A, respectively) were created by site-directed mutagenesis (Supplemental Table S2) [26, 3033]. Mutagenic PCR was performed with the NEB Q5 Site-Directed Mutagenesis kit according to the manufacturer’s instructions. Mutagenic primers were synthesized by Integrated DNA Technologies. After mutagenic PCR and KLD treatment, NEB 5⍺ cells were transformed, plated on Lysogeny Broth-Kanamycin (LBK) plates, and incubated overnight at 37°C. Colonies were picked the following day and 10 ml of liquid LB medium with kanamycin was inoculated with a single colony and grown overnight. Plasmid miniprep was done with the NEB Monarch Plasmid Miniprep Kit. Purified plasmids were verified by whole plasmid sequencing utilizing Oxford Nanopore sequencing.

Protein purification (PolB, PolD, and mutants)

Pyrococcus abyssi PCNA was expressed and purified as previously described [29] (Supplemental Table S1). NEB NiCo21 cells (C2529H) were transformed with plasmids harboring P. abyssi PolB or PolD wt, exo-, or exo-/steric gate genes according to the manufacturer’s instructions, plated on LBK agar plates and incubated overnight at 37°C. A single clone for each mutant was used to start a 10 ml LBK culture that was incubated overnight at 37°C. The following day, 100 ml of LBK medium was inoculated with 5 ml of the overnight culture and grown at 37°C to reach an optical density (OD600) between 0.4 and 0.6 units. Cells were induced with a 400 µM final IPTG and incubated at 37°C for 3 h, followed by harvesting at 4500 × g for 30 min. Harvested cells were resuspended in 1 ml of buffer A (20 mM Tris–HCl, pH 7.5, 300 mM NaCl, 20 mM imidazole) and lysed by sonication (Q500 Sonicator, QSonica). Lysed cells were incubated at 80°C for 20 min followed by centrifugation at 20 000 × g for 20 min. Soluble fractions were then purified utilizing NEBExpress Ni Spin Columns following the manufacturer’s instructions but replacing lysis and wash buffer with buffer A. Elution buffer was replaced with buffer B (20 mM Tris–HCl, pH 7.5, 300 mM NaCl, 500 mM imidazole). Purified proteins were confirmed by sodium dodecyl sulphate (SDS)–polyacrylamide gel electrophoresis and dialyzed overnight in storage buffer [10 mM Tris–HCl, pH 7.4, 100 mM KCl, 1 mM dithiothreitol (DTT), 0.1 mM ethylenediaminetetraacetic acid (EDTA), 50% glycerol].

Protein purification (Pol III, Pol III mutants and β-sliding clamp)

Escherichia coli T7 Express lysY/Iq competent cells were transformed with a pET28a plasmid containing either N-terminal His-tagged α (dnaE), His-MBP-SUMO ε (dnaQ), or His-MBP-SUMO θ (holE) subunits of Pol III, and N-terminal His-tagged (dnaN) β-sliding clamp, as indicated by the manufacturer (Supplemental Table S1). Transformed cells were plated on an LBK agar plate and grown overnight at 37°C. A 10 ml pre-culture, grown overnight at 37°C in LBK, was used to inoculate 1 l of LBK media. The inoculated cultures were grown at 37°C until an OD600 of 0.6. Expression of each protein was induced with 0.2 µM final IPTG and growth was continued overnight at 20°C. Induced cells were harvested by centrifugation at 3000 × g for 30 min and resuspended with lysis buffer (50 mM Tris–HCl, pH 8.0, 300 mM NaCl, 0.1 mM EDTA, 5% glycerol). Resuspended cells were lysed with a cell-disruptor at 20 kpsi (ShearJet HL60 homogenizer, Dyhydromatics LLC). Lysate was clarified by pelleting at 25 000 × g for 20 min. Lysate supernatant was loaded onto a 5 ml Ni-NTA column (Cytiva) pre-equilibrated with buffer A (20 mM Tris–HCl, pH 8.0, 500 mM NaCl, 1 mM tris(2-carboxyethyl)phosphine (TCEP), 0.1% Triton X-100, 30 mM imidazole), and eluted with buffer B (20 mM Tris–HCl, pH 8.0, 500 mM NaCl, 1 mM TCEP, 0.1% Triton X-100, 500 mM imidazole) with an imidazole gradient from 30–500 mM. Fractions were run on a 4%–20% Tris–glycine gel (Bio-Rad) to confirm the presence and purity of each subunit. Fractions containing each subunit were individually pooled and dialyzed into dialysis buffer (10 mM Tris–HCl, pH 7.4, 300 mM NaCl, 1 mM DTT, 0.1 mM EDTA, 50% glycerol) in either 1 kDa (subunit θ) or 8–10 kDa (subunits α and ε) dialysis tubing (SpectraPor, respectively) overnight at 4°C. Fractions containing Pol III subunits ε and θ were then incubated with 1 µM SENP1 protease for 4 h (Supplemental Table S1). Aliquots were run on a 4%–20% Tris–glycine gel to confirm the removal of the MBP tag. Pooled fractions were loaded onto a 5 ml Ni-NTA column and purified as described above, with an imidazole gradient from 0–500 mM. Escherichia coli Pol III clamp loader was a gift from the Loparo Laboratory, Harvard University [34].

Substrate preparation for PacBio sequencing

For Pacific Bioscience DNA polymerase fidelity measurement assays, a synthetic 2 kb DNA template was designed to contain an equal ratio of A:T:G:C in all sequence combinations and avoid homopolymer regions of >5 consecutive bases, with a single Xmn1 site just downstream and T7 promoter upstream. The artificial DNA fragment was synthesized by GenScript and cloned into the EcoRV site of pUC57-simple. NEB 5⍺ (C2987) cells were transformed with the plasmid and plated onto a LB-AMP-agar plate and incubated overnight at 37°C. After incubation, a single colony was used to inoculate a 100 ml culture of LB-AMP media. The plasmid was purified from the overnight culture using the Qiagen Plasmid Plus Maxi kit. Twenty micrograms of plasmid were linearized by incubation with 400 units of XmnI in 1 ml of 1× Thermopol Buffer [20 mM Tris–HCl, 10 mM (NH4)2SO4, 10 mM KCl, 2 mM MgSO4, 0.1% Triton X-100, pH 8.8 at 25°C] for 1 h at 37°C (Supplemental Fig. S1).

Linear DNA was repaired by adding 20 µl PreCR mix, 11 µl dNTPs (10 mM), 11 µl NAD+ (50 mM), 10 µl of 10× Thermopol buffer, and 48 µl ddH2O. The reaction was incubated at 37°C for 30 min. DNA was cleaned up using the Monarch PCR & DNA cleanup kit and eluted with 100 µl of elution buffer. The repaired, linearized substrate was used to generate single-stranded RNA (ssRNA) with the HiScribe® T7 kit, followed by DNase I treatment, as described by the manufacturer’s protocol. ssRNA was cleaned up using the Monarch RNA cleanup kit and eluted with 100 µl low Tris-EDTA (TE) buffer [10 mM Tris–HCl (pH 8.0), 0.1 mM EDTA]. ssRNA was quantitated using a Qubit Fluorometer (Thermo Fisher Scientific) and RNA HS Assay Kit (Thermo Fisher Scientific). Finally, ssRNA was reverse transcribed using ProtoScript II. The following steps yield enough complementary DNA (cDNA) for a single DNA polymerase fidelity assay. Ten micrograms of ssRNA were annealed to 10 µM of the reverse primer (5′-CTGACACTCATCCTTCAGA-3′) in a total volume of 40 µl by incubating at 65°C for 5 min followed by incubation on ice. Annealed substrate was reverse transcribed by adding 50 µl 2× ProtoScript II reaction mix and 10 µl 10× enzyme mix. The reaction was incubated at 42°C for 1 h followed by addition of 4 µl RNaseH and incubation at 37°C for 1 h. cDNA was cleaned up using NEB Sample purification beads, followed by elution with 35 µl low TE buffer. ssDNA synthesis was scaled up by performing 96 parallel reactions, as described above. Groups of 12 reactions were pooled and clean-up with NEB sample purification beads and eluted with 420 μl low TE buffer.

Primer extension reactions utilizing the synthesized 2 kb ssDNA and wt and mutant P. abyssi PolB and PolD, and wt and exo- E. coli Pol III and Klenow Fragment, and wt Thermus aquaticus (Taq) DNA polymerase and Sulfolobus islandicus DNA polymerase IV (Dpo4) were performed by annealing 5 µl of 100 µM forward primer (5′-GGGAAGCAGACGTAATATATG-3′) to 35 µl cDNA from the previous step for each DNA polymerase. Annealing was performed by incubation at 65°C for 5 min followed by transfer to ice. Primer extension was carried out in a 50 µl reaction by combining 5 µl of 10× Thermopol reaction buffer (PolB, PolD, Klenow Fragment, Taq, and Dpo4) or 10× Pol III reaction buffer (50 mM HEPES, pH 7, 1 mM DTT, 12 mM magnesium acetate, 80 mM potassium acetate, 100 µg/ml bovine serum albumin), 200 µM final dNTPs (for equimolar dNTP experiments), 50 nM final DNA polymerase (E. coli β-clamp was present in all Pol III extension assays at a ratio of 1:1), 40 µl of annealed substrate, brought up to 50 µl with ddH2O and incubated for 1 h at 37°C for bacterial and 65°C for archaeal DNA polymerases. Escherichia coli Pol III extension assays were initially carried out in 1× Thermopol buffer, but due to reduced polymerase activity and incomplete primer extension, experiments were then performed in Pol III reaction buffer. For reactions in which PCNA was added with PolD, 150 nM final PCNA was included. For reactions in which the E. coli β-clamp loader was added, 25 nM final clamp loader was included. Extended products were cleaned up with NEB sample purification beads and eluted in 50 µl low TE buffer. Second-strand synthesis was carried out in triplicate for each DNA polymerase condition.

For experiments in which altered dNTP pools were utilized, five different pools were created: Low dATP, Low dCTP, Low dGTP, Low dTTP, and canonical P. abyssi dNTPs. For “Low” dNTP pools, three dNTPs were kept at 200 µM final concentration while one was lowered to 40 µM final concentration, which is just below the Km of dNTP binding for archaeal PolB [35]. For example, for the Low dATP pool, dTTP, dGTP, and dCTP were at 200 µM while dATP was at 40 µM. For canonical P. abyssi dNTPs, ratios were taken from previously reported in vivo P. abyssi nucleotide pool measurements. The highest in vivo concentration (dTTP) was kept at 200 µM final compared to final concentrations of 94 µM dATP, 33 µM dCTP, and 103 µM dGTP [36].

The 2 kb double-stranded DNA constructs were utilized to create PacBio libraries following the manufacturer’s protocol “Preparing multiplexed amplicon libraries using SMRTbell prep kit 3.0″ and the PacBio SMRTbell prep kit 3.0 (102-182-700) with a few modifications. Step 1, “Input DNA and quality control”, was omitted, in Step 2.1, 2 μl of DNA repair mix was replaced with nuclease free water, and in Step 3.1 the SMRTbell Barcoded Adapter Plate 3.0 (102-009-200) was utilized to barcode individual libraries for multiplexing.

PacBio sequencing and analysis

PacBio libraries were prepared for sequencing according to the <3kb Amplicons method utilizing the PacBio SMRTlink online tool, Sequel® II Binding Kit 3.1, and 150 pM concentration on plate. The sequencing run for Sequel® II was set up with the default parameters for the <3 kb Amplicons application and 20 h of movie time. Libraries were pooled for multiplexed sequencing, where a maximum of 45 libraries were pooled at once. Following sequencing, the data was analyzed as previously described for RNA polymerase (RNAP) and reverse transcriptase (RT) fidelity measurements to extract sequencing errors (Supplemental Table S3 and raw data in Data availability below) [23]. Briefly, strand-specific consensus sequences were built for DNAs from each sequencing well (e.g. Zero-mode waveguide, ZMW). First, raw sequencing subreads were mapped to the reference to split first strand (i.e the RT strand) from the corresponding second strand (i.e. the DNA polymerase strand). Second, the PacBio ccs command-line utility was used to build consensus sequences of first and second strands. Opposite strand consensus sequences were filtered to have at least 15 passes, overall read quality of 1.0, read length within 10 bp of the expected reference sequence length, and expected sequences at both 5′ and 3′ ends. For each ZMW, the first-strand consensus was aligned to the second-strand consensus, and the total number of mismatches and the identity of the mismatches was calculated and extracted. Scripts used in this analysis can be found at https://doi.org/10.5281/zenodo.17379561.

Results

A highly accurate, high-throughput single molecule sequencing method to measure DNA polymerase fidelity

In the past decade, short-read and long-read high-throughput next-generation sequencing platforms have been employed to measure DNA and RNAP fidelity. Due to the requirement for PCR-compatible DNA polymerases and the complexity in separating PCR amplification errors and inherent sequencing polymerase errors from desired DNA polymerase fidelity measurements, we sought alternative methodologies to directly measure DNA polymerase fidelity. In this work, we employ PacBio SMRT sequencing to measure replicative DNA polymerase fidelity. Previous work (Potapov et al.) [23] created a Pacific Bioscience SMRT-sequencing based strategy to measure RNAP and RT cDNA and DNA second strand synthesis fidelity. We have adopted the second-strand DNA synthesis workflow to measure the in vitro fidelity of replicative DNA polymerases from Bacteria and Archaea (Fig. 1). Here, 2 kb ssDNA is created through an IVT/RT-based workflow (Supplemental Fig. S1). This ssDNA contains an equal ratio of A:T:G:C in all sequence combinations and avoids homopolymer regions of >5 consecutive bases to enable nonsequence bias fidelity measurements. Following primer annealing, the ssDNA is replicated with the DNA polymerase of interest to create 2 kb dsDNA constructs (Fig. 1A). PacBio libraries are then created from these dsDNA constructs by ligation of SMRTbell adapters followed by sequencing using PacBio SMRT CCS. In CCS, the phi29-based sequencing polymerase copies both strands of the 2 kb fragments multiple times, which enables extremely high accuracy sequencing output, currently making PacBio SMRT-sequencing the highest fidelity high-throughput sequencing platform [37]. Here, the sequencing polymerase must complete 15 passes of each strand to be utilized for fidelity analysis, ensuring accurate fidelity measurements. The consensus sequence of the strand synthesized by the replicative DNA polymerase (green, Fig. 1B) is then directly compared to the consensus RT replicated strand (black, Fig. 1B), for each individual molecule, and misincorporation errors are extracted. Importantly, errors created by the RNAP or RT during creation of the ssDNA fragment do not convolute extraction of replicative DNA polymerase fidelity, as the consensus sequence of the strand synthesized by the replicative DNA polymerase is directly compared to the consensus sequence of the strand synthesized by the RNAP/RT within each individual molecule and not compared to the initial reference sequence (Fig. 1B and Supplemental Fig. S2). Further, errors created by the PacBio phi29-based sequencing polymerase (blue X; Fig. 1B) do not affect extraction of the true errors created by the replicative DNA polymerase (red X; Fig. 1B), as CCS can distinguish between replicative DNA polymerase errors that are present in all reads, from stochastic phi29-based sequencing polymerase errors. In addition to the number of errors detected, the error profiles can also be extracted, enabling an understanding of which misincorporations the DNA polymerase is most prone to make, and under which sequence contexts.

Fidelity measurements of the four families of replicative DNA polymerases and the contribution of the exonuclease domains

To explore the fidelity of bacterial and archaeal replicative DNA polymerases, representative polymerases were chosen from two organisms: Pol I and Pol III from E. coli (families A and C, respectively), and PolB and PolD from P. abyssi (families B and D, respectively). Taq DNA polymerase, a family A DNA polymerase from the thermophilic bacteria T. aquaticus, was included as a control as its fidelity has been extensively measured using various techniques. Taq DNA polymerase therefore provides a reliable benchmark for DNA polymerase fidelity measurements [11, 38] (Fig. 3A). Further, a family Y, translesion DNA polymerase, Dpo4, from the thermophilic archaeon S. islandicus, was included, as family Y DNAPs are notoriously error-prone due to their enlarged active site that accommodates bypass of bulky template lesions, and serves to measure the fidelity of a highly-error prone DNA polymerase from an additional DNA polymerase family.

Figure 3.

Figure 3.

DNA polymerase error rates and profiles. (A) Fidelity measurements in substitutions in a million bases for the different families of DNA polymerases and their respective exo- mutant. (B) Substitution spectra for each DNA polymerase, with dA misincorporations in green, dC misincorporations in blue, dG misincorporations in pink, and dT misincorporations in orange. Legend corresponds to template strand:nascent strand mispair (i.e. dG:dA = template strand dG, nascent strand dA).

The number of errors produced by wt replicative polymerases varied, from as high as ∼150 errors per 1 million bases (1.5 × 10–4) for archaeal family D DNA polymerase, to as little as ∼25 errors per 1 million bases (2.5 × 10–5) for archaeal family B and bacterial family A (Klenow) DNA polymerases, which is congruent with previously reported Klenow fragment fidelity rates [39]. Pol III, a bacterial family C polymerase, was determined to be in the middle with ∼40 errors per 1 million bases (4.0 × 10–5) (Fig. 3A and Supplemental Table S3). Errors created by Taq DNA polymerase were ∼250 per million (2.5 × 10–4) which correlates well with previous next-generation sequencing (NGS)-derived fidelity measurements [14]. Interestingly, for both bacterial and archaeal polymerases, the major replicative polymerases (Pol D and Pol III) exhibited lower fidelity in vitro, and therefore produced more errors, compared to the accessory polymerases that are implicated to perform Okazaki fragment maturation (PolB and Pol I) [40, 41]. PolD, the major replicative polymerase in P. abyssi has ∼6-fold lower fidelity compared to the accessory PolB polymerase; from E. coli, the replicative Pol III has ∼1.5-fold lower fidelity compared to Pol I. Further, PolB and Pol I, the polymerases with the highest fidelity, contain homologous folds in polymerase and exonuclease active sites, a Klenow-like polymerase active site, and DnaQ-like exonuclease active site (Fig. 2). In contrast, PolD and PolC are structurally diverse amongst the polymerase active sites, with PolD housing the DPBB active site and Pol III housing a polβ-like polymerase active site (Fig. 2). Not surprisingly, the translesion family Y DNA polymerase, Dpo4, that lacks proofreading ability, was highly error-prone with ∼3000 errors per 1 million bases (3.0 × 10–3).

To explore the contribution of the exonuclease activity on DNA polymerase fidelity, the exonuclease domain of each polymerase was mutated to render it catalytically inactive (Supplemental Table S2). Unexpectedly, the contribution of the exonuclease domain on polymerase fidelity varied widely amongst the different DNA polymerase families, decreasing fidelity as little as two-fold in PolD and Pol III to as high as 15-fold in PolB. These results suggest that the PolB polymerase active site is more prone to errors compared to the other DNA polymerase families and the PolB exonuclease domain is more efficient at removing polymerase errors; the overall higher fidelity observed for archaeal PolB is a result of highly efficient proofreading by the DnaQ-like active site (Supplemental Table S3). Although PolB, Pol I, and Pol III all contain a DnaQ-like exonuclease fold, the contribution of this domain to polymerase fidelity varies, suggesting more is at play than solely the architecture of the active site (Fig. 3A).

In addition to obtaining the total number of errors each DNA polymerase family creates, the nature of these errors can also be extracted from the sequencing data, providing an understanding of what type of mutations each polymerase is prone to making (Fig. 3B and Supplemental Fig. S3). Given that we know which strand was synthesized by our DNA polymerase of interest, we can extract all 12 possible misincorporations, not simply the nature of the mismatch present. Strikingly, the error profile for each DNA polymerase family is quite distinct, with bacterial family A most prone to dT:dG mispairs, where dGTP was misincorporated across template dT(∼50% of all substitutions possible) and dA:dA mispairs (15%). The archaeal family B substitution spectrum shows dG:dT as the most common mispair (∼50%) along with dT:dG (24%). This suggests both polymerases, which house similar polymerase (Klenow-like) and exonuclease (DnaQ-like) active sites in a single subunit, favor the creation of dG:dT mismatches. Further, the most common mispair for family Y Dpo4, which also houses a Klenow-like polymerase active site, is dT:dG (27%). In contrast, Pol III is most prone to dA:dA mispairs (52%) and dG:dA (∼22%) while PolD exhibits the most diverse error profile with the two most common mispairs being dG:dT (27%) and dT:dG (15%). The dC:dC mispair is among the least common—and sometimes undetectable—for all polymerase families except the translesion family Y polymerase Dpo4 (6.5%). This suggests that replicative polymerase active sites strongly disfavor this mismatch, resulting in poor misincorporation or extension, consistent with previous in vitro single-nucleotide incorporation and primer-extension assays (Fig. 3B) [9, 4243]. Further, Pol III is the only polymerase for which the top three substitutions result in the misincorporation of dA, suggesting a higher affinity for this nucleotide at the active site, with the three mispairs dG:dA, dC:dA, or dA:dA representing 87.5% and 92.1% of all mispairs in Pol III wt and exo-, respectively. While we observed differences in the number of errors between wt and exo- variants for each DNA polymerase, the mutational spectra between wt and exo- is quite similar, suggesting that the exonuclease domain for all polymerases minimally discriminates in mismatch removal and therefore does not dictate the error profile.

To understand if error profiles are polymerase-specific, or trends hold for other members of the same polymerase families, the fidelity and error profile of several archaeal family B and bacterial family A DNA polymerases were measured. Family B DNAPs are from archaeal species that encode a family D DNA polymerase, and therefore likely function as an accessory replicative DNAP not the main replicative polymerase [41, 44]. Intriguingly, error profiles remained consistent across all bacterial family A DNA polymerases and across all archaeal family B DNA polymerases, suggesting these profiles might be conserved within families in the same domain of life (Supplemental Fig. S4A and B).

To observe if sequence contexts contributed to the differences in the spectra, we identified the top three substitutions for each polymerase and extracted the sequences where the errors were made. This data, displayed as logo plots, is shown with 6 nucleotides upstream and 6 nucleotides downstream the point of substitution (Supplemental Fig. S5). Misincorporations were found to be stochastic, as no sequence motif was observed that would result in a misincorporation for any polymerase. The only mispair for which a small contribution of the sequence is observed is dC:dA in bacterial Pol III. The misincorporation of an A across a C is slightly more likely (∼5%) when the base upstream the A is a C. It is important to note that the 2 kb sequence utilized in these studies lacks homopolymer regions and dinucleotide and trinucleotide repeats tracks, regions often found within genetic sequences that lead to polymerase mispairs and strand slippage, and therefore instability [4547].

An additional consideration is that major replicative polymerases operate within the context of many replisome components. For example, PolD functions with the minichromosomal maintenance helicase, PCNA processivity clamp, GINS complex and GINS-associated nuclease [29, 48]. Previous structural studies have shown that P. abyssi PolD contains two PIP-motifs that physically associate with PCNA, while biochemical studies report enhanced processivity of P. abyssi PolD in the present of PCNA [29]. To explore if PCNA also enhances PolD accuracy, fidelity measurements were performed in the presence of P. abyssi PCNA. However, no effect on polymerase fidelity, or the error profile, were observed under the conditions tested (Supplemental Fig. S6A and C). Alternatively, Pol III from E. coli requires the β-sliding clamp for processive DNA synthesis, and will only synthesize 10–20 nucleotides in its absence, therefore the β-clamp was included for all Pol III fidelity measurements [49]. In the context of circular genomic DNA, the β-clamp requires an additional subunit, the clamp loader, that opens and loads the β-clamp onto DNA. Due to the nature of the substrate used in these fidelity measurements, i.e. linear DNA in which the primer is placed at the end of the construct, the clamp loader is not required for initial β-clamp loading yet may be important as the polymerase synthesizes away from a DNA end, potentially enhancing and increasing Pol III fidelity. Therefore, the fidelity of Pol III was also measured in the presence of the β-clamp loader. Like PolD, this additional replisome component was not found to affect either polymerase fidelity or the error profile (Supplemental Fig. S6A and C).

Exploring the impact of dNTP pool imbalance on polymerase fidelity in archaeal family B DNA polymerase

For DNA polymerase fidelity measurements described above, an equimolar dNTP pool ratio was chosen to enable direct comparison of replicative DNA polymerase error rates and profiles, which revealed that separate replicative DNA polymerase families are prone to unique error profiles. Further, an equimolar dNTP pool ratio is typically used in DNA amplification assays, such as PCR. Inspired by early RT and DNAP fidelity measurements that showed imbalanced dNTP pools can alter polymerase fidelity, we performed fidelity assays using imbalanced dNTP pool ratios and archaeal PolB as a representative replicative DNAP [50]. For each condition, three dNTPs were maintained at equimolar concentrations (200 µM final) while the fourth was reduced to 1/5th the final concentration of the other three (40 µM final). It is worth noting that while an equimolar dNTP ratio is typically used for biotechnological applications (such as PCR and sequencing), while in vivo ratios are imbalanced. Therefore, we also measured archaeal PolB fidelity in the presence of previously determined in vivo P. abyssi ratios of dNTPs (∼2:1:1:0.3 of dTTP:dGTP:dATP:dCTP = 200 uM dTTP, 103 uM dGTP, 94 uM dATP, and 33 uM dCTP) [36].

Taken together, five additional dNTP ratios were measured: Low A, Low G, Low C, and Low T, along with the in vivo P. abyssi (Fig. 4). PolB exhibited a ∼2-fold increase in error rates under Low A and Low C conditions, indicating that limiting dATP or dCTP compromises fidelity (Fig. 4A). In contrast, reducing dGTP or dTTP levels revealed a ∼2-fold decrease in error rates, enabling a slightly higher fidelity polymerase. Under conditions of equimolar dNTPs, PolB exhibits the highest propensity for misincorporating dG and dT (Equimolar; Fig. 4B) and accordingly it is possible to reduce the concentration of these highly misincorporated nucleotides to improve overall fidelity (LowG/LowT; Fig. 4A). In the presence of P. abyssi in vivo ratios of dNTPs we did not observe any statistically significant change in error rates compared to equimolar dNTP ratios (Fig. 4A).

Figure 4.

Figure 4.

Archaeal PolB error rates and profiles with different nucleotide pool ratios (A) Fidelity measurements in substitutions in a million bases for Pol B using different dNTP pools (equimolar: 200 µM each dNTP; Low N: 40 µM indicated dNTP and 200 µM other dNTPs and Pab in vivo derived ratio: 94 uM dATP, 200 uM dTTP, 33 uM dCTP, and 103 uM dGTP31 (B) Simplified substitution spectra for PolB with the different dNTP ratios. Substitutions are grouped by misincorporated base. For example, dG:dA, dC:dA, dA:dA are grouped in dA misincorporated group. (Full spectra are shown on Supplemental Fig. S7A.)

Overall, altered dNTP pool ratios induced consistent and predictable changes in the substitution spectra for PolB (Fig. 4B and Supplemental Fig. S7A and B). For example, reducing dATP levels not only decreased dA misincorporations, as expected, but also increased dG misincorporations—and vice versa (low dGTP decreased dG and increased dA misincorporations), indicating a purine-to-purine substitution bias (Fig. 4B and Supplemental Fig. S7A and B). Similarly, lowering dCTP levels reduced dC misincorporations while increasing dT misincorporations (and low dTTP increased dC misincorporations), reflecting a pyrimidine-to-pyrimidine substitution bias. Further, previously quantitated dNTP levels from P. abssyi revealed a 2:1:1:0.3 ratio of dTTP:dGTP:dATP:dCTP. The PolB substitution profile in the presence of in vivo P. abyssi levels is as expected, with dT misincorporations dominating errors, followed by dG, then dA, with minimal dC misincorporation. Previous studies have shown that a skewed nucleotide pool ratio (elevated dTTP and dCTP relative to dGTP and dATP) can alter the sequence context of mispairs, favoring regions that promote correct dTTP and dCTP incorporation in repair-deficient yeast strains [51]. We therefore examined the sequence context of all mispairs under all nucleotide pool conditions. We did not detect significant changes, likely because our construct lacks homopolymer regions and our pools were skewed by the reduction of a single nucleotide rather than by an overall imbalance (Supplemental Fig. S8). As expected, modulation of the dNTP pool composition can influence both the error rate and the mutational spectrum of DNA polymerases, and utilization of this methodology enables accurate examination of these changes and offers a potential strategy for tailoring polymerase behavior in specific applications.

Understanding the effect of the steric gate on fidelity of replicative DNA polymerases

PacBio fidelity assays revealed that different DNA polymerase families have different error profiles (Fig. 3B); we therefore aimed to understand what dictates this diversity. Our exonuclease deficient polymerases exhibited a nearly identical error profile to their wt counterpart, suggesting that the exonuclease domain contributes minimally to polymerase error diversity. Therefore, it was of interest to understand if the polymerase active site dictates the error profile of each DNA polymerase family, as has been previously suggested using single-nucleotide incorporation kinetics for Family B RB69 DNA polymerase [52]. While all DNA polymerase families contain aspartic acid catalytic residues within the DNA polymerase active site, the surrounding amino acids that stabilize the incoming nucleotide, DNA primer, and DNA template are distinct. Furthermore, all replicative DNA polymerases contain a bulky noncatalytic residue, termed the steric gate, within the polymerase active site that is responsible for excluding rNTPs from binding [53]. In vivo, DNA polymerases must cope with the presence of rNTPs which are at higher concentrations than dNTPs [36, 54, 55]. Archaeal family B contain a tyrosine, while family D contain a histidine steric gate residue (Fig. 5A) [32, 33]. If the steric gate is mutated to an alanine, the polymerase can equally incorporate a single rNTP as well as a dNTP, consistent with observations in other DNA polymerase steric gate mutants [53]. Further, steric gate mutants have also been shown to incorporate a wide variety of modified nucleotides, suggesting this mutation opens the polymerase active site [5659].

Figure 5.

Figure 5.

The effect of the steric gate residue on archaeal replicative DNA polymerase fidelity. (A) Active site of PolB from T. kodakarensis (PDB: 5OMF), and PolD from P. abyssi (Active site form PDB: 8PPT and nucleotide modeled from RNAP PDB: 2O5J). (B) Fidelity measurements in substitutions in a million bases for archaeal PolB exo-, PolD exo-, and their corresponding steric gate mutation. (C) Substitution spectra for exo- and exo-/steric gate mutants of PolB and PolD. Legend corresponds to template strand:nascent strand mispair (i.e. dG:dA = template strand dG, nascent strand dA).

In order understand the effect of the steric gate residue on DNA polymerase fidelity, and probe whether the architecture of the active site alters DNA polymerase error profiles, we mutated the steric gate residue of archaeal PolB and PolD to alanine within the context of an exonuclease deficient polymerase. When comparing PolB exo- to PolB SG- exo-, we observe a two-fold decrease in fidelity when the steric gate is mutated, with ∼390 errors/million (3.9 × 10–4) for PolB exo- and ∼756 errors/million (7.6 × 10–4) for PolB SG- exo- (Fig. 5B). Similar results were observed in both HIV-RT, an RNA-dependent DNA polymerase, and Klenow (Pol I) that have the same catalytic fold (Klenow-like) and steric gate residue (tyrosine) as PolB; mutating the steric gate to alanine resulted in a higher misincorporation rate and lower fidelity [60, 61]. Interestingly, for PolD, the fidelity of the steric gate mutant displays the opposite result, with the fidelity of the steric gate mutant of PolD SG- exo- (∼15 substitutions/million, 1.5 × 10–5) >20-fold higher than the fidelity of PolD exo- (333 substitutions/million, 3.3 × 10–4) (Fig. 5A). The DPBB polymerase active site found within PolD is typically found in multi-subunit RNAPs [26]. To prevent ribonucleotide incorporation, PolD evolved a “PHT” selectivity loop within the active site, where H corresponds to the histidine steric gate residue [33]. It is plausible that PolD acquired the ability to exclude ribonucleotides at the cost of fidelity. We further measured the fidelity of the proline and threonine residues (P922 and T924 in P. abyssi PolD, respectively) of the selectivity loop, with P922A and T924A increasing the fidelity by 2.4- and 3.6-fold, respectively (Supplemental Fig. S6B).

Notably, mutation of the steric gate altered not only the error rates of archaeal PolB and PolD, it also remodeled their error profiles (Fig. 5B and Supplemental Fig. S9). For example, dT:dG mispair becomes more frequent (from 19% to 41%) for PolB, ultimately dominating the error profile. For PolD, dA misincorporations are the most prevalent (making up ∼50% of all errors) with dA:dA mispairs appearing most often. This suggests that the steric gate residue plays an important role in not only preventing ribonucleotide incorporation, but also in modulating the fidelity of DNA replication. Interestingly, the mutational spectra for P922A and T924A of the PolD selectivity loop was similar to PolD wt, suggesting that only the steric gate (H923) of the selectivity loop dictates the substitution spectrum and not its neighboring amino acids (Supplemental Fig. S6D). Taken together, these results reveal that the DNA polymerase active site, and not exonuclease active site, is responsible for dictating the error profile, while both play a role in overall error rates.

Discussion

Accurate DNA synthesis by replicative DNA polymerases is critical for genome stability and proper transfer of genetic information from mother cell to daughter cell. Further, accurate in vitro amplification of DNA is also critical for generating high-quality DNA for sequencing and nucleic-acid therapeutics, identification of sequence markers for disease characterization, and maintenance of sequence integrity for DNA assembly and cloning. Therefore, understanding the fidelity of DNA polymerases has important implications for both biology and biotechnology. Here, we have described the use and utility of a PacBio SMRT CCS sequencing workflow to accurately measure the fidelity of replicative DNA polymerases. This methodology overcomes many of the challenges inherent to previous NGS-based sequencing methods. First, many sequencing methods measure PCR errors produced by commercial polymerases during amplification, such as Taq, Q5, Kapa, and Phusion, requiring the polymerase of interest to be thermophilic and capable of thermocycling [13, 52, 43]. While other Illumina-based methods have been developed to measure the fidelity of nonthermophilic enzymes, PCR amplification steps are required prior to Illumina-sequencing [14, 18]. If an error is made within the early stages of amplification, this error will remain and be propagated in downstream cycles, accumulating as the DNA is amplified. Indeed, alternative sequencing methodologies, such as Duplex-sequencing, include the integration of unique barcodes prior to PCR amplification, making it possible to track and remove PCR amplification errors. However, this requires a specialized kit and analysis method, which are not always available [18]. Further, temperature changes, which occur throughout PCR amplification cycles, have previously been shown to alter DNA polymerase fidelity [39]. Finally, the cooling and heating cycles enacted during PCR cycling can induce DNA damage, such as cytosine or adenosine deamination, leading to the formation of uracil or inosine, respectively [62]. These deamination products are mutagenic, further confounding extraction of true polymerase errors from deamination events.

In this work, DNA polymerase synthesis is performed isothermally in a single step, followed by ligation of adapters and sequencing, leading to a simple method for highly accurate DNA polymerase fidelity measurements. Because the extension step is performed isothermally, it enables fidelity measurements of any DNA polymerase that is capable of synthesizing 2 kb fragments, additionally eliminating PCR polymerase errors and heat-induced DNA damage. Due to the inherent long-read sequencing nature of PacBio, the methodology described here is the only one which allows fidelity measurements on long DNA constructs which could, in theory, contain an entire gene, or portion of a gene, to understand the mutational spectrum of a DNA polymerase in a particular sequence context. While we used 2 kb fragments in this work, theoretically much longer substrates could be utilized as long as the read quality, as determined by PacBio CCS analysis, is high. Different polymerase mutants, DNA constructs, temperatures, buffers, metal ions, and noncanonical nucleotides are just a few of the variables that can be modulated during primer extension to investigate their effect on DNA polymerase fidelity. Due to barcoding capabilities, many different conditions can be explored and sequenced in a single sequencing cell. For example, here, 45 unique experiments were pooled and run in a single SMRT cell. Additionally, due to PacBio CCS, the limit of detection is dependent on the number of unique DNA molecules produced and sequenced and the accuracy of PacBio HiFi CCS analysis. Current PacBio HiFi CCS accuracy for deep sequencing, as used in this methodology, exceeds Q50 (i.e. 1 error per 100 000 bases), making it the highest-accuracy sequencing method to date, ∼100-fold more accurate than Oxford Nanopore sequencing (Q30, 1 error in 1000, for duplex ONT reads), the only other commercially available long-read sequencing technique, and about 10-fold more accurate than Illumina short-read sequencing (Q30–Q40, 1 error in 1000 to 1 error in 10 000) [6366].

While there are many clear advantages for the PacBio sequencing methodology described here, it is important to also point out pitfalls and bottlenecks. Currently, creation of long ssDNA constructs required for this assay can be cumbersome, especially ssDNAs that are not naturally occurring, like the construct used in this work. Indeed, we utilized an IVT/RT multistep workflow to create the desired 2 kb fragment, requiring up to one day to fully prepare. Bulk preparation of ssDNA, as described in the ‘Materials and methods’ section, can be done to reduce overall hands-on time and streamline subsequent experiments. Alternative methodologies have been utilized to synthesize long ssDNA, including asymmetric PCR, strand-selective exonuclease digestion, and more recently Methanol-Responsive Polymer PCR [4648]. The ease and utility of these methodologies will be explored in future experiments. Further, the workflow described here requires the DNAP of interest to be capable of synthesizing a 2 kb fragment. Indeed, we tried unsuccessfully to measure the fidelity of human Polβ, a family X DNAP involved in gap-filling. This failure was likely due to its nonprocessive nature. We are currently adapting this methodology utilizing DNA constructs that contain small, randomized gaps to measure the fidelity of nonprocessive DNA polymerases. It is important to note that access and cost of PacBio sequencing is limited and may prohibit widespread adoption and utility of this workflow, A lower cost, high-throughput benchtop PacBio sequencing instrument (named Vega) has recently become available. The applicability and compatibility of this methodology for such an instrument will also be explored. Finally, as with all in vitro fidelity assays, measurements are inherently dependent on conditions such as buffer composition, dNTP concentration, metal ions, and temperature. Careful consideration of these parameters is important to place fidelity measurements in the broader context of in vivo DNA replication.

Utilizing this methodology, we were able to directly compare the fidelity of four replicative DNA polymerase families from bacteria and archaea, which exhibit diversity in both polymerase and exonuclease domains. While it is not surprising that the exonuclease activity enhanced fidelity for all replicative DNA polymerases, as this has been extensively explored in previous DNA polymerase fidelity studies, this enhancement varied across polymerases. Interestingly, Pol I, Pol III, and PolB all contain a DnaQ-like exonuclease active site, but the contribution of the active site varied amongst the polymerases, with as high as 15-fold enhancement in PolB to as low as two-fold in Pol III. Other methodologies have examined the effect of the DnaQ-like exonuclease site on polymerase fidelity and have shown anywhere from 4- to 46-fold enhancement in fidelity, suggesting a high degree of variability depending on the DNA polymerase, experimental conditions, and the assay used to measure fidelity [8, 16, 39]. Future studies will measure the kinetic rate of mismatch removal by each DnaQ-like active site to ascertain whether faster cleavage rates confer increased fidelity.

We observed that different polymerase families have different error profiles. These profiles seem to be conserved across families and domains and are a consequence of polymerase active site architecture, dNTP pool ratios, and likely many other factors, including buffer conditions, nature of the active site metal ion. Further, for both bacterial and archaeal replicative DNA polymerases, the accessory polymerase implicated in Okazaki fragment maturation produced lower in vitro error rates compared to the major replicative polymerase counterpart, likely due to these polymerases necessity to function on their own and not part of the replisome [41, 44, 67]. Further, inherent DNAP processivity may also influence fidelity, as the processivity for polymerases tested range from ∼10–50 nts for Klenow to thousands of nts for E. coli Pol III + β-clamp on linear substrates [68, 69]. In vivo mutational rates in various organisms have been reported to be on the magnitude of 10–9 to 10–10, likely due to the presence of a highly organized and regulated replication system, inherent DNA repair pathways, and optimized buffer, metal ion and dNTP pool concentrations [7072]. Due to the inherent high accuracy of this methodology, exploration, and understanding of such high fidelity is possible and will be explored further. In vivo, there is a delicate balance at play between DNA polymerase mutations, DNA repair pathways that remove these mutations, and persistence of mutations to drive evolution and selective advantage amongst populations. Indeed, archaea and bacteria contain two very different mismatch repair systems, with many archaea, Actinobacteria, and some Deinococcus-Thermus species containing NucS, and most bacteria and some Euryarchaeota containing MutS/MutL [7375]. Interestingly, both repair systems exhibit a preference for removal of dG:dT mismatches, the dominate mispair created by three out of the four replicative DNA polymerase families, while the least incorporated mispair for all replicative polymerases, dC:dC, is resistant to cleavage by both repair systems, suggesting co-evolution of DNAP fidelity and DNA mismatch repair (Fig. 3) [74, 76].

All DNA polymerases make mistakes, which is an important aspect of biology as it leads to evolution and speciation. Random mismatch incorporation by DNA polymerases drives genetic variation, which is important for organism adaptation to external and internal pressures. Interestingly, low fidelity DNA polymerases are also present in nature and perform crucial roles. For instance, a bacterial C-family DNA polymerase, DnaE2, is a highly mutagenic DNA polymerase which lacks proofreading activity and drives antibiotic resistance [77, 78]. In higher eukaryotes, Polη, which is also highly mutagenic, creates variability among antibodies and is the basis of adaptive immunity [79, 80]. Quantifying and modulating DNA polymerase fidelity, as demonstrated in this work, enhances our understanding of the molecular mechanisms underlying both high- and low-fidelity DNA replication. This work has broad implications for studying species evolution and the origins of genetic mutations, as well as for optimizing the production of high-quality DNA for biotechnological and therapeutic applications.

Supplementary Material

gkaf1143_Supplemental_Files

Acknowledgements

We would like to thank Eric Beguec, as well as the leadership of New England Biolabs Inc. for fostering a supportive research environment. We wish to acknowledge and thank Sean Lund for construction of vectors containing E. coli Pol III α (dnaE), ε (dnaQ), θ (holE), and (dnaN) β-sliding clamp. We wish the thank Tasha José for thoughtful figure critique and guidance. We also thank Joseph Loparo and Seungwoo Chang at Harvard Medical School for providing us with the sliding clamp loader.

Author contributions: Leonardo Betancurt-Anzola (Conceptualization [supporting], Data curation [supporting], Formal analysis [equal], Investigation [lead], Methodology [supporting], Visualization [lead], Writing – original draft [equal], Writing – review & editing [equal]), Vladimir Potapov (Conceptualization [supporting], Data curation [lead], Formal analysis [lead], Methodology [equal], Software [lead], Validation [equal], Writing – review & editing [equal]), Killian C. O’Connell (Formal analysis [supporting], Investigation [supporting], Visualization [supporting], Writing – review & editing [equal]), Nathan A. Tanner (Supervision [supporting], Writing – review & editing [supporting]), Jennifer L. Ong (Methodology [lead]), Ludovic Sauguet (Conceptualization [equal], Funding acquisition [equal], Methodology [supporting], Supervision [equal], Writing – review & editing [equal]), and Kelly M. Zatopek (Conceptualization [equal], Funding acquisition [equal], Methodology [supporting], Supervision [equal], Visualization [equal], Writing – original draft [equal], Writing – review & editing [equal])

Contributor Information

Leonardo Betancurt-Anzola, Architecture and Dynamics of Biological Macromolecules, Institut Pasteur, Université Paris Cité, CNRS, UMR 3528, Paris, France; New England Biolabs Inc., 240 County Road, Ipswich, MA 01938, United States; New England Biolabs France, 5 Rue Henri Auguste Desbruères, 91000 Évry-Courcouronnes, France; École Doctorale Complexité du vivant ED 515, Sorbonne Université, 7 Quai Saint-Bernard, 75005 Paris, France.

Killian C O’Connell, New England Biolabs Inc., 240 County Road, Ipswich, MA 01938, United States.

Vladimir Potapov, New England Biolabs Inc., 240 County Road, Ipswich, MA 01938, United States.

Jennifer L Ong, New England Biolabs Inc., 240 County Road, Ipswich, MA 01938, United States.

Nathan A Tanner, New England Biolabs Inc., 240 County Road, Ipswich, MA 01938, United States.

Ludovic Sauguet, Architecture and Dynamics of Biological Macromolecules, Institut Pasteur, Université Paris Cité, CNRS, UMR 3528, Paris, France.

Kelly M Zatopek, New England Biolabs Inc., 240 County Road, Ipswich, MA 01938, United States.

Supplementary data

Supplementary data is available at NAR online.

Conflict of interest

K.M.Z., V.P., K.C.O., N.A.T., and J.L.O. are employed and funded by New England Biolabs, Inc., a manufacturer and vendor of molecular biology reagents, including DNA polymerases.

Funding

L.S. was supported by Agence Nationale de la Recherche [ANR-20-CE11-0003]; L.B.A was funded by a Cifre doctoral grant from Association Nationale de la Recherche et de la Technologie [2021/0125]. K.C.O., V.P., J.L.O., N.A.T., and K.M.Z. were funded through New England Biolabs. Funding to pay the Open Access publication charges for this article was provided by New England Biolabs.

Data availability

Raw PacBio sequencing data pertaining to this study has been deposited in the Sequencing Read Archive under accession number PRJNA1306533. Custom software tools for processing raw PacBio sequencing data to extract DNA polymerase fidelity parameters are available on Zenodo at https://doi.org/10.5281/zenodo.17379561. Processed PacBio sequencing data are available in online supplementary material.

References

  • 1. Henninger  EE, Pursell  ZF. DNA polymerase ε and its roles in genome stability. IUBMB Life. 2014;66:339–51. 10.1002/iub.1276. [DOI] [PubMed] [Google Scholar]
  • 2. Rayner  E, van Gool  IC, Palles  C.  et al.  A panoply of errors: polymerase proofreading domain mutations in cancer. Nat Rev Cancer. 2016;16:71–81. 10.1038/nrc.2015.12. [DOI] [PubMed] [Google Scholar]
  • 3. Preston  BD, Albertson  TM, Herr  AJ. DNA replication fidelity and cancer. Semin Cancer Biol. 2010;20:281–93. 10.1016/j.semcancer.2010.10.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Schroeder  JW, Hirst  WG, Szewczyk  GA.  et al.  The effect of local sequence context on mutational bias of genes encoded on the leading and lagging strands. Curr Biol. 2016;26:692–7. 10.1016/j.cub.2016.01.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Loh  E, Salk  JJ, Loeb  LA. Optimization of DNA polymerase mutation rates during bacterial evolution. Proc Natl Acad Sci USA. 2010;107:1154–9. 10.1073/pnas.0912451107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Aschenbrenner  J, Marx  A. DNA polymerases and biotechnological applications. Curr Opin Biotechnol. 2017;48:187–95. 10.1016/j.copbio.2017.04.005. [DOI] [PubMed] [Google Scholar]
  • 7. Bloom  LB, Chen  X, Fygenson  DK.  et al.  Fidelity of Escherichia coli DNA polymerase III holoenzyme. The effects of β, γ complex processivity proteins and ε proofreading exonuclease on nucleotide misincorporation efficiencies. J Biol Chem. 1997;272:27919–30. 10.1074/jbc.272.44.27919. [DOI] [PubMed] [Google Scholar]
  • 8. Bebenek  K, Joyce  CM, Fitzgerald  MP.  et al.  The fidelity of DNA synthesis catalyzed by derivatives of Escherichia coli DNA polymerase I. J Biol Chem. 1990;265:13878–87. 10.1016/S0021-9258(18)77430-9. [DOI] [PubMed] [Google Scholar]
  • 9. Xia  S, Konigsberg  WH. RB69 DNA polymerase structure, kinetics, and fidelity. Biochemistry. 2014;53:2752–67. 10.1021/bi4014215. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Eger  BT, Benkovic  SJ. Minimal kinetic mechanism for misincorporation by DNA polymerase I (Klenow fragment). Biochemistry. 1992;31:9227–36. 10.1021/bi00153a016. [DOI] [PubMed] [Google Scholar]
  • 11. Tindall  KR, Kunkel  TA. Fidelity of DNA synthesis by the Thermus aquaticus DNA polymerase. Biochemistry. 1988;27:6008–13. 10.1021/bi00416a027. [DOI] [PubMed] [Google Scholar]
  • 12. Keith  BJ, Jozwiakowski  SK, Connolly  BA. A plasmid-based lacZα gene assay for DNA polymerase fidelity measurement. Anal. Biochem.  2013;433:153–61. 10.1016/j.ab.2012.10.019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Eckert  KA, Kunkel  TA. DNA polymerase fidelity and the polymerase chain reaction. PCR Methods Appl. 1991;1:17–24. 10.1101/gr.1.1.17. [DOI] [PubMed] [Google Scholar]
  • 14. Potapov  V, Ong  JL. Examining sources of error in PCR by single-molecule sequencing. PLoS One. 2017;12:e0169774. 10.1371/journal.pone.0169774. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Hestand  MS, Van Houdt  J, Cristofoli  F.  et al.  Polymerase specific error rates and profiles identified by single molecule sequencing. Mutat Res. 2016;784–785:39–45. 10.1016/j.mrfmmm.2016.01.003. [DOI] [PubMed] [Google Scholar]
  • 16. Lee  DF, Lu  J, Chang  S.  et al.  Mapping DNA polymerase errors by single-molecule sequencing. Nucleic Acids Res. 2016;44:e118. 10.1093/nar/gkw436. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Zamft  BM, Marblestone  AH, Kording  K.  et al.  Measuring cation dependent DNA polymerase fidelity landscapes by deep sequencing. PLoS One. 2012;7:e43876. 10.1371/journal.pone.0043876. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18. Kennedy  SR, Schmitt  MW, Fox  EJ.  et al.  Detecting ultralow-frequency mutations by duplex sequencing. Nat Protoc. 2014;9:2586–606. 10.1038/nprot.2014.170. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Schmitt  MW, Kennedy  SR, Salk  JJ.  et al.  Detection of ultra-rare mutations by next-generation sequencing. Proc Natl Acad Sci USA. 2012;109:14508–13. 10.1073/pnas.1208715109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. de Paz  AM, Cybulski  TR, Marblestone  AH.  et al.  High-resolution mapping of DNA polymerase fidelity using nucleotide imbalances and next-generation sequencing. Nucleic Acids Res. 2018;46:e78. 10.1093/nar/gky296. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21. Kinde  I, Wu  J, Papadopoulos  N.  et al.  Detection and quantification of rare mutations with massively parallel sequencing. Proc Natl Acad Sci USA. 2011;108:9530–5. 10.1073/pnas.1105422108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Wenger  AM, Peluso  P, Rowell  WJ.  et al.  Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome. Nat Biotechnol. 2019;37:1155–62. 10.1038/s41587-019-0217-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. Potapov  V, Fu  X, Dai  N.  et al.  Base modifications affecting RNA polymerase and reverse transcriptase fidelity. Nucleic Acids Res. 2018;46:5753–63. 10.1093/nar/gky341. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24. Steitz  TA. DNA polymerases: structural diversity and common mechanisms. J Biol Chem. 1999;274:17395–8. 10.1074/jbc.274.25.17395. [DOI] [PubMed] [Google Scholar]
  • 25. Lamers  MH, O’Donnell  M. A consensus view of DNA binding by the C family of replicative DNA polymerases. Proc Natl Acad Sci USA. 2008;105:20565–6. 10.1073/pnas.0811279106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26. Sauguet  L, Raia  P, Henneke  G.  et al.  Shared active site architecture between archaeal PolD and multi-subunit RNA polymerases revealed by X-ray crystallography. Nat Commun. 2016;7:12227. 10.1038/ncomms12227. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27. Raia  P, Delarue  M, Sauguet  L. An updated structural classification of replicative DNA polymerases. Biochem Soc Trans. 2019;47:239–49. 10.1042/BST20180579. [DOI] [PubMed] [Google Scholar]
  • 28. Barros  T, Guenther  J, Kelch  B.  et al.  A structural role for the PHP domain in E. coli DNA polymerase III. BMC Struct Biol. 2013;13:8. 10.1186/1472-6807-13-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29. Madru  C, Henneke  G, Raia  P.  et al.  Structural basis for the increased processivity of D-family DNA polymerases in complex with PCNA. Nat Commun. 2020;11:1591. 10.1038/s41467-020-15392-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Gouge  J, Ralec  C, Henneke  G.  et al.  Molecular recognition of canonical and deaminated bases by P. abyssi family B DNA polymerase. J. Mol. Biol.  2012;423:315–36. 10.1016/j.jmb.2012.07.025. [DOI] [PubMed] [Google Scholar]
  • 31. Bernad  A, Blanco  L, Lázaro  JM.  et al.  A conserved 3’-5’ exonuclease active site in prokaryotic and eukaryotic DNA polymerases. Cell. 1989;59:219–28. 10.1016/0092-8674(89)90883-0. [DOI] [PubMed] [Google Scholar]
  • 32. Gardner  A. Determinants of nucleotide sugar recognition in an archaeon DNA polymerase. Nucleic Acids Res. 1999;27:2545–53. 10.1093/nar/27.12.2545. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33. Zatopek  KM, Alpaslan  E, Evans  TC.  et al.  Novel ribonucleotide discrimination in the RNA polymerase-like two-barrel catalytic core of family D DNA polymerases. Nucleic Acids Res. 2020;48:12204–18. 10.1093/nar/gkaa986. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34. Tanner  NA, Hamdan  SM, Jergic  S.  et al.  Single-molecule studies of fork dynamics in Escherichia coli DNA replication. Nat Struct Mol Biol. 2008;15:170–6. 10.1038/nsmb.1381. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35. Kong  H, Kucera  RB, Jack  WE. Characterization of a DNA polymerase from the hyperthermophile archaea Thermococcus litoralis. Vent DNA polymerase, steady state kinetics, thermal stability, processivity, strand displacement, and exonuclease activities. J Biol Chem. 1993;268:1965–75. 10.1016/S0021-9258(18)53949-1. [DOI] [PubMed] [Google Scholar]
  • 36. Lemor  M, Kong  Z, Henry  E.  et al.  Differential activities of DNA polymerases in processing ribonucleotides during DNA synthesis in Archaea. J Mol Biol. 2018;430:4908–24. 10.1016/j.jmb.2018.10.004. [DOI] [PubMed] [Google Scholar]
  • 37. Eid  J, Fehr  A, Gray  J.  et al.  Real-time DNA sequencing from single polymerase molecules. Science. 2009;323:133–8. 10.1126/science.1162986. [DOI] [PubMed] [Google Scholar]
  • 38. Eckert  KA, Kunkel  TA. High fidelity DNA synthesis by the Thermus aquaticus DNA polymerase. Nucleic Acids Res. 1990;18:3739–44. 10.1093/nar/18.13.3739. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39. Xue  Y, Braslavsky  I, Quake  SR. Temperature effect on polymerase fidelity. J Biol Chem. 2021;297:101270. 10.1016/j.jbc.2021.101270. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40. Okazaki  R, Arisawa  M, Sugino  A. Slow joining of newly replicated DNA chains in DNA polymerase I-deficient Escherichia coli mutants. Proc Natl Acad Sci USA. 1971;68:2954–7. 10.1073/pnas.68.12.2954. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41. Greenough  L, Kelman  Z, Gardner  AF. The roles of family B and D DNA polymerases in Thermococcus species 9°N okazaki fragment maturation. J Biol Chem. 2015;290:12514–22. 10.1074/jbc.M115.638130. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42. Joyce  CM, Sun  XC, Grindley  ND. Reactions at the polymerase active site that contribute to the fidelity of Escherichia coli DNA polymerase I (Klenow fragment). J Biol Chem. 1992;267:24485–500. 10.1016/S0021-9258(18)35792-2. [DOI] [PubMed] [Google Scholar]
  • 43. Rejali  NA, Moric  E, Wittwer  CT. The effect of single mismatches on primer extension. Clin Chem. 2018;64:801–9. 10.1373/clinchem.2017.282285. [DOI] [PubMed] [Google Scholar]
  • 44. Cubonová  L, Richardson  T, Burkhart  BW.  et al.  Archaeal DNA polymerase D but not DNA polymerase B is required for genome replication in Thermococcus kodakarensis. J Bacteriol. 2013;195:2322–8. 10.1128/JB.02037-12. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45. Streisinger  G, Okada  Y, Emrich  J.  et al.  Frameshift mutations and the genetic code. Cold Spring Harb Symp Quant Biol. 1966;31:77–84. 10.1101/SQB.1966.031.01.014. [DOI] [PubMed] [Google Scholar]
  • 46. Lovett  ST. Encoded errors: mutations and rearrangements mediated by misalignment at repetitive DNA sequences. Mol Microbiol. 2004;52:1243–53. 10.1111/j.1365-2958.2004.04076.x. [DOI] [PubMed] [Google Scholar]
  • 47. Lujan  SA, Zhou  Z-X, Kunkel  TA. Evidence that transient replication errors initiate nuclear genome mutations. Nucleic Acids Res. 2025;53:gkaf679. 10.1093/nar/gkaf679. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48. Oki  K, Nagata  M, Yamagami  T.  et al.  Family D DNA polymerase interacts with GINS to promote CMG-helicase in the archaeal replisome. Nucleic Acids Res. 2022;50:3601–15. 10.1093/nar/gkab799. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49. Fay  PJ, Johanson  KO, McHenry  CS.  et al.  Size classes of products synthesized processively by DNA polymerase III and DNA polymerase III holoenzyme of Escherichia coli. J Biol Chem. 1981;256:976–83. 10.1016/S0021-9258(19)70075-1. [DOI] [PubMed] [Google Scholar]
  • 50. Bebenek  K, Roberts  JD, Kunkel  TA. The effects of dNTP pool imbalances on frameshift fidelity during DNA replication. J Biol Chem. 1992;267:3589–96. 10.1016/S0021-9258(19)50565-8. [DOI] [PubMed] [Google Scholar]
  • 51. Watt  DL, Buckland  RJ, Lujan  SA.  et al.  Genome-wide analysis of the specificity and mechanisms of replication infidelity driven by imbalanced dNTP pools. Nucleic Acids Res. 2016;44:1669–80. 10.1093/nar/gkv1298. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52. Xia  S, Wang  J, Konigsberg  WH. DNA mismatch synthesis complexes provide insights into base selectivity of a B family DNA polymerase. J Am Chem Soc. 2013;135:193–202. 10.1021/ja3079048. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53. Brown  JA, Suo  Z. Unlocking the sugar ‘steric gate’ of DNA polymerases. Biochemistry. 2011;50:1135–42. 10.1021/bi101915z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54. Buckstein  MH, He  J, Rubin  H. Characterization of nucleotide pools as a function of physiological state in Escherichia coli. J Bacteriol. 2008;190:718–26. 10.1128/JB.01020-07. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55. Ferraro  P, Franzolin  E, Pontarin  G.  et al.  Quantitation of cellular deoxynucleoside triphosphates. Nucleic Acids Res. 2010;38:e85. 10.1093/nar/gkp1141. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56. Nevin  P, Engen  JR, Beuning  PJ. Steric gate residues of Y-family DNA polymerases DinB and pol κ are crucial for dNTP-induced conformational change. DNA Repair (Amst). 2015;29:65–73. 10.1016/j.dnarep.2015.01.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57. Donigan  KA, McLenigan  MP, Yang  W.  et al.  The steric gate of DNA polymerase ι regulates ribonucleotide incorporation and deoxyribonucleotide fidelity. J Biol Chem. 2014;289:9136–45. 10.1074/jbc.M113.545442. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58. Vaisman  A, Kuban  W, McDonald  JP.  et al.  Critical amino acids in Escherichia coli UmuC responsible for sugar discrimination and base-substitution fidelity. Nucleic Acids Res. 2012;40:6144–57. 10.1093/nar/gks233. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59. Gardner  AF, Jackson  KM, Boyle  MM.  et al.  Therminator DNA polymerase: modified nucleotides and unnatural substrates. Front Mol Biosci. 2019;6:28. 10.3389/fmolb.2019.00028. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60. Cases-Gonzalez  CE, Gutierrez-Rivas  M, Ménendez-Arias  L. Coupling ribose selection to fidelity of DNA synthesis. The role of tyr-115 of human immunodeficiency virus type 1 reverse transcriptase. J Biol Chem. 2000;275:19759–67. 10.1074/jbc.M910361199. [DOI] [PubMed] [Google Scholar]
  • 61. Minnick  DT, Liu  L, Grindley  NDF.  et al.  Discrimination against purine-pyrimidine mispairs in the polymerase active site of DNA polymerase I: a structural explanation. Proc Natl Acad Sci USA. 2002;99:1194–9. 10.1073/pnas.032457899. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62. Chen  L, Liu  P, Evans  TC, Ettwiller  LM, DNA damage is a pervasive cause of sequencing errors, directly confounding variant identification. Science. 2017;355:752–6. [DOI] [PubMed] [Google Scholar]
  • 63. Uelze  L, Borowiak  M, Bönn  M.  et al.  German-wide interlaboratory study compares consistency, accuracy and reproducibility of whole-genome short read sequencing. Front. Microbiol.  2020;11:573972. 10.3389/fmicb.2020.573972. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64. Oxford Nanopore Technologies . Oxford Nanopore announces technology updates at Nanopore Community Meeting. 2021. https://nanoporetech.com/news/news-oxford-nanopore-announces-technology-updates-nanopore-community-meeting  (6 November 2025, date last accessed). [Google Scholar]
  • 65. Deamer  D, Akeson  M, Branton  D. Three decades of nanopore sequencing. Nat Biotechnol. 2016;34:518–24. 10.1038/nbt.3423. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66. Kingan  S, Kronenberg  Z, Wenger  A. Beyond contiguity: evaluating the accuracy of de novo genome assemblies. 2020. https://www.pacb.com/wp-content/uploads/Kingan-PAG-2020-Beyond-contiguity-evaluating-the-accuracy-of-de-novo-genome-assemblies.pdf  (6 November 2025, date last accessed). [Google Scholar]
  • 67. Łazowski  K, Woodgate  R, Fijalkowska  IJ. Escherichia coli DNA replication: the old model organism still holds many surprises. FEMS Microbiol Rev. 2024;48:fuae018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68. O’Donnell  ME, Kornberg  A. Dynamics of DNA polymerase III holoenzyme of Escherichia coli in replication of a multiprimed template. J Biol Chem. 1985;260:12875–83. [PubMed] [Google Scholar]
  • 69. Bell  JB, Eckert  KA, Joyce  CM.  et al.  Base miscoding and strand misalignment errors by mutator Klenow polymerases with amino acid substitutions at tyrosine 766 in the O helix of the fingers subdomain. J Biol Chem. 1997;272:7345–51. [DOI] [PubMed] [Google Scholar]
  • 70. Drake  JW. A constant rate of spontaneous mutation in DNA-based microbes. Proc Natl Acad Sci USA. 1991;88:7160–4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71. Lee  H, Popodi  E, Tang  H.  et al.  Rate and molecular spectrum of spontaneous mutations in the bacterium Escherichia coli as determined by whole-genome sequencing. Proc Natl Acad Sci USA. 2012;109:E2774–2783. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72. Carlson  J, Locke  AE, Flickinger  M.  et al.  Extremely rare variants reveal patterns of germline mutation rate heterogeneity in humans. Nat Commun. 2018;9:3753. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73. Sachadyn  P. Conservation and diversity of MutS proteins. Mutat Res. 2010;694:20–30. [DOI] [PubMed] [Google Scholar]
  • 74. Ishino  S, Nishi  Y, Oda  S.  et al.  Identification of a mismatch-specific endonuclease in hyperthermophilic Archaea. Nucleic Acids Res. 2016;44:2977–86. 10.1093/nar/gkw153. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75. Castañeda-García  A, Prieto  AI, Rodríguez-Beltrán  J.  et al.  A non-canonical mismatch repair pathway in prokaryotes. Nat Commun. 2017;8:14246. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76. Iyer  RR, Pluciennik  A, Burdett  V.  et al.  DNA mismatch repair: functions and mechanisms. Chem Rev. 2006;106:302–23. 10.1021/cr0404794. [DOI] [PubMed] [Google Scholar]
  • 77. Boshoff  HIM, Reed  MB, Barry  CE.  et al.  DnaE2 polymerase contributes to in vivo survival and the emergence of drug resistance in Mycobacterium tuberculosis. Cell. 2003;113:183–93. 10.1016/S0092-8674(03)00270-8. [DOI] [PubMed] [Google Scholar]
  • 78. Ditse  Z, Lamers  MH, Warner  DF. DNA replication in Mycobacterium tuberculosis. Microbiol Spectr. 2017;5:5.2.20. 10.1128/microbiolspec.TBTB2-0027-2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79. Bertocci  B, De Smet  A, Berek  C.  et al.  Immunoglobulin κ light chain gene rearrangement is impaired in mice deficient for DNA polymerase μ. Immunity. 2003;19:203–11. [DOI] [PubMed] [Google Scholar]
  • 80. Weill  J-C, Reynaud  C-A. DNA polymerases in adaptive immunity. Nat Rev Immunol. 2008;8:302–12. 10.1038/nri2281. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

gkaf1143_Supplemental_Files

Data Availability Statement

Raw PacBio sequencing data pertaining to this study has been deposited in the Sequencing Read Archive under accession number PRJNA1306533. Custom software tools for processing raw PacBio sequencing data to extract DNA polymerase fidelity parameters are available on Zenodo at https://doi.org/10.5281/zenodo.17379561. Processed PacBio sequencing data are available in online supplementary material.


Articles from Nucleic Acids Research are provided here courtesy of Oxford University Press

RESOURCES