Skip to main content
The Journal of Chemical Physics logoLink to The Journal of Chemical Physics
. 2012 Dec 7;137(21):214903. doi: 10.1063/1.4766363

The effects of diffusion on an exonuclease/nanopore-based DNA sequencing engine

Joseph E Reiner 1, Arvind Balijepalli 2,3,2,3, Joseph W F Robertson 2, Bryon S Drown 4, Daniel L Burden 4, John J Kasianowicz 2,a)
PMCID: PMC4108639  PMID: 23231259

Abstract

Over 15 years ago, the ability to electrically detect and characterize individual polynucleotides as they are driven through a single protein ion channel was suggested as a potential method for rapidly sequencing DNA, base-by-base, in a ticker tape-like fashion. More recently, a variation of this method was proposed in which a nanopore would instead detect single nucleotides cleaved sequentially by an exonuclease enzyme in close proximity to one pore entrance. We analyze the exonuclease/nanopore-based DNA sequencing engine using analytical theory and computer simulations that describe nucleotide transport. The available data and analytical results suggest that the proposed method will be limited to reading <80 bases, imposed, in part, by the short lifetime each nucleotide spends in the vicinity of the detection element within the pore and the ability to accurately discriminate between the four mononucleotides.

INTRODUCTION

The ability to sequence DNA rapidly, at low cost, is of significant interest because of its potential positive impact on health care and in many other applications. Over four decades ago, Maxam and Gilbert devised a method for sequencing DNA in which the biopolymer is chemically modified with labels and subsequently cleaved at specific bases with an organic compound.1 Sanger and colleagues developed another method based on dideoxynucleoside triphosphates as DNA chain terminators.2 Modern variations of the Sanger method use contemporary technology to sequence DNA templates by capillary electrophoresis with optical detection of fluorescent bases, and requires time consuming procedures.3

High-throughput methods based on sequencing by synthesis4 were refined in the 1990s, and are typically performed with instruments that are relatively large and expensive to operate. The method was recently miniaturized via microtiter plate technology that reads DNA sequences optically from fluorescent tags3,5 or electronically with CMOS/ISFET technology.6–8 The latter technique is particularly significant because the device is massively parallel and label-free.

The next logical steps in genomic DNA sequencing include reducing the number of copies needed (ideally, the limit of detection would be a single molecule, to avoid the need for PCR amplification9 with its attendant errors10), increasing the read length (to decrease the complexity of genome sequence reconstruction), increasing the measurement accuracy, and decreasing the time to read each base. Towards that goal, it was suggested that individual molecules of DNA might be sequenced in a ticker-tape fashion by threading them through a single nanopore and measuring the change each base makes to the pore's ionic current.11 However, this conceptually simple approach has been elusive,12 in part, because understanding the physics of polymer translocation through a single nanopore and controlling the rate of polynucleotide translocation have both been challenging.13–18

One variation on nanopore-based DNA sequencing suggested the use of an exonuclease attached to the cap domain of the α-hemolysin channel.19 In this scheme, the enzyme would cleave nucleotides, one at a time, from duplex DNA. Conceptually, each base would migrate serially into the pore, bind to a detection element there, and be uniquely identified by the degree of ionic current blockade. Because β-cyclodextrin binds mononucleotides,20–22 and fits well inside the nanopore formed by Staphylococcus aureus α-hemolysin, it was a logical choice for use as DNA base detection element (Fig. 1(a)). Based on the degree by which the four DNA mononucleotides reduced the conductance of the α-hemolysin channel with a β-cyclodextrin adapter in it (Fig. 1(b)),23 an excellent degree of separation of the four bases was achieved. The actual mean overlap between the peaks (8%) determined using a fit to the data with Voigt functions24 (Figs. 1(b)–1(e), indigo) is somewhat greater than the 1% reported with a Gaussian mixture model fit (Figs. 1(b)–1(e), orange).23 Nevertheless, the relatively high degree of accuracy of the β-cyclodextrin adapter suggested that the concept has merit and deserves a detailed theoretical analysis. Here, we study the critical parts of this system's performance, determine the probability that a given length polynucleotide can be accurately sequenced with the technology, and suggest how the method might be implemented.

FIG. 1.

FIG. 1.

Electrical discrimination between mononucleotides by a β-cyclodextrin adapter in a nanopore.23 (a) DNA sequencing engine19 based on exonuclease attached to the cap domain of the α-hemolysin ion channel. The enzyme cleaves DNA (red sphere), one base at a time, near the pore mouth. In order to correctly sequence DNA, every cleaved base has to enter the pore sequentially, reach the base discrimination site (dark blue), and transport completely through the pore. (b) A histogram of the ionic current blockades caused by dGMP, dTMP, dAMP, and dCMP binding to a β-cyclodextrin adapter in the pore are fit with four Gaussian (orange) or Voigt (indigo) functions.24 (c) The resultant fits demonstrate that the Voigt function better describes the data, and fully accounts for the overlap between neighboring peaks. The use of Gaussian functions overestimates the system's ability to separate the mononucleotides. The overlaps between neighboring fitted peaks for the Gaussian (d) and Voigt functions (e) are represented by their respective probability density functions on a log-linear scale. The fit of the Voigt function to the data shows that the accuracies of identifying the mononucleotides are G: 94%, T: 93%, A: 85%, and C: 95%, with an average accuracy of 92%.

MODEL

One-dimensional diffusion model and the probability of capture

First, we analyze theoretically the capture of single mononucleotides that are released near one entrance of a nanopore. The time dependent capture probability for an idealized diffusing particle is computed using solutions of the Fokker-Planck drift-diffusion equation.25–27 Closed form analytical solutions for this equation can usually be found for geometries with simple boundary conditions. The more complicated case of a particle diffusing in a semi-infinite solution in the vicinity of a nanopore has no known closed form solution because it requires solving and matching boundary conditions between two distinct regions (i.e., the nanopore and the bulk). The problem was simplified recently by assuming that a one-dimensional line can represent the nanopore with a perfect absorber at one end (the particle detector/reader inside the nanopore) and a radiating boundary (which describes the escape of a particle from the nanopore to the bulk) at the other.28,29

Each particle is initially located at the entrance of the nanopore, and the Fokker-Planck equation is solved to estimate the capture time probability distribution and from it the time-integrated capture probability. Figure 2 shows a schematic of the analytical model system. The diffusing particle propagator G under a biased flow in a one-dimensional nanopore is

Gt+νdGz=D2Gt2,z0,L, (1)

where vd and D are the particle drift velocity and diffusion coefficient, respectively. The drift includes electrophoretic and electro-osmotic (EO) contributions, which are functions of the applied transmembrane potential, V (see Appendix). A radiating boundary condition is applied at the pore exit (z = L) to model the loss of nucleotide particles into the bulk solution

κG(L,t)=DG(z,t)zz=L+νdG(L,t), (2)

where κ is an effective rate constant.28 For a pore coupled to a three-dimensional bulk solution, κ = 8Dd29 where d is the pore diameter. The location of the particle sink (the β-cyclodextrin adapter), which defines L, is assumed to be that defined in the original experiment.23

FIG. 2.

FIG. 2.

A schematic of the analytical model for the exonuclease/nanopore-based DNA sequencing engine. Each base (red) is released at a point above a pore of length L. The bulk solution is assumed to be infinite and the particle is captured at z = 0.

The boundary condition for particle capture by the detector located at z = 0 is G(z = 0, t) = 0. We assume the particles are initially uniformly distributed at the nanopore entrance. The localization is further assumed to occur over a region of thickness δ, such that

Gz,0=0,z0,Lδ1/δ,zLδ,L. (3)

The parameter of interest is the survival time probability density; ϕt=DGzz=0. G is solved from Eq. (1) by applying the absorption and radiating boundary conditions and working in Laplace space (see Appendix). This leads to the following expression for ϕ^s the Laplace transform of the survival time probability density:

ϕ^s=expα2coshy+β+α2sinhyy, (4)

where α = |vd| L/D, y2 = α2/4 + 2τs,τ = L2/2D, and β = κL/D.

RESULTS AND DISCUSSION

The capture probability density function, ϕ(t), is estimated by numerically inverting Eq. (4).40 Figure 3 shows a comparison between these capture probabilities and results from numerical simulations at applied potentials of V = 0 mV and 300 mV. The results show that an exonuclease-based nanopore sequencing device will be rate-limited by the turnover of exonuclease (∼1 ms to 10 ms) and not the capture rate of each base by the nanopore (∼10 ns to 100 ns). Therefore, we estimate the capture probability from the time-integrated capture probability density, which follows from Eq. (4) because ϕ^s=0=0ϕtdt,

ϕ^0=Pcap(V)=expα2coshα2+1+2βαsinhα2. (5)

The two parameters that describe the total capture probability are the nanopore's aspect ratio (L/d) and the voltage-dependent drift velocity vd. Figure 4 illustrates the voltage dependence of the time integrated capture probability estimated from the analytical model and numerical simulations with no free parameters. The results demonstrate the difficulty in capturing small diffusive particles. However, they also show that increasing the potential could increase the probability of capturing a given cleaved base. Thus, Fig. 4 suggests that a near certain capture probability can be achieved with transmembrane potentials that greatly exceed 1 V. Implicit in this discussion is the notion that the membrane is impervious to dielectric breakdown at extreme voltages.30–33 However, there are other experimental issues that preclude the use of such high voltages. For example, only a relatively small electrostatic potential drop (∼1% of the total field34–36) biases the random walk of each negatively charged mononucleotide into the pore, and most of the potential will drop across the β-cyclodextrin adapter in the nanopore (see Fig. 2 in Clarke et al.23). Thus, the corresponding mean residence time of each mononucleotide on the detector will decrease if the applied potential is increased.

FIG. 3.

FIG. 3.

Distributions of capture times for an individual DNA mononucleotide estimated from the inverse Laplace transform of Eq. (4) (solid lines) and numerical simulations of a three-dimensional model of the system (circles, squares) for two different applied potentials. The peak heights obtained with the analytical model were scaled to coincide with those estimated from the numerical simulations. Increasing the potential both increases the likelihood of capture and shifts the distribution to shorter capture times. A vast majority of capture events occur within the first 100 ns after cleavage of a mononucleotide. The same parameters were used in both the analytical model (Eq. (4)) and the numerical simulations. They are the mononucleotide electrophoretic mobility (μ = −2.4 × 10−4 cm2/Vs), the distance between the pore entrance and the capture point (L = 6 nm), the single nanopore ionic conductance (g = 295 pS), the number of water molecules that hydrate mobile cations and anions (NW = 10), the molar concentration of water (cW = 3.4 × 1028 m−3), the electronic charge (e = 1.6 × 10−19 C), the ratio of the nanopore's permeabilities to cations and anions (P+/P = 0.1), the DNA base diffusion coefficient (D = 3 × 10−6 cm2/s), and the nanopore diameter (d = 2.6 nm).

FIG. 4.

FIG. 4.

The capture probability as a function of the applied voltage estimated from simulations (open circles) and the analytical model (solid line), which has no adjustable parameters and uses the same values from Fig. 3 for the radiating boundary condition29 κ = 8Dd; β = 8Ld = 5.88. A one-parameter least squares fit of Eq. (5) to the simulated results with β = 4.7 ± 0.1 is also shown (dashed line). The slight disagreement between the simulation and theoretical model may result from the analytical model's assumption that the nanopore can be represented as a one-dimensional line.

Assuming that the reaction between the mononucleotides and the molecular adapter proceed with first order kinetics, the mean residence time of the mononucleotides in the αHL nanopore should decrease exponentially with potential, i.e., τ(V)=τ0eV/Vc, where Vc is the mean potential of the distribution. That was indeed the case for thymine with applied potentials between 80 mV and 200 mV (τ0 = 0.55 s and Vc = 47 mV),23 but not for the other three mononucleotides (see molecular adapter chemistry below). Thus, for relatively small applied potentials, the normalized probability density of the residence times will be ρV(t;τ(V))=et/τ(V)/τ(V), for t ⩾ 0 and where t is the time from capture for a given nucleotide. We make the simplifying assumption that a given base cannot be read if it is bound to the adapter for times shorter than the inverse bandwidth (B) of the detection system. This absolute limit ignores the possible increase in noise with detector bandwidth that may further limit the probability of correctly reading a base. Nevertheless, the assumption is instructive and points to a limitation of increasing the capture probability through increased voltage. The total probability of successfully reading a base follows from the normalized probability density,

Pread(V;B)=1/BρV(t;τ(V))dt=expexpVVcBτo. (6)

Assuming that the processes governing the capture and read of individual nucleotides are independent, we can multiply Eqs. (5) and (6) to calculate the total probability of sensing a single base

P(V;B)=expα2expexpV/VcBτocoshα2+1+2βαsinhα2. (7)

Figure 5 shows plots of the combined detection probability as a function of the transmembrane potential (Eq. (7)) for three detection bandwidths (B = 5 kHz, 30 kHz, and 500 kHz). The optimum probability, Pmax, follows from Eq. (7) by setting ∂P/∂V = 0 and is shown as a function of bandwidth in Fig. 5 (inset). The optimum read probability initially increases as a function of bandwidth, but saturates at ∼0.75. This result suggests that increasing the detection bandwidth beyond 500 kHz will have a limited impact on the efficiency of reading a given nucleotide and will most likely be worse given that detector noise will grow with increasing bandwidth. However, at high potentials, the mean residence time distributions for the negatively charged nucleotides on the molecular adapter may only decrease linearly with voltage, not exponentially, if the potential drop across the adapter is much greater than the binding energy. The effect of this transition between the two regimes will cause the capture probability to decrease more slowly with increasing bandwidth than is shown in Fig. 5. Nevertheless, the probability of detecting a base will be limited in this extreme voltage regime.

FIG. 5.

FIG. 5.

The probability of reading a single base correctly is dependent on the applied potential and the system bandwidth. At low applied potential, the capture process dominates the probability. At higher potential, the mean residence time of a base on the detector decreases exponentially, which significantly decreases the probability of identifying a given base. Increasing the system bandwidth (5 kHz blue; 30 kHz green; and 500 kHz red) increases the probability of correctly reading a base because it permits a greater potential to be used (which increases capture probability) and shorter-lived blockade events to be accurately measured. (Inset) Ignoring several sources of noise, the optimum read probability increases with increasing system bandwidth, but saturates at ∼0.75.

In addition to the system bandwidth, which limits the fastest blockade events that can be accurately measured, the relatively small number of ions that probe each base during fast events will also be a limiting factor. For example, ∼160 ions flow through the pore per microsecond under the experimental conditions reported elsewhere (400 mM KCl, 180 mV23), and therefore events that are measured at the 500 kHz bandwidth limit will have sufficiently large statistical counting error (∼8%).

Molecular adapter chemistry

The choice of β-cyclodextrin as a molecular adapter for detecting and discriminating the four DNA mononucleotides23 was sensible because the two species have affinity for each other.20–22 However, the question remains whether that particular adapter has precisely the right chemistry for sequencing DNA.

Experimentally, it was determined that 40 μM total bulk mononucleotide concentration causes ∼30 blockades s−1 of the αHL channel modified with the β-cyclodextrin adapter.23 An estimate of the collision rate between point particles in the bulk and pore mouth can be obtained using a relationship derived by Berg and Purcell.37 That rate (2DdC0, where D, d, and C0 are the mononucleotide diffusion coefficient, pore diameter, and mononucleotide concentration in the bulk, respectively) is ∼58 000 blockades s−1. Assuming that ∼60% of these particles reach the β-cyclodextrin adapter when V = 180 mV (Fig. 4), ∼30 000 events/second should have been observed. Thus, one might conclude that only ∼1 out of 1000 particles that collide with the pore mouth reached the adapter. However, it is more likely that many blockade events are simply too short-lived to detect given the electronic measurement bandwidth in the nanopore apparatus (∼10 kHz) and the additional digital filtering used to present the data (2 kHz).23

Indeed, the binding constants for the reactions between mononucleotides with β-cyclodextrin in the bulk are weak20–22 (i.e., K = koff/kon ∼ 10 mM to 100 mM, where koff and kon are the dissociation and association rate constants, respectively). This suggests the reaction will have relatively short mean lifetimes (τ = 1/koff ∼ 0.1 μs to 1 μs, assuming kon ∼ 109 M−1 s−1), consistent with experimentally measured values (τ = 0.75 μs to 2 μs21,22). If the binding constant for the complex in the pore is similar to that in the bulk, then most of the blockades would be too short-lived to resolve at 2 kHz bandwidth.23 Because Clarke et al. observed ∼0.1% of the events (see above), we estimate that the mean lifetimes are ∼70-fold greater when the adapter is attached to the inside of the pore, assuming the residence times of the bases on the adapter are exponentially distributed. The increase in mean residence time is due to the amines on the β-cyclodextrin.41 Finally, we note that the addition of one or more fixed positive charges on the adapter should facilitate considerably stronger binding with the mononucleotides, and thereby markedly increase the observed apparent capture rate.

Analysis of the exonuclease/nanopore-based DNA sequencing algorithm

Figure 5 illustrates the limitations on capturing and reading a single base cleaved by the exonuclease, but suggests that for a given electronic measurement bandwidth, we can select an optimum voltage to maximize the likelihood of correctly reading a single base with probability p. Assuming the probability of discriminating between the four bases is unity (but see Fig. 1) and that the system treats all bases alike, the probability of correctly sequencing the first m bases of a polynucleotide is pm. However, because p < 0.75 (Fig. 5), accurately sequencing a single polynucleotide molecule in this manner is not practical. The statistics could of course be improved by reading many identical copies of the molecule by either the same nanopore or by an array of identical pores.

We next describe two algorithms to address the question of how many identical DNA copies must be measured to deduce the polynucleotide's sequence with a given degree of accuracy. We first adapt a “survivor” algorithm38 in which a number of sequence measurements (i.e., subsequences), c, are aligned from one end, and an assignment of the sequence is based on the majority of assignments from each subsequence. On average, only cp subsequences survive the first “vote,” and the remaining c(1 − p) subsequences are discarded. For the second letter in the sequence, the cp remaining subsequences will have, on average, cp2 majority “votes” and the remaining cp(1 − p) subsequences are discarded. Therefore, the average number of remaining subsequences after T “votes” is NT = cpT and the maximum expected number of correctly sequenced letters Tmax is

Tmax=logNT/clogp. (8)

Figure 6 shows the results of a numerical simulation using this sequencing algorithm. As an example, we generate Nseq subsequences, each with an average accuracy of 80% (p = 0.8, assuming random deletions of letters, but no misreads) for a 100-mer DNA strand. We then calculate the probability of accurately reconstructing the original sequence using the algorithm described above. Finally, we fit the simulated results using Eq. (8) with NT as an adjustable parameter and a combined capture and discrimination probability of p = 0.8 (a slight overestimate of this sequencing engine's potential, Fig. 5).

FIG. 6.

FIG. 6.

The probability of reading a DNA sequence correctly, Pmax, as a function of the number of identical polynucleotides read, Nseq, evaluated using two algorithms described in the text. The results from the numerical simulations for the “survivor” algorithm (which discards a particular polynucleotide sequence measurement if its base assignment disagrees with that of the majority's consensus, open squares) and a “majority rules” algorithm (which makes use of the information in the minority strand sequence measurements, open circles, see text). In both cases, the probability of capturing a given nucleotide is assumed to be 0.8. The solid line is a least-square fit of Eq. (8) (with NT = 4.3 ± 0.2) to the “survivor” algorithm simulation data. The “majority rules” simulation was performed 20 times for each Nseq to generate statistics. The error bars shown are the standard error.

The curve in Fig. 6 (open squares) highlights the drawbacks of discarding sequences, especially as the algorithm progresses. For 103 subsequences (potentially obtained from reading identical copies using either a single nanopore or an array of nanopores), the amount of correctly read bases is only 25%, which increases to 45% and 80% with the use of 105 and 108 nanopores, respectively (not shown). Given the increased uncertainty arising from discarding data, this method may not be amenable to accurately measuring long sequences. However, it may be possible to improve this approach by breaking the original sequence into a sufficient number of overlapping sequences of shorter lengths. These shorter strands could be sequenced and further analyzed with existing sequencing methodologies based on DNA subset overlapping.

The above algorithm is vastly improved by retaining information in the “discarded” subsequences. As before, if we assume that the errors in reading a given base are from deletions only (i.e., perfect discrimination between the bases), then at the end of one voting round we left-shift every subsequence that agrees with the majority and hold all other sequences at their previous positions. The results of numerical simulations show that this relatively simple modification permits the accurate reading of ∼85 bases of a 100 mer's sequence (p = 0.8) with only ∼200 subsequences (Fig. 6, open circles).

Less than perfect discrimination between nucleotides

Because the mononucleotide discrimination afforded by the β-cyclodextrin adapter is not perfect (Figs. 1(a)–1(e)), the probability of correctly reading DNA sequences with the exonuclease/nanopore-based technique would be proportionally lower than the estimates described above. How to achieve better discrimination with an internal molecular adapter is not yet clear. Further study into this problem would be helpful for both the separation of the four bases and the development of an adapter with sufficiently strong affinity for the mononucleotides.

CONCLUSIONS

An analysis of a proposed exonuclease/nanopore-based DNA sequencing engine illustrates that the system has significant merit, but several aspects of the technique need improvement. Our analysis shows that many bases will escape to the bulk before being detected in the proper sequence. The results also show that increasing the applied transmembrane potential, which also increases the potential just outside the pore, will increase the cleaved mononucleotide capture efficiency. However, the increased potential decreases the lifetime of a given mononucleotide on the molecular adapter, which will therefore decrease the probability of reading each base correctly (or even detecting them). The results also show that because of the relatively low probability of capturing a given base (∼74%), the number of DNA molecules that would need to be read scales exponentially with the number of bases. Based on the ability of a molecular adapter in the αHL nanopore23 to discriminate between different mononucleotides (average accuracy = 92%), the probability of correctly reading a given base is at best 68%, which is likely insufficient for sequencing even moderate length strands of DNA.

Those conclusions notwithstanding, there are several possibilities for improving the exonuclease/nanopore-based DNA sequencing method. For example, the capture probability (Pcap) might be increased by orienting the exonuclease in such a manner that the cleaved mononucleotide is closer to the pore mouth. Conceivably, Pcap could also be increased by increasing the access resistance on the side that contains the exonuclease, but engineering that capability in a massively parallel array would need to be developed. Finally, the use of computer simulations may lead to the design of more accurate molecular adapters, if the required stereochemistry can be rationally designed and produced.

ACKNOWLEDGMENTS

This work was supported in part by grants from National Institute of Standards and Technology (NIST) Office of Law Enforcement Standards (J.J.K., J.W.F.R., J.E.R.), the Wheaton College Alumni Foundation (D.L.B., B.S.D.), the National Science Foundation (NSF) (D.L.B.), and NRC/NIST NIH Research Fellowship (A.B.). We thank Dr. James Russo at Columbia University for helpful comments on the paper.

APPENDIX: DETAILS OF THE CALCULATIONS

Drift velocity

The drift velocity of each base originates from electrophoretic and electro-osmotic effects. The electrophoretic velocity of a charged particle in an electric field is vep = μE, where μ is the electrophoretic mobility.

A net electro-osmotic velocity will occur in an ion selective nanopore. The net flux of ions through a nanopore under an applied potential is39

J±=gVqP+/P1P+/P+1, (A1)

where J± is the net flux of cations or anions through the nanopore, g is the nanopore conductance, V is the applied transmembrane potential, q is the electric charge on each ion, and P+/P is the nanopore ion selectivity ratio. Each ion is hydrated by a number of water molecules, NW, as it flows through the pore. Therefore, the flux of water molecules through the nanopore is given by JW = NWJ± and the resulting net electro-osmotic velocity veo of water through the nanopore veo = Jw / cwAp, where cw is the molar concentration of water and Ap is the average cross-sectional area of the nanopore.

We assume the applied potential creates a uniform electric field inside the nanopore. This condition is relaxed in another report, in which numerical simulations consider more complex electric field distributions. Nevertheless, the salient features of the electric field dependence can be understood here.

Combining the electro-osmotic and electrophoretic terms leads to the following expression for the drift velocity in terms of the applied transmembrane:

vd=vep+veo=μL+gNWcWApeP+/P1P+/P+1V. (A2)

For comparison to numerical simulations, we use the following parameters: μ = −2.4 × 10−4 cm2/Vs, L′ = 7.5 × 10−7 cm, g = 2.95 × 10−10 C/Vs, NW = 10, cW = 3.4 × 1022 cm−3, Ap = 5.3 × 10−14 cm2, e = 1.6 × 10−19 C, P+/P = 0.1. The computer simulation assumes there is a small drop of the potential outside the pore. Thus L′ > L. Substitution of these values into Eq. (A2) leads to the following drift velocity voltage dependence:

vd=328 cm / Vs V. (A3)

It is worth noting that the largest possible electro-osmotic contribution that would enhance the drift of nucleotides through the nanopore occurs when P+/P = 0 (i.e., purely anion selective). In this case, the prefactor in Eq. (A3) would be slightly greater (−330 cm/Vs).

Details of the model

ϕ^s=0expstϕtdt=DdG^z,sdzz=0. (A4)

In the given geometry, the drift velocity points towards z = 0 so v < 0. The Laplace transform of Eqs. (1) and (2) leads to

sG^z,sGz,0vddG^z,sdz=Dd2G^z,sdz2 (A5)

and

DdG^z,sdzz=LvdG^L,s=κG^L,s. (A6)

We solve for G and ϕ(s) subject to the absorption and radiating boundary conditions, setting G^Lδ,sl=G^Lδ,sr and G^Lδ,sl=G^Lδ,sr (where l and r refer to the limit from the left and right side of L-s and the prime refers to differentiation with respect to z), and taking the limit as δ → 0. This leads to the Eq. (4) in the text.

Probability of discriminating mononucleotides

Each mononucleotide can be uniquely identified by the degree to which it blocks the ionic current in the pore (Fig. 1). The probability of identifying each nucleotide (A, T, G, or C) is calculated by fitting the data in the figure to four Voigt profiles (Igor Pro, WaveMetrics, Inc., Lake Oswego, OR, USA). Equation (A7) gives the probability density of each peak, where i represents the blockade current, α is the amplitude, σ is the width, μ is the location of the peak, and γ is a shape parameter:24

Φ(i;α,σ.μ,γ)=αγπ3/2ln2×exp(t2)γ2+σ(iμ)t2dt. (A7)

The probability of uniquely discriminating each base is given by the area under each peak in Fig. 1(b) using the expression, Pd=i1i2Φ(i;α,σ,μ,γ)di, with the limits of integration (i1 and i2) set to the points of intersection between neighboring peaks. The probabilities of uniquely identifying a mononucleotide are then calculated to be 0.94, 0.93, 0.85, and 0.95 for dGMP, dTMP, dAMP, and dCMP, respectively. Finally, the average probability of uniquely discriminating a single mononucleotide is 0.92.

Numerical simulations

The diffusion simulator uses bulk physical parameters (such as the analyte diffusion constant, electrophoretic mobility, nanopore selectivity, total transmembrane voltage, etc.) to perform a random walk in 3D, constrained by the protein and membrane volumes. No periodic boundary conditions are implemented, because each particle is allowed to translate until it is successfully captured, or a specified maximum run time is exceeded (typically 1 μs). Each diffusing species moves with a randomly determined cardinal direction and a step length determined by sampling from a Gaussian distribution with a mean of zero and a standard deviation (Eq. (A8)), dx, which is related to the characteristic diffusion coefficient:37

dx=6Ddt. (A8)

As long as the simulation time step (dt) is small, translational motion inside the nanopore can be modeled accurately. A time step of 1 ps (step size = 40 pm) produces trajectories with adequate spatial resolution. Drift distance arising from electrophoretic and EO flow (Eq. (A2)) over the 1 ps time step is added to dx to give the particle displacement.

Nucleotides are simulated as point particles released from z = 0 (Fig. 1) and reflect off the ion channel lumen walls. The top exterior of the nanopore extends radially to infinity and is also assumed to be reflective. Nucleotides freely migrate above the nanopore in bulk solution without geometric constraints. The capture probability is defined as the ratio of the number of successful nucleotide captures to the total number of nucleotides used in the simulation.

REFERENCES

  • 1.Maxam A. M. and Gilbert W., Proc. Natl. Acad. Sci. U.S.A. 74, 560 (1977). 10.1073/pnas.74.2.560 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Sanger F., Nicklen S., and Coulson A. R., Proc. Natl. Acad. Sci. U.S.A. 74, 5463 (1977). 10.1073/pnas.74.12.5463 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Wheeler D. A., Srinivasan M., Egholm M., Shen Y., Chen L., McGuire A., He W., Chen Y.-J., Makhijani V., Roth G. T., Gomes X., Tartaro K., Niazi F., Turcotte C. L., Irzyk G. P., Lupski J. R., Chinault C., Song X.-Z., Liu Y., Yuan Y., Nazareth L., Qin X., Muzny D. M., Margulies M., Weinstock G. M., Gibbs R. A., and Rothberg J. M., Nature (London) 452, 872 (2008). 10.1038/nature06884 [DOI] [PubMed] [Google Scholar]
  • 4.Guo J., Yu L., Turro N. J., and Ju J., Acc. Chem. Res. 43, 551 (2010). 10.1021/ar900255c [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Margulies M., Egholm M., Altman W. E., Attiya S., Bader J. S., Bemben L. A., Berka J., Braverman M. S., Chen Y.-J., Chen Z., Dewell S. B., Du L., Fierro J. M., Gomes X. V., Godwin B. C., He W., Helgesen S., Ho C. H., Irzyk G. P., Jando S. C., Alenquer M. L. I., Jarvie T. P., Jirage K. B., Kim J.-B., Knight J. R., Lanza J. R., Leamon J. H., Lefkowitz S. M., Lei M., Li J., Lohman K. L., Lu H., Makhijani V. B., McDade K. E., McKenna M. P., Myers E. W., Nickerson E., Nobile J. R., Plant R., Puc B. P., Ronan M. T., Roth G. T., Sarkis G. J., Simons J. F., Simpson J. W., Srinivasan M., Tartaro K. R., Tomasz A., Vogt K. A., Volkmer G. A., Wang S. H., Wang Y., Weiner M. P., Yu P., Begley R. F., and Rothberg J. M., Nature (London) 437, 376 (2005). 10.1038/nature03959 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Bergveld P., IEEE Trans. Biomed Eng. BM17, 70 (1970). 10.1109/TBME.1970.4502688 [DOI] [PubMed] [Google Scholar]
  • 7.Bergveld P., Sens. Actuators B 88, 1 (2003). 10.1016/S0925-4005(02)00301-5 [DOI] [Google Scholar]
  • 8.Rothberg J. M., Hinz W., Rearick T. M., Schultz J., Mileski W., Davey M., Leamon J. H., Johnson K., Milgrew M. J., Edwards M., Hoon J., Simons J. F., Marran D., Myers J. W., Davidson J. F., Branting A., Nobile J. R., Puc B. P., Light D., Clark T. A., Huber M., Branciforte J. T., Stoner I. B., Cawley S. E., Lyons M., Fu Y., Homer N., Sedova M., Miao X., Reed B., Sabina J., Feierstein E., Schorn M., Alanjary M., Dimalanta E., Dressman D., Kasinskas R., Sokolsky T., Fidanza J. A., Namsaraev E., McKernan K. J., Williams A., Roth G. T., and Bustillo J., Nature (London) 475, 348 (2011). 10.1038/nature10242 [DOI] [PubMed] [Google Scholar]
  • 9.Mullis K., Faloona F., Scharf S., Saiki R., Horn G., and Erlich H., Cold Spring Harb. Symp. Quant. Biol. 51(Pt 1), 263 (1986). 10.1101/SQB.1986.051.01.032 [DOI] [PubMed] [Google Scholar]
  • 10.Kong Y., J. Comput. Biol. 16, 817 (2009). 10.1089/cmb.2008.0215 [DOI] [PubMed] [Google Scholar]
  • 11.Kasianowicz J. J., Brandin E., Branton D., and Deamer D., Proc. Natl. Acad. Sci. U.S.A. 93, 13770 (1996). 10.1073/pnas.93.24.13770 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Kasianowicz J. J., Robertson J. W. F., Chan E. R., Reiner J. E., and Stanford V. M., Annu. Rev. Anal. Chem. 1, 737 (2008). 10.1146/annurev.anchem.1.031207.112818 [DOI] [PubMed] [Google Scholar]
  • 13.Lubensky D. and Nelson D., Biophys. J. 77, 1824 (1999). 10.1016/S0006-3495(99)77027-X [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Wong C. T. A. and Muthukumar M., J. Chem. Phys. 128, 154903 (2008). 10.1063/1.2897932 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Muthukumar M., Polymer Translocation (CRC, Boca Raton, FL, 2011). [Google Scholar]
  • 16.Wong C. T. A. and Muthukumar M., J. Chem. Phys. 133, 045101 (2010). 10.1063/1.3464333 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Panwar A. S. and Muthukumar M., J. Am. Chem. Soc. 131, 18563 (2009). 10.1021/ja904047q [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Butler T. Z., Pavlenok M., Derrington I. M., Niederweis M., and Gundlach J. H., Proc. Natl. Acad. Sci. U.S.A. 105, 20647 (2008). 10.1073/pnas.0807514106 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Branton D., Deamer D. W., Marziali A., Bayley H., Benner S. A., Butler T., Di Ventra M., Garaj S., Hibbs A., Huang X., Jovanovich S. B., Krstic P. S., Lindsay S., Ling X. S., Mastrangelo C. H., Meller A., Oliver J. S., Pershin Y. V., Ramsey J. M., Riehn R., Soni G. V., Tabard-Cossa V., Wanunu M., Wiggin M., and Schloss J. A., Nat. Biotechnol. 26, 1146 (2008). 10.1038/nbt.1495 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Formoso C., Biochem. Biophys. Res. Commun. 50, 999 (1973). 10.1016/0006-291X(73)91505-2 [DOI] [PubMed] [Google Scholar]
  • 21.Britz P.-McKibbin and Chen D., J. Chromatogr. A 781, 23 (1997). 10.1016/S0021-9673(97)00484-6 [DOI] [Google Scholar]
  • 22.Bae J.-R. and Lee C., Bull. Korean Chem. Soc. 30, 145 (2009). 10.5012/bkcs.2009.30.1.145 [DOI] [Google Scholar]
  • 23.Clarke J., Wu H.-C., Jayasinghe L., Patel A., Reid S., and Bayley H., Nat. Nanotechnol. 4, 265 (2009). 10.1038/nnano.2009.12 [DOI] [PubMed] [Google Scholar]
  • 24.Armstrong B., J. Quant. Spectrosc. Radiat. Transf. 7, 61 (1967). 10.1016/0022-4073(67)90057-X [DOI] [Google Scholar]
  • 25.Chandrasekhar S., Rev. Mod. Phys. 21, 383 (1949). 10.1103/RevModPhys.21.383 [DOI] [Google Scholar]
  • 26.Weissbluth M., Photon-Atom Interactions (Academic, 1989), pp. 1–37 [Google Scholar]
  • 27.Risken H., The Fokker-Planck Equation: Methods of Solution and Applications (Springer, New York, 1996). [Google Scholar]
  • 28.Berezhkovskii A. M., Barzykin A. V., and Zitserman V. Y., J. Chem. Phys. 130, 245104 (2009). 10.1063/1.3160546 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Bezrukov S., Berezhkovskii A., Pustovoit M., and Szabo A., J. Chem. Phys. 113, 8206 (2000). 10.1063/1.1314862 [DOI] [Google Scholar]
  • 30.Winterhalter M. and Helfrich W., Phys. Rev. A 36, 5874 (1987). 10.1103/PhysRevA.36.5874 [DOI] [PubMed] [Google Scholar]
  • 31.Wilhelm C., Winterhalter M., Zimmermann U., and Benz R., Biophys. J. 64, 121 (1993). 10.1016/S0006-3495(93)81346-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Klotz K., Winterhalter M., and Benz R., Biochim. Biophys. Acta 1147, 161 (1993). 10.1016/0005-2736(93)90327-V [DOI] [PubMed] [Google Scholar]
  • 33.Winterhalter M., Klotz K. H., Benz R., and Arnold W. M., IEEE Trans. Ind. Appl. 32, 125 (1996). 10.1109/28.485823 [DOI] [Google Scholar]
  • 34.Hall J. E., J. Gen. Physiol. 66, 531 (1975). 10.1085/jgp.66.4.531 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Bezrukov S. M., Vodyanoy I., Brutyan R., and Kasianowicz J. J., Macromolecules 29, 8517 (1996). 10.1021/ma960841j [DOI] [Google Scholar]
  • 36.Aksimentiev A. and Schulten K., Biophys. J. 88, 3745 (2005). 10.1529/biophysj.104.058727 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Berg H. and Purcell E., Biophys. J. 20, 193 (1977). 10.1016/S0006-3495(77)85544-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Levenshtein V. I., J. Comb. Theory, Ser. A 93, 310 (2001). 10.1006/jcta.2000.3081 [DOI] [Google Scholar]
  • 39.Gu L.-Q., Cheley S., and Bayley H., Proc. Natl. Acad. Sci. U.S.A. 100, 15498 (2003). 10.1073/pnas.2531778100 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Hollenbeck K. J., INVLAP.M: A matlab function for numerical inversion of Laplace transforms by the de Hoog algorithm, see http://mipav.cit.nih.gov/documentation/api/gov/nih/mipav/model/algorithms/InverseLaplace.html
  • 41.Astier Y., Braha O., and Bayley H., J. Am. Chem. Soc. 128, 1705 (2006). 10.1021/ja057123+ [DOI] [PubMed] [Google Scholar]

Articles from The Journal of Chemical Physics are provided here courtesy of American Institute of Physics

RESOURCES