Skip to main content
NIST Author Manuscripts logoLink to NIST Author Manuscripts
. Author manuscript; available in PMC: 2017 Dec 6.
Published in final edited form as: Anal Chem. 2016 Nov 15;88(23):11900–11907. doi: 10.1021/acs.analchem.6b03725

MOSAIC: A Modular Single-Molecule Analysis Interface for Decoding Multistate Nanopore Data

Jacob H Forstater †,, Kyle Briggs §, Joseph W F Robertson , Jessica Ettedgui †,, Olivier Marie-Rose ||, Canute Vaz , John J Kasianowicz , Vincent Tabard-Cossa §, Arvind Balijepalli †,*
PMCID: PMC5516951  NIHMSID: NIHMS875049  PMID: 27797501

Abstract

Biological and solid-state nanometer-scale pores are the basis for numerous emerging analytical technologies for use in precision medicine. We developed Modular Single-Molecule Analysis Interface (MOSAIC), an open source analysis software that improves the accuracy and throughput of nanopore-based measurements. Two key algorithms are implemented: ADEPT, which uses a physical model of the nanopore system to characterize short-lived events that do not reach their steady-state current, and CUSUM+, a version of the cumulative sum statistical method optimized for longer events that do. We show that ADEPT detects previously unreported conductance states that occur as double-stranded DNA translocates through a 2.4 nm solid-state nanopore and reveals new interactions between short single-stranded DNA and the vestibule of a biological pore. These findings demonstrate the utility of MOSAIC and the ADEPT algorithm, and offer a new tool that can improve the analysis of nanopore-based measurements.

Graphical Abstract

graphic file with name nihms875049u1.jpg


Protein and solid-state nanopores (Figure 1A) are the basis for single-molecule measurements of a variety of analytes including ions,1,2 single-stranded RNA and DNA,310 double-stranded DNA,11,12 proteins,1319 synthetic polymers,2023 and metallic nanoparticles.24,25 The method is conceptually simple. An electric potential applied across a nanopore that spans two electrically isolated chambers (filled with electrolyte solutions) results in an ionic current with a mean value 〈i0〉 (Figure 1B). Single molecules that reversibly partition into the pore cause a series of pulses or current blockades (Figure 1B). The change in pore conductance is caused by the volume exclusion of mobile ions from the pore20,21 and interactions between the ions and the analyte.5,21,26 The change in conductance4,20,21,26,27 and the residence time of analytes in the pore8,20,21 are used to estimate the analyte size,20,21 effective charge,21 and dipole moment.28

Figure 1.

Figure 1

(A) Schematic illustration of DNA translocation through a solid-state nanopore. An electric potential applied across the pore produces an ionic current. (B) The partitioning of DNA into the pore causes well-defined current reductions with different mean current blockade amplitudes (e.g., 〈ia〉 and 〈ib〉). (C) A single level event is characterized by the ratio of the mean currents for the occupied and fully open pore (〈ia〉/〈i0〉 and the level residence time (Δt). (D) A histogram illustrating the relative current blockade depth for two species obtained by analyzing the events. (E) Residence time distributions for the two blockade depth populations. The mean residence times are estimated from fits to the distributions.

Analyte-induced events appear as single or multiple conductance state levels, arising from changes in the analyte conformation or interactions in the pore9,11,20,21,27,29,30 (Figure 1C). Multiple conductance level events differ from the gating of ion channels, where the channel fluctuates between two states, open and closed.31 These fluctuations are well characterized using hidden Markov models3234 and kinetic simulations.34,35 On the other hand, several analysis techniques have been applied to analyze nanopore-based single-molecule data including threshold detection,4,21 slope- or area-based techniques, 36,37 the cumulative sum (CUSUM) algorithm,38 charge conservation,39 and probabilistic machine-learning techniques. 20,40 While these approaches are effective when the residence times of analytes in the nanopore are long (compared to the characteristic time constant of the system), they are not useful for characterizing short-lived events. To more accurately characterize short events, we developed a technique that models the ionic current response with an equivalent electrical circuit.26,41 This algorithm, when applied to the interaction of a polydisperse mixture of a synthetic polymer with the Staphylococcus aureus α-hemolysin (αHL) nanopore, recovered 18-fold more events per unit time at high measurement bandwidth (B = 100 kHz), reduced the constraints on data acquisition by permitting polymers to be separated at lower bandwidth (B = 10 kHz), and improved the resolving power in the low mass regime (to polymers with molecular weight ≈ 370 g/mol).

Here, we describe Modular Single-Molecule Analysis Interface (MOSAIC), an improved data analysis tool for analyzing nanopore data. We implemented two algorithms in MOSAIC: ADEPT, the equivalent electrical circuit model described above,26,41 and an improved version of CUSUM that is suitable for analyzing events with relatively long residence times in the pore. The software is extensible, and allows many commonly used data formats, signal conditioning, and data processing algorithms to be seamlessly integrated. Below, we demonstrate the features and utility of MOSAIC when applied to data measured with both biological and solid-state nanopores.

MATERIALS AND METHODS

Solid-State Nanopore Measurements

Solid-state nanopore data were reanalyzed from Briggs et al.42 Briefly, nanopores were fabricated in 50 μm × 50 μm, 10 nm thick low-stress silicon nitride (SiNx) TEM windows (Norcada, Canada) via the controlled breakdown (CBD) method47 in a 1 M NaCl buffer (pH 10, 10 mM NaHCO3). Nanopore measurements of double-stranded DNA (dsDNA) were performed in 3.6 M LiCl, 10 mM HEPES (pH 8) using highly purified 50 base pair (bp) dsDNA fragments (NoLimits no. SM1421, Life Technologies). Data were low-pass filtered at 100 kHz with a hardware 4-pole Bessel filter (Axopatch 200B) and digitized using a National Instruments USB-6351 DAQ card (Austin, TX) at a sampling rate, Fs = 500 kHz.

Biological Nanopore Measurements

Nanopore measurements were performed using quartz capillaries with a (≈ 1 μm diameter) aperture on one end,26,43,44 within a custom polycarbonate test cell with ≈ 200 μL volume (Electronic Biosciences, San Diego, CA). Analytes were dissolved in the working buffer and added directly into the capillary or to the external test cell.

For single-stranded DNA measurements, the quartz capillary was filled with a 10–20 μM solution of different length homopolymeric adenosine dA20, dA40, or dA100 (Integrated DNA Technologies, Coralville, IA) dissolved in 1 M NaCl, 1× TE buffer (10 mM Tris, 1 mM EDTA in DNase-free water, titrated to pH 7.2 with 3 M HCl).

For poly(ethylene glycol) measurements (PEG), data from two previous studies that span a wide range of polymer sizes were combined.22,26 In both cases, the capillary was filled with a solution containing a combination of polydisperse PEG (Fluka, Switzerland) and a highly purified calibration standard (Polypure, Oslo, Norway), dissolved in 4 M KCl (Sigma-Aldrich, St. Louis, MO), buffered with 10 mM Tris (Schwarz/Mann Biotech, Cleveland, OH) and titrated to pH 7.2 with 3 M citric acid. The two different solutions were as follows: (a) 20 μM PEG-600 (MWavg = 600 g/mol), 40 μM PEG-400 (MWavg = 400 g/mol) and 2 μM purified PEG-502 (Mw = 502 g/mol) or (b) 30 μM PEG-1000 (MWavg = 1000 g/mol), 30 μM PEG-1500 (MWavg = 1500 g/mol), and 1 μM purified PEG-1251 (Mw = 1251 g/mol).

Planar lipid bilayers were formed across the quartz capillary aperture using a 10 mg/mL solution of 1,2 diphytanolyl-sn-glycero-3-phosphatidylcholine (DPhyPC; Avanti Polar Lipids, Alabaster, AL) in n-decane (Sigma-Aldrich).26 Subsequently, wild-type S. aureus α-Hemolysin (αHL) was introduced to the test cell by adding a solution containing either ≈250 ng of monomeric αHL (List Biological Laboratories, Campbell, CA) or ≈2.5 ng of purified preformed heptamers. To facilitate channel incorporation, the bilayer was thinned and enlarged by applying a transmembrane potential of ≈300 mV and a static back pressure within the capillary. Following the insertion of a single channel, the static pressure was reduced and the voltage decreased to the value used for the measurement to prevent further channel incorporation.

The potential was applied across the membrane by a pair of Ag/AgCl electrodes. Immediately prior to use, the electrode placed within in the capillary was prepared by abrading an Ag wire (Alfa Aesar) with 600 grit sandpaper and soaking it in bleach for ≈10 min. The external electrode in the test cell bath was a 2 mm Ag/AgCl disk electrode (E202, In Vivo Metric). Data were acquired with a custom high-impedance amplifier system (Electronic BioSciences, San Diego, CA) and conditioned with a low-pass antialiasing filter. The analog signal was digitized by a National Instruments PCI-6120 DAQ card with a sampling rate (Fs) of 1 MHz, further conditioned using a software-based 8-pole low pass Bessel filter with a cutoff frequency of 100 kHz and resampled at 500 kHz.

Data Processing and Analysis

Nanopore data were processed using a Python based program (MOSAIC) developed in-house. The software implements the ADEPT and CUSUM+ algorithms, which are described below. The compiled program and source code are freely available at https://pages.nist.gov/mosaic/. MOSAIC consists of a modular data processing pipeline which allows users to analyze ionic current data from single-molecule nanopore experiments. The software is designed using object-oriented principles, which ensures that modules remain interoperable. This also makes it straightforward to implement new features into the software, such as alternative analysis algorithms or custom data formats. In many cases, users can interact with MOSAIC using a front end graphical user interface.

Algorithms

MOSAIC consists of a pipeline with five modules: (i) load data, (ii) optional signal conditioning and filtering, (iii) event detection, (iv) event analysis, and (v) results storage. A detailed description of the software architecture is presented in the Supporting Information.

In this section, we discuss two algorithms implemented in MOSAIC: (i) CUSUM+, an improved version of the CUSUM algorithm38 that provides robust statistical analysis of events which converge to a steady-state, and (ii) ADEPT, an implementation of a previously developed theory26,41 that uses a physical model of the nanopore system to accurately characterize very short events that do not approach a steady-state ionic current.

Cumulative Sum Analysis (CUSUM+)

Cumulative Sum (CUSUM) is a commonly used method to detect step-like changes in time-series data45 but was only recently implemented in nanopore analysis.38 It assumes that the interaction of an analyte with the pore causes a series of instantaneous changes in the ionic current from its baseline value (defined as states and well-approximated by step functions45), and the ionic current noise follows a known distribution (e.g., Gaussian). A statistical test identifies when the current level changes. The instantaneous log-likelihood ratios of sequential data points for both positive and negative step changes are calculated. The positive values of these ratios are independently summed and a negative log-likelihood resets the sum to zero. These form a two-sided decision function, which detects level changes that correspond to either an increase or decrease in the current level. A new state is identified when one of the decision functions exceeds a threshold determined automatically by the software. The locations of state changes are determined from the minima of related functions,45 and the mean ionic current between sequential states is calculated and used to determine the local blockade depth, defined as ratio of the ionic current when the pore is occupied to that of the open pore (〈i〉/〈i0〉, Figure 1C).

We implemented an improved version of the CUSUM algorithm in MOSAIC (CUSUM+), which is less sensitive to artifacts that can be falsely identified as a state change. This is achieved by specifying a minimum time between successive triggers (to exclude transients from the state change detection and blockade depth calculations) and by requiring that identified state levels differ by a minimum value (corresponding to a physically significant change). The efficiency is improved by eliminating or reducing redundant computations (e.g., maintaining running calculations of the mean and variance).

Adaptive Time-Series Analysis (ADEPT)

For very long events (>5τ, Figure 2 left), the blockade depth is easily estimated (e.g., with CUSUM+). However, that process fails for short-lived events (<5τ, Figure 2 right) that do not reach a steady-state mean value. In this case, the blockade depths are estimated by fitting the data to an electrical circuit model of the nanopore (implemented as ADEPT26,41 in MOSAIC). The algorithm assumes a molecule partitioning into the nanopore instantaneously increases the nanopore resistance (Rp) by ΔR. However, the system capacitance causes the ionic current change to occur over a finite time. For a constant applied voltage, Va, the predicted ionic current is i(t) = i0β(1−et/τ), where i0=VaRs+Rp,β=VaΔR(Rs+Rp+ΔR)(Rs+Rp), and τ=CmRs(Rp+ΔR)Rs+Rp+ΔR. The difference between the time constants (τ) leading up to and following an event are much shorter than typical sampling rates (see the Supporting Information for the calculation). We therefore use a single fit parameter for τ, which reduces the degrees of freedom. There is an option to override this constraint.

Figure 2.

Figure 2

ADEPT and CUSUM+ analysis applied to a simulated nanopore measurement (gray). Two events with the same current blockade (red) but different residence times (tres), with respect to the system characteristic relaxation time (τ) are shown. (Left) For a long event (tres ≥ 5τ), the ionic current converges close to its steady-state value, and the current levels estimated by ADEPT and CUSUM+ are equivalent. (Right) For short events (e.g., tres ≈ 2τ), the current does not reach the steady-state value of the idealized pulse (red). In this case, CUSUM+ and other algorithms used in nanopore analysis systematically underestimate the steady-state current (blue) by an amount Δi (gray; dashed). In contrast, the physical model underlying ADEPT allows the algorithm to accurately estimate an event’s steady-state current.

RESULTS AND DISCUSSION

Analysis of Short dsDNA Fragments Measured with SiNx Nanopores

We compared the results of CUSUM+ and ADEPT on measurements of 50 base pair (bp) double-stranded DNA (dsDNA) translocating through a ≈2.4 nm diameter SiNx nanopore (≈2800 events).42 At an applied potential of 400 mV, the mean residence time of dsDNA in the pore is ≈440 μs, more than an order of magnitude longer than the characteristic time constant of the system (τ = 10 μs; B = 100 kHz). Both algorithms produce two distinct peaks in the blockade depth histogram (〈i〉/〈i0〉 = 0.070 ± 0.001 and 0.488 ± 0.004). Peak positions were obtained using an error-weighted Gaussian fit and are reported with an expanded uncertainty, k = 2 (see Supporting Information for a full listing of the analysis and fit parameters). The leftmost peak corresponds to DNA translocation, whereas the rightmost peak is likely due to the helical structure of dsDNA unwinding to transition from the B-form to the S-form dsDNA,46,47 where the chain elongates by 1.7 fold because of the strong electric field gradient across the pore.42

At 800 mV, the blockade depth histogram produced by ADEPT has two overlapping peaks (Figure 3B) consisting of a narrow peak (〈i〉/〈i0〉 = 0.080 ± 0.002), the expected location for B-form dsDNA,46 and a broader peak (〈i〉/〈i0〉= 0.138 ± 0.002). The latter is comprised of short, single-level events, which likely result from transient interactions between the dsDNA and the access region outside the pore.48 This was not accurately identified in our previous analysis42 (see below). A third, low amplitude, broad peak is visible at 〈i〉/〈i0〉 = 0.226 ± 0.018. It is probable that this peak is associated with these access-region interactions. Alternatively, it is possible that these events could represent partial unwinding of the dsDNA secondary structure at forces below the B–S stretching transition threshold.47 As seen in Figure 3B, CUSUM+ also detects the first peak, 〈i〉/〈i0〉 = 0.090 ± 0.002 (albeit slightly shifted compared to the ADEPT value). However, it only characterizes ≈20% of the events in the second peak (〈i〉/〈i0〉 = 0.134 ± 0.010) where the mean residence time of the events (〈tres〉 = 47 ± 4 μs) is less than 5τ. In addition, CUSUM+ misses most of the events from the third peak detected with ADEPT. While CUSUM+ could be allowed to characterize events with lifetimes less than 5τ, it will underestimate the blockade depth ratios (see the Supporting Information, Figure S3). Both algorithms identify the fourth peak (〈i〉/〈i0〉 ≈ 0.56), which arises from the stretching transition of dsDNA noted above (S-form of dsDNA46,47). CUSUM+ recovers more events here than ADEPT, which utilizes a fitting routine that may not converge for some very long events (tres > 25 ms; 50 000 points; Fs = 500 kHz). Clearly, the results would be improved if ADEPT is used for relatively short-lived events and CUSUM+ is used on events with residence times >5τ (Supporting Information, Figure S4). This functionality will be implemented in a future version of MOSAIC.

Figure 3.

Figure 3

Blockade depth histograms for a 50 base pair double-stranded DNA measured with a 2.4 nm diameter SiNx nanopore.42 (A) At 400 mV, both CUSUM+ (gray), and ADEPT (red) are in excellent agreement. (B) At 800 mV, the mean residence time decreases to ≈36 μs (estimated from ADEPT analysis). The analyses by ADEPT and CUSUM+ are markedly different.

While both CUSUM+ and ADEPT produce comparable results for events with residence times >5τ, CUSUM+’s statistical approach is on average ≈10× faster than the Levenberg–Marquardt least-squares fitting used in ADEPT.48 Therefore, CUSUM+ is preferred for events with mean residence times considerably longer than the recovery time of the system (≫5τ). Furthermore, the processing time per event for each algorithm scales linearly with the residence time, and therefore the number of data points in an event (see Supporting Information, Figure S5).

Analysis of Single-Stranded DNA Oligonucleotides with ADEPT

We use ADEPT to determine the blockade depth ratio histograms for three different length single-stranded DNA (ssDNA) homopolymers (dA100, dA40, and dA20) entering an αHL nanopore from the cis side.49 We consider events with up to 6 discrete states, with each state containing at least 5 data points (tres > 10 μs, Fs = 500 kHz). Events are partitioned from the time series data with a thresholding algorithm that identifies when the current deviates by more than 5 standard deviations from the mean open channel current. A complete listing of the analysis parameters is shown in Supporting Information, Table S3.

The current blockades observed with poly(dA) appear as either one level (a shallow or deep blockade) or two levels (a shallow blockade followed by a deeper one8,50,51). Approximately 60% of the events have two levels. As shown in the blockade depth ratio histograms for dA100, dA40, and dA20 (Figure 4), these two observed levels are comprised of several different states. Figure 4A shows that the blockade depth histogram for dA100 has three peaks, 〈i〉/〈i0〉 = (0.11 ± 0.03), (0.15 ± 0.05), and (0.50 ± 0.05). The first two peaks (denoted 〈i〉/〈i03′ and 〈i〉/〈i05′) are consistent with the dependence of the blockade depth on the orientation of the leading end of the DNA (3′ vs 5′) entering the pore.4,52,53 The location of these two peaks agrees with previous measurements of dA100 where two highly overlapping peaks were observed at these locations.6 In contrast to earlier measurements we resolve the 3′ and 5′ events with a separation better than 3σ.

Figure 4.

Figure 4

Normalized blockade depth histograms for (A) dA100, (B) dA40, and (C) dA20 single-stranded DNA interacting with the αHL nanopore estimated using the ADEPT algorithm. The applied potential is V = 140 mV and the polynucleotides are added to the cis side of the pore.49

The differences between the 3′ and 5′ blockade depth peaks (Figure 4A, two leftmost peaks) are progressively more difficult to discern for the shorter polynucleotides (Figure 4B,C). The amplitude of the 5′ peak decreases substantially for dA40 (Figure 4B) and is not resolved for dA20 (Figure 4C). These results are likely due to the lower probability of the 5′-end entering the pore and the decreasing residence time of shorter ssDNA molecules.4,53

Interestingly, the shallow blockade level (〈i〉/〈i0〉 ≈ 0.5) is characterized by a single peak for dA100 and dA40 and two peaks for dA20. Previous studies have either not reported this peak6 or only noted it for molecules as short as dA50.50 Furthermore, the sharp decrease in residence time associated with shorter polymers complicates their analysis and has thus far limited the analysis of polynucleotides as short as dA20.

The algorithms within MOSAIC improve the characterization of short polynucleotides. Figure 5 shows the voltage-dependent behavior of the dA20 shallow blockade peaks and their residence time distributions. The shallow blockade depth distributions are qualitatively different than those measured for longer polymers (Figure 4), as noted previously. Furthermore, we observe a change in the morphology of the peaks with increasing voltage as seen in Figure 5A. In particular, increasing the magnitude of the applied potential: (i) shifts the peaks to smaller 〈i〉/〈i0〉 values, i.e., the polynucleotide blocks more current (Figure 5A), (ii) increases the residence times (Figure 5B) (in contrast to the mean residence time of the deep blockades in Figure 4 that are associated with translocation),4,5 and (iii) increases the number of events observed per unit time (capture rate) (see the Supporting Information, Figure S7). Moreover, with increasing voltage, the shallow blockades were more likely to exhibit two states with the shallow blockade preceding a deep blockade.

Figure 5.

Figure 5

Voltage dependence of dA20 shallow blockade depth and residence time. (a) The normalized blockade depth histogram as a function of voltage yields peaks with changing morphology. (B) Joint residence time-blockade depth distribution (log–linear) as a function of voltage. Z-scale (color) was normalized and smoothed using a Gaussian interpolation.

The above results strongly suggest that the shallow blockade peaks correspond to dA20 interacting with the vestibule but not translocating through the pore. Furthermore, the observed increase in the residence time of the shallow blockade (Figure 5B) with voltage suggests that the change in the leftmost peak amplitude in Figure 5A is likely due to interactions between the analyte and different regions of the vestibule, rather than a loss of signal. Interestingly, of the three measured analytes (dA100, dA40, dA20), this voltage-dependent change in peak structure was only observed with dA20, indicating that the phenomenon may be length-dependent.49,54

Example of Using MOSAIC: Single Molecule Mass Spectrometry with a Biological Nanopore

We show a typical analysis using MOSAIC’s graphical user interface. Specifically, we use both CUSUM+ and ADEPT to separate monomers in polydisperse PEG samples with an αHL nanopore.2022,26,55,56 Previous studies showed that the blockade depth ratio (〈i〉/〈i0〉) and mean residence time of these events scale monotonically with polymer size.

Analysis of the PEG data is set up using the GUI shown in Figure 6 and is configured using drop down menus. After selecting a data source (MOSAIC accepts most common electrophysiology data formats: Axon ABF, QUB QDF, as well as raw binary and comma separated value, CSV, data), a segment of the time series is displayed, which assists in determining the mean open channel current (<i0>), noise (σi0) and threshold values for preliminary event identification.

Figure 6.

Figure 6

Graphical user interface (GUI) used to setup and run an analysis in MOSAIC. A main settings window is used to configure the analysis. The results of an analysis are displayed in real-time in the adjacent panels. Completed analysis can also be reexamined within the GUI by opening the saved database files.

A key feature of MOSAIC is the ability to integrate custom algorithms into the processing pipeline. Within the GUI, the user can select the analysis algorithm. The PEG data were analyzed independently using both the ADEPT and CUSUM+ algorithms. The blockade depth histogram of the events and the processing statistics are presented in real time. Fits of the physical model (ADEPT) or detected states (CUSUM+) of individual analyzed events are also displayed to monitor the progress and quality of the analysis. The results are stored in a SQLite database (or can be exported as a CSV file from within the GUI) for further analysis.

This example further illustrates the differences between the CUSUM+ and ADEPT algorithms. Only events that deviate from the open channel current baseline by at least 2.7σ were analyzed. Events shorter than 5τ were excluded from the CUSUM+ analysis (the default value when running CUSUM+ from the GUI). On the other hand, when using ADEPT, we excluded events shorter than 2.5τ (25 μs) to minimize fitting errors (set with the Advanced Settings dialogue in the GUI).

Both algorithms produce well-resolved peaks for PEGs larger than 17-mers (Figure 7 and Supporting Information, Figure S8). For smaller PEGs that have shorter mean residence times, CUSUM+ shows a single broad peak whereas ADEPT easily identifies additional individual species. Here, CUSUM+ recovers significantly fewer events than ADEPT because only a fraction of the total events (those with tres > 5τ) are considered. This effect is particularly significant because it amounts to examining the tail of the exponentially distributed lifetimes.20,21

Figure 7.

Figure 7

Comparison of ADEPT (gray) and CUSUM+ (blue) analyses of a polydisperse polyethylene glycol (PEG) solution measured with an αHL nanopore. Blockade depth histogram of events recovered by each algorithm show that the number of events recovered by CUSUM+ decreases sharply for small polymers (n < 20) that exhibit fast residence times (<5τ) in the pore.

The histograms in Figure 7 are fit to a sum of Lorentzian functions using Igor Pro 6.3 (Wavemetrics Inc., Portland, OR). To directly compare the blockade depth histograms, we use the peak positions from the ADEPT data set as initial guesses for a fit of the CUSUM+-derived blockade depth distribution (Figure 7, gray). As expected, for both algorithms, where the peaks are resolved, the peak positions are in good agreement. For PEGs (n < 17), the signal-to-noise ratio of the CUSUM+ peaks is lower than those recovered by ADEPT (Supporting Information, Figure S8), consistent with the lower number of events recovered by CUSUM+ in this region.

CONCLUSIONS

We developed a new open source platform for the analysis of single-molecule data (MOSAIC) and implemented two robust optimized algorithms (ADEPT and CUSUM+) for biological or solid-state nanopore measurements. When applied to dsDNA measurements with a 2.4 nm SiNx nanopore, MOSAIC found previously undetected states most likely arising from the transient interactions between dsDNA and the access region of the solid-state pore. Additionally, when measuring short oligonucleotides poly(dA)n, MOSAIC accurately analyzed events with residence times <5τ, thereby characterizing previously unreported interactions of dA20 with the αHL nanopore. Such analysis can be used to provide greater insight into the underlying physics of analyte-nanopore interactions.

Supplementary Material

Supp1

Acknowledgments

This work was supported in part by Grant R01HG007415 from the National Human Genome Research Initiative (J.J.K) and by the Natural Sciences and Engineering Research Council of Canada (NSERC). K.B. acknowledges the financial support provided by NSERC and the Vanier CGS program for postgraduate fellowships. We gratefully acknowledge the Ju Laboratory (Columbia University) for providing the preheptamerized alpha-hemolysin used in some of these studies. We thank the numerous research groups who provided feedback on the software during its development.

Footnotes

Author Contributions

The manuscript was written through contributions of all authors. All authors have given approval to the final version of the manuscript.

Certain commercial entities, equipment, or materials may be identified in this document in order to describe an experimental procedure or concept adequately. Such identification is not intended to imply recommendation or endorsement by the National Institute of Standards and Technology, nor is it intended to imply that the entities, materials, or equipment are necessarily the best available for the purpose.

The authors declare no competing financial interest.

Supporting Information

The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.analchem. 6b03725.

Additional analysis of ssDNA and dsDNA blockade depth histograms and extended discussion of MOSAIC software contents and structure (PDF)

References

  • 1.Bezrukov SM, Kasianowicz JJ. Phys Rev Lett. 1993;70(15):2352–2355. doi: 10.1103/PhysRevLett.70.2352. [DOI] [PubMed] [Google Scholar]
  • 2.Kasianowicz JJ, Bezrukov SM. Biophys J. 1995;69(1):94–105. doi: 10.1016/S0006-3495(95)79879-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Zahid OK, Wang F, Ruzicka JA, Taylor EW, Hall AR. Nano Lett. 2016;16:2033. doi: 10.1021/acs.nanolett.6b00001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Kasianowicz JJ, Brandin E, Branton D, Deamer DW. Proc Natl Acad Sci U S A. 1996;93(24):13770–13773. doi: 10.1073/pnas.93.24.13770. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Henrickson SE, Misakian M, Robertson B, Kasianowicz JJ. Phys Rev Lett. 2000;85(14):3057–3060. doi: 10.1103/PhysRevLett.85.3057. [DOI] [PubMed] [Google Scholar]
  • 6.Meller A, Nivon L, Brandin E, Golovchenko J, Branton D. Proc Natl Acad Sci U S A. 2000;97(3):1079–1084. doi: 10.1073/pnas.97.3.1079. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Clarke J, Wu H-C, Jayasinghe L, Patel A, Reid S, Bayley H. Nat Nanotechnol. 2009;4:265–270. doi: 10.1038/nnano.2009.12. [DOI] [PubMed] [Google Scholar]
  • 8.Henrickson SE, DiMarzio EA, Wang Q, Stanford VM, Kasianowicz JJ. J Chem Phys. 2010;132(13):135101. doi: 10.1063/1.3328875. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Manrao EA, Derrington IM, Laszlo AH, Langford KW, Hopper MK, Gillgren N, Pavlenok M, Niederweis M, Gundlach JH. Nat Biotechnol. 2012;30(4):349–353. doi: 10.1038/nbt.2171. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Cherf GM, Lieberman KR, Rashid H, Lam CE, Karplus K, Akeson M. Nat Biotechnol. 2012;30(4):344–348. doi: 10.1038/nbt.2147. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Merchant CA, Healy K, Wanunu M, Ray V, Peterman N, Bartel J, Fischbein MD, Venta K, Luo Z, Johnson ATC, Drndic M. Nano Lett. 2010;10(8):2915–2921. doi: 10.1021/nl101046t. [DOI] [PubMed] [Google Scholar]
  • 12.Rodríguez-Manzo JA, Puster M, Nicolaï A, Meunier V, Drndic M. ACS Nano. 2015;9:6555–6564. doi: 10.1021/acsnano.5b02531. [DOI] [PubMed] [Google Scholar]
  • 13.Marshall MM, Ruzicka J, Zahid OK, Henrich VC, Taylor EW, Hall AR. Langmuir. 2015;31(15):4582–4588. doi: 10.1021/acs.langmuir.5b00457. [DOI] [PubMed] [Google Scholar]
  • 14.Kasianowicz JJ, Henrickson SE, Weetall HH, Robertson B. Anal Chem. 2001;73(10):2268–2272. doi: 10.1021/ac000958c. [DOI] [PubMed] [Google Scholar]
  • 15.Oukhaled G, Mathé J, Biance AL, Bacri L, Betton JM, Lairez D, Pelta J, Auvray L. Phys Rev Lett. 2007;98(15):158101. doi: 10.1103/PhysRevLett.98.158101. [DOI] [PubMed] [Google Scholar]
  • 16.Oukhaled A, Cressiot B, Bacri L, Pastoriza-Gallego M, Betton J-M, Bourhis E, Jede R, Gierak J, Auvray L, Pelta J. ACS Nano. 2011;5(5):3628–3638. doi: 10.1021/nn1034795. [DOI] [PubMed] [Google Scholar]
  • 17.Pastoriza-Gallego M, Rabah L, Gibrat G, Thiebot B, van der Goot FG, Auvray L, Betton J-M, Pelta J. J Am Chem Soc. 2011;133(9):2923–2931. doi: 10.1021/ja1073245. [DOI] [PubMed] [Google Scholar]
  • 18.Rotem D, Jayasinghe L, Salichou M, Bayley H. J Am Chem Soc. 2012;134(5):2781–2787. doi: 10.1021/ja2105653. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Larkin J, Henley RY, Muthukumar M, Rosenstein JK, Wanunu M. Biophys J. 2014;106(3):696–704. doi: 10.1016/j.bpj.2013.12.025. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Robertson JWF, Rodrigues CG, Stanford VM, Rubinson KA, Krasilnikov OV, Kasianowicz JJ. Proc Natl Acad Sci U S A. 2007;104(20):8207–8211. doi: 10.1073/pnas.0611085104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Reiner JE, Kasianowicz JJ, Nablo BJ, Robertson JWF. Proc Natl Acad Sci U S A. 2010;107(27):12080–12085. doi: 10.1073/pnas.1002194107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Balijepalli A, Robertson JWF, Reiner JE, Kasianowicz JJ, Pastor RW. J Am Chem Soc. 2013;135(18):7064–7072. doi: 10.1021/ja4026193. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Baaken G, Halimeh I, Bacri L, Pelta J, Oukhaled A, Behrends JC. ACS Nano. 2015;9(6):6443–6449. doi: 10.1021/acsnano.5b02096. [DOI] [PubMed] [Google Scholar]
  • 24.Angevine CE, Chavis AE, Kothalawala N, Dass A, Reiner JE. Anal Chem. 2014;86(22):11077–11085. doi: 10.1021/ac503425g. [DOI] [PubMed] [Google Scholar]
  • 25.Ettedgui J, Kasianowicz JJ, Balijepalli A. J Am Chem Soc. 2016;138(23):7228–7231. doi: 10.1021/jacs.6b02917. [DOI] [PubMed] [Google Scholar]
  • 26.Balijepalli A, Ettedgui J, Cornio AT, Robertson JWF, Cheung KP, Kasianowicz JJ, Vaz C. ACS Nano. 2014;8(2):1547–1553. doi: 10.1021/nn405761y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Movileanu L, Cheley S, Bayley H. Biophys J. 2003;85(2):897–910. doi: 10.1016/S0006-3495(03)74529-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Yusko EC, Bruhn BR, Eggenberger O. arXivorg. 2015 arXiv:1510.01935. [Google Scholar]
  • 29.Schneider GF, Kowalczyk SW, Calado VE, Pandraud G, Zandbergen HW, Vandersypen LMK, Dekker C. Nano Lett. 2010;10(8):3163–3167. doi: 10.1021/nl102069z. [DOI] [PubMed] [Google Scholar]
  • 30.Barati Farimani A, Min K, Aluru NR. ACS Nano. 2014;8:7914–7922. doi: 10.1021/nn5029295. [DOI] [PubMed] [Google Scholar]
  • 31.Bezanilla F. Physiol Rev. 2000;80(2):555–592. doi: 10.1152/physrev.2000.80.2.555. [DOI] [PubMed] [Google Scholar]
  • 32.Rabiner LR, Juang B. ASSP Magazine, IEEE. 1986;3(1):4–16. [Google Scholar]
  • 33.Qin F, Auerbach A, Sachs F. Biophys J. 2000;79(4):1915–1927. doi: 10.1016/S0006-3495(00)76441-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Magleby KL, Weiss DS. Biophys J. 1990;58(6):1411–1426. doi: 10.1016/S0006-3495(90)82487-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Colquhoun D. J Physiol. 2003;547(3):699–728. doi: 10.1113/jphysiol.2002.034165. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Pedone D, Firnkes M, Rant U. Anal Chem. 2009;81(23):9689–9694. doi: 10.1021/ac901877z. [DOI] [PubMed] [Google Scholar]
  • 37.Gu Z, Ying Y-L, Cao C, He P, Long Y-T. Anal Chem. 2015;87(2):907–913. doi: 10.1021/ac5028758. [DOI] [PubMed] [Google Scholar]
  • 38.Raillon C, Granjon P, Graf M, Steinbock LJ, Radenovic A. Nanoscale. 2012;4(16):4916. doi: 10.1039/c2nr30951c. [DOI] [PubMed] [Google Scholar]
  • 39.Gu Z, Ying Y-L, Cao C, He P, Long Y-T. Anal Chem. 2015;87:10653–10656. doi: 10.1021/ac5028758. [DOI] [PubMed] [Google Scholar]
  • 40.Schreiber J, Karplus K. Bioinformatics. 2015;31(12):1897–1903. doi: 10.1093/bioinformatics/btv046. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Balijepalli A, Ettedgui J, Cornio AT, Robertson J, Cheung KP, Kasianowicz JJ, Vaz C. ACS Nano. 2015;9(12):12583–12583. doi: 10.1021/acsnano.5b06216. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Briggs K, Kwok H, Tabard-Cossa V. Small. 2014;10(10):2077–2086. doi: 10.1002/smll.201303602. [DOI] [PubMed] [Google Scholar]
  • 43.White RJ, Ervin EN, Yang T, Chen X, Daniel S, Cremer PS, White HS. J Am Chem Soc. 2007;129(38):11766–11775. doi: 10.1021/ja073174q. [DOI] [PubMed] [Google Scholar]
  • 44.Kumar S, Tao C, Chien M, Hellner B, Balijepalli A, Robertson JWF, Li Z, Russo JJ, Reiner JE, Kasianowicz JJ, Ju J. Sci Rep. 2012;2:684. doi: 10.1038/srep00684. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Page ES. Biometrika. 1954;41(1/2):100. [Google Scholar]
  • 46.Cluzel P, Lebrun A, Heller C, Lavery R, Viovy JL, Chatenay D, Caron F. Science. 1996;271(5250):792–794. doi: 10.1126/science.271.5250.792. [DOI] [PubMed] [Google Scholar]
  • 47.Strick TR, Allemand JF, Bensimon D, Bensimon A, Croquette V. Science. 1996;271(5257):1835–1837. doi: 10.1126/science.271.5257.1835. [DOI] [PubMed] [Google Scholar]
  • 48.Newville M, Stensitzki T, Allen DB, Ingargiola A. LMFIT: non-linear least-square minimization and curve-fitting for Python; Zenodo. 2014 doi: 10.5281/zenodo.11813. [DOI] [Google Scholar]
  • 49.Song LZ, Hobaugh MR, Shustak C, Cheley S, Bayley H, Gouaux JE. Science. 1996;274(5294):1859–1866. doi: 10.1126/science.274.5294.1859. [DOI] [PubMed] [Google Scholar]
  • 50.Butler TZ, Gundlach JH, Troll M. Biophys J. 2007;93(9):3229–3240. doi: 10.1529/biophysj.107.107003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Butler TZ, Gundlach JH, Troll MA. Biophys J. 2006;90(1):190–199. doi: 10.1529/biophysj.105.068957. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Mathé J, Aksimentiev A, Nelson DR, Schulten K, Meller A. Proc Natl Acad Sci U S A. 2005;102(35):12377–12382. doi: 10.1073/pnas.0502947102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Muzard J, Martinho M, Mathé J, Bockelmann U, Viasnoff V. Biophys J. 2010;98(10):2170–2178. doi: 10.1016/j.bpj.2010.01.041. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Mathé J, Visram H, Viasnoff V, Rabin Y, Meller A. Biophys J. 2004;87(5):3205–3212. doi: 10.1529/biophysj.104.047274. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Krasilnikov OV, Rodrigues CG, Bezrukov SM. Phys Rev Lett. 2006;97(1):018301. doi: 10.1103/PhysRevLett.97.018301. [DOI] [PubMed] [Google Scholar]
  • 56.Bezrukov SM, Vodyanoy I, Brutyan RA, Kasianowicz JJ. Macromolecules. 1996;29(26):8517–8522. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supp1

RESOURCES