Skip to main content
Springer Nature - PMC COVID-19 Collection logoLink to Springer Nature - PMC COVID-19 Collection
. 2023 May 9:1–15. Online ahead of print. doi: 10.1038/s41576-023-00600-1

Beyond assembly: the increasing flexibility of single-molecule sequencing technology

Paul W Hook 1, Winston Timp 1,
PMCID: PMC10169143  PMID: 37161088

Abstract

The maturation of high-throughput short-read sequencing technology over the past two decades has shaped the way genomes are studied. Recently, single-molecule, long-read sequencing has emerged as an essential tool in deciphering genome structure and function, including filling gaps in the human reference genome, measuring the epigenome and characterizing splicing variants in the transcriptome. With recent technological developments, these single-molecule technologies have moved beyond genome assembly and are being used in a variety of ways, including to selectively sequence specific loci with long reads, measure chromatin state and protein–DNA binding in order to investigate the dynamics of gene regulation, and rapidly determine copy number variation. These increasingly flexible uses of single-molecule technologies highlight a young and fast-moving part of the field that is leading to a more accessible era of nucleic acid sequencing.

Subject terms: DNA sequencing, Epigenomics, Epigenetics, Genomics


Hook and Timp describe increasingly flexible ways in which single-molecule sequencing technologies are being used to analyse genomes. Examples include targeted genome sequencing, analysis of chromatin state and protein–DNA interactions, and sequencing of short reads.

Introduction

Since the beginning of the Human Genome Project in 1990, there has been a close pairing between technological innovation driving science and science demanding technological innovation. This drive led to next-generation, short-read sequencing methods dominating the field of nucleic acid sequencing (reviewed in ref. 1). However, short-read sequencing is fundamentally limited in read length (<1000 bp reported1) owing to cycle dephasing and the resulting drops in read quality over length2,3. By contrast, single-molecule sequencing methods, especially platforms from Pacific Biosciences (PacBio) and Oxford Nanopore Technologies (ONT), are not subject to this limitation and allow for the sequencing of long reads (>10 kb). Perhaps the most important difference between these platforms is that PacBio performs sequencing-by-synthesis whereas ONT uses a protein nanopore to characterize the molecule through electrolytic current modulation4. Though both technologies had initial issues with read accuracy (PacBio continuous long read accuracy 85–89%5; ONT R6 accuracy 67%6) and yield (PacBio RS II ~500–1000 Mb; ONT R6 yield ~250 Mb), these features have improved substantially over the past eight years. Both technologies can now achieve impressive accuracies — ~98% for ONT and 99% for PacBio4,7 — and an ONT PromethION device can generate in excess of 100 Gb per flow cell, whereas a PacBio Sequel II HiFi run can generate over 30 Gb4. These output levels put the cost per Gb of PacBio (US$65) and ONT (US$17) sequencing closer to that of short-read instruments such as the Illumina NovaSeq 6000 (US$6) (Supplementary Note).

Long reads have already changed the landscape of genomics, expanding our knowledge by exploring areas that were previously unattainable with short reads. Long reads allow for more complete genome assemblies8, highlighted by their use in the assembly of the first telomere-to-telomere human genome9. Many more structural variants and repetitive areas can be probed with long reads because of their ability to map through the variant10,11, leading to the use of long-read sequencing for surveying structural variants in human populations12,13. Single-molecule sequencing even allows for native measurement of DNA methylation14, including in previously inaccessible regions such as centromeres15,16. Aside from DNA, long reads have also been used to explore RNA, providing information about full-length transcript isoforms including allele-specific expression, poly(A) tail length and RNA modifications1719.

The increasing accuracy and affordability of single-molecule, long-read sequencing has resulted in the accelerated development of methods that apply it to new problems in biology. Here, we review a selection of emerging methods and applications using commercially available single-molecule platforms. First, we review methods used for targeted sequencing of long reads, which harness the advantages of long-read sequencing without the need for whole-genome sequencing, thereby improving coverage and affordability. Next, we focus on assays for mapping protein–DNA interactions, which in addition to ascertaining information already revealed by short reads also provide previously unknown insights into genome organization. Last, we cover the sequencing of short reads with single-molecule platforms, a suite of methods that seek to increase the accessibility of sequencing and the amount of information that can be gained from a single sequencing run.

Insights without whole-genome sequencing

Costs for whole-genome sequencing have dropped substantially during the past decade, but even with the lower cost there are biological questions for which focused, high-depth sequencing is needed. For example, somatic variant calling and epigenetic sequencing of heterogeneous samples requires high sequencing depth to enable low-frequency variants or rare epigenetic states to be measured with confidence. Alternatively, when sequencing large sample sets such as complex disease cohorts, cost per sample becomes an important factor. In these scenarios, depth or sample number may be more important than unbiased genome-wide analysis, so targeting specific regions can drive down cost. Specific regions of interest — for example promoters or exons of protein-coding genes — can be selectively targeted for sequencing. Such targeted sequencing methods, including PCR amplicon sequencing and hybridization capture, have been extensively used in concert with short-read sequencing. These same methods have been adapted for long-read sequencing, in addition to the emergence of novel methods taking advantage of the PacBio and ONT platforms.

PCR enrichment

PCR enrichment, also known as amplicon sequencing, allows for targeted sequencing by simply designing primers flanking regions of interest. PCR enrichment is a mature method with low DNA input requirements and low hands-on time, which enables multiplexing of as many as 24,000 amplicons in one reaction with carefully designed commercial primer panels (Ion AmpliSeq assays20). Overlapping amplicons can be tiled across regions much longer than the amplicon length, with a recent example targeting genomic regions >40 kb21. PCR enrichment can be adapted to long-read sequencing (Fig. 1a) owing in part to the commercial availability of DNA polymerases that can amplify amplicons greater than 10 kb22,23. However, as the length of an amplicon increases, PCR becomes less efficient and requires optimization for each new reaction24. Amplicons greater than 7 kb and long amplicons with high GC content are difficult to consistently amplify25. PCR can also introduce errors (mainly substitutions)25, which can be an issue when probing rare mutations26. Amplifying DNA with PCR erases native DNA modifications, eliminating one of the key advantages of single-molecule platforms (Table 1). Notably, amplicon approaches often require sets of primers to be split into multiple pools owing to possible interactions between primer pairs, thus requiring multiple, optimized PCRs. This makes scaling PCR amplicons to multiple regions difficult. This is especially true for schemes that attempt to tile overlapping amplicons across large regions, as demonstrated in peer-reviewed and preprint studies21,27. Despite these caveats, amplicon sequencing has been used with ONT to detect structural variant frequency in genes frequently mutated in pancreatic cancer (CDKN2A and SMAD4)28 and with PacBio to identify disease-causing variants in a gene frequently mutated in autosomal-dominant polycystic kidney disease (PKD1)29. Outside human genetics, as demonstrated in both peer-reviewed30 and preprint27 articles, tiled amplicons have been used for low-cost, portable, infectious disease outbreak monitoring with ONT for a host of viruses including Zika30, Ebola31 and SARS-CoV-227, underscoring the utility of this method (Table 1).

Fig. 1. Long-read targeted sequencing methods.

Fig. 1

Long-read targeted enrichment methods fall within broad categories including PCR enrichment, hybridization capture, Cas-mediated enrichment and adaptive sampling. a, PCR enrichment uses specific primers to amplify regions of interest before library preparation. b, Hybridization capture uses biotinylated antisense probes designed against regions of interest to isolate DNA fragments containing the targets. PCR and hybridization capture enrichment methods are both commonly used with short-read sequencing and have been adapted to long-read sequencing. c, Cas-mediated enrichment uses Cas ribonuclear complexes (most commonly Cas9) to cut on either side of regions of interest. Cut fragments are selectively sequenced owing to preferential adapter ligation to the freshly cut ends55. Targeted fragments can be further enriched through depletion of off-target fragments5658. d, Enrichment using adaptive sampling is a nanopore sequencing method in which regions of interest are selectively sequenced by controlling the voltage at individual pores to eject unwanted fragments. ONT, Oxford Nanopore Technologies.

Table 1.

Summary of long-read DNA enrichment methods

Method Typical read length Typical coverage Example of percent reads on-target Example of number of targets Platform Advantages Disadvantages
PCR 7 kb 50–1,000× >96%141 19 amplicons31 ONT, PacBio

Simple design

High enrichment

Low hands-on time

Easy to multiplex samples

Easy to multiplex targets

Sensitive

Low input

Erases native DNA modifications

Limited fragment length

Lengthy optimization may be required

Introduction of PCR errors

Multiple reactions often needed

Hybridization capture 5 kb 200–1,000× 66.3%42 4,800 genes34 ONT, PacBio

Simple design

High enrichment

Easy to multiplex targets

Easy to multiplex samples

High scalability

Single reaction needed

Erases native DNA modifications

Limited fragment length

Laborious protocols with high hands-on time

Includes multiple PCR steps

Cas-mediated enrichment Up to 100 kb 50–1,000× 4.6%55 10 genes55 ONT, PacBio

Preserves DNA modifications

Long read lengths

No PCR

High DNA input

Less ability to multiplex

Adaptive sampling 10–20 kb 15–40× ~5%68 717 genes68 ONT

Simple design

Preserves DNA modifications

Long read lengths

No PCR

No additional molecular biology steps besides library prep

Low enrichment

Limited to ONT nanopore sequencing

Multiple sequencing runs to obtain maximum output

Computationally intensive

ONT, Oxford Nanopore Technologies; PacBio, Pacific Biosciences.

Hybridization capture sequencing

Hybridization capture sequencing uses tagged, antisense oligonucleotide probes against regions of interest. Genomic DNA is denatured using a combination of heat and chemical methods, probes are hybridized against it, probe-bound DNA is captured and unbound DNA is washed away32 (Fig. 1b). This method can be more easily scaled than PCR amplicons and often only requires one reaction, though probes are expensive and the resulting on-target rate tends to be lower (Table 1). Hybridization capture probes can also be used to enrich across large, contiguous target regions (for example, ~750,000 bp33) by tiling probes across the region in one reaction. Multiple separate locations are easily targeted — exemplified by a study targeting 4800 genes simultaneously with nanopore sequencing (Table 1), even though reads were only ~1,000 bp34. Though long-read hybridization capture methods have been applied successfully even in human cohorts to resolve complex structural variants leading to disease3538, they have key limitations (Table 1). The lengths of sequenced fragments are typically shorter than those in the original library, suggesting bias towards shorter fragments38. This observation has been consistent across long-read hybridization capture experiments37,3942 and is attributed to the hybridization capture step41. We and others have found large fragments more difficult to capture, with the most efficient capture size found to be about 5 kb4345. As with PCR amplicons, amplification (pre-capture or post-capture) can lead to errors in reads; for example, errors in AT-rich regions led to gaps in assembled haplotypes of a complex genomic region containing the natural killer-cell immunoglobulin-like receptor (KIR) gene family46. Hybridization capture is often a lengthy protocol (often >3 days42) independent of the long-read platform used — though automation and high throughput (96 samples) are possible with liquid-handling robotics. Despite these limitations, hybridization capture can produce deep on-target coverage with one study reporting 1099-fold enrichment from a single run on an ONT MinION device37.

Cas-mediated enrichment

Though powerful, amplicon and hybridization capture have key limitations in read length and maintenance of modification state: to fully capitalize on the potential of single-molecule targeted sequencing, methods need to be designed from the ground up with this in mind. A bacterial defence system, clustered regularly interspaced short palindromic repeats (CRISPR) and CRISPR-associated (Cas) proteins, though primarily used for genome editing47, can be adapted to enrich long fragments (Table 1). In Cas-mediated enrichment, the CRISPR–Cas system is used to induce double-stranded breaks flanking the regions of interest, which produces long fragments with ends amenable to downstream applications. Initially used to clone large fragments48,49, Cas9-assisted targeting of chromosome segments (CATCH) was adapted so that the cut fragments were instead gel-isolated by size and sequenced on an ONT MinION flow cell, achieving ~25–70× mean coverage tiled across a 200-kb region encompassing the hereditary cancer gene BRCA1 (ref. 50). Unfortunately, so little DNA was recovered after gel isolation that amplification was required, removing native DNA modifications and resulting in read lengths less than 5 kb50.

Subsequent methods have instead used preferential ligation at freshly cut sites flanking the regions of interest to remove the size selection step and have been used with both PacBio51,52 and ONT sequencing5358. Typically in these approaches, Cas cleavage occurs before library preparation and the first step is to passivate existing DNA ends by dephosphorylating them, which prevents random ligation. DNA is then cut by a Cas protein–guide RNA complex, either on one side or flanking a region of interest, to create 5′ phosphorylated ends. Sequencing adapters are then ligated to the freshly cut and phosphorylated sites to enable selective sequencing of fragments containing the area of interest (Fig. 1c). Exemplifying this strategy is nanopore Cas9-targeted sequencing (nCATS), which achieved up to 1,000× coverage at loci on an ONT MinION sequencer55. However, without multiplexing, only a fraction of the flow cell capacity is used in this method because of the low molarity of resulting library molecules55 (Table 1). Furthermore, this method seems to work best when two cut sites are generated. Additionally, obtaining read lengths greater than 50 kb was difficult, which may be attributed to the isolation of fragmented DNA during purification55. This affects the ability to obtain single reads that span larger regions.

Additional methods have been developed in an attempt to improve upon these caveats. For example, the affinity-based Cas9-mediated enrichment method (ACME) removes non-target fragments (increasing the molarity of library molecules) via bead-based pulldown of a His-tagged Cas9, which remains bound to non-target fragments after cutting56. Data presented in a preprint article demonstrated that ACME excelled in enriching for single reads spanning the entire length of large target regions (~100 kb)56. Cas-mediated enrichment has also been demonstrated on a completed PacBio sequencing library. As presented in a preprint article from 2017, a special capture adapter can be ligated to cut sites after Cas-mediated digestion51, allowing for a bead-based pull-down enrichment approach similar to ACME. This optimized PacBio approach was able to achieve 9% on-target reads, greater than reported with ACME (<1%)51,56. Alternatively, exonucleases can be used to digest off-target fragments, as in Cas9-based background elimination (CaBagE)57, Negative Enrichment58 and PacBio No-Amp52. These exonuclease-based methods can produce high coverage at target loci (~400× for small targets) with a high percentage of reads spanning the entire target region57. Furthermore, as shown in both published and preprint work, the size of target regions can be increased by tiling guide RNAs across a region59,60, similar to tiling methods used with PCR amplicons or hybridization capture. By using a pool of in vitro transcribed guide RNAs tiled across the region, a recent preprint study demonstrated the ability to enrich reads across a region as large as 9 Mb60.

Adaptive sampling

All the methods mentioned above include additional molecular biology steps involving targeted probes, primers or guide RNAs, which can add time and cost. An enrichment approach that does not include additional manipulations makes single-molecule targeted sequencing more accessible. Nanopore sequencing offers a unique opportunity in this regard — as the molecule is sequenced, a decision can be made to eject the molecule by flipping the voltage if the data do not match a database of targets, a process called adaptive sampling (Fig. 1d). Initially, adaptive sampling was implemented by matching the real-time electrical signal to a reference genome using dynamic time warping with the ‘Read Until’ approach61, but was limited to small reference genomes. As a result, improved algorithms for mapping electrical signal were developed6267, exemplified by UNCALLED, which demonstrated real-time enrichment of 148 human cancer genes with an average coverage of ~30× (5.5-fold enrichment over non-enriched) using an ONT MinION flow cell62 (Fig. 1d). Alternatively, improvements to the speed of the basecaller enabled the development of tools that align basecalled reads against a reference to decide whether or not a molecule should be sequenced6870. These tools are exemplified by readfish, which demonstrated enrichment of the genomic sequence of ~700 genes associated with human cancer (~30× mean coverage)68. A version of these sequence-based methods has been directly incorporated into the ONT sequencing software (MinKNOW), making it easy for end-users to employ.

Compared to other methods, adaptive sampling can target large regions of interest without additional expense or optimization of primers, probes or guide RNAs. Even entire human chromosomes can be targeted68, which can be ideal for biological questions such as exploring putative X chromosome-linked disorders. However, in order to achieve enrichment, sequenced fragments must be a sufficient length (>5 kb)71; the longer the ‘rejected’ molecule, the more time is saved by not sequencing it and hence the higher the enrichment of ‘accepted’ sequences. Best results are typically achieved for fragment sizes >10 kb62,68,72. Samples with damaged DNA (for example, formalin-fixed, paraffin-embedded tissue) typically have DNA lengths below this threshold, which may hinder their use with adaptive sampling. Finally, targeting either too low a percentage (<1%) or too high a percentage (>10%) of the genome will also lead to less enrichment: if too much time or not enough time is spent rejecting molecules, the resulting on-target sequence yield will not be sufficient.

Though easy to use, adaptive sampling methods result in lower coverage and a lower percentage of on-target reads than other enrichment methods (Table 1). Encouragingly, data presented in a recent preprint article demonstrated that readfish multiplexed sequencing on the ONT PromethION flow cell yielded 25–50× coverage for three human samples (5–6× enrichment over theoretical whole-genome sequencing), further reducing cost and indicating that higher depth is achievable72. Currently, adaptive sequencing requires relatively substantial computational resources, including access to graphical processing units (NVIDIA 2060 series or better with CUDA capability) or powerful central processing units to achieve the analysis speed needed for enrichment. Finally, pores become inactive more quickly during adaptive sampling than during standard nanopore sequencing runs, possibly owing to DNA blockages62. Maximum output can be achieved by performing a nuclease flush of the flow cells to remove blockages and a reload of the flow cell with fresh library62,68,72, but this increases the amount of DNA, reagents and hands-on time required for these experiments.

Additional methods

There are other approaches for long-read enrichment that do not fit into the above categories. For example, Xdrop partitions long DNA molecules into droplets with locus-specific primers, followed by droplet digital PCR. Droplets containing the loci of interest are isolated with flow sorting, and DNA is amplified73. This amplified DNA can then be sequenced with short-read or long-read platforms. This method requires a specialized microfluidic apparatus whereas the methods described above need only standard molecular biology tools.

Mapping protein–DNA interactions

For decades, researchers have tried to understand not just the sequence of DNA, but how DNA is organized within the nucleus and how that organization affects cellular function, development, gene regulation and disease (reviewed in ref. 74). State-of-the-art genomics methods including microarrays and next-generation sequencing have been leveraged to study chromatin state and protein–DNA binding (reviewed in7577), even down to the single-cell level (reviewed in ref. 78). Most of these assays rely on PCR enrichment for states of interest (such as open chromatin or bound protein), requiring input controls to correct for PCR bias and thereby making quantification difficult. These methods also typically fragment the DNA to small sizes to provide resolution, making it impossible to study the coordination of chromatin states at adjacent loci on the same single molecule of DNA. Short reads also make it difficult to assign reads to haplotypes given the infrequency of variants on short fragments. As emphasized above, PCR erases native DNA modifications, making additional steps necessary in order to measure methylation and protein–DNA interactions or chromatin state simultaneously7981.

Specific short-read methods using methyltransferase footprinting have set the stage for long-read approaches to explore protein–DNA binding. Emerging from the observation that methyltransferase enzymes preferentially label accessible DNA82, methyltransferase footprinting assays were developed to measure nucleosome positioning and protein–DNA interactions8386. Such assays can even determine protein binding through the protection from labelling; though the identity of the protein is not known, it can be inferred from the size of the protected areas (nucleosomes) or motifs in the protected areas87Chemical bisulfite conversion of unmethylated bases followed by next-generation sequencing allowed these footprinting assays to be applied to panels of promoters88, to genome-wide footprinting89, and down to single molecules with short reads90. These methods have now been combined with single-molecule platforms to begin to probe unknown aspects of gene regulation (Fig. 2).

Fig. 2. Long-read, single-molecule methyltransferase footprinting methods can reveal heterogeneity and coordination of chromatin states.

Fig. 2

a, In methyltransferase footprinting assays, a methyltransferase enzyme deposits exogenous methylation on accessible DNA, which may include linker DNA between histones, open chromatin regions or regions surrounding transcription factors bound to DNA. b, When this exogenous labelling is performed on long, single molecules, the heterogeneity of nucleosome positioning, open or closed chromatin and protein–DNA binding can be measured on single molecules. c, With long molecules that span multiple regulatory elements, the coordination between adjacent sites can be measured, potentially revealing unknown aspects of gene regulation. d, Antibody-directed methyltransferase labelling builds on methyltransferase footprinting by concentrating labelling around binding sites of specific proteins. The methyltransferase is fused to protein A, protein G or both, which bind to IgG antibodies.

Measuring chromatin accessibility with methyltransferase footprinting

Three methods have been developed that combine 5-methylcytosine (5mC) labelling with ONT sequencing to assay nucleosome positioning and open chromatin (Table 2). Two methods focused on yeast: one measured nucleosome positioning, with methyltransferase treatment followed by single-molecule long-read sequencing (MeSMLR-seq) using the GpC methyltransferase M.CviPI91; the other measured nucleosome occupancy via DNA methylation and high-throughput sequencing (ODM-seq) using both M.CviPI and the CpG methyltransferase M.SssI92. These methods were shown to correlate well with micrococcal nuclease (MNase) digestion sequencing (MNase-seq), a classic method for measuring nucleosome positioning. Using MeSMLR-seq data, over 300 inferred nucleosomes were phased on a single read and it was found that the number of molecules with open chromatin at a given promoter correlates with the expression of its corresponding gene91. ODM-seq estimated the number of nucleosomes across the entire genome in a yeast cell and quantified protein binding in nucleosome-free regions92. Methyltransferase footprinting has also been applied to human samples. Nanopore sequencing of nucleosome occupancy and methylome (nanoNOMe), adapted from NOMe-seq89, used M.CviPI to simultaneously call accessible chromatin (GC 5mC) and native CpG methylation, allowing for footprinting of proteins bound to DNA in bulk and on single reads93. NanoNOMe made use of the advantages of long reads by exploring chromatin state in repetitive elements and phasing reads to measure allele-specific chromatin accessibility and CpG methylation93. In particular, nanoNOMe was able to quantitatively examine protein binding at known motifs, such as CTCF sites, by examining the inferred footprint at these locations. Unsurprisingly, this revealed that traditional chromatin immunoprecipitation followed by sequencing (ChIP–seq) methods are semi-quantitative and that a ChIP–seq peak can represent a large range of fractional binding states. Later work combining nanoNOMe with Cas-mediated enrichment for higher depth found that different CTCF-binding sites have very different percentages of reads (5–70%) supporting CTCF binding94.

Table 2.

Summary of long-read footprinting assays

Method Enzymes used Theoretical resolution in humansa Sequencing platform Demonstrated uses
Methyltransferase treatment followed by single-molecule long-read sequencing (MeSMLR-seq)91 M.CviPI (5mC GC) Medium (>20 bp) ONT Nucleosome positioning, chromatin accessibility, chromatin coordination on single molecules
Occupancy measurement via DNA methylation and high-throughput sequencing (ODM-seq)92 M.CviPI (5mC GC), M.SssI (5mC CG) Medium (>20 bp) ONT Nucleosome positioning, chromatin accessibility, transcription factor footprinting
Nanopore sequencing of nucleosome occupancy and methylome (nanoNOMe)93 M.CviPI (5mC GC) Medium (>20 bp) ONT Nucleosome positioning, chromatin accessibility, transcription factor footprinting
Single-molecule long-read accessible chromatin mapping sequencing assay (SMAC-seq)96 M.CviPI (5mC GC), M.SssI (5mC CG), EcoGII (m6dA on all A) High (<5 bp) ONT Nucleosome positioning, chromatin accessibility, transcription factor footprinting, chromatin coordination on single molecules
Fiber-seq97 Hia5 (m6dA on all A) High (<5 bp) PacBio Nucleosome positioning, chromatin accessibility, transcription factor footprinting, chromatin coordination on single molecules
Single-molecule adenine methylated oligonucleosome sequencing assay (SAMOSA)99 EcoGII (m6dA on all A), MNase High (<5 bp) PacBio Oligonucleosome patterns, chromatin accessibility, transcription factor footprinting
Tagmentation-assisted single-molecule adenine methylated oligonucleosome sequencing assay (SAMOSA-Tag)100

EcoGII (m6dA on all A),

Tn5 transposase

High (<5 bp) PacBio Nucleosome positioning, chromatin accessibility, transcription factor footprinting, PacBio library preparation
Directed methylation with long-read sequencing (DiMeLo-seq)104 pA–Hia5 or pAG–Hia5 Low (within 200 bp)

PacBio,

ONT

Protein–DNA interactions, coordination between binding sites on single molecules
BIND&MODIFY105 pA–EcoGII (m6dA on all A) Low (within 100 bp) ONT Measuring protein–DNA interactions

5mC, 5-methylcytosine; m6dA, N6-methyladenine; MNase, micrococcal nuclease; ONT, Oxford Nanopore Technologies; PacBio, Pacific Biosciences; pA–EcoGII, fusion of protein A and EcoGII; pAG–Hia5, fusion of proteins A and G with Hia5. aEstimates of theoretical resolution are based on analysis described previously96.

The absence of recognition motifs for these 5mC methyltransferases can limit their ability to label some parts of the genome, such as AT-rich regions. Thus, other methods have leveraged N6-methyladenine (m6dA, also known as 6mA) methyltransferases (Table 2) for labelling, as m6dA is either absent from or present only at low levels in the genomes of eukaryotes95. The single-molecule long-read accessible chromatin mapping sequencing assay (SMAC-seq) uses a combination of methyltransferases (including M.CviPI, M.SssI and EcoGII (m6dA on all adenines)) to achieve high-resolution (<5 bp) mapping in order to study chromatin states and the coordination of regulatory elements on single molecules using ONT nanopore sequencing96 (Fig. 2). Fiber-seq used the Hia5 methyltransferase (m6dA on all adenines) with readout from PacBio sequencing97. Both methods were developed using model organisms with small genomes: SMAC-seq was developed using yeast and Fiber-seq using the Drosophila melanogaster S2 cell line. Both showed high correlation with existing open-chromatin data and the ability to study the coordination of chromatin state between adjacent regulatory sites (Fig. 2). More recently, a preprint article has described the use of Fiber-seq in human samples, leveraging improvements in single-molecule yield to profile the chromatin state of telomeres98.

Methyltransferase labelling has been further extended by combining it with other methods that can reveal protein–DNA interactions (Table 2). The single-molecule adenine methylated oligonucleosome sequencing assay (SAMOSA) combines EcoGII-mediated m6dA labelling with MNase digestion99, which targets reads to accessible regions. Footprinting information can be obtained both from the molecule ends and from m6dA labelling. A recent preprint described tagmentation-assisted SAMOSA (SAMOSA-Tag)100 in which the MNase is replaced with  Tn5 transposase, commonly used in the assay for transposase-accessible chromatin using sequencing (ATAC-seq) and cleavage under targets and tagmentation (CUT&Tag)101. Importantly, the authors demonstrate identification of m6dA labelling and native 5mC CpG modifications, showing that SAMOSA-Tag can assay protein–DNA interactions, epigenetic modifications and primary DNA sequence simultaneously with PacBio sequencing.

Directly mapping protein–DNA interactions

In an extension of footprinting, m6dA labelling has been used within the framework of cleavage under targets & release using nuclease (CUT&RUN) and CUT&Tag methods101,102 to directly measure interactions between specific proteins and DNA (Table 2). In these approaches, a protein of interest is bound by specific antibodies (Fig. 2d). These antibodies are bound by bacterial proteins that bind tightly to IgG (protein A, protein G or both)103 fused to methyltransferases, thereby concentrating methyltransferase activity — and m6dA labelling with S-adenosylmethionine — around protein binding sites (Fig. 2d). This approach has been implemented for Hia5 (ref. 104) and EcoGII105 and can map protein–DNA binding with a resolution of 100–200 bp. Directed methylation with long-read sequencing (DiMeLo-seq) uses Hia5 and is the most extensively tested and optimized approach: it has been used to measure protein–DNA interactions across repetitive regions of the genome, study the coordination and heterogeneity of adjacent binding sites and phase read to study allele-specific protein–DNA binding104.

Although single-molecule approaches for measuring protein–DNA binding unlock the ability to explore previously intractable biological questions, the way the interactions are measured is fundamentally different from established short-read methods (such as ChIP-seq, CUT&RUN and CUT&Tag). Short-read methods enrich bound regions, producing peaks of enrichment that cover a small percentage of the genome (<10%106,107) but often contain >50% of sequenced reads (the so-called fraction of reads in peaks)108. By contrast, the single-molecule methods discussed above have no built-in enrichment step, and although this makes them more quantitative and removes bias, it also requires whole-genome sequencing in order to obtain the same genome-wide signal. Fortunately, recent efforts have shown that these labelling techniques can be combined with enrichment methods for long reads94,104, allowing cost-effective profiling.

Measuring chromosome conformation

Moving to a larger scale, there is an interplay between DNA methylation, chromatin state, protein–DNA interactions and DNA organization in the nucleus. The three-dimensional organization of the genome plays a critical role in gene regulation, development and human disease (reviewed in refs. 109,110). Primary methods used to measure three-dimensional organization rely on proximity ligation and are known as chromatin conformation capture (3C) assays (reviewed in ref. 111). Most of these methods measure pairwise interactions with short-read sequencing and fail to capture information about potential cooperation between multiple loci112. Although methods that do not rely on proximity ligation make it possible to measure multi-way contacts113, long-read sequencing platforms have the potential to read long fragments from 3C-based experiments that represent multi-way interactions and have been employed in a variety of methods. PacBio sequencing was initially employed by a method measuring chromosomal walks in which 3C DNA was directly sequenced114. However, the long-read data were mostly used to validate short-read data, the reads were not very long (<8 kb) and the data produced represented <0.5× coverage of the mouse and human genomes, limiting what information could be gleaned114. Multi-contact circular chromosome conformation capture (MC-4C) employed circular chromosome conformation capture combined with Cas9 targeting to measure all interactions at one locus (a so-called ‘one versus all approach’) with ONT sequencing115,116. Again, the average sequenced read size was not very long (~2 kb), owing in part to the use of PCR, with most reads measuring three-way or four-way contacts and some measuring ten contacts115. Genome-wide methods such as multi-contact 3C (MC-3C)117 and Pore-C118 do not employ PCR and are ‘all versus all’ methods (that is, all contacts at all loci are measured) like Hi-C and chromosomal walks. MC-3C used PacBio, whereas Pore-C used ONT. Of these two methods, the data from Pore-C best demonstrate the potential of these approaches owing to extremely deep sequencing (up to >132× genome coverage)118. With high-depth data, the authors were able to explore CpG methylation on haplotype-specific, multi-way interactions on single molecules. In a good example of how quickly this area is moving, Pore-C has already been modified to reduce cost and improve throughput with a method termed high-throughput Pore-C (HiPore-C)119.

Short reads on single-molecule platforms

Although single-molecule sequencing typically emphasizes read length, both PacBio and ONT technologies can sequence short nucleic acid fragments. Despite Illumina (and other short-read sequencers) dominating the short-read sequencing field, approaches that sequence short reads on ONT and PacBio have gained traction. The portability, low physical footprint and ability to analyse sequencing data in real-time make ONT sequencing devices ripe for use with short reads directly at the bench or in the field, without the need for a sequencing core. Single-molecule sequencing can reduce cost as multiple types of -omics data (for example, methylation and genetic variation) can be gleaned from a single sequencing run. The increases in throughput and accuracy of these single-molecule platforms provide advantages that have made them even more attractive for short-read sequencing. These advantages fall into the ‘iron triangle’ of project management: fast, good or cheap.

Fast: portability and speed

Recent attempts to detect chromosomal abnormalities by optimizing short-read sequencing on ONT highlight the advantage of the low cost and small size of the ONT sequencing devices, especially the ONT Flongle flow cells and ONT MinION flow cells. These aspects could make sequencing more accessible for environments with limited resources and bring these assays from centralized cores to the laboratory benchtop. Additionally, real-time sequencing with ONT enables rapid turnaround times compared to waiting for a completed sequencing run120,121. Chromosomal abnormalities, including aneuploidies and copy number variants (CNVs), play a role in human disease and are commonly screened for during pregnancy and in cancer (reviewed in refs. 122,123). Multiple studies have shown that short-read sequencing can be optimized for the portable ONT MinION device to detect aneuploidies124,125 and CNVs126,127. These approaches showed that sequencing libraries could be multiplexed, detected abnormalities were concordant with Illumina sequencing, only 0.5–2 million reads were required and sufficient reads could be obtained in under 3 hours (Fig. 3a). Additionally, similar CNV estimates were observed on the same sequencing device with short or long reads, underscoring the flexibility of these devices126.

Fig. 3. Applications of shorter-read sequencing (<5 kb) on single-molecule platforms.

Fig. 3

a, Short reads can be quickly sequenced on portable Oxford Nanopore sequencing devices, returning real-time information about copy number variants and aneuploidy in 3 h or less. b, Primary sequence, fragment patterns and endogenous methylation can be measured simultaneously with single-molecule platforms, and that information can be used to assign reads to tissues of origin. c, Accuracy of short reads on single-molecule platforms can be improved by correcting for errors by reading the same molecule multiple times. d, The cost of sequencing short fragments on single-molecule platforms can be decreased by combining multiple different short molecules into a single, long molecule.

Good: multimodal measurements

An important advantage for single-molecule platforms is that base modification information is acquired for free (not counting computational requirements) alongside the primary sequence. Specifically, short-read single-molecule assays can take advantage of modification data to measure cell-free DNA (cfDNA), which is fragmented DNA found in plasma that is usually the same length as DNA wrapped around a nucleosome (~150 bp). cfDNA has become a popular diagnostic tool owing to the relative ease of collection (via blood draws or ‘liquid biopsies’) and has been used to analyse fetal DNA during pregnancy, circulating tumour DNA and donor-derived DNA in transplant patients (reviewed in refs. 128,129). As reported in both published and preprint articles, cfDNA has been sequenced with PacBio and ONT to detect fetal DNA in maternal blood130,131 and assay circulating tumour DNA132135. The ability to measure native CpG methylation and patterns from fragment ends (known as ‘fragmentomics’129) has been used to classify placental and maternal DNA130, show that tumour-derived DNA had lower methylation than non-tumour-derived DNA132, estimate tissue-of-origin and cell-type proportions (Fig. 3b), footprint transcription factor binding sites and measure nucleosome positioning133. ONT and PacBio platforms can also capture any longer fragments in these liquid biopsies, revealing previously unknown biology. For example, long reads (>1 kb) can constitute a large proportion (up to ~41%) of cfDNA reads in maternal plasma and the percentage of long reads increases as pregnancy progresses130.

Though exogenous labelling methods are a focus of single-molecule chromatin assay development (see ‘Mapping protein–DNA interactions’), methods sequencing short fragments from chromatin assays have also emerged. For example, Array-seq simply sequences the typical MNase digestion ladder to measure nucleosome positioning with ONT136 and short fragments from native ChIP-seq without amplification have been sequenced with PacBio137, allowing for both protein binding and native DNA modifications to be measured simultaneously. Another example is DamID, which uses exogenous DNA adenine methyltransferase (Dam) labelling and methylation-sensitive restriction enzyme digestion to probe protein–DNA interactions138. DamID output has been directly sequenced with ONT both with amplification (RNA Pol DamID (RAPID))139 and without amplification (nanopore-DamID)140, the latter reported in a recent preprint. These approaches have been shown to benefit from the single-molecule platforms that can sequence longer reads, measuring binding sites in repetitive sequences and segmental duplications as well as simultaneously investigating protein–DNA binding and native methylation140.

Good: accuracy

Two primary methods have been used to improve the accuracy of reads on single-molecule platforms: consensus methods and molecular indexing methods. Consensus methods have received the most attention with various approaches existing for both ONT and PacBio. PacBio sequencing natively supports consensus sequencing (‘circular consensus sequencing’ (CCS) with PacBio HiFi) and has been used on both short fragments (<1,000 bp)141 and long fragments (>13 kb)142 to generate highly accurate (99.8%)142 consensus reads. As ONT does not sequence circular molecules, a variety of methods have been developed using rolling circle amplification to generate linear molecules composed of concatemers of the original molecule (Fig. 3c). These methods usually begin with linear fragments of DNA that are circularized by intramolecular ligation143, molecular inversion probes144, ligation into a backbone145, or by using Gibson assembly and a common DNA splint146. The circular molecules are then amplified using the phi29 polymerase to create long concatemerized molecules. After sequencing, concatemers are identified and a consensus sequence of the original molecule is constructed (Fig. 3c). Even though long reads could be used with these methods, during development these methods have focused on short reads (<1000 bp) down to 52 bp144. All of these methods show increased accuracy (for example, improving from 74% to >95% accuracy144) when consensus molecules are constructed, with a recent publication reporting the added benefit of increasing the sequencing yield compared to sequencing the short fragments directly146.

In addition to consensus sequencing, unique molecular identifiers (UMIs) have been developed for single-molecule platforms and incorporated into amplicon sequencing147. UMIs were shown to improve the error rate of both ONT and PacBio (all >99.5% accuracy) and remove PCR chimeras that may arise during amplification. Although the UMIs were shown to work with long amplicons (>4,000 bp), they have the potential to be used in short-read methodologies as well.

Regardless of the approach used to improve accuracy, systematic errors in sequencing reads from these single-molecule platforms will prevent all errors from being corrected. For example, nanopore sequencing is error-prone in low-complexity sequences148 and homopolymer sequences, even with the latest commercially available pores7. PacBio is more accurate than ONT in general, but also shows systematic errors in homopolymer regions147,149. That said, further improvement is possible as indicated by recent efforts combining PacBio CCS with UMIs that resulted in very few errors147 and the improvement of accuracy seen by retraining nanopore basecallers with troublesome sequences150.

Cheap: increasing throughput

Both PacBio and ONT typically produce fewer reads per sequencing run than an Illumina device, affecting the cost of these platforms for read-counting applications such as assaying CNVs and RNA-seq. Because of this, a set of methods have been developed to increase the yield of short reads on single-molecule platforms. The methods are similar to approaches used to increase Sanger sequencing throughput in the 1990s151,152 and rely on concatenating short fragments into artificial, long fragments to increase throughput using either Gibson assembly153 or sticky-end ligation154156 (Fig. 3d). For example, a method published in a recent preprint article, multiplexed arrays sequencing of isoforms (MAS-ISO-seq), shows ~15–25× increase in throughput with PacBio156 and sampling molecules using re-ligated fragments (SMURF-seq) achieves a ~3× increase on ONT155. Based on the gain in sequencing output, both methods can reduce the cost per million reads or full-length transcripts from >US$883 (PacBio) and >US$415 (ONT) to <US$56 (PacBio) and <US$146 (ONT) (see Supplementary Note and Supplementary Data). These approaches have been used in a variety of ways including identifying cancer variants153,155, measuring CNVs154 and sequencing RNA isoforms156,157.

It is currently unclear if any biases are introduced during these concatemerization methods and how they may affect the resulting data. Two of the methods recently described in preprints, MAS-ISO-seq156 and HIT-scISOseq157, both show relative depletion of longer spike-in RNA variants compared to shorter transcripts when compared to PacBio Iso-Seq. This could be due to any step in those protocols, including PCR, uracil digestion or ligation. Furthermore, the ligases used in these assays may have some GC bias, as was shown for serial analysis of gene expression (SAGE)151,158,159. Finally, these concatemerization methods rely on being able to accurately identify the junction sites between molecules in order to split them into individual fragments. Although most of these methods are paired with software for resolving concatemers, the base pair accuracy of these methods has not been fully elucidated. For example, ConcatSeq showed a small distribution of fragments deviating from the expected fragment length153. We expect that benchmarking and further exploration of these data will elucidate any sources of bias.

Conclusions and future perspectives

The increasing use of single-molecule sequencing platforms in genomics has led to an increase in applications beyond typical use cases. As they enter the mainstream, the number of creative uses of these platforms will increase and the methods detailed in this Review will be optimized, refined and expanded. If anything, development will be accelerated in coming years owing to the massive increase in the use of ONT sequencing to monitor the SARS-CoV-2 pandemic, as illustrated by ~50% of COVID-19 sequencing across the African continent being performed with ONT160. This increase will give an expanded population of researchers ready access to single-molecule sequencing technology.

Targeted sequencing methods will be improved to capture longer reads to take full advantage of these platforms. The optimization of these methods will lead to greater read depths and lengths, enabling applications that need ultra-high-depth sequencing such as identifying somatic mosaic variants or intratumoural heterogeneity. Further developments in combining methods, such as Cas-mediated enrichment with adaptive sampling161, will improve on-target rates and drive costs even lower. Targeted long reads are likely to generate new insights into the direct molecular impact of mutations and alterations as their single-molecule nature is a proxy for cellular heterogeneity in complex clinical samples.

Since their inception, short-read assays measuring protein–DNA binding have been developed to reduce input even to the single-cell level (reviewed in ref. 78) and to measure multiple protein–DNA interactions simultaneously162,163. We expect single-molecule methods to follow the same trajectory as they offer an appealing route to quantitative methods for measuring these interactions. Early work on the coordination of epigenetic marks on long, single reads — in some cases as long as 100 kb — offers tantalizing views into exploring epigenetic heterogeneity, such as examining the temporal dynamics of T cell activation94. However, determining whether exogenous labelling variation is biological or technical requires careful molecular controls. Potential confounding technical aspects include the extent to which both protein and antibody penetrate cells and/or nuclei and their binding efficiencies, fidelity of modification calling and enzyme labelling efficiencies.

Although the throughput of short reads on single-molecule platforms is improving, it still remains at a relatively high cost per million reads for counting applications, such as RNA-seq, CNV analysis and CUT&RUN. Improvements increasing the number of short reads obtained in a single sequencing run will enable sample multiplexing, driving down the cost of sequencing. With increasing throughput, we expect more short reads from a variety of assays to be sequenced on these long-read platforms owing to decreasing cost, increased speed and portability, and the ability to gain multimodal information.

Although we focus on DNA-based methods in this Review, we believe the ability to sequence RNA directly will also have an important role in a variety of methods going forward. However, at this time, direct RNA sequencing lags behind DNA sequencing and will require improvement in many aspects, including accuracy, to spur further use164. Similarly, we expect the young field of protein sequencing on nanopores to continue to advance165, eventually completing our ability to measure the central dogma in its entirety.

Finally, we imagine these advances could be combined with parallel advances in the portability and flexibility of sample collection166 and data analysis167,168. This is an especially exciting prospect when considering their use with portable ONT sequencing, which could lead to sequencing assays leaving core facilities for use directly at the bench or even the field. Improvements and future developments in these methods set the stage for a more flexible and accessible field of genomics, pushing it into a new and exciting era.

Supplementary information

Supplementary Information (353.1KB, pdf)

Acknowledgements

This work was supported by funding from the National Institutes of Health (grant no. R01 HG009190; National Human Genome Research Institute).

Glossary

Basecaller

An algorithm that converts the raw signal from nucleic acid sequencing into the bases that the signal represents.

Centromeres

The region of a chromosome where the kinetochore attaches during cell division, typically an extremely repetitive region.

Chemical bisulfite conversion

A method used to measure the DNA modifications 5-methylcytosine and 5-hydroxymethylcytosine. DNA is treated with sodium bisulfite, which converts unmodified cytosine to uracil, whereas modified cytosines are protected from conversion. Following conversion and PCR, unmodified cytosines are read as thymines when sequenced, whereas modified cytosines remain cytosines.

Chromatin immunoprecipitation followed by sequencing

(ChIP–seq). A method for directly measuring protein–DNA binding with antibody-mediated immunoprecipitation of protein–DNA complexes.

Cleavage under targets & release using nuclease

(CUT&RUN). A method for directly measuring protein–DNA binding with antibody-guided DNA digestion with a micrococcal nuclease.

Cleavage under targets and tagmentation

(CUT&Tag). A method for directly measuring protein–DNA binding with antibody-guided transposition and fragmentation (tagmentation) with Tn5 transposase.

Cycle dephasing

Mechanism of error that affects sequencing devices using polymerase colonies (polonies). This occurs when clonal molecules within the same cluster are not all elongated in a given extension step, diluting the sequencing signal during subsequent cycles as the molecules become out of phase. More molecules become ‘dephased’ with each additional sequencing cycle, leading to increasingly lower sequencing quality as different positions on the template contribute to the signal.

Dynamic time warping

An algorithm for measuring similarity between two time series. In this context it refers to matching experimental nanopore data to a modelled electrical signal from a reference DNA sequence to identify the correct sequence from a database.

Human Genome Project

An international effort launched in 1990 with the primary goal of assembling the human genome. The project was completed in 2003.

ONT Flongle flow cell

Low-throughput flow cell (<1 Gb) from Oxford Nanopore Technologies. This flow cell can be sequenced on MinION or GridION sequencing devices.

ONT MinION

Hand-held sequencing device from Oxford Nanopore Technologies that can perform sequencing with MinION or Flongle flow cells.

ONT MinION flow cell

Medium-throughput (2–20 Gb) flow cell from Oxford Nanopore Technologies. This flow cell can be sequenced on MinION or GridION sequencing devices.

ONT PromethION

High-throughput sequencing device from Oxford Nanopore Technologies that can perform sequencing with PromethION flow cells.

ONT PromethION flow cell

High-throughput (50–100+ Gb) flow cell from Oxford Nanopore Technologies. This flow cell can be sequenced on PromethION sequencing devices.

PacBio RS II

Sequencing device released by Pacific Biosciences in 2013 that can perform single-molecule, real-time sequencing.

PacBio Sequel II

Sequencing device released by Pacific Biosciences in 2019 that can perform single-molecule, real-time sequencing.

Sequencing depth

The number of reads that map to a given locus, also known as sequencing coverage. This is usually represented as an average, and a locus can refer to a single nucleotide, region(s) of interest, entire chromosome(s) or entire genomes. We would consider ‘high’ coverage or depth as >100× for most assays.

Telomeres

Repetitive regions at the end of chromosomes.

Tn5 transposase

A bacterial protein that facilitates the movement of DNA sequences through a ‘cut and paste’ mechanism. This protein has become a valuable molecular biology tool with its uses ranging from efficient library preparation to probing chromatin state.

Unique molecular identifiers

(UMIs). Short sequences of random nucleotides that tags an individual nucleic acid molecule. UMIs can be used to identify subsequently amplified fragments that arose from the same original molecule, mitigating bias introduced during PCR and allowing for more accurate quantification.

Whole-genome sequencing

A sequencing approach that attempts to obtain reads that map to all bases in the genome.

Author contributions

The authors contributed equally to all aspects of the article.

Peer review

Peer review information

Nature Reviews Genetics thanks Matthew Loose and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Competing interests

W.T. has two patents (8,748,091 and 8,394,584) licensed to ONT. W.T. has received travel funds to speak at symposia organized by ONT. P.H. declares no competing interests.

Footnotes

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

The online version contains supplementary material available at 10.1038/s41576-023-00600-1.

References

  • 1.Goodwin S, McPherson JD, McCombie WR. Coming of age: ten years of next-generation sequencing technologies. Nat. Rev. Genet. 2016;17:333–351. doi: 10.1038/nrg.2016.49. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Erlich Y, Mitra PP, delaBastide M, McCombie WR, Hannon GJ. Alta-Cyclic: a self-optimizing base caller for next-generation sequencing. Nat. Methods. 2008;5:679–682. doi: 10.1038/nmeth.1230. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Metzker ML. Sequencing technologies — the next generation. Nat. Rev. Genet. 2010;11:31–46. doi: 10.1038/nrg2626. [DOI] [PubMed] [Google Scholar]
  • 4.Logsdon GA, Vollger MR, Eichler EE. Long-read human genome sequencing and its applications. Nat. Rev. Genet. 2020;21:597–614. doi: 10.1038/s41576-020-0236-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Rhoads A, Au KF. PacBio sequencing and its applications. Genomics Proteomics Bioinformatics. 2015;13:278–289. doi: 10.1016/j.gpb.2015.08.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Timp W, et al. Think small: nanopores for sensing and synthesis. IEEE Access. 2014;2:1396–1408. doi: 10.1109/ACCESS.2014.2369506. [DOI] [Google Scholar]
  • 7.Sereika M, et al. Oxford Nanopore R10.4 long-read sequencing enables the generation of near-finished bacterial genomes from pure cultures and metagenomes without short-read or reference polishing. Nat. Methods. 2022;19:823–826. doi: 10.1038/s41592-022-01539-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Rhie A, et al. Towards complete and error-free genome assemblies of all vertebrate species. Nature. 2021;592:737–746. doi: 10.1038/s41586-021-03451-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Nurk S, et al. The complete sequence of a human genome. Science. 2022;376:44–53. doi: 10.1126/science.abj6987. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Chaisson MJP, et al. Resolving the complexity of the human genome using single-molecule sequencing. Nature. 2015;517:608–611. doi: 10.1038/nature13907. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Aganezov S, et al. Comprehensive analysis of structural variants in breast cancer genomes using single-molecule sequencing. Genome Res. 2020;30:1258–1273. doi: 10.1101/gr.260497.119. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Ebert P, et al. Haplotype-resolved diverse human genomes and integrated analysis of structural variation. Science. 2021;372:eabf7117. doi: 10.1126/science.abf7117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Beyter D, et al. Long-read sequencing of 3,622 Icelanders provides insight into the role of structural variants in human diseases and other traits. Nat. Genet. 2021;53:779–786. doi: 10.1038/s41588-021-00865-4. [DOI] [PubMed] [Google Scholar]
  • 14.Simpson JT, et al. Detecting DNA cytosine methylation using nanopore sequencing. Nat. Methods. 2017;14:407–410. doi: 10.1038/nmeth.4184. [DOI] [PubMed] [Google Scholar]
  • 15.Altemose N, et al. Complete genomic and epigenetic maps of human centromeres. Science. 2022;376:eabl4178. doi: 10.1126/science.abl4178. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Gershman A, et al. Epigenetic patterns in a complete human genome. Science. 2022;376:eabj5089. doi: 10.1126/science.abj5089. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Workman RE, et al. Nanopore native RNA sequencing of a human poly(A) transcriptome. Nat. Methods. 2019;16:1297–1305. doi: 10.1038/s41592-019-0617-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Glinos DA, et al. Transcriptome variation in human tissues revealed by long-read sequencing. Nature. 2022;608:353–359. doi: 10.1038/s41586-022-05035-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Pratanwanich PN, et al. Identification of differential RNA modifications from nanopore direct RNA sequencing with xPore. Nat. Biotechnol. 2021;39:1394–1402. doi: 10.1038/s41587-021-00949-w. [DOI] [PubMed] [Google Scholar]
  • 20.Gampawar P, et al. Evaluation of the performance of AmpliSeq and SureSelect exome sequencing libraries for ion proton. Front. Genet. 2019;10:856. doi: 10.3389/fgene.2019.00856. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Togi S, Ura H, Niida Y. Optimization and validation of multimodular, long-range PCR-based next-generation sequencing assays for comprehensive detection of mutation in tuberous sclerosis complex. J. Mol. Diagn. 2021;23:424–446. doi: 10.1016/j.jmoldx.2020.12.009. [DOI] [PubMed] [Google Scholar]
  • 22.Barnes WM. PCR amplification of up to 35-kb DNA with high fidelity and high yield from lambda bacteriophage templates. Proc. Natl Acad. Sci. USA. 1994;91:2216–2220. doi: 10.1073/pnas.91.6.2216. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Jia H, Guo Y, Zhao W, Wang K. Long-range PCR in next-generation sequencing: comparison of six enzymes and evaluation on the MiSeq sequencer. Sci. Rep. 2014;4:5737. doi: 10.1038/srep05737. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Walczak M, et al. Long-range PCR libraries and next-generation sequencing for pharmacogenetic studies of patients treated with anti-TNF drugs. Pharmacogenomics J. 2019;19:358–367. doi: 10.1038/s41397-018-0058-9. [DOI] [PubMed] [Google Scholar]
  • 25.Brait N, Külekçi B, Goerzer I. Long range PCR-based deep sequencing for haplotype determination in mixed HCMV infections. BMC Genomics. 2022;23:31. doi: 10.1186/s12864-021-08272-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Potapov V, Ong JL. Examining sources of error in PCR by single-molecule sequencing. PLoS ONE. 2017;12:e0169774. doi: 10.1371/journal.pone.0169774. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Tyson JR, et al. Improvements to the ARTIC multiplex PCR method for SARS-CoV-2 genome sequencing using nanopore. bioRxiv. 2020 doi: 10.1101/2020.09.04.283077v1. [DOI] [Google Scholar]
  • 28.Norris AL, Workman RE, Fan Y, Eshleman JR, Timp W. Nanopore sequencing detects structural variants in cancer. Cancer Biol. Ther. 2016;17:246–253. doi: 10.1080/15384047.2016.1139236. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Borràs DM, et al. Detecting PKD1 variants in polycystic kidney disease patients by single-molecule long-read sequencing. Hum. Mutat. 2017;38:870–879. doi: 10.1002/humu.23223. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Quick J, et al. Multiplex PCR method for MinION and Illumina sequencing of Zika and other virus genomes directly from clinical samples. Nat. Protoc. 2017;12:1261–1276. doi: 10.1038/nprot.2017.066. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Quick J, et al. Real-time, portable genome sequencing for Ebola surveillance. Nature. 2016;530:228–232. doi: 10.1038/nature16996. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Turner EH, Ng SB, Nickerson DA, Shendure J. Methods for genomic partitioning. Annu. Rev. Genomics Hum. Genet. 2009;10:263–284. doi: 10.1146/annurev-genom-082908-150112. [DOI] [PubMed] [Google Scholar]
  • 33.Gnirke A, et al. Solution hybrid selection with ultra-long oligonucleotides for massively parallel targeted sequencing. Nat. Biotechnol. 2009;27:182–189. doi: 10.1038/nbt.1523. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Leung AW-S, et al. ECNano: a cost-effective workflow for target enrichment sequencing and accurate variant calling on 4800 clinically significant genes using a single MinION flowcell. BMC Med. Genomics. 2022;15:43. doi: 10.1186/s12920-022-01190-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Zhang L, et al. Efficient CNV breakpoint analysis reveals unexpected structural complexity and correlation of dosage-sensitive genes with clinical severity in genomic disorders. Hum. Mol. Genet. 2017;26:1927–1941. doi: 10.1093/hmg/ddx102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Beck CR, et al. Megabase length hypermutation accompanies human structural variation at 17p11.2. Cell. 2019;176:1310–1324.e10. doi: 10.1016/j.cell.2019.01.045. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Yamaguchi K, et al. Application of targeted nanopore sequencing for the screening and determination of structural variants in patients with Lynch syndrome. J. Hum. Genet. 2021;66:1053–1060. doi: 10.1038/s10038-021-00927-9. [DOI] [PubMed] [Google Scholar]
  • 38.Wang M, et al. PacBio-LITS: a large-insert targeted sequencing method for characterization of human disease-associated chromosomal structural variations. BMC Genomics. 2015;16:214. doi: 10.1186/s12864-015-1370-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Giolai M, et al. Targeted capture and sequencing of gene-sized DNA molecules. Biotechniques. 2016;61:315–322. doi: 10.2144/000114484. [DOI] [PubMed] [Google Scholar]
  • 40.Bethune K, et al. Long-fragment targeted capture for long-read sequencing of plastomes. Appl. Plant Sci. 2019;7:e1243. doi: 10.1002/aps3.1243. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Lefoulon E, et al. Large enriched fragment targeted sequencing (LEFT-SEQ) applied to capture of Wolbachia genomes. Sci. Rep. 2019;9:5939. doi: 10.1038/s41598-019-42454-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Steiert TA, et al. High-throughput method for the hybridisation-based targeted enrichment of long genomic fragments for PacBio third-generation sequencing. NAR Genom. Bioinform. 2022;4:lqac051. doi: 10.1093/nargab/lqac051. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Karamitros T, Magiorkinis G. A novel method for the multiplexed target enrichment of MinION next generation sequencing libraries using PCR-generated baits. Nucleic Acids Res. 2015;43:e152. doi: 10.1093/nar/gkv773. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Karamitros, T. & Magiorkinis, G. in Next Generation Sequencing: Methods and Protocols (eds Head, S. R., Ordoukhanian, P. & Salomon, D. R.) 43–51 (Springer New York, 2018).
  • 45.Lee, I., Workman, R. E., Wang, J. Z. & Timp, W. Use of Agilent SureSelect to perform targeted long-read nanopore sequencing. Agilent Application Note (Agilent Technologies, 2017).
  • 46.Roe D, et al. Efficient sequencing, assembly, and annotation of human KIR haplotypes. Front. Immunol. 2020;11:582927. doi: 10.3389/fimmu.2020.582927. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Doudna JA, Charpentier E. Genome editing. The new frontier of genome engineering with CRISPR-Cas9. Science. 2014;346:1258096. doi: 10.1126/science.1258096. [DOI] [PubMed] [Google Scholar]
  • 48.Lee NCO, Larionov V, Kouprina N. Highly efficient CRISPR/Cas9-mediated TAR cloning of genes and chromosomal loci from complex genomes in yeast. Nucleic Acids Res. 2015;43:e55. doi: 10.1093/nar/gkv112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Jiang W, et al. Cas9-assisted targeting of chromosome segments CATCH enables one-step targeted cloning of large gene clusters. Nat. Commun. 2015;6:8101. doi: 10.1038/ncomms9101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Gabrieli T, et al. Selective nanopore sequencing of human BRCA1 by Cas9-assisted targeting of chromosome segments (CATCH) Nucleic Acids Res. 2018;46:e87. doi: 10.1093/nar/gky411. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Tsai Y-C, et al. Amplification-free, CRISPR-Cas9 targeted enrichment and SMRT sequencing of repeat-expansion disease causative genomic regions. bioRxiv. 2017 doi: 10.1101/203919v1. [DOI] [Google Scholar]
  • 52.Tsai, Y.-C. et al. in Genomic Structural Variants in Nervous System Disorders (ed Proukakis, C.) 95–120 (Springer, 2022).
  • 53.Watson CM, et al. Cas9-based enrichment and single-molecule sequencing for precise characterization of genomic duplications. Lab. Invest. 2020;100:135–146. doi: 10.1038/s41374-019-0283-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Giesselmann P, et al. Analysis of short tandem repeat expansions and their methylation state with nanopore sequencing. Nat. Biotechnol. 2019;37:1478–1481. doi: 10.1038/s41587-019-0293-x. [DOI] [PubMed] [Google Scholar]
  • 55.Gilpatrick T, et al. Targeted nanopore sequencing with Cas9-guided adapter ligation. Nat. Biotechnol. 2020;38:433–438. doi: 10.1038/s41587-020-0407-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Iyer SV, Kramer M, Goodwin S, McCombie WR. ACME: an affinity-based Cas9 mediated enrichment method for targeted nanopore sequencing. bioRxiv. 2022 doi: 10.1101/2022.02.03.478550v2. [DOI] [Google Scholar]
  • 57.Wallace AD, et al. CaBagE: a Cas9-based background elimination strategy for targeted, long-read DNA sequencing. PLoS ONE. 2021;16:e0241253. doi: 10.1371/journal.pone.0241253. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Stevens RC, et al. A novel CRISPR/Cas9 associated technology for sequence-specific nucleic acid enrichment. PLoS ONE. 2019;14:e0215441. doi: 10.1371/journal.pone.0215441. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Bruijnesteijn J, van der Wiel M, de Groot NG, Bontrop RE. Rapid characterization of complex killer cell immunoglobulin-like receptor (KIR) regions using Cas9 enrichment and nanopore sequencing. Front. Immunol. 2021;12:722181. doi: 10.3389/fimmu.2021.722181. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Gilpatrick T, et al. IVT generation of guideRNAs for Cas9-enrichment nanopore sequencing. bioRxiv. 2023 doi: 10.1101/2023.02.07.527484v1. [DOI] [Google Scholar]
  • 61.Loose M, Malla S, Stout M. Real-time selective sequencing using nanopore technology. Nat. Methods. 2016;13:751–754. doi: 10.1038/nmeth.3930. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Kovaka S, Fan Y, Ni B, Timp W, Schatz MC. Targeted nanopore sequencing by real-time mapping of raw electrical signal with UNCALLED. Nat. Biotechnol. 2021;39:431–441. doi: 10.1038/s41587-020-0731-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Zhang H, et al. Real-time mapping of nanopore raw signals. Bioinformatics. 2021;37:i477–i483. doi: 10.1093/bioinformatics/btab264. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Bao Y, et al. SquiggleNet: real-time, direct classification of nanopore signals. Genome Biol. 2021;22:298. doi: 10.1186/s13059-021-02511-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Han R, Wang S, Gao X. Novel algorithms for efficient subsequence searching and mapping in nanopore raw signals towards targeted sequencing. Bioinformatics. 2020;36:1333–1343. doi: 10.1093/bioinformatics/btz742. [DOI] [PubMed] [Google Scholar]
  • 66.Masutani B, Morishita S. A framework and an algorithm to detect low-abundance DNA by a handy sequencer and a palm-sized computer. Bioinformatics. 2019;35:584–592. doi: 10.1093/bioinformatics/bty663. [DOI] [PubMed] [Google Scholar]
  • 67.Dunn, T. et al. in MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture 535–549 (Association for Computing Machinery, 2021).
  • 68.Payne A, et al. Readfish enables targeted nanopore sequencing of gigabase-sized genomes. Nat. Biotechnol. 2021;39:442–450. doi: 10.1038/s41587-020-00746-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69.Edwards HS, et al. Real-time selective sequencing with RUBRIC: read until with basecall and reference-informed criteria. Sci. Rep. 2019;9:11475. doi: 10.1038/s41598-019-47857-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70.Ulrich J-U, Lutfi A, Rutzen K, Renard BY. ReadBouncer: precise and scalable adaptive sampling for nanopore sequencing. Bioinformatics. 2022;38:i153–i160. doi: 10.1093/bioinformatics/btac223. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.Martin S, et al. Nanopore adaptive sampling: a tool for enrichment of low abundance species in metagenomic samples. Genome Biol. 2022;23:11. doi: 10.1186/s13059-021-02582-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72.Payne A, et al. Barcode aware adaptive sampling for GridION and PromethION Oxford Nanopore sequencers. bioRxiv. 2022 doi: 10.1101/2021.12.01.470722v2. [DOI] [Google Scholar]
  • 73.Madsen EB, Höijer I, Kvist T, Ameur A, Mikkelsen MJ. Xdrop: targeted sequencing of long DNA molecules from low input samples using droplet sorting. Hum. Mutat. 2020;41:1671–1679. doi: 10.1002/humu.24063. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Rivera CM, Ren B. Mapping human epigenomes. Cell. 2013;155:39–55. doi: 10.1016/j.cell.2013.09.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Minnoye L, et al. Chromatin accessibility profiling methods. Nat. Rev. Methods Prim. 2021;1:10. doi: 10.1038/s43586-020-00008-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Klemm SL, Shipony Z, Greenleaf WJ. Chromatin accessibility and the regulatory epigenome. Nat. Rev. Genet. 2019;20:207–220. doi: 10.1038/s41576-018-0089-8. [DOI] [PubMed] [Google Scholar]
  • 77.Furey TS. ChIP-seq and beyond: new and improved methodologies to detect and characterize protein–DNA interactions. Nat. Rev. Genet. 2012;13:840–852. doi: 10.1038/nrg3306. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78.Preissl S, Gaulton KJ, Ren B. Characterizing cis-regulatory elements using single-cell epigenomics. Nat. Rev. Genet. 2022 doi: 10.1038/s41576-022-00509-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79.Brinkman AB, et al. Sequential ChIP-bisulfite sequencing enables direct genome-scale investigation of chromatin and DNA methylation cross-talk. Genome Res. 2012;22:1128–1138. doi: 10.1101/gr.133728.111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 80.Clark SJ, et al. scNMT-seq enables joint profiling of chromatin accessibility DNA methylation and transcription in single cells. Nat. Commun. 2018;9:781. doi: 10.1038/s41467-018-03149-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81.Luo C, et al. Single nucleus multi-omics identifies human cortical cell regulatory genome diversity. Cell Genom. 2022;2:100107. doi: 10.1016/j.xgen.2022.100107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 82.Fehér Z, Kiss A, Venetianer P. Expression of a bacterial modification methylase gene in yeast. Nature. 1983;302:266–268. doi: 10.1038/302266a0. [DOI] [PubMed] [Google Scholar]
  • 83.Singh J, Klar AJ. Active genes in budding yeast display enhanced in vivo accessibility to foreign DNA methylases: a novel in vivo probe for chromatin structure of yeast. Genes Dev. 1992;6:186–196. doi: 10.1101/gad.6.2.186. [DOI] [PubMed] [Google Scholar]
  • 84.Kladde MP, Simpson RT. Positioned nucleosomes inhibit Dam methylation in vivo. Proc. Natl Acad. Sci. USA. 1994;91:1361–1365. doi: 10.1073/pnas.91.4.1361. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 85.Kladde MP, Xu M, Simpson RT. Direct study of DNA-protein interactions in repressed and active chromatin in living cells. EMBO J. 1996;15:6290–6300. doi: 10.1002/j.1460-2075.1996.tb01019.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 86.Xu M, Simpson RT, Kladde MP. Gal4p-mediated chromatin remodeling depends on binding site position in nucleosomes but does not require DNA replication. Mol. Cell. Biol. 1998;18:1201–1212. doi: 10.1128/MCB.18.3.1201. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 87.Sönmezer C, et al. Molecular co-occupancy identifies transcription factor binding cooperativity in vivo. Mol. Cell. 2021;81:255–267.e6. doi: 10.1016/j.molcel.2020.11.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 88.Nabilsi NH, et al. Multiplex mapping of chromatin accessibility and DNA methylation within targeted single molecules identifies epigenetic heterogeneity in neural stem cells and glioblastoma. Genome Res. 2014;24:329–339. doi: 10.1101/gr.161737.113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 89.Kelly TK, et al. Genome-wide mapping of nucleosome positioning and DNA methylation within individual DNA molecules. Genome Res. 2012;22:2497–2506. doi: 10.1101/gr.143008.112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 90.Kleinendorst RWD, Barzaghi G, Smith ML, Zaugg JB, Krebs AR. Genome-wide quantification of transcription factor binding at single-DNA-molecule resolution using methyl-transferase footprinting. Nat. Protoc. 2021;16:5673–5706. doi: 10.1038/s41596-021-00630-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 91.Wang Y, et al. Single-molecule long-read sequencing reveals the chromatin basis of gene expression. Genome Res. 2019;29:1329–1342. doi: 10.1101/gr.251116.119. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 92.Oberbeckmann E, et al. Absolute nucleosome occupancy map for the Saccharomyces cerevisiae genome. Genome Res. 2019;29:1996–2009. doi: 10.1101/gr.253419.119. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 93.Lee I, et al. Simultaneous profiling of chromatin accessibility and methylation on human cell lines with nanopore sequencing. Nat. Methods. 2020;17:1191–1199. doi: 10.1038/s41592-020-01000-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 94.Battaglia S, et al. Long-range phasing of dynamic, tissue-specific and allele-specific regulatory elements. Nat. Genet. 2022;54:1504–1513. doi: 10.1038/s41588-022-01188-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 95.Kong Y, et al. Critical assessment of DNA adenine methylation in eukaryotes using quantitative deconvolution. Science. 2022;375:515–522. doi: 10.1126/science.abe7489. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 96.Shipony Z, et al. Long-range single-molecule mapping of chromatin accessibility in eukaryotes. Nat. Methods. 2020;17:319–327. doi: 10.1038/s41592-019-0730-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 97.Stergachis AB, Debo BM, Haugen E, Churchman LS, Stamatoyannopoulos JA. Single-molecule regulatory architectures captured by chromatin fiber sequencing. Science. 2020;368:1449–1454. doi: 10.1126/science.aaz1646. [DOI] [PubMed] [Google Scholar]
  • 98.Dubocanin D, et al. Single-molecule architecture and heterogeneity of human telomeric DNA and chromatin. bioRxiv. 2022 doi: 10.1101/2022.05.09.491186v1. [DOI] [Google Scholar]
  • 99.Abdulhay NJ, et al. Massively multiplex single-molecule oligonucleosome footprinting. eLife. 2020;9:e59404. doi: 10.7554/eLife.59404. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 100.Nanda AS, et al. Sensitive multimodal profiling of native DNA by transposase-mediated single-molecule sequencing. bioRxiv. 2022 doi: 10.1101/2022.08.07.502893v2. [DOI] [Google Scholar]
  • 101.Henikoff S, Henikoff JG, Kaya-Okur HS, Ahmad K. Efficient chromatin accessibility mapping in situ by nucleosome-tethered tagmentation. eLife. 2020;9:e63274. doi: 10.7554/eLife.63274. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 102.Skene PJ, Henikoff S. An efficient targeted nuclease strategy for high-resolution mapping of DNA binding sites. eLife. 2017;6:e21856. doi: 10.7554/eLife.21856. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 103.Eliasson M, Andersson R, Olsson A, Wigzell H, Uhlén M. Differential IgG-binding characteristics of staphylococcal protein A, streptococcal protein G, and a chimeric protein AG. J. Immunol. 1989;142:575–581. doi: 10.4049/jimmunol.142.2.575. [DOI] [PubMed] [Google Scholar]
  • 104.Altemose N, et al. DiMeLo-seq: a long-read, single-molecule method for mapping protein–DNA interactions genome wide. Nat. Methods. 2022;19:711–723. doi: 10.1038/s41592-022-01475-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 105.Weng Z, et al. BIND&MODIFY: a long-range method for single-molecule mapping of chromatin modifications in eukaryotes. Genome Biol. 2023;24:61. doi: 10.1186/s13059-023-02896-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 106.Gopi LK, Kidder BL. Integrative pan cancer analysis reveals epigenomic variation in cancer type and cell specific chromatin domains. Nat. Commun. 2021;12:1419. doi: 10.1038/s41467-021-21707-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 107.Battle SL, et al. Enhancer chromatin and 3D genome architecture changes from naive to primed human embryonic stem cell states. Stem Cell Rep. 2019;12:1129–1144. doi: 10.1016/j.stemcr.2019.04.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 108.Kaya-Okur HS, et al. CUT&Tag for efficient epigenomic profiling of small samples and single cells. Nat. Commun. 2019;10:1930. doi: 10.1038/s41467-019-09982-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 109.Zheng H, Xie W. The role of 3D genome organization in development and cell differentiation. Nat. Rev. Mol. Cell Biol. 2019;20:535–550. doi: 10.1038/s41580-019-0132-4. [DOI] [PubMed] [Google Scholar]
  • 110.Schoenfelder S, Fraser P. Long-range enhancer-promoter contacts in gene expression control. Nat. Rev. Genet. 2019;20:437–455. doi: 10.1038/s41576-019-0128-0. [DOI] [PubMed] [Google Scholar]
  • 111.Kempfer R, Pombo A. Methods for mapping 3D chromosome architecture. Nat. Rev. Genet. 2020;21:207–226. doi: 10.1038/s41576-019-0195-2. [DOI] [PubMed] [Google Scholar]
  • 112.McCord RP, Kaplan N, Giorgetti L. Chromosome conformation capture and beyond: toward an integrative view of chromosome structure and function. Mol. Cell. 2020;77:688–708. doi: 10.1016/j.molcel.2019.12.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 113.Quinodoz SA, et al. Higher-order inter-chromosomal hubs shape 3D genome organization in the nucleus. Cell. 2018;174:744–757.e24. doi: 10.1016/j.cell.2018.05.024. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 114.Olivares-Chauvet P, et al. Capturing pairwise and multi-way chromosomal conformations using chromosomal walks. Nature. 2016;540:296–300. doi: 10.1038/nature20158. [DOI] [PubMed] [Google Scholar]
  • 115.Allahyar A, et al. Enhancer hubs and loop collisions identified from single-allele topologies. Nat. Genet. 2018;50:1151–1160. doi: 10.1038/s41588-018-0161-5. [DOI] [PubMed] [Google Scholar]
  • 116.Vermeulen C, et al. Multi-contact 4C: long-molecule sequencing of complex proximity ligation products to uncover local cooperative and competitive chromatin topologies. Nat. Protoc. 2020;15:364–397. doi: 10.1038/s41596-019-0242-7. [DOI] [PubMed] [Google Scholar]
  • 117.Tavares-Cadete F, Norouzi D, Dekker B, Liu Y, Dekker J. Multi-contact 3C reveals that the human genome during interphase is largely not entangled. Nat. Struct. Mol. Biol. 2020;27:1105–1114. doi: 10.1038/s41594-020-0506-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 118.Deshpande AS, et al. Identifying synergistic high-order 3D chromatin conformations from genome-scale nanopore concatemer sequencing. Nat. Biotechnol. 2022;40:1488–1499. doi: 10.1038/s41587-022-01289-z. [DOI] [PubMed] [Google Scholar]
  • 119.Zhong J-Y, et al. High-throughput Pore-C reveals the single-allele topology and cell type-specificity of 3D genome folding. Nat. Commun. 2023;14:1250. doi: 10.1038/s41467-023-36899-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 120.Magi A, et al. Nano-GLADIATOR: real-time detection of copy number alterations from nanopore sequencing data. Bioinformatics. 2019;35:4213–4221. doi: 10.1093/bioinformatics/btz241. [DOI] [PubMed] [Google Scholar]
  • 121.Munro R, et al. MinoTour, real-time monitoring and analysis for nanopore sequencers. Bioinformatics. 2021 doi: 10.1093/bioinformatics/btab780. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 122.Ben-David U, Amon A. Context is everything: aneuploidy in cancer. Nat. Rev. Genet. 2020;21:44–62. doi: 10.1038/s41576-019-0171-x. [DOI] [PubMed] [Google Scholar]
  • 123.Zack TI, et al. Pan-cancer patterns of somatic copy number alteration. Nat. Genet. 2013;45:1134–1140. doi: 10.1038/ng.2760. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 124.Wei S, Williams Z. Rapid short-read sequencing and aneuploidy detection using MinION nanopore technology. Genetics. 2016;202:37–44. doi: 10.1534/genetics.115.182311. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 125.Wei S, Weiss ZR, Williams Z. Rapid multiplex small DNA sequencing on the MinION nanopore sequencing platform. G3. 2018;8:1649–1657. doi: 10.1534/g3.118.200087. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 126.Baslan T, et al. High resolution copy number inference in cancer using short-molecule nanopore sequencing. Nucleic Acids Res. 2021;49:e124. doi: 10.1093/nar/gkab812. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 127.Martignano F, et al. Nanopore sequencing from liquid biopsy: analysis of copy number variations from cell-free DNA of lung cancer patients. Mol. Cancer. 2021;20:32. doi: 10.1186/s12943-021-01327-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 128.Corcoran RB, Chabner BA. Application of cell-free DNA analysis to cancer treatment. N. Engl. J. Med. 2018;379:1754–1765. doi: 10.1056/NEJMra1706174. [DOI] [PubMed] [Google Scholar]
  • 129.Lo YMD, Han DSC, Jiang P, Chiu RWK. Epigenetics, fragmentomics, and topology of cell-free DNA in liquid biopsies. Science. 2021;372:eaaw3616. doi: 10.1126/science.aaw3616. [DOI] [PubMed] [Google Scholar]
  • 130.Yu SCY, et al. Single-molecule sequencing reveals a large population of long cell-free DNA molecules in maternal plasma. Proc. Natl Acad. Sci. USA. 2021;118:e2114937118. doi: 10.1073/pnas.2114937118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 131.Cheng SH, et al. Noninvasive prenatal testing by nanopore sequencing of maternal plasma DNA: feasibility assessment. Clin. Chem. 2015;61:1305–1306. doi: 10.1373/clinchem.2015.245076. [DOI] [PubMed] [Google Scholar]
  • 132.Choy LYL, et al. Single-molecule sequencing enables long cell-free DNA detection and direct methylation analysis for cancer patients. Clin. Chem. 2022;68:1151–1163. doi: 10.1093/clinchem/hvac086. [DOI] [PubMed] [Google Scholar]
  • 133.Katsman E, et al. Detecting cell-of-origin and cancer-specific methylation features of cell-free DNA from Nanopore sequencing. Genome Biol. 2022;23:158. doi: 10.1186/s13059-022-02710-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 134.Lau BT, et al. Single molecule methylation profiles of cell-free DNA in cancer with nanopore sequencing. bioRxiv. 2022 doi: 10.1101/2022.06.22.497080v1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 135.Sampathi S, et al. Nanopore sequencing of clonal IGH rearrangements in cell-free DNA as a biomarker for acute lymphoblastic leukemia. Front. Oncol. 2022;12:958673. doi: 10.3389/fonc.2022.958673. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 136.Baldi S, Krebs S, Blum H, Becker PB. Genome-wide measurement of local nucleosome array regularity and spacing by nanopore sequencing. Nat. Struct. Mol. Biol. 2018;25:894–901. doi: 10.1038/s41594-018-0110-0. [DOI] [PubMed] [Google Scholar]
  • 137.Wu TP, et al. DNA methylation on N6-adenine in mammalian embryonic stem cells. Nature. 2016;532:329–333. doi: 10.1038/nature17640. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 138.Aughey GN, Southall TD. Dam it’s good! DamID profiling of protein-DNA interactions. Wiley Interdiscip. Rev. Dev. Biol. 2016;5:25–37. doi: 10.1002/wdev.205. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 139.Gómez-Saldivar G, et al. Tissue-specific transcription footprinting using RNA PoI DamID (RAPID) in Caenorhabditis elegans. Genetics. 2020;216:931–945. doi: 10.1534/genetics.120.303774. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 140.Cheetham SW, et al. Single-molecule simultaneous profiling of DNA methylation and DNA-protein interactions with Nanopore-DamID. bioRxiv. 2022 doi: 10.1101/2021.08.09.455753v2. [DOI] [Google Scholar]
  • 141.Hebert PDN, et al. A sequel to Sanger: amplicon sequencing that scales. BMC Genomics. 2018;19:219. doi: 10.1186/s12864-018-4611-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 142.Wenger AM, et al. Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome. Nat. Biotechnol. 2019;37:1155–1162. doi: 10.1038/s41587-019-0217-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 143.Li C, et al. INC-Seq: accurate single molecule reads using nanopore sequencing. Gigascience. 2016;5:34. doi: 10.1186/s13742-016-0140-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 144.Wilson BD, Eisenstein M, Soh HT. High-fidelity nanopore sequencing of ultra-short DNA targets. Anal. Chem. 2019;91:6783–6789. doi: 10.1021/acs.analchem.9b00856. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 145.Marcozzi A, et al. Accurate detection of circulating tumor DNA using nanopore consensus sequencing. NPJ Genom. Med. 2021;6:106. doi: 10.1038/s41525-021-00272-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 146.Zee A, et al. Sequencing Illumina libraries at high accuracy on the ONT MinION using R2C2. Genome Res. 2022;32:2092–2106. doi: 10.1101/gr.277031.122. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 147.Karst SM, et al. High-accuracy long-read amplicon sequences using unique molecular identifiers with Nanopore or PacBio sequencing. Nat. Methods. 2021;18:165–169. doi: 10.1038/s41592-020-01041-y. [DOI] [PubMed] [Google Scholar]
  • 148.Timp W, Comer J, Aksimentiev A. DNA base-calling from a nanopore using a Viterbi algorithm. Biophys. J. 2012;102:L37–L39. doi: 10.1016/j.bpj.2012.04.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 149.Mikheenko A, Prjibelski AD, Joglekar A, Tilgner HU. Sequencing of individual barcoded cDNAs using Pacific Biosciences and Oxford Nanopore Technologies reveals platform-specific error patterns. Genome Res. 2022;32:726–737. doi: 10.1101/gr.276405.121. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 150.Tan K-T, Slevin MK, Meyerson M, Li H. Identifying and correcting repeat-calling errors in nanopore sequencing of telomeres. Genome Biol. 2022;23:180. doi: 10.1186/s13059-022-02751-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 151.Velculescu VE, Zhang L, Vogelstein B, Kinzler KW. Serial analysis of gene expression. Science. 1995;270:484–487. doi: 10.1126/science.270.5235.484. [DOI] [PubMed] [Google Scholar]
  • 152.Andersson B, et al. Adaptor-based uracil DNA glycosylase cloning simplifies shotgun library construction for large-scale sequencing. Anal. Biochem. 1994;218:300–308. doi: 10.1006/abio.1994.1182. [DOI] [PubMed] [Google Scholar]
  • 153.Schlecht U, Mok J, Dallett C, Berka J. ConcatSeq: a method for increasing throughput of single molecule sequencing by concatenating short DNA fragments. Sci. Rep. 2017;7:5252. doi: 10.1038/s41598-017-05503-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 154.Prabakar RK, Xu L, Hicks J, Smith AD. SMURF-seq: efficient copy number profiling on long-read sequencers. Genome Biol. 2019;20:134. doi: 10.1186/s13059-019-1732-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 155.Thirunavukarasu D, et al. Oncogene concatenated enriched amplicon nanopore sequencing for rapid, accurate, and affordable somatic mutation detection. Genome Biol. 2021;22:227. doi: 10.1186/s13059-021-02449-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 156.Al’Khafaji AM, et al. High-throughput RNA isoform sequencing using programmable cDNA concatenation. bioRxiv. 2021 doi: 10.1101/2021.10.01.462818v1. [DOI] [PubMed] [Google Scholar]
  • 157.Zheng Y-F, et al. HIT-scISOseq: high-throughput and high-accuracy single-cell full-length isoform sequencing for corneal epithelium. bioRxiv. 2020 doi: 10.1101/2020.07.27.222349v1. [DOI] [Google Scholar]
  • 158.Margulies EH, Kardia SL, Innis JW. Identification and prevention of a GC content bias in SAGE libraries. Nucleic Acids Res. 2001;29:E60–E60. doi: 10.1093/nar/29.12.e60. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 159.Bilotti K, et al. Mismatch discrimination and sequence bias during end-joining by DNA ligases. Nucleic Acids Res. 2022;50:4647–4658. doi: 10.1093/nar/gkac241. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 160.Tegally H, et al. The evolving SARS-CoV-2 epidemic in Africa: insights from rapidly expanding genomic surveillance. Science. 2022;378:eabq5358. doi: 10.1126/science.abq5358. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 161.Rubben K, et al. Cas9 targeted nanopore sequencing with enhanced variant calling improves CYP2D6–CYP2D7 hybrid allele genotyping. PLoS Genet. 2022;18:e1010176. doi: 10.1371/journal.pgen.1010176. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 162.Gopalan S, Wang Y, Harper NW, Garber M, Fazzio TG. Simultaneous profiling of multiple chromatin proteins in the same cells. Mol. Cell. 2021;81:4736–4746.e5. doi: 10.1016/j.molcel.2021.09.019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 163.Stuart T, et al. Nanobody-tethered transposition enables multifactorial chromatin profiling at single-cell resolution. Nat. Biotechnol. 2022 doi: 10.1038/s41587-022-01588-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 164.Jain M, Abu-Shumays R, Olsen HE, Akeson M. Advances in nanopore direct RNA sequencing. Nat. Methods. 2022;19:1160–1164. doi: 10.1038/s41592-022-01633-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 165.Brinkerhoff H, Kang ASW, Liu J, Aksimentiev A, Dekker C. Multiple rereads of single proteins at single-amino acid resolution using nanopores. Science. 2021;374:1509–1513. doi: 10.1126/science.abl4381. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 166.Bhamla MS, et al. Hand-powered ultralow-cost paper centrifuge. Nat. Biomed. Eng. 2017;1:0009. doi: 10.1038/s41551-016-0009. [DOI] [Google Scholar]
  • 167.Samarakoon H, et al. Genopo: a nanopore sequencing analysis toolkit for portable Android devices. Commun. Biol. 2020;3:538. doi: 10.1038/s42003-020-01270-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 168.Palatnick A, Zhou B, Ghedin E, Schatz MC. iGenomics: comprehensive DNA sequence analysis on your smartphone. Gigascience. 2020;9:giaa138. doi: 10.1093/gigascience/giaa138. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Information (353.1KB, pdf)

Articles from Nature Reviews. Genetics are provided here courtesy of Nature Publishing Group

RESOURCES