Abstract
Analysis of structural variations (SVs) is important to understand mutations underlying genetic disorders and pathogenic conditions. However, characterizing SVs using short-read, high-throughput sequencing technology is difficult. Although long-read sequencing technologies are being increasingly employed in characterizing SVs, their low throughput and high costs discourage widespread adoption. Sequence motif-based optical mapping in nanochannels is useful in whole-genome mapping and SV detection, but it is not possible to precisely locate the breakpoints or estimate the copy numbers. We present here a universal multicolor mapping strategy in nanochannels combining conventional sequence-motif labeling system with Cas9-mediated target-specific labeling of any 20-base sequences (20mers) to create custom labels and detect new features. The sequence motifs are labeled with green fluorophores and the 20mers are labeled with red fluorophores. Using this strategy, it is possible to not only detect the SVs but also utilize custom labels to interrogate the features not accessible to motif-labeling, locate breakpoints, and precisely estimate copy numbers of genomic repeats. We validated our approach by quantifying the D4Z4 copy numbers, a known biomarker for facioscapulohumeral muscular dystrophy (FSHD) and estimating the telomere length, a clinical biomarker for assessing disease risk factors in aging-related diseases and malignant cancers. We also demonstrate the application of our methodology in discovering transposable long non-interspersed Elements 1 (LINE-1) insertions across the whole genome.
Graphical Abstract

More and more structural variations (SVs) are being associated with complex, multifactorial disorders.1–7 However, SVs have been difficult to characterize using short-read, high-throughput sequencing technologies in the human genome. Long-read sequencing technologies have been increasingly employed in identifying the SVs.8 However, its low throughput and high cost prevent them from being widely adopted in these applications. Optical mapping technology uses over 300 kbp single molecules, which allows it to capture long-range information. It has found wide applications in genome research. Recently, groups reported the application of this optical mapping with the direct labeling scheme (Bionano Genomics) to detect somatic cancer markers and large SVs.9–15
Typically, optical mapping is based on mapping specific 6-base to 8-base sequence motifs across the whole genome.16–19 However, these motifs are unevenly spread along the genome and are often absent in repetitive regions. Furthermore, motif-based optical mapping lacks base-level information making it impossible to precisely locate SV breakpoints. To achieve a site-specific fluorophore tagging on DNA, McCaffrey et al. leveraged the specificity of the Cas9 enzyme together with traditional motif-mapping approach and developed a nickase-based strategy for target-specific labeling of any 20 bases across the whole genome.20 This approach has found wide applications, especially in the characterization of SVs.21–26
To further expand its applications, we developed an enzymatic labeling strategy for multicolor whole-genome mapping by combining direct label enzyme (DLE-1, Bionano Genomics) with Cas9-mediated nick-labeling reaction. Using this universal strategy, it is possible to target and fluorescently label any 20mer or the combination of multiple 20mers across the whole genome, especially in repetitive regions lacking DLE motifs. Custom maps can be generated to enable precise detection of breakpoints and interrogate the repetitive sequences; this enables more in-depth analysis of SVs than was previously possible.
We validated our approach by quantifying the number of D4Z4 repeats in chromosome 4q, detecting long non-interspersed elements 1 (LINE-1) insertions, and estimating the telomere length. D4Z4 is a 3.3 kbp repeat sequence associated with facioscapulohumeral muscular dystrophy (FSHD). The repeats occur on 4q35 and 10q26 loci lacking certain motifs targeted by DLE enzyme and nickase (Nt. BspQI) for conventional mapping. Similarly, telomeres in humans are chromosome-capping (TTAGGG)n repeats with varying lengths up to 20 kbp. They occur in genomic regions also lacking labeling motifs. LINE-1 insertions are transposable elements and are frequently inserted across the genome. Optical mapping with DLE alone does not differentiate LINE-1s from other insertions. With our DLE–Cas9 methodology, we can fluorescently tag specific sequences to differentiate LINE-1 insertions from others, quantify the copy numbers of D4Z4 repeats, and estimate the telomere length.
The DLE-1 based whole-genome mapping is gradually replacing Nt. BspQI-based whole-genome mapping and is becoming widely adopted. The nickases-based (Nt. BspQI–Cas9) multicolor labeling approach has found certain applications, especially in the characterization of SVs.21–26 However, limitations such as the breakage of long DNA molecules by Nt. BspQI and tedious protocol prevent its widespread adoption. Our new approach of combining DLE-1 and Cas9 labeling proves to be more advantageous.
First, the DLE–Cas9-based labeling strategy will preserve DNA integrity better and result in longer DNA molecules than Nt. BspQI. This is critical in studying the long structural variants. For instance, the D4Z4 locus and the flanking regions span ~250 kbp. DLE–Cas9 retains more long molecules spanning across the repeat region with less coverage than Nt. BspQI–Cas9 labeling and allows a reliable quantification of D4Z4 repeats. In the telomere length characterization, Nt. BspQI–Cas9 nickases-based labeling approach characterized 36 out of 46 telomeres. Among the uncharacterized, four chromosome arms (16p, 17p, 19q, 22q) lacked data due to DNA breakage at fragile sites in the subtelomeric regions arising from Nt. BspQI sites occurring in close proximities. The rest were not characterized due to lack of reference sequences or enough data preventing the assemblies at these subtelomeric regions.25 With our DLE–Cas9 labeling approach, five of the missing 10 telomeres (16p, 17p, 19q, 22q, and 23p) can be characterized (Figure S1). Second, in the Nt. BspQI–Cas9 method, both Nt. BspQI labeling and Cas9 labeling are nick-labeling chemistries. One must remove the first-color nucleotides before introducing a second-color, which makes the second-color labeling less efficient.25 This often results in incomplete labeling and characterization of specific targets. The DLE–Cas9 labeling approach is simpler and has higher efficiency of second-label incorporation. By using the DLE–Cas9 methodology, we could detect all the LINE-1 insertions across the NA12878 human genome and reliably quantify the D4Z4 copy numbers. Finally, the Nt. BspQI–Cas9’s entire labeling process is more tedious as it treats the DNA to additional steps and increases the likelihood of shortening of the DNA molecules. In contrast, the DLE-1 labeling chemistry as well as the DLE–Cas9 protocol are simple and preserve DNA integrity.
Our flexible and simple DLE–Cas9 enzymatic strategy for multicolor labeling leverages the global haplotyping characteristic by DLE-1 labeling and programmable Cas9-mediated nick-labeling. This universal strategy can be used to target any 20mer across the whole genome, all in a single tube reaction. Furthermore, the simultaneous detection of multiple targets within a single reaction enables customizable interrogations for applications such as breakpoint detection, repetitive sequence characterization, investigating mutagenesis, and so forth.
EXPERIMENTAL SECTION
DNA Preparation.
High-molecular weight gDNA was purified either from cells embedded into agarose-gel plugs using commercial kits as per the manufacturer’s specifications (BioRad no. 170–3592) or via Nanobind disk-based solid phase extraction (Bionano Genomics). The DNA samples were then quantified on Qubit using an AccuGreen Broad Range dsDNA Quantitation Kit (Biotium). DNA samples whose concentrations were in the range of 36–150 ng/uL were used for labeling.
Guide RNA Sequences.
Telomere, 4qD4z4, and 10qD4z4 probes were ordered from Integrated DNA Technology (IDT) as crRNA. The LINE-1 single guide RNA (sgRNA) mix was synthesized in the lab. They are designed to target 20 bases starting at 97,1425,3660 and 5841, respectively, for sgRNA_1 to sgRNA_4 in a full-length LINE-1 reference (Genbank L1.3; GenBank: L19088). For LINE-1 insertion detection, we performed the experiment using LINE-1 and telomere guide RNAs. The same experiment also provided the data for our telomere analysis reported in this paper. For D4Z4 characterization, we performed the experiment using three guide RNAs (4qD4Z4, 10qD4Z4, and telomere). Here, the telomere guide RNA was included as a control for second-labeling step, but not analyzed. In another experiment, we combined all gRNAs listed in Table 1; it generated similar results, but the analysis was not included in the paper.
Table 1.
Targets Used in DLE–Cas9 Labeling of NA12878
| guide RNAs | 20-base recognition sequences |
|---|---|
| LINE-1 sgRNA_1 | GGTACCGGGTTCATCTCACT |
| LINE-1 sgRNA_2 | CAAGTTGGAAAACACTCTGC |
| LINE-1 sgRNA_3 | GCTTATCCACCATGATCAAG |
| LINE-1 sgRNA_4 | GAAGGGGAATATCACACTCT |
| telomere | TTAGGGTTAGGGTTAGGGTT |
| 4qD4Z4 | TGGGAGAGCGCCCCGTCCGG |
| 10qD4Z4 | GAGAGCGAAGGCACCGTGCC |
Single Guide RNA Synthesis.
Four LINE-1 specific targets (Table 1) were each encoded on a 55-base DNA oligo along with T7 promoter (5′-TTCTAATACGACTCACTATAG-3′) and overlap sequences (5′-GTTTTAGAGCTAGA-3′) and ordered from IDT. An 80-base complementary oligo designed to hybridize to the overlap sequence was also ordered from IDT (5′-AAAGCACCGACTCGGTGCCACTTTTTCAAGTTGATAACGGACTAGCCTTATTTTAACTTGCTATTTCTAGCTCTAAAAC-3′). A 10 μM equimolar pool of 4 oligos was first made and mixed with 10 μM complementary oligo in the presence of 1× NEBuffer 2.0 (New England Biolabs, NEB) and 2 mM dNTPs. The mix was incubated at 90 °C for 15 s followed by 43 °C for 5 min to promote hybridization. A double-stranded DNA was synthesized later by adding 5U of Klenow exo- (NEB) to the mix and incubating at 37 °C for 1 hr. Any remnant single-stranded DNA was then degraded by the addition of 10 U Exonuclease I (NEB) in 1× Exonuclease buffer and incubating at 37 °C for 1 hr. The synthesized dsDNA was purified using a QIAquick Nucleotide Removal Kit (Qiagen) and quantified by absorbance spectroscopy and used for RNA synthesis. The sgRNA mix of 4 LINE-1 targets was synthesized following the manufacturer’s instructions in a NEB HiScribe T7 High Yield RNA Synthesis Kit and using the above dsDNA. After transcription and DNAseI (NEB) treatment, the sgRNA was purified using spin columns (Monarch RNA Cleanup Kit T2030, NEB) and quantified by absorbance spectroscopy before use in the labeling reactions.
DLE–Cas9 Labeling.
First, 750 ng of genomic DNA was labeled with a DLS labeling kit (Bionano Genomics) as per the manufacturer’s recommendations. In the second step, 300 ng of DLE-1 labeled DNA was nicked with Cas9D10A and subsequently labeled with Taq DNA polymerase. The crRNA and/or sgRNA used for the Cas9-mediated nicking reactions are listed in Table 1.
Briefly, a direct labeling enzyme master mix was prepared with Bionano Genomics’ DLE kit components (direct labeling enzyme, 1× DLE reaction buffer, and DL-green labeling mix) and added to DNA. The reaction was mixed well and incubated at 37 °C for 2 h. After this incubation, excess protein, fluorescent entities, and salt in the reaction volume were depleted by membrane dialysis for up to 2 h at room temperature in dark. A 100 nm hydrophilic membrane (EMD Millipore, VCWP04700) was chosen for efficient diffusion. Following this, the recovered DNA was once again quantified on a Qubit before proceeding to the second step.
For the second step, 0.5 μL of 50 μM crRNA and 0.5 μL of 0.5 μM tracrRNA (IDT) were first mixed and incubated on ice for 30 min. This incubation was omitted when using the synthesized guide RNA. Then, 200 ng Cas9D10A was added to 25 pmol RNA and incubated in 1× NEBuffer 3.1 for 15 min at 37 °C. Later, 300 ng of DLE-1 labeled DNA was added to this mixture and a nicking reaction was performed at 37 °C for 1 h. The nicked DNA was then labeled in the presence of 67 nM of nucleotides (Atto647 dUTP, Atto647 dATP dGTP, dCTP) with 5U Taq DNA polymerase for 1 h at 72 °C in 1× Thermopol buffer (NEB). The nick-labeled sample was treated with Proteinase-K (Qiagen) at 50 °C for 30 min and prepared for loading on the nanochannels, that is, a staining mix (with flow buffer, DTT, and DNA stain in a Bionano Genomics DLS kit) was prepared according to Bionano Prep Labeling NLRS Protocol—30024, Rev K (bionanogenomics.com), added to the sample, and incubated overnight at room temperature.
Imaging on Bionano Nanochannels.
The labeled sample was loaded on the Bionano Saphyr G1.2 chip and imaged using a “dual-labeled sample” workflow. red and green labels are sequentially excited with 637 and 532 nm lasers, respectively, and then, the YOYO-1-stained DNA backbone is excited with a 473 nm laser. For each experiment, we collected 480 Gb data. The raw molecule images were converted into BNX files and saved on Bionano Access. The molecules were first de novo assembled based on the green channel (DLE-1) reference. The red labels were later identified based on the expected location on the genome and further analyzed.
Two-Color Data Analysis.
The red label locations, identified with “1” in the “LabelChannel” column in the Cmap files in this assembly, were extracted. This information, however, is not listed in the Xmap files since the de novo assembly is performed based on the green-channel map. The locations for these labels relative to other green labels on the same molecule are found in the BNX file as well as the Cmap files. Shortlisted molecules for analysis containing the expected pattern of green and red labels were extracted from both these files. The raw molecules from the BNX file without stretch-match were used to generate the histograms.
RESULTS AND DISCUSSION
Quantification of D4Z4 Copy Numbers in 4q35.
The D4Z4 locus on the 4q35 chromosome arm is composed of tandemly repeating 3.3 kbp unit and the D4Z4 copy number variation in 4qA is thought to be responsible for FSHD presentation.27,28 However, there is a high sequence homology (99.9%) of D4Z4 repeats among 10q26 and a 9.5 kbp region on Chr Y.29 This complicates the detection of the copy numbers of D4Z4 repeats among these regions. Optical mapping relies on long single molecules of 300 kb, which is 10 times higher than the average read length of long-read sequencing methods.
In this experiment, we used three guide RNAs (4qD4Z4, 10qD4Z4, and telomere). The DNA was labeled at repeat motifs (CTTAAG) with green fluorophores using DLE enzyme. The D4Z4 repeat array was targeted using two guide RNAs—4qD4Z4 and 10qD4Z4 (Table 1). The telomere guide RNA as an internal control for the second-labeling step. The two probes 4qD4Z4 and 10qD4Z4 (Table 1) were used to target the D4Z4 repeats on the 4q chromosome arm with red fluorophores and are expected to generate a 1.68 and 3.3 kbp repetitive label pattern, respectively. Based on the hg38 reference of the 4q D4Z4 locus, the two target probes we designed (“4qD4Z4” and “10qD4Z4”) generate the repeating units, the theoretical distance between is about 1648 bp. When one probe, that is, “4qD4Z4” is used, a 3.3 kbp repeating unit will be detected and will result in the detection limit of one repeat unit. When two probes “4qD4Z4” and “10qD4Z4” are used, 1.68 kbp repeating unit is detected and the sensitivity will be half a repeat unit. This will increase the accuracy.
De novo assembled contigs spanning across the D4Z4 regions are shown in Figure 1A. DLE labels allow mapping not only to distinguish the 4q35 and 10q26 regions of D4Z4, but also separate the two haplotypes of 4qA and 4qB based on DLE signature (pink boxes in Figure 1A) (Bionano Solve Theory of Operation EnFocus FSHD Analysis Documentation, bionanogenomics.com). The molecules from 10q and 4q are already separated based on the DLE labels. The gRNAs were designed specifically to quantify the copy numbers of D4Z4 on the 4q chromosome.
Figure 1.

(A) De novo assembled optical maps of DLE–Cas9 labeled D4Z4 array on chromosome 4q in NA12878. On the top, 4qA haplotype is seen and on the bottom, 4qB haplotype can be seen. The wide green bar at the top denotes the hg38 reference. The wide blue bar below the reference represents consensus contigs from the de novo assembly. Individual molecules are represented by the thin yellow lines arranged under the consensus contigs. Dark blue vertical ticks on the single molecules (yellow lines) indicate labeled DLE sites and the red vertical ticks in the subtelomeric region indicate D4Z4 target-specific red labels. The figures show only a part of all labeled molecules aligned to 4qA and 4qB. (B) Graph of distances between the red labels plotted against their frequencies. Here, the X-axis indicates the distances between the two closest red labels which occurred along the length of the D4Z4 array of a molecule and the Y-axis indicates the frequency of the recorded distances across all mapped molecules.
The D4Z4 repeats labeling is shown as red ticks in Figure 1A. More red labels are present in the 4qA haplotype across longer distances than the 4qB haplotype. Varying distances between the neighboring red labels are observed.
Figure 1B shows the histogram of all recorded distances between neighboring red labels obtained from all molecules that span across the entire D4Z4 regions. We then perform the Gaussian fitting of each peak to find the peak locations at ~1.68, 3.36, 5.0, 6.6, 9.9, and 13.2 kbp. A peak was observed at ~1.68 kbp distance, shorter than the expected full D4Z4 repeat length, indicating that it was the distance between an on-target label and an off-target label. Longer distances, such as 6.6, 9.9, and 13.2 kb indicate that the expected red labels were missing. The average distance between all the peaks of haplotype 4qA, 1.68 kbp, was determined to be the average length of a half of a D4Z4 repeating unit. Same 1.68 kb was obtained on the 4qB haplotype. This is exactly half of the 3.36 kb unit because of the off-target labeling due to the 10qD4Z4 probe. The red labeling at ~190 Mb in Figure 1 is probably due to the telomere-like sequence or off-target labeling of 4qD4Z4 guide RNA.
We reason that we can accurately estimate the D4Z4 copy numbers by dividing the total length of D4Z4 from the first to last detected red labels by the 1.68 kb repeating unit. Using 1.68 kb as the repeating unit could increase the accuracy. To calculate the total length of D4Z4 repeats, we need to determine the “TRUE” first and last red labels since the overall labeling efficiency within this array was not 100% and many molecules missed the first or last red label. We measured the distances from the first red labels of each molecule to the left flanking DLE sites (arrows in Figure 1A). 7.7 ± 2 kb is the shortest distance among 75% molecules belonging to the 4qA haplotype. The same percentage of molecules on 4qA showed the distance between the last red label and the right flanking DLE sites to be 1 ± 2 kb. Only the molecules containing the “TRUE” first red label and “TRUE” last red label were used to calculate the total length of D4Z4 repeats. 37 molecules in 4qA and 44 molecules in 4qB were used for our D4Z4 copy number analysis.
Taken all together, we estimated that the 4qA has an average of 96 copies of 1.68 units and 48 ± 0.94 copies of 3.36 kb units. The 4qB was estimated to have 38 copies of 1.68 units and 19 ± 0.29 copies of 3.36 kb units. This is consistent with the numbers reported in previous studies.30–32 Here, we showed the accuracy of less than a single copy.
FSHD is conventionally diagnosed using southern blotting tests but they only offer semi-quantitative results.33 In a small set of the specimen (n = 87), southern blotting tests produced indeterminate results in 23% of the cases.34 As a result, alternative molecular combing,35,36 optical mapping,13 and long-read sequencing-based approaches37,38 for more efficient diagnosis of FSHD are gaining popularity. Although long-read sequencing read lengths have improved significantly since their inception to date, whole-genome sequencing is expensive while targeted sequencing for long regions such as D4Z4 repeats remains infeasible. Optical mapping can address some issues with long molecules, but due to the lack of motifs within the array, D4Z4 repeats are estimated based on distances between the closest DLE sites leading to inaccuracies.32 For more direct quantification, specific enzyme Nb. BssSI is needed, which tags each repeat with fluorophores.13 Our DLE–Cas9 is a more universal and versatile method, which can be used to tag any target or multiple targets simultaneously. The number of repeats that we estimate is comparable to earlier reports for healthy samples (between 10 and 240).13,28,32,35,36 For the first time, we quantify the standard deviation of our method, 0.97 repeats for 4qA, which makes it possible to differentiate less than one D4Z4 repeat unit for 4qA (pathogenic haplotype). This is especially important for FSHD cases where less than 8–10 repeats need to be counted accurately to differentiate the phenotypes.39
Telomere Labeling and Length Estimation.
Telomere length is a recognized clinical biomarker for aging and aging-related diseases. Several published studies correlate the unregulated telomere length to malignant cancers (bladder, esophageal, gastric, head, breast, neck, ovarian, renal, and endometrial).40–42 We previously demonstrated an optical mapping approach to estimate the individual telomere length by combining the conventional nickase-labeling with Cas9 labeling.25 However, only 36 (out of 46) could be mapped in the subtelomeric regions due to limitations such as fragile sites (nick sites occurring close to each other on the opposite strand). The two successive nicking reactions in the previous method are also laborious and cause DNA damage. To adequately address the above challenges, we apply our new DLE–Cas9 methodology to perform a telomere length measurement assay.
In this assay, we first used direct label enzyme (DLE-1, Bionano Genomics) to globally tag DNA at all DLE-specific motifs with green fluorophores. For telomere-specific labeling, we performed a Cas9 nick-labeling reaction.20 The Cas9 nickase was directed to telomere repeats by a 20-base synthetic guide RNA ordered from IDT (telomere, Table 1) to create nicks and telomeric repeats were then labeled with a red fluorescent dye. The labeled DNA molecules were imaged using high-throughput nanochannel arrays on the Bionano Saphyr system.43 A de novo assembly was performed based on the DLE-labels and the assemblies were aligned to hg38 reference. Individual molecules with red telomere labels at the ends were identified and used for the quantification of telomere lengths.
In Figure 2A, the de novo assembled contigs of 14q and 20q with their long single molecules are shown aligned to the hg38 reference. The wide green bar at the top denotes the hg38 reference. The wide blue bar below the reference represents the consensus contigs from the de novo assembly. The consensus contigs of both 14q and 20q matched well with the hg38 reference map. Individual molecules are represented by the thin yellow lines arranged under the consensus contigs. The dark blue vertical ticks on the single molecules (yellow lines) indicate labeled DLE sites and the red vertical ticks indicate target-specific red labels (shown by arrows). These red labels are clearly at the end of molecules indicating that the telomere repeats were labeled. In Figure 2A bottom panel, the labeling at ~64.27 Mb is due to the presence of telomere-like sequences in the subtelomeric region. As a proof of principle, we then quantified the total intensity of telomere labels from the molecules belonging to 14q and 20q arms, respectively. Figure 2B shows a plot with measured intensities of the red labels at the telomere-termini containing single molecules. Each filled circle represents the total red label intensity of a single molecule. The 14q has an average intensity of 4.79 ± 4.81, while 20q had an average intensity of 3.0 ± 2.6. High standard deviations of intensity reflect the heterogeneity in telomere lengths from different cells within a sample.44 The fragmentation of either 5′ or 3′ telomere ends could affect the quantification. However, they are a rare event among all telomere molecules and much less frequent than the DNA fragmentation in the middle, away from the telomeres. Moreover, we did not observe any telomere loss (no telomere) in normal cell lines as opposed to the telomere loss observed in cancer or aging cell lines.25,45 To translate the intensity to absolute base pairs, one needs to use a standard containing a known telomere repeats and a known system optical specificity.25 The lack of system information on the commercial system makes it difficult to provide base pair information.
Figure 2.

(A) De novo assembled optical maps of DLE–Cas9 labeled telomeric repeats array on chromosome 14q (top panel) and 20q (bottom panel) in NA12878. The wide green bar at the top denotes the hg38 reference. The wide blue bar below the reference represents consensus contigs from the de novo assembly. Individual molecules are represented by the thin yellow lines arranged under the consensus contigs. Dark blue vertical ticks on the single molecules (yellow lines) indicate labeled DLE sites and the red vertical ticks at the ends of single molecules indicate telomere red labels. Only a part of all aligned single molecules (yellow lines) are shown in the maps. (B) Plot with measured intensities of red labels at telomere-termini containing single molecules from 14q and 20q arms. Each filled circle represents the total red label intensity of a single molecule. The horizontal bar represents the average measured intensity.
Common telomere length assays include terminal restriction fragment (TRF)46 and qPCR.47 Both the methods estimate the average telomere length. Single telomere length analysis (STELA)48 and quantitative fluorescence in situ hybridization (Q-FISH)49 were developed to detect and measure the length of specific telomeres. However, STELA can only measure a limited number of chromosomes and Q-FISH is limited in the analysis of cells currently in metaphase and is unable to measure telomeres in terminally senescent cells or cells that are no longer able to divide.46,50
Optical-mapping-based telomere characterization assay by McCaffrey et al. can address the above challenges, but due to fragile sites has been successful in measuring 36 of 46 telomere lengths. We demonstrate the application of our new DLE–Cas9 methodology to label and characterize telomeres. Using our assay, we were able to label and measure telomeric intensities in all chromosome arms except the 5 acrocentric chromosomes (data not shown). The lack of hg38 reference sequences makes it especially difficult to characterize the telomeres of the 5 remaining short acrocentric chromosome arms (13p, 14p, 15p, 21p, and 22p). In our methodology, we demonstrated the multiplex ability of targets in a single assay. We combined all gRNAs listed in Table 1 to label the multiple targets in a single assay and it generated similar results (data not included). In an earlier report, we demonstrated the synthesis and use of up to 200 sgRNA in a single tube reaction.24
Detecting Long Interspersed Elements with DLE–Cas9 Multicolor Mapping.
LINE-1 insertions make up ~17% of the human genome. These insertions have been associated with various cancers, hemophilia, muscular dystrophy, and other genetic disorders.51–57 An individual is thought to have 80–100 active LINE-1 insertions responsible for most of the human retrotransposon activity. These active LINE-1s are ~6 kbp in length and are thought to differ between individuals.58–60
Optical mapping with sequence motifs, such as DLE, is very efficient in detecting insertions.26 When the size distribution of all the insertions from the whole genome assembly is plotted, a peak at 6 kb is always observed, which could be mostly attributed to the full-length LINE-1 insertions. However, optical mapping cannot differentiate other 6 kb insertions from LINE-1 insertions because mapping does not provide base-by-base information. As a proof of concept, we employed our DLE–Cas9 method to tag and detect LINE-1 insertions in the NA12878 sample.
We specifically designed and synthesized 4 single guide RNAs (Table 1) to target 4 different 20-base sequences on the LINE-1 reference at locations 97, 1425, 3660, and 5841 and separated by 1328, 2235, and 2181 bp. These sites were labeled with red fluorescent nucleotides. A de novo assembly was performed based on the DLE labels and the assemblies were aligned to the hg38 reference. A typical LINE-1 insertion detected using our DLE–Cas9 mapping is shown in Figure 3. Here, both DLE and the red labels have been stretch-matched and aligned to the reference.
Figure 3.

LINE-1 insertions detected in a Chr4 haplotype using our DLE–Cas9 approach. Both DLE and red labels are stretch-matched in the figure. (A) Haplotype with the 6 kbp line 1 insertion. (B) Second haplotype with no insertion at the same genomic region.
Two haplotypes were observed in this region, with a 6 kb insertion detected from 146,303,137 to 146,312,443 bp in haplotype 1 (Figure 3A) with red labels and no insertion in haplotype 2 (Figure 3B) at the same location. The average distances between the red labels in haplotypes were measured to be 1.5, 2.3, and 2.2 kb, which matches the distances between the four designed guide RNA targets in the LINE-1 reference. The sequential 1.5–2.3–2.2 kb order also indicates that the orientation of the insertion matches the reference. Moreover, the distances of two unmatched DLE motifs (yellow vertical lines on the contig) inside the insertion also match the LINE-1 reference. Taken together, we designated this insertion as LINE-1 insertion. The other haplotype is shown without LINE-1 insertion (Figure 3B) but may still have some LINE-1 like sequences because of the presence of some red labels.
Figure 3 also shows some red labels in a neighboring location (from 146,347,677 to 146,357,405 bp) but without any detected insertion. These indicate the presence of some LINE-1 sequences in this location, near the LINE-1 insertion. Interestingly, many of the LINE-1 insertions occurred in the locations in the vicinity of LINE-1 sequences.
We then scanned the whole genome looking for insertions with red labels that are separated by 1.5 ± 0.5, 2.3 ± 0.3, and 2.3 ± 0.3 kb; only molecules with three red labels were used in the analysis. We discovered 55 LINE-1 insertion sites of NA12878. We compared our results with a recent study by Zhou et al. that identified 52 LINE-1 insertions in NA12878 using PacBio sequencing data.61 Our method was able to identify 51 of 52 of these insertions and four additional locations that were not reported by Zhou et al. On further investigation, we discovered that the one location we missed (chr2: 131243591–131243683) was not a true LINE-1 insertion since the optical maps did not show any insertions in this location nor were any red labels found. The four additional LINE-1 insertions all passed our pipeline. Table S1 (Supporting Information) lists all the locations with the zygosity and orientation where LINE-1 insertions were found. The DNA molecules in the nanochannels are typically stretched to 85% of their theoretical maximum length.43 Factors such as the width of the nanochannel, salt concentration and voltage changes can cause localized variations in this stretching factor.62,63 However, a stretch-match function provided by Bionano Genomics was used to normalize the label locations shown in Figure 3.64 The stretch-match of the red labels in Figure 3 should not affect the LINE-1 detection. As we used four guide RNAs specific to LINE-1 sequences, the mere presence of the red labels together with the 6 kbp insertions detected by DLE labels should be enough to confirm that the insertions are LINE-1 sequences. In conclusion, our gRNA, labeling, and pipeline successfully detected all the LINE-1 insertions found by Zhou et al. and found four new previously unidentified locations.
Active LINE-1 insertions are frequent, non-static SVs associated with cancer and neurological and genetic disorders.65 Their mobile nature and variability between individuals make it challenging to study them. Long-read sequencing, although is widely used to characterize LINE-1 insertions, produces low throughput and high cost may prevent its application in detecting specific LINE insertions. Sequence-motif-based optical mapping, such as DLE and nickase,16,18 do not provide sequence-level information for the identification of LINE-1 insertions. We demonstrated the applicability of our DLE–Cas9 methodology for the detection and characterization of full-length LINE-1 insertions with their zygosity and orientation. Our approach can benefit clinical investigations by providing haplotype-resolved and structurally accurate LINE-1 consensus maps for genomic analysis.
CONCLUSIONS
The long-read sequencing technologies have been progressing tremendously since their inception.8 However, the lower throughput, high cost, high error rate, and still relatively short average read length limit their application. For example, in estimating the D4Z4 repeat copy numbers, the read length must reach more than 300 kb including the upstream and downstream sequences to separate the different haplotypes. Optical mapping can read single molecules with an average length of 300 kb. Optical mapping also offers a cost advantage where one can obtain 200× coverage with about $500 as compared to $10–20,000 for whole-genome sequencing with long-read technologies. The targeted sequencing of D4Z4 is still challenging with no commercially available enrichment kit that can capture D4Z4.1
For the first time, we demonstrated the technological feasibility of combining DLE sequence-specific labeling and Cas9-mediated target-specific labeling to target any sequences in the genome. This is a universal and versatile methodology that can be used in the simultaneous analysis of multiple targets. In an earlier report, we demonstrated the synthesis and use of up to 200 sgRNA in a single tube reaction;24 custom synthesizing the sgRNA significantly reduces the cost of assays. In this paper, we were able to detect LINE-1 insertions, estimating the copy numbers of D4Z4 repeats and telomere length in a single tube reaction, with the combination of either crRNA or sgRNA. More importantly, the whole assay is built on a commercial instrument and assay kit. This method will be available for general use in every laboratory.
Supplementary Material
ACKNOWLEDGMENTS
This work is supported by grants from NIH (R01HG005946 and R01HG009708).
Footnotes
The authors declare no competing financial interest.
Complete contact information is available at: https://pubs.acs.org/10.1021/acs.analchem.1c01373
Supporting Information
The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acs.analchem.1c01373.
Locations of LINE-1 insertions detected with the DLE–Cas9 labeling method and telomeres detected with the DLE–Cas9 labeling method (PDF)
Contributor Information
Lahari Uppuluri, School of Biomedical Engineering, Science and Health Systems, Drexel University, Philadelphia, Pennsylvania 19104, United States.
Tanaya Jadhav, School of Biomedical Engineering, Science and Health Systems, Drexel University, Philadelphia, Pennsylvania 19104, United States.
Yilin Wang, School of Biomedical Engineering, Science and Health Systems, Drexel University, Philadelphia, Pennsylvania 19104, United States.
Ming Xia, School of Biomedical Engineering, Science and Health Systems, Drexel University, Philadelphia, Pennsylvania 19104, United States; Center for Genomic Sciences, Institute of Molecular Medicine and Infectious Disease, Drexel University, Philadelphia, Pennsylvania 19104, United States.
REFERENCES
- (1).Macintyre G; Ylstra B; Brenton JD Trends Genet. 2016, 32, 530–542. [DOI] [PubMed] [Google Scholar]
- (2).Sleegers K; Brouwers N; Gijselinck I; Theuns J; Goossens D; Wauters J; Del-Favero J; Cruts M; Duijn C. M. v.; Broeckhoven CV Brain 2006, 129, 2977–2983. [DOI] [PubMed] [Google Scholar]
- (3).Rovelet-Lecrux A; Hannequin D; Raux G; Meur NL; Laquerrière A; Vital A; Dumanchin C; Feuillette S; Brice A; Vercelletto M; Dubas F; Frebourg T; Campion D Nat. Genet. 2006, 38, 24–26. [DOI] [PubMed] [Google Scholar]
- (4).Talkowski ME; Rosenfeld JA; Blumenthal I; Pillalamarri V; Chiang C; Heilbut A; Ernst C; Hanscom C; Rossin E; Lindgren AM; Pereira S; Ruderfer D; Kirby A; Ripke S; Harris DJ; Lee J-H; Ha K; Kim H-G; Solomon BD; Gropman AL; Lucente D; Sims K; Ohsumi TK; Borowsky ML; Loranger S; Quade B; Lage K; Miles J; Wu B-L; Shen Y; Neale B; Shaffer LG; Daly MJ; Morton CC; Gusella JF Cell 2012, 149, 525–537. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (5).Osei-Owusu IA; Norris AL; Joynt AT; Thorpe J; Cho S; Tierney E; Schmidt J; Hagopian L; Harris J; Pevsner J Mol. Case Stud. 2020, 6, a005884. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (6).Peng Y; Yuan C; Tao X; Zhao Y; Yao X; Zhuge L; Huang J; Zheng Q; Zhang Y; Hong H; Chen H; Sun Y Transl. Lung Cancer Res. 2020, 9, 670. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (7).Lupski JR Environ. Mol. Mutagen. 2015, 56, 419–436. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (8).Chaisson MJP; Sanders AD; Zhao X; Malhotra A; Porubsky D; Rausch T; Gardner EJ; Rodriguez OL; Guo L; Collins RL Nat. Commun. 2019, 10, 1784. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (9).Ebert P; Audano PA; Zhu Q; Rodriguez-Martin B; Porubsky D; Bonder MJ; Sulovari A; Ebler J; Zhou W; Mari RS Science 2021, 372, No. eabf7117. [Google Scholar]
- (10).Goldrich DY; LaBarge B; Chartrand S; Zhang L; Sadowski HB; Zhang Y; Pham K; Way H; Lai C-YJ; Pang AWC; Clifford B; Hastie AR; Oldakowski M; Goldenberg D; Broach JR J. Personalized Med. 2021, 11, 142. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (11).Margalit S; Abramson Y; Sharim H; Manber Z; Bhattacharya S; Chen YW; Vilain E; Barseghyan H; Elkon R; Sharan R bioRxiv 2021, DOI: 10.1101/2021.01.28.428654. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (12).Sharim H; Grunwald A; Gabrieli T; Michaeli Y; Margalit S; Torchinsky D; Arielly R; Nifker G; Juhasz M; Gularek F; Almalvez M; Dufault B; Chandra SS; Liu A; Bhattacharya S; Chen Y-W; Vilain E; Wagner KR; Pevsner J; Reifenberger J; Lam ET; Hastie AR; Cao H; Barseghyan H; Weinhold E; Ebenstein Y Genome Res. 2019, 29, 646–656. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (13).Dai Y; Li P; Wang Z; Liang F; Yang F; Fang L; Huang Y; Huang S; Zhou J; Wang D; Cui L; Wang K J. Med. Genet. 2020, 57, 109–120. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (14).Zheng Y; Kong L; Xu H; Lu Y; Zhao X; Yang Y; Yu G; Li P; Liang F; Jin H; Kong X Prenat. Diagn. 2020, 40, 317–323. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (15).Pastor S; Tran O; Jin A; Carrado D; Silva BA; Uppuluri L; Abid HZ; Young E; Crowley TB; Bailey AG Sci. Rep. 2020, 10, 12235. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (16).Xiao M; Phong A; Ha C; Chan T-F; Cai D; Leung L; Wan E; Kistler AL; DeRisi JL; Selvin PR; Kwok P-Y Nucleic Acids Res. 2007, 35, No. e16. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (17).Schwartz D; Li X; Hernandez L; Ramnarain S; Huff E; Wang Y Science 1993, 262, 110–114. [DOI] [PubMed] [Google Scholar]
- (18).Grunwald A; Dahan M; Giesbertz A; Nilsson A; Nyberg LK; Weinhold E; Ambjörnsson T; Westerlund F; Ebenstein Y Nucleic Acids Res. 2015, 43, No. e117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (19).Barseghyan H; Tang W; Wang RT; Almalvez M; Segura E; Bramble MS; Lipson A; Douine ED; Lee H; Délot EC; Nelson SF; Vilain E Genome Med. 2017, 9, 90. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (20).McCaffrey J; Sibert J; Zhang B; Zhang Y; Hu W; Riethman H; Xiao M Nucleic Acids Res. 2016, 44, No. e11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (21).Heft IE; Mostovoy Y; Levy-Sakin M; Ma W; Stevens AJ; Pastor S; McCaffrey J; Boffelli D; Martin DI; Xiao M; Kennedy MA; Kwok P-Y; Sikela JM Genetics 2020, 214, 179–191. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (22).Young E; Pastor S; Rajagopalan R; McCaffrey J; Sibert J; Mak ACY; Kwok P-Y; Riethman H; Xiao M Nucleic Acids Res. 2017, 45, No. e73. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (23).Young E; Abid HZ; Kwok P-Y; Riethman H; Xiao M PLoS Genet. 2020, 16, No. e1008347. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (24).Abid HZ; Young E; McCaffrey J; Raseley K; Varapula D; Wang H-Y; Piazza D; Mell J; Xiao M Nucleic Acids Res. 2021, 49, No. e8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (25).McCaffrey J; Young E; Lassahn K; Sibert J; Pastor S; Riethman H; Xiao M Genome Res. 2017, 27, 1904–1915. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (26).Levy-Sakin M; Pastor S; Mostovoy Y; Li L; Leung AKY; McCaffrey J; Young E; Lam ET; Hastie AR; Wong KHY Nat. Commun. 2019, 10, 1025. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (27).Wijmenga C; Brouwer OF; Padberg GW; Frants RR Lancet (British edition) 1992, 340, 985–986. [DOI] [PubMed] [Google Scholar]
- (28).Lemmers RJLF; de Kievit P; Sandkuijl L; Padberg GW; van Ommen G-JB; Frants RR; van der Maarel SM Nat. Genet. 2002, 32, 235–236. [DOI] [PubMed] [Google Scholar]
- (29).Jiang G; Yang F; van Overveld PGM; Vedanarayanan V; van der Maarel S; Ehrlich M Hum. Mol. Genet. 2003, 12, 2909–2921. [DOI] [PubMed] [Google Scholar]
- (30).Lyle R; Wright TJ; Clark LN; Hewitt JE Genomics 1995, 28, 389–397. [DOI] [PubMed] [Google Scholar]
- (31).Winokur S; Bengtsson U; Vargas JC; Wasmuth JJ; Altherr MR Hum. Mol. Genet. 1996, 5, 1567–1575. [DOI] [PubMed] [Google Scholar]
- (32).Jian Wang ETL; Andy WCP; Tom W; Zhang D; Sadowski HB; Hastie AR; Oldakowski M High throughput analysis of tandem repeat contraction associated with Facioscapulohumeral MuscularDystrophy (FSHD) by optical mapping. Annual Meeting 2019, American Society of Human Genetics, 2019. [Google Scholar]
- (33).Lemmers RJLF; van der Wielen MJR; Bakker E; Padberg GW; Frants RR; van der Maarel S. r. M. Ann. Neurol. 2004, 55, 845–850. [DOI] [PubMed] [Google Scholar]
- (34).Yanoov-Sharav M; Leshinsky-Silver E; Cohen S; Vinkler C; Michelson M; Lerman-Sagie T; Ginzberg M; Sadeh M; Lev D J. Genet. Counsel. 2012, 21, 557–563. [DOI] [PubMed] [Google Scholar]
- (35).Vasale J; Boyar F; Jocson M; Sulcova V; Chan P; Liaquat K; Hoffman C; Meservey M; Chang I; Tsao D; Hensley K; Liu Y; Owen R; Braastad C; Sun W; Walrafen P; Komatsu J; Wang J-C; Bensimon A; Anguiano A; Jaremko M; Wang Z; Batish S; Strom C; Higgins J Neuromuscul. Disord. 2015, 25, 945–951. [DOI] [PubMed] [Google Scholar]
- (36).Nguyen K; Walrafen P; Bernard R; Attarian S; Chaix C; Vovan C; Renard E; Dufrane N; Pouget J; Vannier A; Bensimon A; Lévy N Ann. Neurol. 2011, 70, 627–633. [DOI] [PubMed] [Google Scholar]
- (37).Mitsuhashi S; Nakagawa S; Takahashi Ueda M; Imanishi T; Frith MC; Mitsuhashi H Sci. Rep. 2017, 7, 14789. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (38).Morioka MS; Kitazume M; Osaki K; Wood J; Tanaka Y PLoS One 2016, 11, No. e0151963. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (39).Butz M; Koch MC; Muller-Felber W; Lemmers RJLF; van der Maarel SM; Schreiber H J. Neurol 2003, 250, 932–937. [DOI] [PubMed] [Google Scholar]
- (40).Blackburn EH; Epel ES; Lin J Science 2015, 350, 1193–1198. [DOI] [PubMed] [Google Scholar]
- (41).Wentzensen IM; Mirabello L; Pfeiffer RM; Savage SA Cancer Epidemiol. Biomark. Prev. 2011, 20, 1238–1250. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (42).Benati M; Montagnana M; Danese E; Mazzon M; Paviati E; Garzon S; Laganà AS; Casarin J; Giudici S; Raffaelli R; Ghezzi F; Franchi M; Lippi G Pathol. Oncol. Res. 2020, 26, 2281–2289. [DOI] [PubMed] [Google Scholar]
- (43).Lam ET; Hastie A; Lin C; Ehrlich D; Das SK; Austin MD; Deshpande P; Cao H; Nagarajan N; Xiao M; Kwok P-Y Nat. Biotechnol. 2012, 30, 771–776. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (44).Britt-Compton B; Rowson J; Locke M; Mackenzie I; Kipling D; Baird DM Hum. Mol. Genet. 2006, 15, 725–733. [DOI] [PubMed] [Google Scholar]
- (45).Abid HZ; McCaffrey J; Raseley K; Young E; Lassahn K; Varapula D; Riethman H; Xiao M BMC Genom. 2020, 21, 485. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (46).Aubert G; Hills M; Lansdorp PM Mutat. Res. Fund Mol. Mech. Mutagen 2012, 730, 59–67. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (47).Cawthon RM Nucleic Acids Res. 2009, 37, No. e21. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (48).Baird DM; Rowson J; Wynford-Thomas D; Kipling D Nat. Genet. 2003, 33, 203–207. [DOI] [PubMed] [Google Scholar]
- (49).Lansdorp P; Verwoerd NP; Van De Rijke FM; Dragowska V; Little M-T; Dirks RW; Raap AK; Tanke H J. Hum. Mol. Genet. 1996, 5, 685–691. [DOI] [PubMed] [Google Scholar]
- (50).Montpetit AJ; Alhareeri AA; Montpetit M; Starkweather AR; Elmore LW; Filler K; Mohanraj L; Burton CW; Menzies VS; Lyon DE; Jackson-Cook CK Nurs. Res. 2014, 63, 289. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (51).Takasu M; Hayashi R; Maruya E; Ota M; Imura K; Kougo K; Kobayashi C; Saji H; Ishikawa Y; Asai T; Tokunaga K Tissue Antigens 2007, 70, 144–150. [DOI] [PubMed] [Google Scholar]
- (52).Stacey SN; Kehr B; Gudmundsson J; Zink F; Jonasdottir A; Gudjonsson SA; Sigurdsson A; Halldorsson BV; Agnarsson BA; Benediktsdottir KR; Aben KKH; Vermeulen SH; Cremers RG; Panadero A; Helfand BT; Cooper PR; Donovan JL; Hamdy FC; Jinga V; Okamoto I; Jonasson JG; Tryggvadottir L; Johannsdottir H; Kristinsdottir AM; Masson G; Magnusson OT; Iordache PD; Helgason A; Helgason H; Sulem P; Gudbjartsson DF; Kong A; Jonsson E; Barkardottir RB; Einarsson GV; Rafnar T; Thorsteinsdottir U; Mates IN; Neal DE; Catalona WJ; Mayordomo JI; Kiemeney LA; Thorleifsson G; Stefansson K Hum. Mol. Genet. 2016, 25, 1008–1018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (53).Hancks DC; Kazazian HH Mobile DNA 2016, 7, 9–28. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (54).Nakamura Y; Murata M; Takagi Y; Kozuka T; Nakata Y; Hasebe R; Takagi A; Kitazawa J.-i.; Shima M; Kojima T Int. J. Hematol. 2015, 102, 134–139. [DOI] [PubMed] [Google Scholar]
- (55).Qian Y; Mancini-DiNardo D; Judkins T; Cox HC; Daniels C; Holladay J; Ryder M; Coffee B; Bowles K; Roa B Identification Of Retrotransposon Insertion Mutations in Hereditary Cancer. 65th Annual Meeting of the American Society of Human Genetics, 2015. [Google Scholar]
- (56).Peixoto A; Pinheiro M; Massena L; Santos C; Pinto P; Rocha P; Pinto C; Teixeira MR J. Hum. Genet. 2013, 58, 78–83. [DOI] [PubMed] [Google Scholar]
- (57).Gonçalves A; Oliveira J; Coelho T; Taipa R; Melo-Pires M; Sousa M; Santos R Genes 2017, 8, 253. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (58).Brouha B; Schustak J; Badge RM; Lutz-Prigge S; Farley AH; Moran JV; Kazazian HH Proc. Natl. Acad. Sci. U.S.A. 2003, 100, 5280–5285. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (59).Sassaman DM; Dombroski BA; Moran JV; Kimberland ML; Naas TP; DeBerardinis RJ; Gabriel A; Swergold GD; Kazazian HH Nat. Genet. 1997, 16, 37–43. [DOI] [PubMed] [Google Scholar]
- (60).Dombroski B; Mathias S; Nanthakumar E; Scott A; Kazazian H Science 1991, 254, 1805–1808. [DOI] [PubMed] [Google Scholar]
- (61).Zhou W; Emery SB; Flasch DA; Wang Y; Kwan KY; Kidd JM; Moran JV; Mills RE Nucleic Acids Res. 2019, 48, 1146–1163. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (62).Jo K; Dhingra DM; Odijk T; de Pablo JJ; Graham MD; Runnheim R; Forrest D; Schwartz DC Proc. Natl. Acad. Sci. U.S.A. 2007, 104, 2673–2678. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (63).Reccius CH; Stavis SM; Mannion JT; Walker LP; Craighead HG Biophys. J. 2008, 95, 273–286. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (64).Shelton JM; Coleman MC; Herndon N; Lu N; Lam ET; Anantharaman T; Sheth P; Brown S J. BMC Genom. 2015, 16, 734. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (65).Zhang X; Zhang R; Yu J Front. Cell Dev. Biol. 2020, 8, 657. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
