Abstract
Long-read RNA sequencing (RNA-seq) is emerging as a powerful and versatile technology for studying human transcriptomes. By enabling the end-to-end sequencing of full-length transcripts, long-read RNA-seq opens up avenues for investigating various RNA species and features that cannot be reliably interrogated by standard short-read RNA-seq methods. In this review, we present an overview of long-read RNA-seq, delineating its strengths over short-read RNA-seq, as well as summarizing recent advances in experimental and computational approaches to boost the power of long-read-based transcriptomics. We describe a wide range of applications of long-read RNA-seq, and highlight its expanding role as a foundational technology for exploring transcriptome variations in human diseases.
Keywords: long-read, RNA-seq, transcriptomics, isoform, RNA modification
Graphical abstract
Wang, Lin, and colleagues review long-read RNA sequencing, which is emerging as a powerful and versatile technology for studying human transcriptomes. The authors outline the strengths, recent advances, and diverse applications of long-read RNA sequencing, highlighting its expanding role as a foundational technology for exploring transcriptome variations in human diseases.
Introduction
The human genome contains approximately 20,000 protein-coding genes, yet it can encode an estimated over 300,000 unique protein isoforms,1 with approximately 160,000 protein-coding transcript isoforms currently annotated by GENCODE v.41.2 This transcript isoform diversity results in part from various forms of alternative transcription or RNA processing, such as alternative transcriptional start site usage, alternative splicing, and alternative polyadenylation (Figure 1A). More than 95% of multi-exon genes undergo alternative splicing, by which various combinations of exons and splice sites contribute to distinct mature transcript products.3,4 Common alternative splicing events include exon skipping, alternative 5′- and 3′-splice site usage, mutually exclusive exon usage, and intron retention.5 Human genes also have an average of four different transcriptional start sites,6 and alternative polyadenylation occurs in over 70% of human genes.7 The resulting transcript isoforms may have differences in their regulatory properties, such as subcellular localization, stability, and translational efficiency; and may encode protein isoforms with different structures and functions. Human RNAs also undergo a wide variety of chemical modifications.8 Some of the modified bases that occur frequently in mRNAs include 5-methylcytosine (m5C), N1-methyladenosine (m1A), pseudouridine (ψ), N6-methyladenosine (m6A), and inosine produced by A-to-I RNA editing (Figure 1B). These RNA modifications broadly impact the mRNA life cycle, influencing nearly every aspect of RNA processing and metabolism.
Figure 1.
The complexity of the human transcriptome
(A) Common patterns of alternative splicing, alternative transcription start site, and alternative polyadenylation. Dark blue boxes, constitutively spliced exons; red and yellow boxes, alternatively spliced exons; light blue box, retained intron; horizontal lines, introns. (B) RNA modifications. Top: common chemical modifications in human mRNAs. m5C, 5-methylcytosine; m1A, N1-methyladenosine; ψ, pseudouridine; m6A, N6-methyladenosine; CDS, coding sequence; UTR, untranslated region. Bottom: adenosine to inosine (A-to-I) RNA editing. ADAR, adenosine deaminase acting on RNA. (C) Chimeric RNAs generated by either trans-splicing (top) or readthrough and cis-splicing (bottom). Different colored boxes, exons from different parental genes; horizontal lines, introns. (D) Circular RNAs generated through back-splicing: a downstream splice donor site is joined to an upstream splice acceptor site (right). Linear RNAs generated through canonical splicing of pre-mRNAs (left). (E) Transposable element (TE)-derived transcripts. Different TE loci may contain discriminative single-nucleotide polymorphisms (SNPs). Magenta boxes, TE loci and corresponding TE-derived transcripts; diamonds, discriminative SNPs.
The complexity of the human transcriptome is further increased by a diverse array of transcripts that result from non-canonical transcription or splicing events. For example, chimeric RNAs can be generated from trans-splicing between distant genes or from readthrough and cis-splicing between adjacent genes (Figure 1C).9 Circular RNAs can be formed through back-splicing of pre-mRNAs (Figure 1D).10 In addition, transposable elements (TEs), which are typically silenced in the genome, can be activated under specific physiological or pathological conditions to produce TE-derived transcripts (Figure 1E).11 While some of these non-canonical transcripts are protein coding, most are likely non-coding RNAs that play regulatory roles in gene expression.
Short-read vs. long-read RNA sequencing
Over the past 15 years, massively parallel RNA sequencing (RNA-seq), primarily utilizing the Illumina-based “short-read” sequencing platform, has become the standard strategy for transcriptome profiling.12 Contemporary short-read sequencers offer high throughput and high base accuracy, providing a powerful tool for exploring the complexity of transcriptomes in both healthy and diseased conditions. In addition to characterizing the overall expression levels of protein-coding genes, short-read RNA-seq can also interrogate a wide range of RNA species as well as diverse mechanisms of RNA processing and regulation.13,14 Despite its widespread use and impact, short-read RNA-seq has a fundamental limitation due to its short read length, typically ranging from 50 to 300 bp, which is significantly shorter than the average size of human mRNAs (approximately 3 kb).15 As a result, short-read RNA-seq workflows require the fragmentation of mRNA molecules before sequencing and cannot directly sequence full-length transcript isoforms. As the connectivity between distant exons is lost, inferring full-length transcript isoforms from short-read RNA-seq data is a significant challenge (Figure 2A).16 In addition, the short read length poses read mapping challenges for certain RNA species, such as circular RNAs or TE-derived transcripts.17,18
Figure 2.
Short-read vs. long-read RNA-seq
(A) Schematic comparison of short-read vs. long-read RNA-seq alignments to a gene with multiple transcript isoforms. Top: a gene structure with exons (black boxes) and introns (horizontal black lines). Middle: short reads (short gray boxes) and long reads (long colored boxes) aligned to the gene. Reads spanning multiple exons are connected by dashed lines. Bottom: three distinct transcript isoforms resulting from alternative splicing. (B) Major long-read sequencing platforms. Left: Pacific Biosciences (PacBio) sequencing uses a zero-mode waveguide with an immobilized DNA polymerase at the bottom. The DNA polymerase binds to a circularized DNA molecule and synthesizes the new strand. Fluorescently labeled nucleotides are incorporated and emit specific fluorescent signals that are detected in real time (top). Fluorescent pulses corresponding to nucleotide incorporations are recorded, with different colors representing different nucleotides (bottom). Right: Oxford Nanopore Technologies (ONT) sequencing utilizes a protein nanopore embedded in a membrane. DNA or RNA is unwound by a motor protein and translocates through the pore, causing changes in ionic current (top). Distinct ionic current blockades are recorded as different nucleotides pass through the pore and subsequently decoded into nucleotide sequences (bottom).
Long-read sequencing platforms, such as Pacific Biosciences (PacBio) and Oxford Nanopore Technologies (ONT) long-read sequencers, enable end-to-end sequencing of full-length mRNA molecules, providing a transformative technology for transcriptome profiling (Figure 2).19 PacBio involves real-time detection of individual fluorescently labeled nucleotides as they are incorporated into a replicating strand.20 Individual full-length molecules run through a single DNA polymerase immobilized within a nanowell. To increase read accuracy, the DNA is circularized via ligation of hairpin adaptors and sequenced in multiple continuous passes by the polymerase. The resulting concatemeric reads can be used to produce high-fidelity consensus sequences.21 ONT applies an electric field across a membrane-embedded protein pore and measures changes in the electric current as the nucleotide sequence in a single molecule travels through the pore.22 While both PacBio and ONT can sequence cDNAs, a unique feature of ONT is that it can also directly sequence native RNA molecules without cDNA conversion, allowing for detection of RNA modifications without additional chemical labeling.22
PacBio and ONT can generate reads longer than 10 kb,23 capturing full-length transcripts as single molecules rather than amplified clusters as in Illumina short-read sequencing. For almost a decade since the advent of long-read sequencing, the technology suffered from low throughput and low base accuracy, creating significant hurdles for its application.24 With recent improvements in sequencing chemistry and informatics, the throughput and accuracy of both PacBio and ONT platforms have improved substantially (Table 1). PacBio’s high-fidelity reads now offer exceptional base accuracy (99.9%) with read lengths up to 25 kb,26 while ONT can generate ultra-long reads up to 4 Mb with accuracy ranging from 95% to 99% using the R10.4 chemistry.27 ONT provides higher throughput at a lower cost (up to 277 Gb per PromethION flow cell) compared with PacBio (up to 90 Gb per SMRT cell).25 PacBio excels in applications requiring high base accuracy, such as detection of small sequence variants (SNPs, indels), while ONT’s ability to directly sequence native RNA molecules makes it uniquely suited for studying RNA modifications. These advancements are paving the way for broad applications of long-read RNA-seq in biomedical research.19
Table 1.
Technological features of short-read and long-read RNA-seq
Feature | Illumina short-read RNA-seq | PacBio long-read RNA-seq | ONT long-read RNA-seq |
---|---|---|---|
Read length | 50–300 bpa | up to 25 kbb | up to 4 Mbc |
Base accuracy | 99.9%a | 99.9%b | 95%–99% (R10.4 chemistry)c |
Throughput | 65–3,000 Gb per flow cella | up to 90 Gb per SMRT cellb | up to 277 Gb per PromethION flow cellc |
Coste | $12–$27 per Gbd | $65–$200 per Gbd | $22–$90 per Gbd |
Data from the specification sheet for Illumina’s NovaSeq 6000 sequencing system (https://www.illumina.com/content/dam/illumina/gcs/assembled%assets/marketing-literature/novaseq-6000-spec-sheet-m-gl-00271/novaseq-6000-spec-sheet-m-gl-00271.pdf).
Data from the introduction page for PacBio HiFi long-read sequencing (https://www.pacb.com/technology/hifi-sequencing/).
Data from the introduction page for ONT sequencing platforms (https://nanoporetech.com/platform).
Cost estimates from Scarano et al.25
Costs may vary depending on sequencer models, kits, reagents, applications, and other factors.
Applications of long-read RNA sequencing
Long-read RNA-seq can be used to study a variety of RNA species and features (Figure 1). In this section, we describe applications of long-read RNA-seq and highlight recent advances in experimental and computational approaches to boost the power of long-read-based transcriptomics. Long-read RNA-seq provides a versatile tool for studying nearly all aspects of the transcriptome. The following list highlights some of the key applications, although it is not exhaustive.
Full-length transcript discovery and quantification
Long-read RNA-seq is expected to greatly facilitate the discovery and quantification of full-length transcripts. However, the high base error rate of long-read sequencing platforms in their early days—ranging between 5% and 20% on average for raw sequencing reads—historically posed a significant challenge for transcript analysis.28,29 In particular, reliable detection of splice sites from error-prone long RNA-seq reads is difficult, due to frequent sequencing errors around splice sites that confound state-of-the-art long-read RNA-seq aligners.30 Various strategies have been developed to mitigate this issue. Circular consensus sequencing, in which circularized cDNA molecules are sequenced multiple times, has been used to derive accurate consensus sequences from long concatemeric reads representing multiple copies of cDNA molecules.31,32 Hybrid sequencing, in which error-prone long RNA-seq reads are combined with high-accuracy short RNA-seq reads on the same RNA sample, has been used to improve splice site discovery and transcript analysis from long-read RNA-seq data.33,34 However, these strategies increase the complexity and cost of long-read RNA-seq studies and may introduce biases into the resulting data.
With continued improvement in the base accuracy of long-read sequencing platforms, a newer generation of computational tools have been developed for transcript analysis using raw long RNA-seq reads alone. Some of the representative tools published in recent years include: StringTie2,35 FLAMES,36 ESPRESSO,37 IsoQuant,38 and Bambu.39 As an example, ESPRESSO considers error profiles of individual reads and aggregates information across multiple reads to refine the alignment of long RNA-seq reads to splice sites, thus improving the discovery and quantification of full-length transcript isoforms, including novel transcript isoforms not included in existing annotations.37 Most of these tools output count matrices of transcript isoforms, which can then be used as input for differential transcript expression and differential transcript splicing analysis, through tools like DESeq2,40 edgeR,41 or DRIMseq.42
With many tools developed, there has been substantial interest in benchmarking long-read RNA-seq-based transcript analysis tools.43,44 Notably, the Long-read RNA-Seq Genome Annotation Assessment Project (LRGASP) Consortium systematically benchmarked 14 computational tools, using 427 million long RNA-seq reads generated by multiple PacBio and ONT protocols.43 No tool emerged as a clear frontrunner, and different tools may be recommended for different study objectives. For example, some tools are designed toward quantifying annotated transcript isoforms, while others are more receptive to discovering novel transcript isoforms. The authors also found that, as expected, obtaining full-length and highly accurate reads is more important for transcript discovery, while obtaining high read depth is more important for transcript quantification.
Transcripts derived from non-canonical transcription or splicing events
While conventionally used for profiling protein-coding mRNAs and long non-coding RNAs, long-read RNA-seq can be used to interrogate various RNA species derived from non-canonical transcription or splicing events, such as TE-derived transcripts, chimeric RNAs, and circular RNAs (Figure 1). The long read length is ideally suited for solving read mapping issues and determining full-length transcript sequences, which present challenges for short-read RNA-seq analysis of these RNA species. Specialized experimental protocols and computational tools have been developed to study such transcripts using long-read RNA-seq data.
Approximately 50% of the human genome is comprised of repetitive TEs, and individual TE loci can share highly similar genomic sequences.45 While TEs are typically silenced transcriptionally by epigenetic mechanisms, TE-derived transcripts can be aberrantly activated and expressed in diseases such as cancer, highlighting their potential roles as biomarkers and regulators of disease.46 A major impediment to studying TE-derived transcripts and elucidating their biological functions is the analytic challenge of mapping RNA-seq reads unambiguously to individual TE loci. The repetitive nature of TEs poses a challenge for short-read RNA-seq, and standard short-read RNA-seq pipelines typically discard reads mapped to TE loci.17 Long-read RNA-seq has been used to increase resolution and mappability of TE-derived transcripts.47 Berrens et al. developed CELLO-seq for locus-specific single-cell analysis of TE-derived transcripts.48 Lee et al. developed LocusMasterTE, a computational tool for accurate TE expression quantification by integrating long-read and short-read RNA-seq data.49 Collectively, long-read RNA-seq may enable systematic investigations of TE-derived transcripts across diverse physiological and pathological conditions, unraveling their importance in human biology and disease.
Long-read RNA-seq has also been used for profiling transcripts derived from non-canonical splicing events, such as chimeric RNAs generated by gene fusions. Multiple computational tools, such as IDP-fusion,50 LongGF,51 PB_FLIP,52 JAFFAL,53 Genion,54 FusionSeeker,55 and CTAT-LR-fusion,56 have been designed to detect chimeric RNAs from long-read RNA-seq data, primarily in the context of cancer. Although each tool has been benchmarked by its developers, comprehensive and independent evaluations comparing all published methods are still lacking. Recent studies, like the one introducing CTAT-LR-fusion, have highlighted the advantage of long-read RNA-seq for detecting and characterizing gene fusions. Notably, long-read RNA-seq reveals full-length isoform structures and captures the isoform complexity of chimeric fusion transcripts, whereas short-read RNA-seq is typically restricted to detecting splicing events near fusion breakpoints.56
Circular RNAs represent another category of transcripts arising from non-canonical splicing events. Several long-read RNA-seq-based sequencing methods, such as isoCirc,57 CIRI-long,58 circNick-LRS,59 and circFL-seq,60 have been developed to determine the full-length transcript sequences and exact exonic compositions of circular RNAs. For example, isoCirc uses rolling circle amplification followed by ONT long-read sequencing to sequence full-length circular RNA isoforms.57 By sequencing 12 human tissues and one cell line (HEK293), the authors generated a catalog of 107,147 full-length circular RNA isoforms, including 40,628 isoforms over 500 nt in length. The full-length transcript sequences of such long circular RNAs are difficult to define using short-read RNA-seq.61 The authors also identified extensive alternative splicing within the internal part of circular RNAs, highlighting the isoform complexity of full-length circular RNAs. These studies have prompted experts in the circular RNA field to emphasize the importance of considering full-length transcript sequences in the naming conventions for circular RNAs.62
Variant calling and phasing
Long-read RNA-seq allows detection of small sequence variants (SNPs, indels) within RNA and subsequently linking such variants to specific full-length transcript isoforms, greatly enhancing the exploration of allelic variation in gene expression and splicing. The ability to phase variants at isoform resolution and observe the complete combination of splicing events and sequence variants within individual long reads overcomes the limitation of short-read RNA-seq for studying allele-specific transcriptome variation.63 A variety of variant callers have recently been developed for long-read DNA sequencing, such as DeepVariant,64 Longshot,65 iGDA,66 NanoCaller,67 PEPPER-Margin-DeepVariant,68 and Clair3.69 A recent study by de Souza et al. sought to benchmark and optimize long-read DNA sequencing-based variant callers for long-read RNA-seq data.70 The authors found that spliced alignments of long RNA-seq reads were not directly suitable for long-read DNA sequencing-based variant callers and proposed a pipeline to transform the original spliced alignments for variant calling. With this transformation, certain variant callers (Clair3 and DeepVariant) achieved a decent performance for calling SNPs and indels in long RNA-seq reads. Various factors, such as read coverage, distance to splice sites, presence of homopolymers, and allelic bias in gene expression, affected the performance of variant calling. Variant calling suffered near splice sites, within homopolymer regions, and within genes exhibiting allele-specific expression.70 More recently, variant callers designed specifically for long-read RNA-seq data have been released, such as Clair3-RNA and longcallR,71,72 but their performance has yet to be rigorously assessed. Once called, individual variants can be phased into distinct haplotypes using tools like WhatsHap,73 HapCUT2,74 or LongPhase.75 These data can be used to define haplotype-resolved full-length transcriptomes, as demonstrated in a recent study on GTEx tissues and cell lines.76
RNA modifications
RNA modifications can be detected by long-read RNA-seq, either through sequencing of cDNAs (PacBio and ONT), or through direct sequencing of native RNAs (ONT). A-to-I RNA editing, a process in which adenosines are deaminated to inosines, is abundant in human transcriptomes.77 As inosine is recognized as guanosine during reverse transcription, an A-to-I RNA editing event in RNA can be detected as an A-to-G substitution in cDNA sequencing data. While A-to-I RNA editing is conventionally detected from Illumina RNA-seq data,78 a new tool, L-GIREMI, has been developed to identify A-to-I RNA editing events as A-to-G substitutions in PacBio cDNA sequencing data.79 In addition, leveraging ONT’s ability to directly sequence native RNAs, specialized tools such as Dinopore and DeepEdit have been developed to detect A-to-I RNA editing events, taking advantage of the characteristic electric current signal of sequences containing inosine and the increase in base calling errors surrounding edited bases in ONT direct RNA-seq data.80,81 L-GIREMI and DeepEdit can also resolve the phasing of A-to-I RNA editing events on a single transcript.79,81
ONT direct RNA-seq has also been used to detect and quantify various other RNA modifications, such as m6A and ψ.82,83 m6A is arguably the most extensively studied type of RNA modification by ONT direct RNA-seq, with many tools developed recently such as m6Anet,84 xPore,85 Nanocompore,86 and Remora,87 to name a few. These tools were designed based on either the interpretation of characteristic electric current signal of m6A or the recovery of m6A sites falsely denoted as base calling errors in ONT direct RNA-seq data.83 Zhong et al. conducted a comprehensive benchmark analysis of 10 tools and, after performing an integrative analysis of multiple tools, found likely authentic m6A sites that had not been detected previously by short-read RNA-seq methods.83 In addition, ONT direct RNA-seq has been used to study RNA modifications beyond mRNAs. For example, Nano-tRNAseq was developed to quantify the abundance and modifications of tRNAs.88
Looking ahead, an especially exciting opportunity lies in the simultaneous detection of multiple distinct types of RNA modifications, alongside sequence variants and splicing events, within full-length transcripts. An effort toward this goal is the recent development of CHEUI, a method to simultaneously detect m6A and m5C in RNA at single-molecule resolution via ONT direct RNA-seq.89 Similarly, TandemMod simultaneously detects several types of RNA modifications, including m6A and m5C, from ONT direct RNA-seq data.90 Ultimately, this isoform-resolved view of RNA modifications may offer deeper insights into their biogenesis, interactions, and regulatory consequences.
RNA structures
RNA molecules fold into complex secondary and tertiary structures through base pairing of nucleotides and formation of secondary structural elements (such as stems and loops) as well as various tertiary motifs.91 Structural information for most RNAs is incomplete, and most structural studies employing classical approaches such as X-ray crystallography and nuclear magnetic resonance spectroscopy only focus on a small number of RNAs.92 RNA structure can be elucidated through chemical probing, in which reagents that interact with unpaired bases generate identifiable adducts on high-throughput sequencing platforms, allowing RNA structures to be probed at scale.91,93 However, while chemical probing is traditionally coupled with short-read sequencing, an inherent limitation is that the structural information cannot be assigned to full-length transcript isoforms. Recently, long-read RNA-seq has been utilized for determining RNA structure at isoform resolution.94,95,96 PORE-cupine utilizes ONT direct RNA-seq with machine learning to determine isoform-specific secondary structures of cellular RNAs.94 Nano-DMS-MaP employs an ultra-processive reverse transcriptase to generate long cDNAs with mutational signatures at dimethylsulfate modification sites, followed by ONT cDNA sequencing and computational analysis to determine the secondary structures of individual transcript isoforms.96 Both studies demonstrate that transcript isoforms of the same gene can fold into different structures, contributing to distinct regulatory and functional properties.
Expanding applications of long-read RNA-seq to human diseases
As long-read RNA-seq becomes more versatile, accurate, high-throughput, and cost-effective, its applications to studies of human diseases have substantially expanded in recent years, deepening our understanding of their molecular basis and advancing the development of novel diagnostics and therapies. Below we describe the applications of long-read RNA-seq to three classes of human diseases: cancer, complex diseases, and rare Mendelian diseases.
Cancer
Cancer transcriptomes exhibit diverse forms of RNA dysregulation.97 Long-read RNA-seq has recently been used to profile the transcriptomes of a variety of cancer types.34,98,99,100,101,102,103,104,105,106 These studies have uncovered extensive isoform diversity in cancer transcriptomes, revealing transcript isoforms with potential biological and clinical significance. The ability of long-read RNA-seq to directly interrogate full-length transcript isoforms has been pivotal for elucidating the likely functions of isoform variation. For instance, long-read RNA-seq enables accurate inference of open reading frames, facilitating the identification of transcript isoforms harboring premature termination codons and targeted for nonsense-mediated mRNA decay (NMD). In a long-read RNA-seq analysis of 468 actionable cancer genes across 40 breast cancer cell lines, Wang et al. identified both subtype-specific and cell line-specific transcript isoforms, with a notable enrichment of NMD-targeted isoforms among cell line-specific isoforms of tumor suppressor genes (TSGs), suggesting a common RNA-associated mechanism for inactivating TSGs.105
Beyond profiling isoform diversity, long-read RNA-seq has also been used to study other features of cancer transcriptomes, including exonic mutations and fusion transcripts. For instance, PacBio long-read RNA-seq was used to detect mutations in BCR-ABL1 fusion transcripts that confer resistance to tyrosine kinase inhibitor therapy in patients with chronic myeloid leukemia.107 Similarly, it was used to determine the allelic configurations of double PIK3CA mutations in breast cancer, revealing that when both mutations occur on the same allele, oncogenicity is enhanced while cancer cells become more sensitive to PI3Kα inhibitors.108 Long-read RNA-seq has also facilitated the discovery of cancer-specific fusion transcripts and their associated structural variants, such as double-hop fusion transcripts resulting from complex structural variants in breast cancer,100 as well as fusion transcripts involving hepatitis B virus (HBV) and TEs in HBV-infected hepatocellular carcinoma.104
Recent studies have demonstrated the potential of long-read RNA-seq in identifying cancer biomarkers. For example, in human hepatocellular carcinoma, Chen et al. used PacBio long-read RNA-seq to identify tumor-specific transcript isoforms of genes such as ARHGEF2, DEK, ADRM1, and CD44.99 These transcript isoforms are highly expressed in tumors but their expression levels are negligible in normal livers. Reggiardo et al. developed TE-aware profiling of cell-free RNAs (COMPLETE-seq), which uses ONT long-read RNA-seq to profile the transcriptome in cell-free plasma.109 By doing so, researchers discovered that specific TE clades are enriched in cell-free RNAs from the blood plasma of cancer patients, offering disease-specific diagnostic biomarkers.
A promising application of long-read RNA-seq in cancer research is the discovery of novel therapeutic targets, particularly for immunotherapies. RNA dysregulation has emerged as an extensive but underexplored source of tumor antigens (TAs) and potential cancer immunotherapy targets.97 By facilitating full-length transcript and protein isoform analysis, long-read RNA-seq is uniquely positioned to discover TAs arising from RNA dysregulation in cancer cells, such as cancer-specific alternative splicing and TE expression. A long-read RNA-seq analysis of non-small-cell lung cancer aimed to discover aberrant transcript isoforms that may generate candidate TAs.102 As long-read RNA-seq data on cancer and normal tissues continue to accumulate and technologies continue to advance, this line of research is expected to accelerate, driving the discovery and targeting of this emerging class of TAs for various hard-to-treat cancers.
Complex diseases
Complex diseases, including various neurological and immunological disorders, are marked by complex interactions between multiple genetic and environmental factors.110,111 Using long-read RNA-seq, researchers have uncovered a vast array of transcript isoforms, including previously unannotated novel isoforms in medically important genes associated with various complex diseases.112,113,114,115,116,117 Hardwick et al. combined targeted RNA capture with nanopore long-read RNA-seq to profile 1,023 haplotype blocks across the human genome containing SNPs associated with neuropsychiatric traits and disorders.113 They found that 62% of these haplotype blocks, including 13% of intergenic blocks, expressed novel multi-exonic transcripts. Heberle et al. conducted deep nanopore long-read RNA-seq on 12 aged human frontal cortices, including 6 Alzheimer’s disease cases and 6 controls.117 They identified 1,917 medically important genes expressing multiple transcript isoforms, including 1,018 where transcript isoform variation altered protein-coding sequences. They also identified 99 transcript isoforms differentially expressed between Alzheimer’s disease cases and controls where the overall gene expression level was unchanged, highlighting the utility of isoform-level analysis of long-read RNA-seq data. Patowary et al. utilized PacBio long-read RNA-seq to examine the full-length transcriptome of the developing human neocortex.116 They identified over 200,000 distinct transcript isoforms, 72.6% of which were novel. Thousands of isoform switches were detected during cortical neurogenesis. Leveraging this catalog of transcript isoforms, the researchers reprioritized thousands of rare variants as potentially pathogenic for diseases such as autism spectrum disorders, intellectual disability, and neurodevelopmental disorders, highlighting the value of isoform discovery for variant interpretation.116
Quantitative trait loci (QTLs) analysis of transcriptome profiles, such as expression QTL and splicing QTL analysis, is a powerful strategy for identifying genetic contributors to complex diseases.118 While these studies have traditionally relied on population-scale short-read RNA-seq data,5 recent work has begun to incorporate long-read RNA-seq into transcriptome QTL studies. For example, a new study used nanopore direct RNA-seq to identify transcript QTLs and m6A QTLs across 60 lymphoblastoid cell lines from the 1000 Genomes project.119 Long-read RNA-seq is expected to see wider adoption in transcriptome QTL studies of human tissues and cell types, offering valuable insights into the genetic mechanisms underlying complex diseases.
Rare Mendelian diseases
Many Mendelian diseases are individually rare but collectively common, with approximately 7,000 known rare Mendelian diseases affecting every organ system.120,121 Despite significant advances in clinical genetic testing through high-throughput DNA sequencing, diagnostic rates currently range from 20% to 60%, leaving many patients with suspected rare Mendelian diseases without a definitive genetic diagnosis.122,123 By uncovering the effects of genomic variants on RNA expression and processing, RNA-seq can improve genetic diagnosis and variant interpretation for rare Mendelian diseases, raising diagnostic rates by 7.5%–36% in various studies.124 While RNA-seq studies of rare Mendelian diseases have primarily relied on standard short-read sequencing, recent research has begun to incorporate long-read RNA-seq, primarily focusing on specific genes. For instance, researchers used long-read RNA-seq to identify pathogenic splice variants of LMNA in dilated cardiomyopathy and MYBPC3 in hypertrophic cardiomyopathy.125,126 Similarly, Chandrasekhar et al. used long-read RNA-seq to investigate splicing defects in USH2A associated with retinitis pigmentosa and type 2 Usher syndrome.127 Specifically, they detected various simple and complex mis-splicing patterns resulting from deep intronic variants as well as previously classified variants of uncertain significance, helping to confirm variant pathogenicity and resolve genetic diagnosis. Expanding from single genes to gene panels, Schwenk et al. developed a targeted RNA-seq approach for diagnosing Lynch syndrome, uncovering RNA splicing and gene expression defects in mismatch repair genes.128 Given its unique advantages in detecting complex transcript alterations and enabling haplotype-resolved full-length transcript analysis, long-read RNA-seq has the potential to be widely adopted in studies of rare Mendelian diseases, improving genetic diagnosis and enhancing variant interpretation.
Conclusion
Over the past three decades, the field of transcriptomics has undergone four major waves of technological disruption and innovation, with developments of expressed sequence tag sequencing, splicing-sensitive microarray, short-read RNA-seq, and now, long-read RNA-seq. Each wave has transformed research and unlocked new scientific opportunities that were previously out of reach. Today, transcriptomics is poised for another critical point of technological transformation. Although short-read RNA-seq has been the de facto standard for transcriptome analysis for the last 15 years, the rise of long-read RNA-seq is revolutionizing transcriptome research, offering new insights into the complexity of the human transcriptome and advancing our understanding of RNA biology in health and disease. Long-read RNA-seq technologies promise to profoundly impact disease research by improving diagnosis, uncovering mechanisms of pathogenesis and progression, and identifying novel therapeutic targets.
Currently, long-read RNA-seq still faces several challenges. Throughput and cost remain disadvantages compared with short-read RNA-seq, with long-read platforms typically producing fewer reads at a higher cost, which can limit the accuracy of transcript discovery and quantification. Obtaining high-quality RNA samples, particularly from clinical specimens, presents another obstacle, as degraded RNAs lead to incomplete transcripts, undermining full-length transcript analysis. In addition, long-read RNA-seq demands substantial computational resources for data storage and analysis, requiring robust infrastructure and specialized bioinformatics tools.
Despite these challenges, ongoing advancements in sequencing platforms, library preparation methods, and informatics tools are continually enhancing the accuracy, cost efficiency, and capabilities of long-read RNA-seq. Targeted long-read RNA-seq enables ultra-deep sequencing of full-length transcripts for any gene panel.105,129,130,131 Single-cell long-read RNA-seq allows full-length transcriptome analysis at single-cell resolution.36,132,133,134,135 The integration of long-read RNA-seq data with proteomics data can substantially enhance protein isoform discovery and characterization.1,136,137 We envision that, in the near future, long-read RNA-seq will replace short-read RNA-seq as the standard tool for transcriptome analysis in studies of human diseases.
Acknowledgments
This work was supported by a National Institutes of Health grant R01GM121827 (to L.L.). We thank Dr. Yi Xing for assistance with the manuscript.
Author contributions
Visualization, I.H.A. and F.W.; writing – original draft (equal), I.H.A. and N.D.; writing – review & editing (equal), F.W. and L.L.; funding acquisition, L.L.
Declaration of interests
The authors declare no competing interests.
Contributor Information
Feng Wang, Email: wangf3@chop.edu.
Lan Lin, Email: linlan@chop.edu.
References
- 1.Miller R.M., Jordan B.T., Mehlferber M.M., Jeffery E.D., Chatzipantsiou C., Kaur S., Millikin R.J., Dai Y., Tiberi S., Castaldi P.J., et al. Enhanced protein isoform characterization through long-read proteogenomics. Genome Biol. 2022;23:69. doi: 10.1186/s13059-022-02624-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Frankish A., Carbonell-Sala S., Diekhans M., Jungreis I., Loveland J.E., Mudge J.M., Sisu C., Wright J.C., Arnan C., Barnes I., et al. GENCODE: reference annotation for the human and mouse genomes in 2023. Nucleic Acids Res. 2023;51:D942–D949. doi: 10.1093/nar/gkac1071. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Wang E.T., Sandberg R., Luo S., Khrebtukova I., Zhang L., Mayr C., Kingsmore S.F., Schroth G.P., Burge C.B. Alternative isoform regulation in human tissue transcriptomes. Nature. 2008;456:470–476. doi: 10.1038/nature07509. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Pan Q., Shai O., Lee L.J., Frey B.J., Blencowe B.J. Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing. Nat. Genet. 2008;40:1413–1415. doi: 10.1038/ng.259. [DOI] [PubMed] [Google Scholar]
- 5.Park E., Pan Z., Zhang Z., Lin L., Xing Y. The Expanding Landscape of Alternative Splicing Variation in Human Populations. Am. J. Hum. Genet. 2018;102:11–26. doi: 10.1016/j.ajhg.2017.11.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Djebali S., Davis C.A., Merkel A., Dobin A., Lassmann T., Mortazavi A., Tanzer A., Lagarde J., Lin W., Schlesinger F., et al. Landscape of transcription in human cells. Nature. 2012;489:101–108. doi: 10.1038/nature11233. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Tian B., Manley J.L. Alternative polyadenylation of mRNA precursors. Nat. Rev. Mol. Cell Biol. 2017;18:18–30. doi: 10.1038/nrm.2016.116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Roundtree I.A., Evans M.E., Pan T., He C. Dynamic RNA Modifications in Gene Expression Regulation. Cell. 2017;169:1187–1200. doi: 10.1016/j.cell.2017.05.045. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Jia Y., Xie Z., Li H. Intergenically Spliced Chimeric RNAs in Cancer. Trends Cancer. 2016;2:475–484. doi: 10.1016/j.trecan.2016.07.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Yang L., Wilusz J.E., Chen L.-L. Biogenesis and Regulatory Roles of Circular RNAs. Annu. Rev. Cell Dev. Biol. 2022;38:263–289. doi: 10.1146/annurev-cellbio-120420-125117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Bourque G., Burns K.H., Gehring M., Gorbunova V., Seluanov A., Hammell M., Imbeault M., Izsvák Z., Levin H.L., Macfarlan T.S., et al. Ten things you should know about transposable elements. Genome Biol. 2018;19:199. doi: 10.1186/s13059-018-1577-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Wang Z., Gerstein M., Snyder M. RNA-Seq: a revolutionary tool for transcriptomics. Nat. Rev. Genet. 2009;10:57–63. doi: 10.1038/nrg2484. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Byron S.A., Van Keuren-Jensen K.R., Engelthaler D.M., Carpten J.D., Craig D.W. Translating RNA sequencing into clinical diagnostics: opportunities and challenges. Nat. Rev. Genet. 2016;17:257–271. doi: 10.1038/nrg.2016.10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Stark R., Grzelak M., Hadfield J. RNA sequencing: the teenage years. Nat. Rev. Genet. 2019;20:631–656. doi: 10.1038/s41576-019-0150-2. [DOI] [PubMed] [Google Scholar]
- 15.Piovesan A., Antonaros F., Vitale L., Strippoli P., Pelleri M.C., Caracausi M. Human protein-coding genes and gene feature statistics in 2019. BMC Res. Notes. 2019;12:315. doi: 10.1186/s13104-019-4343-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Foord C., Hsu J., Jarroux J., Hu W., Belchikov N., Pollard S., He Y., Joglekar A., Tilgner H.U. The variables on RNA molecules: concert or cacophony? Answers in long-read sequencing. Nat. Methods. 2023;20:20–24. doi: 10.1038/s41592-022-01715-9. [DOI] [PubMed] [Google Scholar]
- 17.Lanciano S., Cristofari G. Measuring and interpreting transposable element expression. Nat. Rev. Genet. 2020;21:721–736. doi: 10.1038/s41576-020-0251-y. [DOI] [PubMed] [Google Scholar]
- 18.Nielsen A.F., Bindereif A., Bozzoni I., Hanan M., Hansen T.B., Irimia M., Kadener S., Kristensen L.S., Legnini I., Morlando M., et al. Best practice standards for circular RNA research. Nat. Methods. 2022;19:1208–1220. doi: 10.1038/s41592-022-01487-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.van Dijk E.L., Naquin D., Gorrichon K., Jaszczyszyn Y., Ouazahrou R., Thermes C., Hernandez C. Genomics in the long-read sequencing era. Trends Genet. 2023;39:649–671. doi: 10.1016/j.tig.2023.04.006. [DOI] [PubMed] [Google Scholar]
- 20.Eid J., Fehr A., Gray J., Luong K., Lyle J., Otto G., Peluso P., Rank D., Baybayan P., Bettman B., et al. Real-time DNA sequencing from single polymerase molecules. Science. 2009;323:133–138. doi: 10.1126/science.1162986. [DOI] [PubMed] [Google Scholar]
- 21.Rhoads A., Au K.F. PacBio Sequencing and Its Applications. Genomics Proteomics Bioinformatics. 2015;13:278–289. doi: 10.1016/j.gpb.2015.08.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Wang Y., Zhao Y., Bollas A., Wang Y., Au K.F. Nanopore sequencing technology, bioinformatics and applications. Nat. Biotechnol. 2021;39:1348–1365. doi: 10.1038/s41587-021-01108-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Logsdon G.A., Vollger M.R., Eichler E.E. Long-read human genome sequencing and its applications. Nat. Rev. Genet. 2020;21:597–614. doi: 10.1038/s41576-020-0236-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Byrne A., Cole C., Volden R., Vollmers C. Realizing the potential of full-length transcriptome sequencing. Phil. Trans. R. Soc. B. 2019;374 doi: 10.1098/rstb.2019.0097. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Scarano C., Veneruso I., De Simone R.R., Di Bonito G., Secondino A., D’Argenio V. The Third-Generation Sequencing Challenge: Novel Insights for the Omic Sciences. Biomolecules. 2024;14:568. doi: 10.3390/biom14050568. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.PacBio HiFi Sequencing: HiFi reads for highly accurate long-read sequencing. 2024. https://www.pacb.com/technology/hifi-sequencing/
- 27.ONT Discover the benefits of nanopore technology. 2024. https://nanoporetech.com/platform
- 28.Weirather J.L., de Cesare M., Wang Y., Piazza P., Sebastiano V., Wang X.-J., Buck D., Au K.F. Comprehensive comparison of Pacific Biosciences and Oxford Nanopore Technologies and their applications to transcriptome analysis. F1000Res. 2017;6:100. doi: 10.12688/f1000research.10571.2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Rang F.J., Kloosterman W.P., de Ridder J. From squiggle to basepair: computational approaches for improving nanopore sequencing read accuracy. Genome Biol. 2018;19:90. doi: 10.1186/s13059-018-1462-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Li H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics. 2018;34:3094–3100. doi: 10.1093/bioinformatics/bty191. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Sharon D., Tilgner H., Grubert F., Snyder M. A single-molecule long-read survey of the human transcriptome. Nat. Biotechnol. 2013;31:1009–1014. doi: 10.1038/nbt.2705. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Volden R., Palmer T., Byrne A., Cole C., Schmitz R.J., Green R.E., Vollmers C. Improving nanopore read accuracy with the R2C2 method enables the sequencing of highly multiplexed full-length single-cell cDNA. Proc. Natl. Acad. Sci. USA. 2018;115:9726–9731. doi: 10.1073/pnas.1806447115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Au K.F., Sebastiano V., Afshar P.T., Durruthy J.D., Lee L., Williams B.A., van Bakel H., Schadt E.E., Reijo-Pera R.A., Underwood J.G., Wong W.H. Characterization of the human ESC transcriptome by hybrid sequencing. Proc. Natl. Acad. Sci. USA. 2013;110:E4821–E4830. doi: 10.1073/pnas.1320101110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Tang A.D., Soulette C.M., van Baren M.J., Hart K., Hrabeta-Robinson E., Wu C.J., Brooks A.N. Full-length transcript characterization of SF3B1 mutation in chronic lymphocytic leukemia reveals downregulation of retained introns. Nat. Commun. 2020;11:1438. doi: 10.1038/s41467-020-15171-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Kovaka S., Zimin A.V., Pertea G.M., Razaghi R., Salzberg S.L., Pertea M. Transcriptome assembly from long-read RNA-seq alignments with StringTie2. Genome Biol. 2019;20:278. doi: 10.1186/s13059-019-1910-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Tian L., Jabbari J.S., Thijssen R., Gouil Q., Amarasinghe S.L., Voogd O., Kariyawasam H., Du M.R.M., Schuster J., Wang C., et al. Comprehensive characterization of single-cell full-length isoforms in human and mouse with long-read sequencing. Genome Biol. 2021;22:310. doi: 10.1186/s13059-021-02525-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Gao Y., Wang F., Wang R., Kutschera E., Xu Y., Xie S., Wang Y., Kadash-Edmondson K.E., Lin L., Xing Y. ESPRESSO: Robust discovery and quantification of transcript isoforms from error-prone long-read RNA-seq data. Sci. Adv. 2023;9 doi: 10.1126/sciadv.abq5072. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Prjibelski A.D., Mikheenko A., Joglekar A., Smetanin A., Jarroux J., Lapidus A.L., Tilgner H.U. Accurate isoform discovery with IsoQuant using long reads. Nat. Biotechnol. 2023;41:915–918. doi: 10.1038/s41587-022-01565-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Chen Y., Sim A., Wan Y.K., Yeo K., Lee J.J.X., Ling M.H., Love M.I., Göke J. Context-aware transcript quantification from long-read RNA-seq data with Bambu. Nat. Methods. 2023;20:1187–1195. doi: 10.1038/s41592-023-01908-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Love M.I., Huber W., Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15:550. doi: 10.1186/s13059-014-0550-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Robinson M.D., McCarthy D.J., Smyth G.K. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2010;26:139–140. doi: 10.1093/bioinformatics/btp616. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Nowicka M., Robinson M.D. DRIMSeq: a Dirichlet-multinomial framework for multivariate count outcomes in genomics. F1000Res. 2016;5:1356. doi: 10.12688/f1000research.8900.2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Pardo-Palacios F.J., Wang D., Reese F., Diekhans M., Carbonell-Sala S., Williams B., Loveland J.E., De María M., Adams M.S., Balderrama-Gutierrez G., et al. Systematic assessment of long-read RNA-seq methods for transcript identification and quantification. Nat. Methods. 2024;21:1349–1363. doi: 10.1038/s41592-024-02298-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Su Y., Yu Z., Jin S., Ai Z., Yuan R., Chen X., Xue Z., Guo Y., Chen D., Liang H., et al. Comprehensive assessment of mRNA isoform detection methods for long-read sequencing data. Nat. Commun. 2024;15:3972. doi: 10.1038/s41467-024-48117-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Wells J.N., Feschotte C. A Field Guide to Eukaryotic Transposable Elements. Annu. Rev. Genet. 2020;54:539–561. doi: 10.1146/annurev-genet-040620-022145. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Liang Y., Qu X., Shah N.M., Wang T. Towards targeting transposable elements for cancer therapy. Nat. Rev. Cancer. 2024;24:123–140. doi: 10.1038/s41568-023-00653-8. [DOI] [PubMed] [Google Scholar]
- 47.Shahid S., Slotkin R.K. The current revolution in transposable element biology enabled by long reads. Curr. Opin. Plant Biol. 2020;54:49–56. doi: 10.1016/j.pbi.2019.12.012. [DOI] [PubMed] [Google Scholar]
- 48.Berrens R.V., Yang A., Laumer C.E., Lun A.T.L., Bieberich F., Law C.-T., Lan G., Imaz M., Bowness J.S., Brockdorff N., et al. Locus-specific expression of transposable elements in single cells with CELLO-seq. Nat. Biotechnol. 2022;40:546–554. doi: 10.1038/s41587-021-01093-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Lee S., Barbour J.A., Tam Y.M., Yang H., Huang Y., Wong J.W.H. Integrating long-read RNA sequencing improves locus-specific quantification of transposable element expression. bioRxiv. 2024 doi: 10.1101/2023.03.21.533716. Preprint at. [DOI] [Google Scholar]
- 50.Weirather J.L., Afshar P.T., Clark T.A., Tseng E., Powers L.S., Underwood J.G., Zabner J., Korlach J., Wong W.H., Au K.F. Characterization of fusion genes and the significantly expressed fusion isoforms in breast cancer by hybrid sequencing. Nucleic Acids Res. 2015;43 doi: 10.1093/nar/gkv562. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Liu Q., Hu Y., Stucky A., Fang L., Zhong J.F., Wang K. LongGF: computational algorithm and software tool for fast and accurate detection of gene fusions by long-read transcriptome sequencing. BMC Genomics. 2020;21:793. doi: 10.1186/s12864-020-07207-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Miller A.R., Wijeratne S., McGrath S.D., Schieffer K.M., Miller K.E., Lee K., Mathew M., LaHaye S., Fitch J.R., Kelly B.J., et al. Pacific Biosciences Fusion and Long Isoform Pipeline for Cancer Transcriptome-Based Resolution of Isoform Complexity. J. Mol. Diagn. 2022;24:1292–1306. doi: 10.1016/j.jmoldx.2022.09.003. [DOI] [PubMed] [Google Scholar]
- 53.Davidson N.M., Chen Y., Sadras T., Ryland G.L., Blombery P., Ekert P.G., Göke J., Oshlack A. JAFFAL: detecting fusion genes with long-read transcriptome sequencing. Genome Biol. 2022;23:10. doi: 10.1186/s13059-021-02588-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Karaoglanoglu F., Chauve C., Hach F. Genion, an accurate tool to detect gene fusion from long transcriptomics reads. BMC Genomics. 2022;23:129. doi: 10.1186/s12864-022-08339-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Chen Y., Wang Y., Chen W., Tan Z., Song Y., Chen H., Chong Z., Chong Z. Gene Fusion Detection and Characterization in Long-Read Cancer Transcriptome Sequencing Data with FusionSeeker. Cancer Res. 2023;83:28–33. doi: 10.1158/0008-5472.CAN-22-1628. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Qin Q., Popic V., Yu H., White E., Khorgade A., Shin A., Wienand K., Dondi A., Beerenwinkel N., Vazquez F., et al. CTAT-LR-fusion: accurate fusion transcript identification from long and short read isoform sequencing at bulk or single cell resolution. bioRxiv. 2024 doi: 10.1101/2024.02.24.581862. Preprint at. [DOI] [Google Scholar]
- 57.Xin R., Gao Y., Gao Y., Wang R., Kadash-Edmondson K.E., Liu B., Wang Y., Lin L., Xing Y. isoCirc catalogs full-length circular RNA isoforms in human transcriptomes. Nat. Commun. 2021;12:266. doi: 10.1038/s41467-020-20459-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Zhang J., Hou L., Zuo Z., Ji P., Zhang X., Xue Y., Zhao F. Comprehensive profiling of circular RNAs with nanopore sequencing and CIRI-long. Nat. Biotechnol. 2021;39:836–845. doi: 10.1038/s41587-021-00842-6. [DOI] [PubMed] [Google Scholar]
- 59.Rahimi K., Venø M.T., Dupont D.M., Kjems J. Nanopore sequencing of brain-derived full-length circRNAs reveals circRNA-specific exon usage, intron retention and microexons. Nat. Commun. 2021;12:4825. doi: 10.1038/s41467-021-24975-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Liu Z., Tao C., Li S., Du M., Bai Y., Hu X., Li Y., Chen J., Yang E. circFL-seq reveals full-length circular RNAs with rolling circular reverse transcription and nanopore sequencing. Elife. 2021;10 doi: 10.7554/eLife.69457. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Zheng Y., Ji P., Chen S., Hou L., Zhao F. Reconstruction of full-length circular RNAs enables isoform-level quantification. Genome Med. 2019;11:2. doi: 10.1186/s13073-019-0614-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Chen L.-L., Bindereif A., Bozzoni I., Chang H.Y., Matera A.G., Gorospe M., Hansen T.B., Kjems J., Ma X.-K., Pek J.W., et al. A guide to naming eukaryotic circular RNAs. Nat. Cell Biol. 2023;25:1–5. doi: 10.1038/s41556-022-01066-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Demirdjian L., Xu Y., Bahrami-Samani E., Pan Y., Stein S., Xie Z., Park E., Wu Y.N., Xing Y. Detecting Allele-Specific Alternative Splicing from Population-Scale RNA-Seq Data. Am. J. Hum. Genet. 2020;107:461–472. doi: 10.1016/j.ajhg.2020.07.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Poplin R., Chang P.-C., Alexander D., Schwartz S., Colthurst T., Ku A., Newburger D., Dijamco J., Nguyen N., Afshar P.T., et al. A universal SNP and small-indel variant caller using deep neural networks. Nat. Biotechnol. 2018;36:983–987. doi: 10.1038/nbt.4235. [DOI] [PubMed] [Google Scholar]
- 65.Edge P., Bansal V. Longshot enables accurate variant calling in diploid genomes from single-molecule long read sequencing. Nat. Commun. 2019;10:4660. doi: 10.1038/s41467-019-12493-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Feng Z., Clemente J.C., Wong B., Schadt E.E. Detecting and phasing minor single-nucleotide variants from long-read sequencing data. Nat. Commun. 2021;12:3032. doi: 10.1038/s41467-021-23289-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Ahsan M.U., Liu Q., Fang L., Wang K. NanoCaller for accurate detection of SNPs and indels in difficult-to-map regions from long-read sequencing by haplotype-aware deep neural networks. Genome Biol. 2021;22:261. doi: 10.1186/s13059-021-02472-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Shafin K., Pesout T., Chang P.-C., Nattestad M., Kolesnikov A., Goel S., Baid G., Kolmogorov M., Eizenga J.M., Miga K.H., et al. Haplotype-aware variant calling with PEPPER-Margin-DeepVariant enables high accuracy in nanopore long-reads. Nat. Methods. 2021;18:1322–1332. doi: 10.1038/s41592-021-01299-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Zheng Z., Li S., Su J., Leung A.W.-S., Lam T.-W., Luo R. Symphonizing pileup and full-alignment for deep learning-based long-read variant calling. Nat. Comput. Sci. 2022;2:797–803. doi: 10.1038/s43588-022-00387-x. [DOI] [PubMed] [Google Scholar]
- 70.de Souza V.B.C., Jordan B.T., Tseng E., Nelson E.A., Hirschi K.K., Sheynkman G., Robinson M.D. Transformation of alignment files improves performance of variant callers for long-read RNA sequencing data. Genome Biol. 2023;24:91. doi: 10.1186/s13059-023-02923-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Huang N. 2024. huangnengCSU/longcallR. [Google Scholar]
- 72.HKU-BAL/Clair3-RNA . 2024. HKUCS Bioinformatics Algorithms and Core Technology Research Laboratory. [Google Scholar]
- 73.Martin M., Ebert P., Marschall T. Read-Based Phasing and Analysis of Phased Variants with WhatsHap. Methods Mol. Biol. 2023;2590:127–138. doi: 10.1007/978-1-0716-2819-5_8. [DOI] [PubMed] [Google Scholar]
- 74.Edge P., Bafna V., Bansal V. HapCUT2: robust and accurate haplotype assembly for diverse sequencing technologies. Genome Res. 2017;27:801–812. doi: 10.1101/gr.213462.116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Lin J.-H., Chen L.-C., Yu S.-C., Huang Y.-T. LongPhase: an ultra-fast chromosome-scale phasing algorithm for small and large variants. Bioinformatics. 2022;38:1816–1822. doi: 10.1093/bioinformatics/btac058. [DOI] [PubMed] [Google Scholar]
- 76.Glinos D.A., Garborcauskas G., Hoffman P., Ehsan N., Jiang L., Gokden A., Dai X., Aguet F., Brown K.L., Garimella K., et al. Transcriptome variation in human tissues revealed by long-read sequencing. Nature. 2022;608:353–359. doi: 10.1038/s41586-022-05035-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Bazak L., Haviv A., Barak M., Jacob-Hirsch J., Deng P., Zhang R., Isaacs F.J., Rechavi G., Li J.B., Eisenberg E., Levanon E.Y. A-to-I RNA editing occurs at over a hundred million genomic sites, located in a majority of human genes. Genome Res. 2014;24:365–376. doi: 10.1101/gr.164749.113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Park E., Jiang Y., Hao L., Hui J., Xing Y. Genetic variation and microRNA targeting of A-to-I RNA editing fine tune human tissue transcriptomes. Genome Biol. 2021;22:77. doi: 10.1186/s13059-021-02287-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Liu Z., Quinones-Valdez G., Fu T., Huang E., Choudhury M., Reese F., Mortazavi A., Xiao X. L-GIREMI uncovers RNA editing sites in long-read RNA-seq. Genome Biol. 2023;24:171. doi: 10.1186/s13059-023-03012-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80.Nguyen T.A., Heng J.W.J., Kaewsapsak P., Kok E.P.L., Stanojević D., Liu H., Cardilla A., Praditya A., Yi Z., Lin M., et al. Direct identification of A-to-I editing sites with nanopore native RNA sequencing. Nat. Methods. 2022;19:833–844. doi: 10.1038/s41592-022-01513-3. [DOI] [PubMed] [Google Scholar]
- 81.Chen L., Ou L., Jing X., Kong Y., Xie B., Zhang N., Shi H., Qin H., Li X., Hao P. DeepEdit: single-molecule detection and phasing of A-to-I RNA editing events using nanopore direct RNA sequencing. Genome Biol. 2023;24:75. doi: 10.1186/s13059-023-02921-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Begik O., Lucas M.C., Pryszcz L.P., Ramirez J.M., Medina R., Milenkovic I., Cruciani S., Liu H., Vieira H.G.S., Sas-Chen A., et al. Quantitative profiling of pseudouridylation dynamics in native RNAs with nanopore sequencing. Nat. Biotechnol. 2021;39:1278–1291. doi: 10.1038/s41587-021-00915-6. [DOI] [PubMed] [Google Scholar]
- 83.Zhong Z.-D., Xie Y.-Y., Chen H.-X., Lan Y.-L., Liu X.-H., Ji J.-Y., Wu F., Jin L., Chen J., Mak D.W., et al. Systematic comparison of tools used for m6A mapping from nanopore direct RNA sequencing. Nat. Commun. 2023;14:1906. doi: 10.1038/s41467-023-37596-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Hendra C., Pratanwanich P.N., Wan Y.K., Goh W.S.S., Thiery A., Göke J. Detection of m6A from direct RNA sequencing using a multiple instance learning framework. Nat. Methods. 2022;19:1590–1598. doi: 10.1038/s41592-022-01666-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85.Pratanwanich P.N., Yao F., Chen Y., Koh C.W.Q., Wan Y.K., Hendra C., Poon P., Goh Y.T., Yap P.M.L., Chooi J.Y., et al. Identification of differential RNA modifications from nanopore direct RNA sequencing with xPore. Nat. Biotechnol. 2021;39:1394–1402. doi: 10.1038/s41587-021-00949-w. [DOI] [PubMed] [Google Scholar]
- 86.Leger A., Amaral P.P., Pandolfini L., Capitanchik C., Capraro F., Miano V., Migliori V., Toolan-Kerr P., Sideri T., Enright A.J., et al. RNA modifications detection by comparative Nanopore direct RNA sequencing. Nat. Commun. 2021;12:7198. doi: 10.1038/s41467-021-27393-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87.nanoporetech/remora . 2024. Oxford Nanopore Technologies. [Google Scholar]
- 88.Lucas M.C., Pryszcz L.P., Medina R., Milenkovic I., Camacho N., Marchand V., Motorin Y., Ribas de Pouplana L., Novoa E.M. Quantitative analysis of tRNA abundance and modifications by nanopore RNA sequencing. Nat. Biotechnol. 2024;42:72–86. doi: 10.1038/s41587-023-01743-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 89.Acera Mateos P., J Sethi A., Ravindran A., Srivastava A., Woodward K., Mahmud S., Kanchi M., Guarnacci M., Xu J., W S Yuen Z., et al. Prediction of m6A and m5C at single-molecule resolution reveals a transcriptome-wide co-occurrence of RNA modifications. Nat. Commun. 2024;15:3899. doi: 10.1038/s41467-024-47953-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 90.Wu Y., Shao W., Yan M., Wang Y., Xu P., Huang G., Li X., Gregory B.D., Yang J., Wang H., Yu X. Transfer learning enables identification of multiple types of RNA modifications using nanopore direct RNA sequencing. Nat. Commun. 2024;15:4049. doi: 10.1038/s41467-024-48437-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 91.Zhang J., Fei Y., Sun L., Zhang Q.C. Advances and opportunities in RNA structure experimental determination and computational modeling. Nat. Methods. 2022;19:1193–1207. doi: 10.1038/s41592-022-01623-y. [DOI] [PubMed] [Google Scholar]
- 92.Wan Y., Kertesz M., Spitale R.C., Segal E., Chang H.Y. Understanding the transcriptome through RNA structure. Nat. Rev. Genet. 2011;12:641–655. doi: 10.1038/nrg3049. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 93.Strobel E.J., Yu A.M., Lucks J.B. High-throughput determination of RNA structures. Nat. Rev. Genet. 2018;19:615–634. doi: 10.1038/s41576-018-0034-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 94.Aw J.G.A., Lim S.W., Wang J.X., Lambert F.R.P., Tan W.T., Shen Y., Zhang Y., Kaewsapsak P., Li C., Ng S.B., et al. Determination of isoform-specific RNA structure with nanopore long reads. Nat. Biotechnol. 2021;39:336–346. doi: 10.1038/s41587-020-0712-z. [DOI] [PubMed] [Google Scholar]
- 95.Bizuayehu T.T., Labun K., Jakubec M., Jefimov K., Niazi A.M., Valen E. Long-read single-molecule RNA structure sequencing using nanopore. Nucleic Acids Res. 2022;50 doi: 10.1093/nar/gkac775. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 96.Bohn P., Gribling-Burrer A.-S., Ambi U.B., Smyth R.P. Nano-DMS-MaP allows isoform-specific RNA structure determination. Nat. Methods. 2023;20:849–859. doi: 10.1038/s41592-023-01862-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 97.Pan Y., Kadash-Edmondson K.E., Wang R., Phillips J., Liu S., Ribas A., Aplenc R., Witte O.N., Xing Y. RNA Dysregulation: An Expanding Source of Cancer Immunotherapy Targets. Trends Pharmacol. Sci. 2021;42:268–282. doi: 10.1016/j.tips.2021.01.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 98.Jing Y., Zhang Y., Zhu H., Zhang K., Cai M.-C., Ma P., Shen P., Zhang Z., Shao M., Wang J., et al. Hybrid sequencing-based personal full-length transcriptomic analysis implicates proteostatic stress in metastatic ovarian cancer. Oncogene. 2019;38:3047–3060. doi: 10.1038/s41388-018-0644-y. [DOI] [PubMed] [Google Scholar]
- 99.Chen H., Gao F., He M., Ding X.F., Wong A.M., Sze S.C., Yu A.C., Sun T., Chan A.W.H., Wang X., Wong N. Long-Read RNA Sequencing Identifies Alternative Splice Variants in Hepatocellular Carcinoma and Tumor-Specific Isoforms. Hepatology. 2019;70:1011–1025. doi: 10.1002/hep.30500. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 100.Namba S., Ueno T., Kojima S., Kobayashi K., Kawase K., Tanaka Y., Inoue S., Kishigami F., Kawashima S., Maeda N., et al. Transcript-targeted analysis reveals isoform alterations and double-hop fusions in breast cancer. Commun. Biol. 2021;4:1320. doi: 10.1038/s42003-021-02833-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 101.Huang K.K., Huang J., Wu J.K.L., Lee M., Tay S.T., Kumar V., Ramnarayanan K., Padmanabhan N., Xu C., Tan A.L.K., et al. Long-read transcriptome sequencing reveals abundant promoter diversity in distinct molecular subtypes of gastric cancer. Genome Biol. 2021;22:44. doi: 10.1186/s13059-021-02261-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 102.Oka M., Xu L., Suzuki T., Yoshikawa T., Sakamoto H., Uemura H., Yoshizawa A.C., Suzuki Y., Nakatsura T., Ishihama Y., et al. Aberrant splicing isoforms detected by full-length transcriptome sequencing as transcripts of potential neoantigens in non-small cell lung cancer. Genome Biol. 2021;22:9. doi: 10.1186/s13059-020-02240-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 103.Veiga D.F.T., Nesta A., Zhao Y., Mays A.D., Huynh R., Rossi R., Wu T.-C., Palucka K., Anczukow O., Beck C.R., Banchereau J. A comprehensive long-read isoform analysis platform and sequencing resource for breast cancer. Sci. Adv. 2022;8 doi: 10.1126/sciadv.abg6711. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 104.Kiyose H., Nakagawa H., Ono A., Aikata H., Ueno M., Hayami S., Yamaue H., Chayama K., Shimada M., Wong J.H., Fujimoto A. Comprehensive analysis of full-length transcripts reveals novel splicing abnormalities and oncogenic transcripts in liver cancer. Plos Genet. 2022;18 doi: 10.1371/journal.pgen.1010342. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 105.Wang F., Xu Y., Wang R., Zhang B., Smith N., Notaro A., Gaerlan S., Kutschera E., Kadash-Edmondson K.E., Xing Y., Lin L. TEQUILA-seq: a versatile and low-cost method for targeted long-read RNA sequencing. Nat. Commun. 2023;14:4760. doi: 10.1038/s41467-023-40083-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 106.Sun Q., Han Y., He J., Wang J., Ma X., Ning Q., Zhao Q., Jin Q., Yang L., Li S., et al. Long-read sequencing reveals the landscape of aberrant alternative splicing and novel therapeutic target in colorectal cancer. Genome Med. 2023;15:76. doi: 10.1186/s13073-023-01226-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 107.Cavelier L., Ameur A., Häggqvist S., Höijer I., Cahill N., Olsson-Strömberg U., Hermanson M. Clonal distribution of BCR-ABL1 mutations and splice isoforms by single-molecule long-read RNA sequencing. BMC Cancer. 2015;15:45. doi: 10.1186/s12885-015-1046-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 108.Vasan N., Razavi P., Johnson J.L., Shao H., Shah H., Antoine A., Ladewig E., Gorelick A., Lin T.-Y., Toska E., et al. Double PIK3CA mutations in cis increase oncogenicity and sensitivity to PI3Kα inhibitors. Science. 2019;366:714–723. doi: 10.1126/science.aaw9032. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 109.Reggiardo R.E., Maroli S.V., Peddu V., Davidson A.E., Hill A., LaMontagne E., Aaraj Y.A., Jain M., Chan S.Y., Kim D.H. Profiling of repetitive RNA sequences in the blood plasma of patients with cancer. Nat. Biomed. Eng. 2023;7:1627–1635. doi: 10.1038/s41551-023-01081-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 110.Manolio T.A., Collins F.S., Cox N.J., Goldstein D.B., Hindorff L.A., Hunter D.J., McCarthy M.I., Ramos E.M., Cardon L.R., Chakravarti A., et al. Finding the missing heritability of complex diseases. Nature. 2009;461:747–753. doi: 10.1038/nature08494. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 111.Boyle E.A., Li Y.I., Pritchard J.K. An Expanded View of Complex Traits: From Polygenic to Omnigenic. Cell. 2017;169:1177–1186. doi: 10.1016/j.cell.2017.05.038. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 112.De Roeck A., Van den Bossche T., van der Zee J., Verheijen J., De Coster W., Van Dongen J., Dillen L., Baradaran-Heravi Y., Heeman B., Sanchez-Valle R., et al. Deleterious ABCA7 mutations and transcript rescue mechanisms in early onset Alzheimer's disease. Acta Neuropathol. 2017;134:475–487. doi: 10.1007/s00401-017-1714-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 113.Hardwick S.A., Bassett S.D., Kaczorowski D., Blackburn J., Barton K., Bartonicek N., Carswell S.L., Tilgner H.U., Loy C., Halliday G., et al. Targeted, High-Resolution RNA Sequencing of Non-coding Genomic Regions Associated With Neuropsychiatric Functions. Front. Genet. 2019;10:309. doi: 10.3389/fgene.2019.00309. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 114.Clark M.B., Wrzesinski T., Garcia A.B., Hall N.A.L., Kleinman J.E., Hyde T., Weinberger D.R., Harrison P.J., Haerty W., Tunbridge E.M. Long-read sequencing reveals the complex splicing profile of the psychiatric risk gene CACNA1C in human brain. Mol. Psychiatry. 2020;25:37–47. doi: 10.1038/s41380-019-0583-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 115.Ma L., Semick S.A., Chen Q., Li C., Tao R., Price A.J., Shin J.H., Jia Y., Brandon N.J., Cross A.J., et al. Schizophrenia risk variants influence multiple classes of transcripts of sorting nexin 19 (SNX19) Mol. Psychiatry. 2020;25:831–843. doi: 10.1038/s41380-018-0293-0. [DOI] [PubMed] [Google Scholar]
- 116.Patowary A., Zhang P., Jops C., Vuong C.K., Ge X., Hou K., Kim M., Gong N., Margolis M., Vo D., et al. Developmental isoform diversity in the human neocortex informs neuropsychiatric risk mechanisms. Science. 2024;384 doi: 10.1126/science.adh7688. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 117.Aguzzoli Heberle B., Brandon J.A., Page M.L., Nations K.A., Dikobe K.I., White B.J., Gordon L.A., Fox G.A., Wadsworth M.E., Doyle P.H., et al. Mapping medically relevant RNA isoform diversity in the aged human frontal cortex with deep long-read RNA-seq. Nat. Biotechnol. 2024 doi: 10.1038/s41587-024-02245-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 118.Brandt M., Lappalainen T. SnapShot: Discovering Genetic Regulatory Variants by QTL Analysis. Cell. 2017;171:980–980.e1. doi: 10.1016/j.cell.2017.10.031. [DOI] [PubMed] [Google Scholar]
- 119.Réal A., Brown A., Yung G.P., Borel C., Lykoskoufis N., Seebach J., Dermitzakis E., Ramisch A., Viñuela A. 2024. Direct long-read RNA sequencing uncovers functional variation affecting transcript production and RNA modifications. [Google Scholar]
- 120.Haendel M., Vasilevsky N., Unni D., Bologa C., Harris N., Rehm H., Hamosh A., Baynam G., Groza T., McMurry J., et al. How many rare diseases are there? Nat. Rev. Drug Discov. 2020;19:77–78. doi: 10.1038/d41573-019-00180-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 121.Nguengang Wakap S., Lambert D.M., Olry A., Rodwell C., Gueydan C., Lanneau V., Murphy D., Le Cam Y., Rath A. Estimating cumulative point prevalence of rare diseases: analysis of the Orphanet database. Eur. J. Hum. Genet. 2020;28:165–173. doi: 10.1038/s41431-019-0508-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 122.Baxter S.M., Posey J.E., Lake N.J., Sobreira N., Chong J.X., Buyske S., Blue E.E., Chadwick L.H., Coban-Akdemir Z.H., Doheny K.F., et al. Centers for Mendelian Genomics: A decade of facilitating gene discovery. Genet. Med. 2022;24:784–797. doi: 10.1016/j.gim.2021.12.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 123.Chung C.C., Hue S.P., Ng N.Y., Doong P.H., Chu A.T., Chung B.H., Chung B.H.Y. Meta-analysis of the diagnostic and clinical utility of exome and genome sequencing in pediatric and adult patients with rare diseases across diverse populations. Genet. Med. 2023;25 doi: 10.1016/j.gim.2023.100896. [DOI] [PubMed] [Google Scholar]
- 124.Murdock D.R., Dai H., Burrage L.C., Rosenfeld J.A., Ketkar S., Müller M.F., Yépez V.A., Gagneur J., Liu P., Chen S., et al. Transcriptome-directed analysis for Mendelian disease diagnosis overcomes limitations of conventional genomic testing. J. Clin. Invest. 2021;131:e141500. doi: 10.1172/JCI141500. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 125.Dainis A., Tseng E., Clark T.A., Hon T., Wheeler M., Ashley E. Targeted Long-Read RNA Sequencing Demonstrates Transcriptional Diversity Driven by Splice-Site Variation in MYBPC3. Circ. Genom Precis Med. 2019;12 doi: 10.1161/CIRCGEN.119.002464. [DOI] [PubMed] [Google Scholar]
- 126.Sedaghat-Hamedani F., Rebs S., Kayvanpour E., Zhu C., Amr A., Müller M., Haas J., Wu J., Steinmetz L.M., Ehlermann P., et al. Genotype Complements the Phenotype: Identification of the Pathogenicity of an LMNA Splice Variant by Nanopore Long-Read Sequencing in a Large DCM Family. Int. J. Mol. Sci. 2022;23 doi: 10.3390/ijms232012230. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 127.Chandrasekhar S., Lin S., Jurkute N., Oprych K., Estramiana Elorrieta L., Schiff E., Malka S., Wright G., Michaelides M., Mahroo O.A., et al. Investigating Splice Defects in USH2A Using Targeted Long-Read Sequencing. Cells. 2024;13:1261. doi: 10.3390/cells13151261. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 128.Schwenk V., Leal Silva R.M., Scharf F., Knaust K., Wendlandt M., Häusser T., Pickl J.M.A., Steinke-Lange V., Laner A., Morak M., et al. Transcript capture and ultradeep long-read RNA sequencing (CAPLRseq) to diagnose HNPCC/Lynch syndrome. J. Med. Genet. 2023;60:747–759. doi: 10.1136/jmg-2022-108931. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 129.Lagarde J., Uszczynska-Ratajczak B., Carbonell S., Pérez-Lluch S., Abad A., Davis C., Gingeras T.R., Frankish A., Harrow J., Guigo R., Johnson R. High-throughput annotation of full-length long noncoding RNAs with capture long-read sequencing. Nat. Genet. 2017;49:1731–1740. doi: 10.1038/ng.3988. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 130.Sheynkman G.M., Tuttle K.S., Laval F., Tseng E., Underwood J.G., Yu L., Dong D., Smith M.L., Sebra R., Willems L., et al. ORF Capture-Seq as a versatile method for targeted identification of full-length isoforms. Nat. Commun. 2020;11:2326. doi: 10.1038/s41467-020-16174-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 131.Wang J., Yang L., Cheng A., Tham C.-Y., Tan W., Darmawan J., de Sessions P.F., Wan Y. Direct RNA sequencing coupled with adaptive sampling enriches RNAs of interest in the transcriptome. Nat. Commun. 2024;15:481. doi: 10.1038/s41467-023-44656-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 132.Gupta I., Collier P.G., Haase B., Mahfouz A., Joglekar A., Floyd T., Koopmans F., Barres B., Smit A.B., Sloan S.A., et al. Single-cell isoform RNA sequencing characterizes isoforms in thousands of cerebellar cells. Nat. Biotechnol. 2018;36:1197–1202. doi: 10.1038/nbt.4259. [DOI] [PubMed] [Google Scholar]
- 133.Lebrigand K., Magnone V., Barbry P., Waldmann R. High throughput error corrected Nanopore single cell transcriptome sequencing. Nat. Commun. 2020;11:4025. doi: 10.1038/s41467-020-17800-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 134.Volden R., Vollmers C. Single-cell isoform analysis in human immune cells. Genome Biol. 2022;23:47. doi: 10.1186/s13059-022-02615-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 135.Shiau C.-K., Lu L., Kieser R., Fukumura K., Pan T., Lin H.-Y., Yang J., Tong E.L., Lee G., Yan Y., et al. High throughput single cell long-read sequencing analyses of same-cell genotypes and phenotypes in human tumors. Nat. Commun. 2023;14:4124. doi: 10.1038/s41467-023-39813-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 136.Singh A. Enhanced protein isoform characterization. Nat. Methods. 2022;19:401. doi: 10.1038/s41592-022-01472-9. [DOI] [PubMed] [Google Scholar]
- 137.Korchak J.A., Jeffery E.D., Bandyopadhyay S., Jordan B.T., Lehe M.D., Watts E.F., Fenix A., Wilhelm M., Sheynkman G.M. IS-PRM-Based Peptide Targeting Informed by Long-Read Sequencing for Alternative Proteome Detection. J. Am. Soc. Mass Spectrom. 2024;35:2614–2630. doi: 10.1021/jasms.4c00119. [DOI] [PMC free article] [PubMed] [Google Scholar]