Figure - PMC

Skip to main content

An official website of the United States government

Here's how you know

Here's how you know

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

View full-text article in PMC

. 2023 Jan 20;9(3):eabq5072. doi: 10.1126/sciadv.abq5072

Search in PMC
Search in PubMed
View in NLM Catalog
Add to search

Copyright © 2023 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of Science. No claim to original U.S. Government Works. Distributed under a Creative Commons Attribution NonCommercial License 4.0 (CC BY-NC).

This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial license, which permits use, distribution, and reproduction in any medium, so long as the resultant use is not for commercial advantage and provided the original work is properly cited.

PMC Copyright notice

Fig. 2. — (A) Precision-recall curves for de novo SJ identification from raw long-read-to-genome alignments, combined over n = 3 direct RNA-seq replicates, using read count thresholds based on total aligned reads or perfectly aligned reads supporting a given SJ. (B) Distribution of transcript isoform categories among aligned reads, combined over n = 3 direct RNA-seq replicates, before and after de novo SJ correction. FSM or ISM indicates that all SJs in a read are consistent with those in an annotated SIRV transcript, with FSM and ISM reads representing full-length and fragmented reads, respectively. FSM reads are further partitioned into two subcategories (canonical and noncanonical) based on whether they contain SJs without the canonical splice site dinucleotide motif. NIC or NNC indicates a novel combination of annotated or novel splice sites, respectively, and reads classified as NIC or NNC have incorrect transcript structures with respect to SIRVs. NCD reads contain at least one putative SJ in the raw alignment that was evaluated as low-confidence but could not be corrected by ESPRESSO. (C) Sensitivity, precision, and F1 score of ESPRESSO and two other tools (StringTie2 and FLAMES) in discovering SIRV transcripts from direct RNA-seq data (n = 3), using random downsamples of different proportions of SIRV annotations as a guide. Each point represents the mean of three random samplings per downsampling level. (D) Box-and-whisker plots (median and interquartile range) and correlation (Pearson’s and Spearman’s) between known concentrations of 68 SIRV transcripts and their estimated abundances from ESPRESSO and five other tools (LIQA, NanoCount, FLAIR, StringTie2, and FLAMES). For each tool, transcript abundance is reported as the sum of assigned read counts over n = 3 direct RNA-seq replicates. Diameters of points in the box-and-whisker plots are scaled according to transcript length.