Figure - PMC

Skip to main content

An official website of the United States government

Here's how you know

Here's how you know

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

View full-text article in PMC

. 2018 Mar 13;46(8):e45. doi: 10.1093/nar/gky053

Search in PMC
Search in PubMed
View in NLM Catalog
Add to search

© The Author(s) 2018. Published by Oxford University Press on behalf of Nucleic Acids Research.

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com

PMC Copyright notice

Figure 1. — The SEASTAR pipeline for the computational identification and quantitative analysis of first exons using RNA-seq data alone. (A) Alternative transcription start sites (TSSs) can appear in two forms: alternative first exons (AFEs) and alternative tandem TSSs. (B) The reference guided transcript assembly: the reference annotation based transcript (RABT) assembly method is used to assemble novel transcripts using RNA-seq reads guided by the existing transcriptome annotation. (C) The generation of non-redundant first exons (FEs): transcripts from all samples are merged to generate a non-redundant set of FEs. (D) The quantitation of exon and splice junction coverage: reads mapped to each FE and its downstream splice junction are counted as the coverage for each FE. (E) The identification of bona fide FEs: five methods are designed and compared. The logistic regression model (highlighted in bold) is selected as the method of choice in SEASTAR due to its superior performance. (F) The detection of differential AFE usage: the percent-spliced-in (PSI) value for each AFE in each sample is calculated using the read counts and effective lengths of all AFEs within the gene. The rMATS statistical test is used to determine whether the AFE has significant differential usage between two samples or two groups of samples.