Figure - PMC

Skip to main content

An official website of the United States government

Here's how you know

Here's how you know

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

View full-text article in PMC

. 2020 Feb 10;18:63. doi: 10.1186/s12967-020-02247-6

Search in PMC
Search in PubMed
View in NLM Catalog
Add to search

© The Author(s) 2020

Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

PMC Copyright notice

Fig. 2 — CLEAR Workflow: bin-based coverage analysis by transcript expression. a Data analysis workflow using CLEAR to preprocess lcRNA-seq data. Step 1: Trimmed lcRNA-seq reads are aligned to the reference genome; Step 2: μ_i, the mean of the positional distribution of aligned reads along each individual transcript, is determined; Step 3: Transcript positional means, μ_i, (y-axis) are ranked and then binned by the transcript read coverage (x-axis). When μ_i of a bin is ≈ 0, the read distribution is symmetrical along the length of the transcript. When μ_i within a bin develops a bimodal distribution with a mode toward + 1 (TTS) and − 1 (TSS), its values will deviate from 0; Step 4: All available transcripts, binned into groups of 250 are fitted to a bimodal distribution model. The emergence of a bimodal distribution identifies when aggregate μ_i start to deviate from a unimodal distribution around the center of the transcripts, indicated by a change in the fitting parameters a and b; Step 5: When either of the model parameters exceed a value of 2 (indicated by a gray line), transcripts beyond that point are excluded by CLEAR for differential gene expression and other downstream analysis; Step 6: CLEAR transcripts are used in downstream between-group analyses such as hierarchical clustering; b example lcRNA-seq read coverage plots. Read coverage plot for GAPDH depicts a transcript with μ_i ~ 0, RPS7 depicts a transcript close to the CLEAR cutoff, while DDAH2 depicts a transcript deemed too noisy by CLEAR; c CLEAR profiles for 10-, 100- and 1000-pg input mass lcRNA-seq data. The value of μ_i is plotted for the 7000 highest expressed primary transcripts for three representative samples. The red line depicts the CLEAR filtering threshold; d violin plots of the same data as shown in c. The end marks indicate the window extrema and the middle bar indicates the mean