Figure - PMC

Skip to main content

An official website of the United States government

Here's how you know

Here's how you know

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

View full-text article in PMC

. 2018 May 1;19:56. doi: 10.1186/s13059-018-1432-2

Search in PMC
Search in PubMed
View in NLM Catalog
Add to search

© The Author(s). 2018

Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

PMC Copyright notice

Fig. 2 — Performance of three alternative regression methods for inferring E–P models. a Performance of ordinary least squares (OLS), generalized linear model with negative binomial distribution (GLM.NB), and zero-inflated negative binomial (ZINB) regression using the binary test. Point (x,y) on a plot indicates that a fraction x of the models had − log₁₀[q-value] < y computed by Wilcoxon rank sum test. OLS yields a higher fraction of validated models at any q-value cutoff. b Same as a but using the activity level validation test, with p values computed by the Spearman correlation test. Here too, OLS yields a higher fraction of validated models than the other methods. c Number of promoters whose OLS models passed (at q < 0.1) each of the tests (or none). d The distribution of the number of positive samples (samples in which the promoter is active, i.e., has RPKM≥1) for promoters in each category. e Comparison between the R² values with and without cross-validation (CV). Each dot is a promoter model. Blue dots denote models with R² ≥ 0.5 and $R_{CV}^{2} \geq 0.25$ . Red dots denote models with and R² > 0.5 and $R_{CV}^{2} < 0.25$ corresponding to over-fitted models with low predictive power on novel samples. f A promoter whose model as computed without CV has a very high R² (left plot) but when CV is applied a low $R_{CV}^{2}$ is obtained (right plot). This example demonstrates the sensitivity of R² (and Pearson correlation) to outliers. ρ_s Spearman correlation, Q-value FDR-corrected p value