Figure - PMC

Skip to main content

An official website of the United States government

Here's how you know

Here's how you know

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

View full-text article in PMC

. 2015 Dec 23;6(2):435–446. doi: 10.1534/g3.115.023119

Search in PMC
Search in PubMed
View in NLM Catalog
Add to search

Copyright © 2016 Cross

This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PMC Copyright notice

Statistics of uORF length and number in the genome, and in partially or fully randomized controls. (A) Class 1 but not Class 2 or Class 3 uORFs are longer than the random expectation. Cumulative length distribution of three classes of uORFs for the genome (green) and for randomized or mutagenized controls (three replicates of each). ‘Scramble’: each 5UT sequence was randomized (yellow); ‘Rand(dinuc)’: sequences of the same length as the real 5-UT sequences were constructed with identical dinucleotide frequencies to the overall ‘5UT-ome’ (red); Mut 0.1/0.2/0.5: the set of 5UT sequences was ‘mutagenized’ by replacing one in 10, one in five, or one in two nucleotides in each 5UT with random selections from the overall nucleotide frequency distribution of the complete collection of 5UT sequences. (Note: the randomized distribution for all classes is essentially identical to the class 3 length distribution for the actual genomic Class 3 sequences.) The indicated box in each graph is blown up at right to show high reproducibility of randomized results for the three replicates. (B) Total numbers of uORFs with and without randomization. The small red bar represents a hypothetical standard deviation based on the assumption that numbers in each category are Poisson-distributed (square root of the number observed). Stars represent P-values for a t-test comparing each randomization to the genome, using these standard deviations: * P < 0.05; ** P < 0.01; *** P < 0.001). Randomizing by scrambling (shown) or by dinucleotide frequencies gave very similar results.