Figure - PMC

Skip to main content

An official website of the United States government

Here's how you know

Here's how you know

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

View full-text article in PMC

. 2015 Apr 15;11(4):e1005147. doi: 10.1371/journal.pgen.1005147

Search in PMC
Search in PubMed
View in NLM Catalog
Add to search

© 2015 Shalem et al

This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.

PMC Copyright notice

Fig 3 — (A) Heat map showing the mean effect of a mutation as a function of location in the 3’ end sequence. Each row represents one sequence and the color represents the mean expression fold change across two replicates between the mutated and wild type sequences. Rows are sorted by the location of the maximal affecting mutation. (B) Heat map of predicted logistic values on a held-out test set (see main text and methods). Location of subsequences correspond to those in Fig 3A. (C) Frequency of AT dinucleotide, highest weighted feature in the inferred model, in sliding windows of 20bp. Location of subsequences correspond to those in Fig 3A. (D) Table of the features that contribute most to the classification. Color represents the mean coefficient across the 10 cross validation partitions. For each possible mono/di-nucleotide three types of features were considered: ‘[0|1]’ – a binary feature that is one if the specified mono/di-nucleotide occurs at least once in the sequence and zero otherwise, ‘#’ – a counter of the number that the specified mono/di-nucleotide occurs in the sequence. ‘%’ percent of nucleotides of the sequence that are part of an occurrence of the specified mono/di-nucleotide. (E) DNA sequence motif found to be enriched in the positive subsequence instances. (F) Distribution of distances between the location (center) of the mutation that resulted in the maximal reduction in expression and the location of the main polyadenylation site for the wild type sequence. (G) Results of YFP specific 3’ RACE, where each lane represents 4 expression bins. Lowest lane displays long aberrant 3’UTRs not apparent in the higher expression bins.