Skip to main content
. 2009 Aug 11;4(8):e6478. doi: 10.1371/journal.pone.0006478

Figure 6. Association of the top-scoring 5% of predictions with protein-coding genes.

Figure 6

Left: distance from intergenic predictions to the nearest protein-coding gene. “Upstream” means the prediction is upstream of the gene. The red bars show the empirically observed distribution; the grey bars show the distribution that would be expected if hits were uniformly distributed across the genome. We observe a clear depletion of intergenic predictions near the 3′ end of protein-coding genes. Right: frequencies of predictions enclosed completely by introns, separated by intron number (i.e. the position of the intron in the ordered list of introns associated with the parent gene). The blue bars show the empirically observed counts; the error bars show 99% bounds for a uniform random distribution of bases across all chromosomes (excluding bases outside introns). Introns shared by multiple transcripts were counted multiple times. There is a depletion of predictions in the first intron.