Table 3.
Distributions of all 25-nt segments of the MAC genome beginning with 5′-TNG, and of 5′-UNG, 25-nt MAC scnRNAs, over annotated features of the MAC genome
Features | 5′TNG in MAC genome |
5′UNG MAC scnRNAs |
Statistical test |
|||
---|---|---|---|---|---|---|
Numbers | Frequencies | Observed | Expected (e) | χ2 | P-value | |
Intergenic regionsa | 1265012 | 0.169 | 31 | 29.3 | 0.06 | 0.80 |
Coding exons,b sense str. | 2922337 | 0.391 | 71 | 67.6 | 0.21 | 0.65 |
Introns,c sense str. | 246029 | 0.033 | 5 | 5.7 | 0.01 | 0.94 |
Coding exons,b antisense str. | 2859122 | 0.382 | 61 | 66.1 | 0.52 | 0.47 |
Introns,c antisense str. | 181929 | 0.024 | 4 | 4.2 | 0.02 | 0.89 |
Otherd | 6469 | 0.001 | 1 | 0.1 | 0.82 | 0.36 |
Total | 7480898 | 1.000 | 173 | 173.0 |
aMany scnRNAs mapping to intergenic regions may be part of non-coding exons, because intergenic regions are very short (352 bp on average) and most 5′ and 3′UTRs have not been annotated.
bscnRNAs mapping in coding exons, including those overlapping 5′ or 3′UTRs, but excluding those overlapping introns.
cscnRNAs overlapping intronic sequences. Because introns are very short (25 nt on average), very few scnRNAs are expected to be entirely within introns.
dAll other cases. The one case observed overlaps the coding sequences of two closely spaced, convergent genes.
eRandom expectation is based on the actual distribution of 25 nt, 5′-TNG segments of the MAC genome. The non-significant P-values in the last column (Pearson's χ2 test with Yates’ continuity correction) indicate that the fractions of scnRNAs mapping to the different types of annotated features are consistent with a random distribution (confirmed by a χ2 test comparing observed and expected distributions, P = 0.35).