Table 2.
Indel density for annotation features (across all 44 ENCODE regions)
Indels | Rate (number per 100 kb) | Rate (bp per 100 kb) | |||||
n | bp | n | 99% CI | bp | 99% CI | Feature length (kb) | |
Manual | 2,186 | 6,504 | 14.6 | 11.7 to 18.2 | 43.4 | 34.4 to 54.7 | 14,998 |
Random | 2,300 | 6,506 | 15.3 | 13.6 to 17.3 | 43.4 | 37.5 to 50.2 | 15,000 |
Overall | 4,486 | 13,010 | 15.0 | 13.4 to 16.7 | 43.4 | 38.3 to 49.1 | 29,998 |
RNA transcription | |||||||
CDS | 5 | 5 | 0.7 | 0.1 to 8.6 | 0.7 | 0.1 to 8.6 | 675 |
TSS | 2 | 2 | 3.3 | 3.3 | 61 | ||
RACEfrags | 9 | 28 | 2.1 | 0.8 to 5.4 | 6.6 | 1.3 to 33.9 | 425 |
TARs/transfrags | 37 | 78 | 5.8 | 3.5 to 9.6 | 12.3 | 6.8 to 22.3 | 634 |
Pseudo-exons | 9 | 26 | 6.6 | 2.6 to 16.6 | 19.1 | 5.8 to 63.3 | 136 |
3' UTR | 48 | 103 | 11.0 | 7.2 to 16.7 | 23.6 | 13.5 to 41.3 | 436 |
5' UTR | 7 | 32 | 6.0 | 1.6 to 22.3 | 27.4 | 3.8 to 198.7 | 117 |
TUF | 53 | 160 | 12.2 | 7.8 to 19.2 | 36.9 | 20.2 to 67.6 | 433 |
Open chromatin | |||||||
FAIRE-sites | 106 | 327 | 7.7 | 5.6 to 10.6 | 23.8 | 15.5 to 36.7 | 1,372 |
DHS (NHGRI) | 19 | 61 | 6.1 | 3.3 to 11.3 | 19.7 | 8.3 to 46.9 | 310 |
DHS (Regulome) | 43 | 135 | 8.6 | 5.3 to 14.0 | 27.0 | 13.4 to 54.4 | 499 |
DNA-protein intreraction/transcript regulation | |||||||
HisPolTAF | 141 | 348 | 13.1 | 10.0 to 17.2 | 32.4 | 22.5 to 46.5 | 1,076 |
Seq_specific (all motifs) | 131 | 420 | 11.2 | 8.3 to 15.0 | 35.8 | 23.1 to 55.3 | 1,174 |
SeqSp (sequence specific factors) | 54 | 225 | 10.2 | 6.2 to 16.7 | 42.5 | 20.1 to 89.5 | 530 |
Ancestral repeats | 532 | 1,592 | 7.9 | 6.7 to 9.2 | 26.5 | 21.7 to 32.5 | 5,998 |
Evolutionary constraint | |||||||
MCS strict | 19 | 31 | 2.5 | 1.3 to 5.1 | 4.1 | 1.6 to 10.4 | 748 |
MCS moderate | 78 | 170 | 5.1 | 3.5 to 7.6 | 11.2 | 6.8 to 18.5 | 1,515 |
MCS loose | 356 | 960 | 9.8 | 8.2 to 11.7 | 26.4 | 20.9 to 33.4 | 3,637 |
Cell cycle | |||||||
EarlyRepSeg | 1,124 | 2,989 | 16.4 | 13.8 to 19.4 | 43.5 | 33.3 to 56.9 | 6,868 |
MidRepSeg | 1,190 | 3,352 | 15.4 | 13.5 to 17.5 | 43.2 | 35.3 to 53.0 | 7,751 |
LateRepSeg | 1,110 | 3,345 | 13.9 | 12.1 to 15.9 | 41.9 | 32.9 to 53.3 | 7,991 |
bp, base pairs; CDS, coding sequence; CI, confidence interval; DHS, DNAse hypersensitive sites; ENCODE, Encyclopedia of DNA Elements; FAIRE, formaldehyde assisted isolation of regulatory elements; kb, kilobases; MCS, multi-species conserved sequence; NHGRI, National Human Genome Research Institute; transfrag, transcribed fragment; RACEfrag, rapid amplification of cDNA ends fragment; TAR, transcriptionally active region; TSS, transcription start site; TUF, transcripts of unknown function; UTR, untranslated region.