Table 5.
Comparison of ENCODE and Bhangale et al. (ten ENCODE regions) indel data
ENCODE (44 ENCODE regions/Baylor) | Bhangale et al. (ten ENCODE regions/Baylor) | |||||||
Indels | Rate (per 100 kb) | Indels | Rate (per 100 kb) | |||||
n | bp | n | bp | n | bp | n | bp | |
Manual | 2,186 | 6,504 | 14.6 | 43.4 | 362 | 1,122 | 13.0 | 40.4 |
Random | 2,300 | 6,506 | 15.3 | 43.4 | 502 | 1,350 | 14.3 | 38.6 |
Overall | 4,486 | 13,010 | 15.0 | 43.4 | 864 | 2,472 | 13.8 | 39.4 |
RNA transcription | ||||||||
CDS | 5 | 5 | 0.7 | 0.7 | 1 | 1 | 1.2 | 1.2 |
TSS | 2 | 2 | 3.3 | 3.3 | 0 | 0 | 0.0 | 0.0 |
RACEfrags | 9 | 28 | 2.1 | 6.6 | 0 | 0 | 0.0 | 0.0 |
TARs/transfrags | 37 | 78 | 5.8 | 12.3 | 6 | 11 | 7.5 | 13.7 |
Pseudo-exons | 9 | 26 | 6.6 | 19.1 | 2 | 10 | 9.7 | 48.7 |
3' UTR | 48 | 103 | 11.0 | 23.6 | 11 | 29 | 18.7 | 49.2 |
5' UTR | 7 | 32 | 6.0 | 27.4 | 4 | 8 | 37.3 | 74.6 |
TUF | 53 | 160 | 12.2 | 36.9 | 4 | 18 | 8.1 | 36.4 |
Open chromatin | ||||||||
FAIRE sites | 106 | 327 | 7.7 | 23.8 | 17 | 72 | 5.6 | 23.6 |
DHS (NHGRI) | 19 | 61 | 6.1 | 19.7 | 1 | 1 | 2.8 | 2.8 |
DHS (Regulome) | 43 | 135 | 8.6 | 27.0 | 15 | 40 | 8.5 | 22.6 |
DNA-protein intreraction/transcript Regulation | ||||||||
HisPolTAF | 141 | 348 | 13.1 | 32.4 | 32 | 114 | 12.8 | 45.5 |
Seq_specific (all motifs) | 131 | 420 | 11.2 | 35.8 | 28 | 122 | 33.4 | 145.3 |
SeqSp (sequence specific factors) | 54 | 225 | 10.2 | 42.5 | 9 | 45 | 5.1 | 25.6 |
Ancestral repeats | 532 | 1,592 | 7.9 | 26.5 | 110 | 280 | 8.7 | 22.1 |
Evolutionary constraint | ||||||||
MCS strict | 19 | 31 | 2.5 | 4.1 | 5 | 9 | 3.3 | 5.9 |
MCS moderate | 78 | 170 | 5.1 | 11.2 | 17 | 36 | 5.4 | 11.4 |
MCS loose | 356 | 960 | 9.8 | 26.4 | 63 | 136 | 8.4 | 18.1 |
Cell cycle | ||||||||
EarlyRepSeg | 1,124 | 2,989 | 16.4 | 43.5 | 161 | 495 | 16.4 | 50.4 |
MidRepSeg | 1,190 | 3,352 | 15.4 | 43.2 | 270 | 797 | 16.4 | 48.3 |
LateRepSeg | 1,110 | 3,345 | 13.9 | 41.9 | 300 | 819 | 11.3 | 31.0 |
Both datasets (Encyclopedia of DNA Elements [ENCODE] and that reported by Bhangale and coworkers [19]) are based on a subset of 8 African Americans (the Baylor samples). bp, base pairs; CDS, coding sequence; CI, confidence interval; DHS, DNAse hypersensitive sites; ENCODE, Encyclopedia of DNA Elements; FAIRE, formaldehyde assisted isolation of regulatory elements; kb, kilobases; MCS, multi-species conserved sequence; NHGRI, National Human Genome Research Institute; transfrag, transcribed fragment; RACEfrag, rapid amplification of cDNA ends fragment; SNP, single nucleotide polymorphism; TAR, transcriptionally active region; TSS, transcription start site; TUF, transcripts of unknown function; UTR, untranslated region.