Table 3.
MIR | Expression in cell linesa | Predicted length of primary transcript(s)b |
---|---|---|
MIR_dup2285 (chr16:22309780-22309939) | GM12878, H1-hESC, HeLa-S3, HepG2, HUVEC, K562, NHEK | 124/131 (T4), 223/230 (T3GT2), 243/250 (T4), 354/361 (T3CT, downstream of annotated MIR but within the cloned sequence) |
MIR_dup3493 (chr1:34943459-34943727) | GM12878, H1-hESC, K562, NHEK | 177 (TAT3), 213 (TAT3), 277 (T2AT2). Expected transcripts originating from terminators more downstream of the annotated MIR but within the cloned sequence: 305 (T2AT2), 365 (T2AT3), 393 (T4) |
MIRb_dup5848 (chr2:71762977-71763215) | H1-hESC, HepG2, NHEK | 119 (TCT3), 256 (T3CT), 358 (T5) |
MIRc_dup2189 (chr14:89445565-89445634) | H1-hESC, K562, NHEK | 137 (T3AT) and 140 (T4), 207 (T2GT3), 250 (T5) |
aThe column lists, for each MIR element, the cell lines in which it was found to be expressed by ENCODE RNA-seq data analysis.
bThe reported transcript lengths were calculated by assuming as TSS the A or G residue closest to the position 12 bp upstream of the A box. To estimate the 3’ end of the transcript, both canonical (Tn with n ≥ 4) and non-canonical T-rich Pol III terminators [17] were considered both within and downstream of MIR body sequence (indicated in parentheses after the transcript length). For canonical terminators, the four Us corresponding to the first four Ts of the termination signal were considered as part of the transcripts; for non-canonical terminators, all the nucleotides of the terminator were considered as incorporated into the RNA. In the case of MIR_dup2285, for which two possible A boxes could drive transcription, the expected lengths of both putative alternative transcripts are indicated (Supplementary Fig. S1).