Table A.
Sequence |
Enrichment |
|||||
Junction |
||||||
Repeat Classa | Fraction | No. | Control Fraction | Finished Fraction | Junction vs. Control Ratio | Junction vs. Finished Ratio |
Repetitive repeats: | ||||||
SINE: | 24.6 | 2,680 | 15.3 | 12.1 | 1.6 | 2.0 |
Alu | 23.9b | 2,525 | 14.2c | 10.3 | 1.7b | 2.3 |
MIR | .7 | 85 | 1.1 | 1.9 | .6 | .4 |
LINE: | 11.9 | 1,399 | 15.9 | 19.2 | .7 | .6 |
L1 | 11.0 | 1,304 | 14.3 | 16.5 | .8 | .7 |
L2 | .6 | 68 | 1.4 | 2.5 | .4 | .3 |
LTR: | 5.8 | 730 | 9.4 | 8.0 | .6 | .7 |
ERVL | .8 | 92 | .9 | 1.3 | .9 | .6 |
ERVK | .8 | 119 | 1.0 | .3 | .8 | 2.5 |
ERV1 | 2.9 | 371 | 4.8 | 2.9 | .6 | 1.0 |
MaLR | 1.2 | 147 | 2.6 | 3.5 | .5 | .4 |
DNA: | 1.3 | 147 | 2.1 | 2.7 | .6 | .5 |
MER1 type | .4 | 44 | .9 | 1.2 | .4 | .3 |
MER2 type | .8 | 87 | .9 | 1.0 | .9 | .8 |
Low complexity | .8 | 89 | .7 | .6 | 1.2 | 1.3 |
Other | .1 | 13 | .2 | .1 | .5 | .7 |
Satellite | 6.1b | 616 | 3.0c | .3 | 2.1b | 21.9 |
Simple repeat | 1.3 | 149 | 1.6 | .9 | .8 | 1.5 |
Total repetitive | 52.0 | 5,829 | 48.2 | 44.0 | 1.1 | 1.2 |
GC Content | 44.5 | … | 43.9 | 40.6 | … | … |
Note.— A total of 2,366 alignments (9,464 junction positions) with 90%–99.8% sequence identity and length ⩾5 kb. The junction fraction was calculated as the average repeat content within the 9,464 junctions, each 10 bp in size, for a total of 94,640 bp of sequence. Control regions were defined as the duplicated sequence plus 1 kb of flanking sequence and totaled 108,720,232 bp of sequence. The control fraction, therefore, represented the repeat content within the duplicated portion of the genome. The finished fraction considered the repeat content within the finished portion of the human genome assembly (build 30). The enrichment compares the relative repeat content of the junction to the control or finished regions of the genome.
MIR = mammalianwide interspersed repeat; ERV = endogenous retrovirus; MER = medium-reiteration frequency sequence; MaLR = mammalian LTR retrosequence.
P<.0001, by random sampling of control regions (10,000 replicates; maximum simulated fraction 16.3% for Alu and 4.9% for satellite).
P<.0001, by random sampling of finished regions compared with the duplicated control regions.