Table 1. The exaptation of mobile element classes and superfamilies.
Class | Superfamily | Exapted Elements | Exapted Bases | Genomic Elements | Genomic Bases | Genomic Elements Per Exaptation | Genomic Bases Per Exapted Base |
L1 | 67,103 | 1,963,366 | 937,370 | 511,375,943 | 13.9 | 260.4 | |
L2 | 46,532 | 946,311 | 462,005 | 103,894,644 | 9.9 | 109.7 | |
CR1 | 19,644 | 58,6282 | 60,731 | 10,855,797 | 3 | 18.5 | |
RTE | 6,218 | 156,338 | 17,696 | 3,652,083 | 2.8 | 23.3 | |
Dong-R4 | 797 | 25,967 | 550 | 120,346 | 0.6 | 4.6 | |
RTE-BovB | 398 | 12,401 | 659 | 74,688 | 1.6 | 6 | |
L1-like | 44 | 1,715 | 83 | 6,788 | 1.8 | 3.9 | |
LINE | 140,760 | 3,695,873 | 1,479,094 | 629,957,456 | 10.5 | 170.4 | |
MIR | 61,335 | 1,122,485 | 590,380 | 84,230,914 | 9.6 | 75 | |
Deu | 1,815 | 70,613 | 1,266 | 178,943 | 0.6 | 2.5 | |
Alu | 1,624 | 107,958 | 1,174,518 | 306,522,171 | 723.2 | 2,839.2 | |
SINE | 1,602 | 66,502 | 964 | 161,994 | 0.6 | 2.4 | |
tRNA | 1,026 | 27,838 | 1,652 | 229,877 | 1.6 | 8.2 | |
SINE | 67,418 | 1,397,359 | 1,768,780 | 391,323,899 | 26.2 | 280 | |
hAT-Charlie | 23,994 | 515,751 | 251,682 | 44,862,356 | 10.4 | 86.9 | |
TcMar-Tigger | 9,024 | 264,739 | 102,787 | 33,907,139 | 11.3 | 128 | |
hAT-Tip100 | 2,663 | 64,363 | 30,206 | 6,602,950 | 11.3 | 102.5 | |
TcMar-like | 2,380 | 127,346 | 3,426 | 624,957 | 1.4 | 4.9 | |
DNA | 1,894 | 88,980 | 2,750 | 339,865 | 1.4 | 3.8 | |
TcMar-Mariner | 1,496 | 39,382 | 16,229 | 2,815,735 | 10.8 | 71.4 | |
TcMar-Tc2 | 1,463 | 34,606 | 8,083 | 1,664,901 | 5.5 | 48.1 | |
hAT-Blackjack | 1,360 | 31,624 | 19,571 | 3,415,244 | 14.3 | 107.9 | |
hAT | 767 | 12,459 | 12,421 | 1,673,724 | 16.1 | 134.3 | |
TcMar | 674 | 17,320 | 1,940 | 319,735 | 2.8 | 18.4 | |
PiggyBac-like | 458 | 18,964 | 239 | 44,436 | 0.5 | 2.3 | |
hAT-like | 323 | 6,851 | 3,027 | 503,467 | 9.3 | 73.4 | |
PiggyBac | 80 | 3,041 | 2,115 | 497,959 | 26.4 | 163.7 | |
MuDR | 14 | 1,302 | 1,972 | 686,896 | 140.8 | 527.5 | |
Merlin | 1 | 56 | 55 | 17,595 | 55 | 314.1 | |
DNA | 46,561 | 1,226,696 | 456,503 | 97,959,784 | 9.8 | 79.8 | |
ERVL-MaLR | 14,468 | 289,612 | 343,284 | 110,688,741 | 23.7 | 382.1 | |
ERVL | 8,441 | 185,880 | 157,889 | 56,087,725 | 18.7 | 301.7 | |
ERV1 | 2,855 | 81,186 | 172,636 | 83,248,758 | 60.4 | 1,025.4 | |
Gypsy | 1,815 | 38,904 | 10,760 | 2,295,297 | 5.9 | 58.9 | |
Gypsy-like | 1,323 | 26,101 | 7,808 | 1,454,545 | 5.9 | 55.7 | |
LTR | 837 | 22,332 | 2,196 | 472,591 | 2.6 | 21.1 | |
ERVL-like | 320 | 6,700 | 1,782 | 413,433 | 5.5 | 61.7 | |
ERV | 35 | 579 | 580 | 191,020 | 16.5 | 329.9 | |
ERVK | 7 | 271 | 10,455 | 8,790,037 | 1,493.5 | 32,435.5 | |
LTR | 30083 | 651,379 | 707,390 | 263,530,842 | 23.5 | 404.5 | |
Total | 284,857 | 6,988,191 | 4,411,767 | 1,382,528,004 | 15.4 | 197.8 |
We show the contribution of mobile elements as a whole, as well as the various classes and superfamilies, to the creation of putative gene regulatory elements on the human lineage. The numbers from the superfamilies do not always add up perfectly to the number for the class. This is because a CNEE is not counted as being exapted from a mobile element unless more than 50% of its bases are annotated as having repeat origins. A CNEE where 45% of the bases are annotated as coming from an L1 insertion and 45% from an L2 insertion will not appear as either an L1 or L2 exaptation, but will be counted as a LINE exaptation. A very small number of bases are also annotated by RepeatMasker as having come from more than one mobile element. Both these situations are rare in our set and the difference in counting never amounts to more than 35 elements.