Table 1.
Dataset | Classification | Numbers | Length | |||
---|---|---|---|---|---|---|
(Multiclass/Binary) | Sum | Average | Max | |||
Dataset_HERV (HERV) |
HERV_Coding/HERV | 16,556 | 22,643,342 | 1,368 | 125,188 | |
HERV_Non-Coding/HERV | 400,860 | 295,693,006 | 738 | 112,801 | ||
Non-HERV_Coding/ Non-HERV | 18,994 | 23,490,411 | 1,237 | 83,426 | ||
Non-HERV_Non-Coding/ Non-HERV | 343,913 | 251,939,874 | 733 | 93,893 | ||
Dataset_Immuno (Immunogenetic) |
Immuno_KIR/Immuno | 1,723 | 3,096,500 | 1,797 | 39,269 | |
Immuno_Others/Immuno | 16,719 | 83,307,498 | 4,983 | 71,249 | ||
Non-Immuno | 18,326 | 78,935,043 | 4,307 | 93,519 | ||
Dataset_Regulatory (Regulatory) |
TF_binding_site/Regulatory | 22,860 | 13,391,832 | 586 | 93,893 | |
Enhancer/Regulatory | 320,389 | 335,938,843 | 1,049 | 93,893 | ||
CTCF_binding_site/Regulatory | 94,738 | 52,273,466 | 552 | 93,893 | ||
Promoter/Regulatory | 52,413 | 79,591,023 | 1,519 | 99,371 | ||
Open_chromatin_region/Regulatory | 130,006 | 54,838,694 | 422 | 93,893 | ||
Non-Regulatory | 494,389 | 435,738,664 | 881 | 96,745 | ||
Dataset_Diseases_GWAS (GWAS_loci/PrimateAI-3D scores) |
Diseases-GWAS_Coding/Diseases_GWAS | 175,920 | 566,540,577 | 3,220 | 99,371 | |
Diseases-GWAS_Non-Coding/Diseases_GWAS | 51,427 | 132,677,910 | 2,580 | 92,731 | ||
Non_Diseases-GWAS_Coding/Non_Diseases_GWAS | 148,235 | 550,157,564 | 3,711 | 93,893 | ||
Non_Diseases-GWAS_Non-Coding/Non_Diseases_GWAS | 31,298 | 104,453,319 | 3,337 | 93,893 | ||
Dataset_Highly_Specifically_Gene (Defensins/Olfactory Receptor) |
Defensins/Highly_Specifically_Gene | 415 | 418,877 | 1,009 | 9,191 | |
Olfactory_Receptor/ Highly_Specifically_Gene | 5,400 | 5,305,444 | 983 | 40,645 | ||
Others | 5,610 | 6,653,032 | 1,186 | 94,155 |