Table 1.
For each dataset (on columns) we report the number of samples per class, as well as the cardinality of the dataset composed solely by enhancers and promoters (rows “Total E+P”) for genome version hg19 (Top table) and genome version hg38 (Bottom table)
Genome version | Labels | HepG2 | K562 | GM12878 | Total | HelaS3 |
---|---|---|---|---|---|---|
hg19 | Active enhancer (AE) | 1465 | 894 | 2878 | 5237 | 1847 |
Inactive enhancer (IE) | 34,556 | 34,392 | 28,156 | 97,104 | 32,179 | |
Active promoter (AP) | 11,467 | 10,076 | 10,816 | 32,359 | 10,759 | |
Inactive promoter (IP) | 96,184 | 82,829 | 73,891 | 252,904 | 79,004 | |
Total E + P | 143,672 | 128,191 | 115,741 | 387,604 | 123,789 | |
Active exon (AX) | 9931 | 9033 | 8226 | 9123 | ||
Inactive exon (IX) | 19,071 | 20,261 | 19,078 | 22,071 | ||
Unknown (UK) | 79,417 | 78,081 | 80,004 | 81,502 | ||
Total | 25,209 | 235,566 | 223,049 | 236,485 | ||
hg38 | Active enhancer (AE) | 7177 | 5524 | 11,589 | 24,290 | |
Inactive enhancer (IE) | 56,108 | 57761 | 51,696 | 165,565 | ||
Active promoter (AP) | 14,092 | 12,524 | 14,036 | 40,652 | ||
Inactive promoter (IP) | 85,789 | 87,357 | 85,845 | 258,991 | ||
Total E + P | 163,166 | 163,166 | 163,166 | 489,498 |
Column “Total” allows comparing the total cardinality of CRRs across the hg19 and the hg38-datasets. Since we also have non-CRRs regions for genome version hg19, row “Total” in the top table reports the total number of samples per cell line in the hg19 dataset