Table 2.
Dataset | UST | ESS-Tip-Compress | ESS-Compress | Eq. (3.1) lower bound | |||
---|---|---|---|---|---|---|---|
# strings | #char/ -mer | # strings | #char/ -mer | # strings | #char/ -mer | #char/ -mer | |
R. sphaeroides | 240,562 | 2.22 | 61,909 | 1.38 | 36,456 | 1.29 | 1.28 |
Human RNA-seq | 4,098,389 | 2.22 | 1,834,945 | 1.60 | 1,098,938 | 1.42 | 1.39 |
Gingiva metagenome | 3,095,476 | 1.91 | 1,499,270 | 1.48 | 917,388 | 1.33 | 1.32 |
Soybean RNA-seq | 1,806,078 | 1.49 | 1,137,350 | 1.32 | 515,244 | 1.17 | 1.17 |
Tongue metagenome | 6,030,814 | 2.10 | 2,664,422 | 1.53 | 1,327,701 | 1.33 | 1.32 |
Whole human | 22,072,219 | 1.32 | 21,320,263 | 1.28 | 10,321,275 | 1.15 | 1.14 |
The rightmost column shows the lower bound computed by Eq. (3.1) in Sect. "The weight of the ESS-Compress representation". The weight of ESS-Compress was verified to be the same as predicted by Theorem 3.2