Skip to main content
. 2021 Jun 21;16:10. doi: 10.1186/s13015-021-00192-7

Table 2.

The weights and sizes of various string set representations

Dataset UST ESS-Tip-Compress ESS-Compress Eq. (3.1) lower bound
# strings #char/ k-mer # strings #char/ k-mer # strings #char/ k-mer #char/ k-mer
R. sphaeroides 240,562 2.22 61,909 1.38 36,456 1.29 1.28
Human RNA-seq 4,098,389 2.22 1,834,945 1.60 1,098,938 1.42 1.39
Gingiva metagenome 3,095,476 1.91 1,499,270 1.48 917,388 1.33 1.32
Soybean RNA-seq 1,806,078 1.49 1,137,350 1.32 515,244 1.17 1.17
Tongue metagenome 6,030,814 2.10 2,664,422 1.53 1,327,701 1.33 1.32
Whole human 22,072,219 1.32 21,320,263 1.28 10,321,275 1.15 1.14

The rightmost column shows the lower bound computed by Eq. (3.1) in Sect. "The weight of the ESS-Compress representation". The weight of ESS-Compress was verified to be the same as predicted by Theorem 3.2

HHS Vulnerability Disclosure