Skip to main content
. 2022 Aug 24;38(20):4812–4813. doi: 10.1093/bioinformatics/btac564

Fig. 1.

Fig. 1.

(a) Linear increase of the wall clock time required by ntHash2 to generate spaced seed hashes (Seeds 1–6) from 1 million random 1 kbp sequences (one hash value per seed). Hashing spaced seeds with more blocks and monomers takes more time. (b) Histogram of a million k-mer hashes generated by ntHash2 from random 100-mers. Hash values (H) are distributed uniformly in the normalized 64-bit word space (x-axis). The mean and standard deviation of the bin counts are 1000 ± 31.29, which is close to the ideal value of 1000 hashes per bin. (c) Average wall clock time elapsed by ntHash2 and similar hashing algorithms on a unique dataset (106 random 1 kbp sequences) over 3 runs. Standard deviation was negligible (<500 ms for all tools). Spaced seed patterns (Seed1–Seed6) are described in Supplementary Section S6