Skip to main content
. 2019 Sep 13;20:199. doi: 10.1186/s13059-019-1809-x

Table 1.

Glossary of terms

Term Definition
Bit-pattern observable The run of 0 s in a binary string
Bit vector An array data structure that holds bits
Canonical k-mer The smallest hash value between a k-mer and its reverse complement
Hash function A function that takes input data of arbitrary size and maps it to a bit string that is of fixed size and typically smaller than the input
Jaccard similarity A similarity measure defined as the intersection of sets, divided by their union
K-mer decomposition The process of extracting all sub-sequences of length k from a sequence
Minimizer The smallest hash value in a set
Multiset A set that allows for multiple instances of each of its elements (i.e. element frequency)
Register A quickly accessible bit vector used to hold information
Sketch A compact data structure that approximates a data set
Stochastic averaging A process used to reduce the variance of an estimator