Skip to main content
. 2017 May 16;8:15311. doi: 10.1038/ncomms15311

Figure 1. Overview of Balaur.

Figure 1

(a) Illustration of the MHG index construction (IC) and candidate contig selection (P1) protocols for a given reference window or read sequence, respectively. Given the sequence MinHash fingerprint (computed using L random hash functions h1hL), we store the sequence in T MHG buckets according to the values of the T fingerprint projections (during IC); or we look up and merge the contents of the corresponding T buckets (during P1). (b) Illustration of the kmer voting protocol. The read and candidate contig sequences are decomposed into a set of kmers; initial LRS and GRS repeats are illustrated using identical shapes; the results of cryptographically hashing the kmers using different keys is shown using different colours; the results of masking LRS is shown using different stripe patterns. For simplicity, the voting results are shown without LRS neighbour masking and kmer position binning.