Figure - PMC

Skip to main content

An official website of the United States government

Here's how you know

Here's how you know

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

View full-text article in PMC

. 2023 Jun 30;39(Suppl 1):i260–i269. doi: 10.1093/bioinformatics/btad233

Search in PMC
Search in PubMed
View in NLM Catalog
Add to search

© The Author(s) 2023. Published by Oxford University Press.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

PMC Copyright notice

Figure 3. — The Matrix-SBWT k-mer index and the mapping to color sets. This figure continues from the example in Fig. 2. The columns of the SBWT matrix represent the k-mers of the input data, with technical dummy prefixes containing dollar-symbols added to the k-mers ending in the first k positions of the input sequences. The k-mers are shown vertically at the top (for illustration purposes only—they are not explicitly stored), and the SBWT matrix is the binary matrix in the middle with 4 rows. Each row corresponds to a character of the alphabet, and a 1-bit at cell (i, j) indicates that the jth k-mer has a different (k-1)-suffix from the previous k-mer, and has an outgoing edge such that the last character of the edge $(k + 1)$ -mer is the ith character of the alphabet. See Alanko et al. (2022) for a more in-depth explanation. The columns shaded in gray are the key k-mers, which are also marked in the bit vector below the SBWT matrix. The key k-mers are associated with the color sets at the bottom. The sparse sets are encoded as lists of integers, whereas the dense sets are encoded as bit maps. The mapping from key k-mers to the color sets, that is represented by lines in the figure, is implemented by marking with another bit vector (not pictured) whether the set is sparse or dense, and using a bit vector rank query to find the index of the set within the color sets of its type (sparse or dense). Color sets of a single type are stored in concatenated form, with pointers to the starts of the sets.