Skip to main content
. 2017 Nov 24;34(10):1659–1665. doi: 10.1093/bioinformatics/btx753

Fig. 2.

Fig. 2.

Size of BAM files vs IKC files for each sample. Since low frequency k-mers are removed, the size of an IKC file does not continue to grow with read depth once a full set of representative k-mers are present. Since BAM files have a record for each read, their size does increase with the number of reads. Samples removed for suspected contamination and low coverage are shown in red. Low-coverage samples lack a representative set of k-mers at sufficient frequency, and so the file size falls below the distribution. Samples with contamination contain k-mers that do not belong to the sample, and so their size rises above the distribution (Color version of this figure is available at Bioinformatics online.)