Skip to main content
. Author manuscript; available in PMC: 2016 Jan 20.
Published in final edited form as: Nat Biotechnol. 2015 Sep 14;33(10):1053–1060. doi: 10.1038/nbt.3329

Figure 3.

Figure 3

Latent Semantic Analysis pipeline. Metagenomic samples containing multiple species (depicted by different colors) are sequenced. Every k-mer in every sequencing read is hashed to one column of a matrix. Values from each sample occupy a different row. Singular value decomposition of this k-mer abundance matrix defines a set of eigengenomes. K-mers are clustered across eigengenomes, and each read is partitioned based on the intersection of it’s k-mers with each of these clusters. Each partition contains a small fraction of the original data, and can be analyzed independent of all others.