Skip to main content
. Author manuscript; available in PMC: 2014 Nov 1.
Published in final edited form as: Nat Biotechnol. 2014 Apr 20;32(5):462–464. doi: 10.1038/nbt.2862

Figure 1.

Figure 1

Overview of the Sailfish pipeline. Sailfish consists of an indexing phase (a) that is invoked via the command `sailfish index' and a quantification phase (b) invoked via the command `sailfish quant'. The Sailfish index has four components: (1) a perfect hash function mapping each k-mer in the transcript set to a unique integer between 0 and N – 1, where N is the number of unique k-mers in the set of transcripts; (2) an array recording the number of times each k-mer occurs in the reference set; (3) an index mapping each transcript to the multiset of k-mers that it contains; (4) an index mapping each k-mer to the set of transcripts in which it appears. The quantification phase consists of counting the indexed k-mers in the set of reads and then applying an expectation-maximization procedure to determine the maximum-likelihood estimates of relative transcript abundance.