Figure 6.
Description of the core assembly algorithm. (A) Pseudocode overview of the steps during assembly of a single contig. The parameter omin controls the stringency of the algorithm, and r denotes the read length. (B) Illustration of the elongation step. Contig C is to be elongated to the right. Read R is a candidate for elongation found in the data set of reads, because its prefix (gray) matches the end of C perfectly. The suffix of read R (white) is the potential extension E for contig C. The length of the check region M is the sum of read length r, and the length of the extension E. Substrings of M and its reverse complement are used to search for matching read prefixes in the data set. Only if all of these reads match M exactly is C extended by E.