Skip to main content
. 2015 Feb 12;1(1):e1400234. doi: 10.1126/sciadv.1400234

Fig. 5. Long tandem repeats of the Cen1-like consensus are detected in PacBio single sequence reads.

Fig. 5

(A) Maps of BLASTN hits (boxes, where gray horizontal bars represent 100% identity, vertical red lines represent mismatches, and vertical black lines represent indels) in raw PacBio reads. Displayed are the 10 PacBio single sequence reads (indicated by their sequence read identifier) with the highest bit scores in a MegaBLAST search of SRR1304331 using the Cen1-like 340-bp query. Alternating hits are shown in two tiers for visual clarity. We attribute gaps in the array to the ~15% mostly indel error rate characteristic of PacBio raw data, an interpretation that is supported by the near-perfect alignment of BLAST hits to the 340-bp tiling shown as tandem black diamonds at bottom. (B) A consensus sequence was derived for each of the raw sequences indicated in (A) by automated alignment of the tandem BLAST hits, and a dendrogram was produced, rooting the tree with the Cen1-like consensus. (C) Alignment of the Cen1-like consensus (top sequence) identifies 44 ambiguous residues (indicated as “u” or “s”) and six indels (indicated as dashes) in the overall PacBio-derived consensus (bottom sequence) over the 340-bp sequence.