Skip to main content
. 2004 Apr;14(4):721–732. doi: 10.1101/gr.2264004

Table 2.

Building eBACs

Number of Average (reads) Std. Dev. (reads)
Distinct WGS in overlapsa 2342 2011
WGS passing rarityb 1431 352
WGS passing overlap qualb 2276 1947
WGS passing bothb 1417 351
WGS binned with BACc 1314 310
WGS binned + mates 1757 390
WGS in Phrap contigs 1675 368
a

Distinct WGS in all overlaps produced by the overlapper with 95% identity and 100 k-mer copies allowed

b

Filtering done in Binner based only on overlapper information. Repeat heuristic limits k-mer copies to 12 (three times the coverage). Overlap quality heuristic requires 3 × span/(3 + span-score) ≥ 35, where score is the banded alignment score, and 2 × span/(2 + span-score) would approximate the average distance between discrepancies were there only substitutions (indels have added penalties)

c

Beyond the k-mer repeat and quality heuristics, only the top six (i.e., coverage × 1.5) WGS overlaps from each end of a BAC read are examined, and they are kept only if strictly better by the quality heuristic than the top discarded overlap