Skip to main content
. 2012 Jun 25;13(6):R56. doi: 10.1186/gb-2012-13-6-r56

Table 1.

Gap closure results obtained on the bacterial datasets

Method Original IMAGE SOAPdenovo GapFiller GapFiller-LC
Escherichia coli
 Genome size (bp) 4,478,287 4,530,961 4,490,973 4,490,638
 Scaffolds 179 179 179 179
 Gap count 544 291 16 11
 Total gap length (bp) 12,516 2,861 16 130
 Errors (SNPs) 12 40 33 22
 Errors (indels) 4 17 25 9
 Errors (misjoins) 1 1 1 1
 N50 50,557 50,558 50,558 50,558
Streptomyces coelicolor
 Genome size (bp) 8,558,275 8,576,331 8,557,720 8,558,333
 Scaffolds 115 115 115 115
 Gap count 158 63 60 23
 Total gap length (bp) 9,221 4,009 1,288 806
 Errors (SNPs) 299 423 406 280
 Errors (indels) 664 677 769 686
 Errors (misjoins) 12 17 18 18
 N50 173,822 173,822 173,822 173,822
Staphylococcus aureus
 Genome size (bp) 2,880,676 2,880,926 2,881,756 2,883,448
 Scaffolds 19 19 19 19
 Gap count 48 27 27 22
 Total gap length (bp) 9,900 1,547 5,508 1,861
 Errors (SNPs) 79 260 98 173
 Errors (indels) 16 53 26 37
 Errors (misjoins) 4 13 7 5
 N50 1,091,731 1,091,333 1,092,281 1,092,421
Rhodobacter sphaeroides
 Genome size (bp) 4,609,785 4,609,466 4,609,596 4,610,796
 Scaffolds 38 38 38 38
 Gap count 170 163 161 139
 Total gap length (bp) 21,409 14,166 20,667 17,625
 Errors (SNPs) 218 410 230 300
 Errors (indels) 187 294 190 199
 Errors (misjoins) 6 10 6 7
 N50 3,192,334 3,192,075 3,192,215 3,192,974

Gap closure results obtained on four bacterial datasets show that the GapFiller strategy yields the most accurate finished genomes. Also, the gap count is lower compared to the other methods. The IMAGE method significantly underperforms on all quality measures and would therefore not be the preferred method to use. Differences are smaller between GapFiller and SOAPdenovo. Interestingly, whereas the gap count after closure is generally less for GapFiller, SOAPdenovo yields in three cases a shorter total gap length. This suggests the latter method is able to close larger gaps. Strikingly, however, the amount of errors is significantly higher for SOAPdenovo regardless of the source (SNPs, indels and misjoins). Even when applying less strict settings for GapFiller (GapFiller-LC: minimum coverage o = 1, ratio r = 0.5) to shorten the total gap length, our method still yields significantly less errors.

HHS Vulnerability Disclosure