Table 2.
Gap closure results obtained on the eukaryotic datasets
Method | |||
---|---|---|---|
Original | SOAPdenovo | GapFiller | |
Saccharomyces cerevisiae | |||
Genome size (bp) | 11,388,647 | 11,388,600 | 11,388,609 |
Scaffolds | 334 | 334 | 334 |
Gap count | 283 | 67 | 45 |
Total gap length (bp) | 19,358 | 994 | 2,873 |
Errors (SNPs) | 890 | 1,033 | 931 |
Errors (indels) | 565 | 754 | 648 |
Errors (misjoins) | 23 | 42 | 31 |
N50 | 84,640 | 84,640 | 84,649 |
Homo sapiens (chromosome 14) | |||
Genome size (bp) | 95,081,274 | 95,059,687 | 95,072,801 |
Scaffolds | 19,249 | 19,249 | 19,249 |
Gap count | 2,820 | 1,986 | 1,682 |
Total gap length (bp) | 949,137 | 423,107 | 699,550 |
Errors (SNPs) | 76,653 | 79,266 | 76,928 |
Errors (indels) | 21,261 | 23,144 | 22,338 |
Errors (misjoins) | 179 | 224 | 187 |
N50 | 7,748 | 8,262 | 8,469 |
Results of SOAPdenovo and GapFiller obtained for the S. cerevisiae and human genome show the suitability of both methods to close gaps also in eukaryotic genomes. Patterns are similar to the observations made for bacteria: overall, GapFiller yields the most reliable results and the lowest gap count whereas SOAPdenovo yields a significantly shorter total gap length (though at the cost of a fairly increased error rate). In human the shortened genome size and total gap length obtained by SOAPdenovo (together with the increased indel and misjoin error rate) might indicate that some gaps are eventually closed by collapsing of (repeated) elements.