Table 2.
Comparison between genome assemblies
Sequence identity level | February 2002 assembly* | February 2003 assembly |
Duplication content (bp) | ||
90-92% | 4,966,470 | 3,543,429 |
92-94% | 15,685,840 | 13,981,642 |
94-96% | 17,533,730 | 17,970,287 |
96-98% | 11,539,392 | 11,731,958 |
98-99.5% | 5,865,024 | 5,487,899 |
†Potential sequence misassignment error detected (bp) | ||
99.5-100% | 4,832,594 | 18,456,096 |
The comparison is of duplication content by sequence identity and potential sequence misassignment errors between the February 2002 (MGSCv3) and February 2003 (a hybrid assembly of MGSCv3 with 705 Mb finished BAC sequence) genome assemblies. *Analysis of the duplication content for February 2002 assembly can be found at [14].†Sequences detected to show extremely high percent identity duplications are likely to be genome assembly artifacts and were not included in the duplication content shown in Table 1.