Skip to main content
. 2015 Jun 3;5:10814. doi: 10.1038/srep10814

Table 1. Evaluation of the performance of de novo genome assembly using MIRA and Celera.

Assembly MIRA Celera
Statistics of contig
 Number of contigs 1,972 3,094
 Total size of contigs (bp) 109,184,716 107,097,920
 Largest contig (bp) 717,688 650,163
 N50 contig length (bp) 109,277 86,600
 L50 count 274 316
 Contig GC content (%) 35.51 35.48
Statistics of contig mapping
 Genome coverage (%) 96.48 97.18
 Duplication ratio 1.134 1.103
 NA50 contig length (bp) 82,984 78,179
 LA50 count 349 372
 Relocations 443 225
 Translocations 245 131
 Inversions 40 28
 SNVs per 100 Kb 24.6 19.52
 Short indels (<9 bp) 0.01195% 0.00647%
 Long indels (>=9 bp) 0.000143% 0.000049%
 Fully unaligned contigs 0 8
 Partially unaligned contigs 6 29

The N50 length measures the length of the contig for which 50% of the total assembly length is contained in contigs of that size or larger, while the L50 metric is the ranking order of the contig if all contigs are ordered from longest to shortest. NA50 and LA50 are similar to N50 and L50 respectively except they are based on the alignment of the contigs against the genome. The relocation is a mis-assembly event that a single contig is “broken” with a minimum interval size of one Kbp and can be mapped to different regions of the same chromosome, while the translocation is the mis-assembly event that a single contig can be mapped to different chromosomes. The inversion is a mis-assembly event that a contig can be aligned to the opposite strands of the same chromosome. Duplication ratio is defined as the ratio of contig length and reference length.