Table 1.
Summary of the panda genome sequencing and assembly
Step | Paired-end insert size (bp)* | Sequence coverage (×)† | Physical coverage (×)† | N50 (bp) ‡ | N90 (bp) ‡ | Total length (bp) |
---|---|---|---|---|---|---|
Initial contig | 1,483 | 224 | 2,021,639,596 | |||
Scaffold 1 | 110–230; 380–570 | 38.5 | 96 | 32,648 | 7,780 | 2,213,848,409 |
Scaffold 2 | Add 1,700–2,800 | 8.4 | 151 | 229,150 | 45,240 | 2,250,442,210 |
Scaffold 3 | Add 3,700–7,500 | 6.5 | 450 | 581,933 | 127,336 | 2,297,100,301 |
Scaffold 4 | Add 9,200–12,300 | 2.6 | 373 | 1,281,781 | 312,670 | 2,299,498,912 |
Final contig | All | 56.0 | 1,070 | 39,886 | 9,848 | 2,245,302,481 |
Add denotes accumulative; for example, scaffold 2 uses data of 110–230, 380–570 and 1,700–2,800.
Approximate average insert size of Illumina Genome Analyser sequencing libraries. The sizes were estimated by mapping the reads onto the assembled genome sequences.
High-quality read sequences that were used in assembly. Coverage was estimated assuming a genome size of 2.4 Gb. Sequence coverage refers to the total length of generated reads, and physical coverage refers to the total length of sequenced clones of the libraries.
N50 size of contigs or scaffolds was calculated by ordering all sequences then adding the lengths from longest to shortest until the summed length exceeded 50% of the total length of all sequences. N90 is similarly defined.