Table 2.
Assembly Version | ||||||
---|---|---|---|---|---|---|
Meraculous-2D | HiRise | PB Jelly | Pilon iteration 1 | Pilon iteration 2 | ||
Assembly step | Shotgun assembly | Scaffolding | Gap filling | Error correcting | Error correcting | |
Input data | Illumina shotgun | CHiCago | ONT | Illumina shotgun | Illumina shotgun | |
Genome length (bp) | 2 086 202 895 | 2 215 996 330 | 2 232 824 675 | 2 232 974 266 | 2 232 736 973 | |
Contigs | 180 766 | 183 631 | 166 073 | 163 790 | 163 790 | |
Scaffolds | 139 421 | 12 402 | 12 402 | 12 402 | 12 402 | |
Contig N50 (kb) | 20.1 | 20.3 | 21.8 | 23.2 | 23.5 | |
Contig L50 | 31 094 | 30 649 | 28 724 | 27 028 | 26 657 | |
Scaffold N50 (Mb) | 0.0262 | 20.92 | 21.12 | 21.12 | 21.12 | |
Scaffold L50 | 23 560 | 29 | 29 | 29 | 29 | |
Number of gaps | 41 345 | 171 229 | 162 727 | 153 671 | 151 388 | |
Number of Ns | 1 438 142 | 131 322 142 | 125 625 584 | 124 345 965 | 123 688 823 | |
% of Ns in genome | 0.00198 | 5.93 | 5.62 | 5.57 | 5.54 | |
Busco Scores (%; n = 9229) | C | 54.9 | 87.9 | 88.0 | 88.1 | 88.0 |
S | 54.6 | 87.4 | 87.4 | 87.5 | 87.4 | |
D | 0.3 | 0.5 | 0.6 | 0.6 | 0.60 | |
F | 16.2 | 3.9 | 4.0 | 3.9 | 4.0 | |
M | 28.9 | 8.2 | 8.0 | 8.0 | 8.0 |
The metrics of the Andean bear genome at different stages of the assembly process. The final column represents the final assembly. We saw a marked improvement in N50 as a result of scaffolding with HiRise. Gap filling with PBJelly noticeably decreased the numbers of Ns and strings of Ns. As a result of this conversion of Ns to useful sequence, the number of Illumina reads that mapped to the assembly increased. Full BUSCO scores are shown at the bottom, indicating the proportion of complete (C), single-copy (S), duplicated (D), fragmented (F), and missing (M) conserved single-copy mammalian orthologs that a present at each stage of the assembly.