Table 2.
Assembly type | Species | Chromo-somes (n) | Genome size | Total contig number | Total contig length | Assembled percent (%) | Average contig number | Contig N50 | Contig N90 |
---|---|---|---|---|---|---|---|---|---|
Simulated assembly | Human | 23 | 3,054,815,472 | 104 | 3,054,815,472 | 100% | 4.52 | 47,909,438 | 14,532,355 |
Rice | 12 | 373,094,580 | 42 | 373,094,580 | 100% | 3.50 | 11,071,427 | 5,000,429 | |
Arabidopsis | 5 | 119,146,348 | 11 | 119,146,348 | 100% | 2.20 | 9,660,775 | 5,994,203 | |
Hifiasm-assembly | Human | 23 | 3,054,815,472 | 82 | 3,007,080,905 | 98% | 3.57 | 89,131,734 | 28,203,557 |
Great burdock | 18 | 1,720,000,000 | 30 | 1,709,056,189 | 99% | 1.67 | 74,692,580 | 38,981,084 | |
Water spinach | 15 | 485,000,000 | 29 | 480,197,403 | 99% | 1.61 | 23,511,778 | 9,860,712 |
Genome size are validated or estimated value. The contigs with size > 1 Mb are used for statistics in this table. Assembled percent = total contig length/genome size. Average contig number means average contig number per chromosome. For simulated data, the reference genomes of human CHM13 v1.1, rice (Nipponbare) ASM386523v1, Arabidopsis thaliana (Columbia) TAIR10.1 were used to simulate large contigs, and each chromosome of reference genome was randomly split into 1–6 contigs. In the simulated assembly, contigs size smaller than 1 Mb were not allowed, and all the contigs are larger than 1 Mb. For real data, the hifiasm-assembled contigs of human were downloaded from https://zenodo.org/record/4393631/files/CHM13.HiFi.hifiasm-0.12.fa.gz, while the contigs of great burdock and water spinach were assembled by hifiasm using default parameters from HiFi reads downloaded from NCBI-SRA databases (PRJNA764011 and PRJNA764042)