Table 1.
EGYPT | EGYPT_wtdbg2 | EGYPT_falcon | AK1 | YORUBA | |
---|---|---|---|---|---|
Assembly level | meta | contig | scaffold | scaffold | chromosome |
Effective genome size | 2,820,489,739 | 2,733,934,177 | 2,897,551,797 | NA | NA |
# Sequences | 3235 | 3106 | 1615 | 2832 | 1647 |
Longest sequence | 88,566,048 | 88,566,048 | 84,324,762 | 113,921,103 | 248,986,603 |
# N’s per 100 kbp | 0 | 0 | 209.01 | 1285.7 | 7180.2 |
Base level QV | 42.4 | 42.9 | 43.0 | 50.4a | NA |
# Genes (thereof # partial) | 20,908 (3226) | 20,613 (3229) | 21,176 (1578) | 21,047 (1396) | 21,077 (1721) |
Genome fraction w.r.t. GRCh38 (%) | 94.174 | 92.247 | 95.924 | 95.177 | 95.391 |
Duplication ratio w.r.t. GRCh38 | 1.01 | 0.999 | 1.018 | 1.023 | 1.088 |
Largest GRCh38 alignment | 75,492,126 | 75,492,126 | 56,458,009 | 58,219,133 | 65,512,502 |
Total GRCh38 aligned length | 2,800,100,449 | 2,713,712,375 | 2,865,356,241 | 2,829,006,639 | 2,832,740,986 |
NG50 w.r.t. GRCh38 | 20,857,787 | 20,857,787 | 28,071,354 | 39,609,866 | 145,208,384 |
LG50 w.r.t. GRCh38 | 35 | 35 | 33 | 24 | 9 |
NGA50 w.r.t. GRCh38 | 11,187,777 | 11,187,777 | 8,226,500 | 13,028,687 | 19,529,238 |
LGA50 w.r.t. GRCh38 | 71 | 71 | 95 | 66 | 43 |
# GRCh38 differences >1 kb (thereof # outside centromeres) | 1276 (1103) | 1276 (1103) | 3499 (2832) | 1952 (1685) | 1756 (1472) |
# GRCh38 mismatches per 100 kbp | 139 | 138.72 | 143.64 | 126.92 | 141.56 |
# GRCh38 indels per 100 kbp | 32.09 | 31.74 | 40.06 | 32.77 | 46.95 |
K-mer-based compl. w.r.t. GRCh38 (%) | 86.01 | 85.15 | 87.75 | 87.68 | 85.82 |
The table lists the final EGYPT meta assembly, the two alternative base assemblies EGYPT_wtdbg2 and EGYPT_falcon and two publicly available assemblies of the genomes of a Korean (AK1) and Yoruba individual (YORUBA). Metrics were largely obtained with QUAST-LG. The complete QUAST-LG report and additional assembly metrics are provided in Supplementary Data 2.
aBased on error rate estimated by AK1 authors Seo et al.16.