Skip to main content
. 2020 Sep 18;11:4719. doi: 10.1038/s41467-020-17964-1

Table 1.

Comparison of main assembly characteristics and quality metrics.

EGYPT EGYPT_wtdbg2 EGYPT_falcon AK1 YORUBA
Assembly level meta contig scaffold scaffold chromosome
Effective genome size 2,820,489,739 2,733,934,177 2,897,551,797 NA NA
# Sequences 3235 3106 1615 2832 1647
Longest sequence 88,566,048 88,566,048 84,324,762 113,921,103 248,986,603
# N’s per 100 kbp 0 0 209.01 1285.7 7180.2
Base level QV 42.4 42.9 43.0 50.4a NA
# Genes (thereof # partial) 20,908 (3226) 20,613 (3229) 21,176 (1578) 21,047 (1396) 21,077 (1721)
Genome fraction w.r.t. GRCh38 (%) 94.174 92.247 95.924 95.177 95.391
Duplication ratio w.r.t. GRCh38 1.01 0.999 1.018 1.023 1.088
Largest GRCh38 alignment 75,492,126 75,492,126 56,458,009 58,219,133 65,512,502
Total GRCh38 aligned length 2,800,100,449 2,713,712,375 2,865,356,241 2,829,006,639 2,832,740,986
NG50 w.r.t. GRCh38 20,857,787 20,857,787 28,071,354 39,609,866 145,208,384
LG50 w.r.t. GRCh38 35 35 33 24 9
NGA50 w.r.t. GRCh38 11,187,777 11,187,777 8,226,500 13,028,687 19,529,238
LGA50 w.r.t. GRCh38 71 71 95 66 43
# GRCh38 differences >1 kb (thereof # outside centromeres) 1276 (1103) 1276 (1103) 3499 (2832) 1952 (1685) 1756 (1472)
# GRCh38 mismatches per 100 kbp 139 138.72 143.64 126.92 141.56
# GRCh38 indels per 100 kbp 32.09 31.74 40.06 32.77 46.95
K-mer-based compl. w.r.t. GRCh38 (%) 86.01 85.15 87.75 87.68 85.82

The table lists the final EGYPT meta assembly, the two alternative base assemblies EGYPT_wtdbg2 and EGYPT_falcon and two publicly available assemblies of the genomes of a Korean (AK1) and Yoruba individual (YORUBA). Metrics were largely obtained with QUAST-LG. The complete QUAST-LG report and additional assembly metrics are provided in Supplementary Data 2.

aBased on error rate estimated by AK1 authors Seo et al.16.