Skip to main content
. 2020 Jul 14;585(7823):79–84. doi: 10.1038/s41586-020-2547-7

Table 1.

Assembly statistics for CHM13 and the human reference sorted by continuity

Primary technology Assembly Size (Gb) No. of contigs NG50 (Mb) BACs resolved (%) BACs %idy all BACs %idy uni
56× Illumina linked reads Supernova (this paper) 2.95 42,828 0.21 17.3 99.975 99.985
76× PacBio CLR FALCON (ref. 50) 2.88 1,916 28.2 36.37 99.981 99.995
24× PacBio HiFi Canu (ref. 22) 3.03 5,206 29.1 45.46 99.979 99.997
Sanger BACs GRCh38p13 (ref. 2) 3.27 1,590 56.4 85.63 99.731a 99.768a
39× Nanopore ultra-long Canu (this paper) 2.94 448 70.1 82.11 99.980 99.994

aGRCh38 is expected to have a lower identity to BACs derived from CHM13 as it represents a different human genome.

Primary Technology: sequencing technology used for contig assembly. The PacBio CLR assembly was additionally polished using Illumina linked reads. The Nanopore ultra-long assembly was polished with the PacBio CLR and Illumina linked reads. GRCh38 is primarily based on Sanger-sequenced BACs, but has been continually curated and patched since the completion of the human genome project. Assembly: assembler used and reference to the published assembly. Size: sum of bases in the assembly in Gb including N-bases. GRCh38 assembly size includes 110 Mb of alternative (ALT) sequences. No. of contigs: total number of contigs in the assembly; scaffolds were split at three consecutive N-bases to obtain contigs. NG50: half of the 3.09-Gb human genome size contained in contigs of this length or greater in Mb. Supernova NG50 statistics were identical between the two reported pseudo-haplotypes. BACs resolved (%): percentage of 341 ‘challenging’ CHM13 BACs found intact in the assembly. BACs unresolved by the best CHM13 assembly either map across multiple contigs or map to a single contig with large structural variation, indicating an error in either the BAC or whole-genome assembly. BACs %idy all: median alignment accuracy versus all validation BACs. BACs %idy uni: median alignment accuracy versus the 31 validation BACs that occur outside of segmental duplications (Supplementary Note 4).