Skip to main content
. Author manuscript; available in PMC: 2021 Apr 1.
Published in final edited form as: Nat Rev Genet. 2020 Jun 5;21(10):597–614. doi: 10.1038/s41576-020-0236-x

Table 2.

Statistics of human genome assemblies generated with various data types and assembly algorithms

Genome assembly Data type [coverage; read N50 (kb)] Assembler Size (Mb) # of contigs Contig N50 (Mb) Estimated cost Ref.
hg1 Multi-technology GigAssembler, PHRAP 2.69 149,821 0.082 $300,000,000 72
hg38 Multi-technology Multiple algorithms 3.01 998 57.88 not determined 161
YH Illumina (56-fold; <0.075) SOAPdenovo 2.91 361,157 0.02 $1,600a 162
CHM13 PacBio CLR (77-fold; 17.5) FALCON 2.88 1,916 29.30 $2,700b 30
PacBio HiFi (24-fold; 10.9) FALCON 3.00 2,116 31.92 $4,100b 52
Canu 3.03 5,206 25.51
PacBio CLR (77-fold; 17.5) and ONT (50-fold; 70.4) Canu 2.94 590 72.00 $55,000c 34
HG002 PacBio HiFi (28-fold; 13.5) FALCON 2.91 2,541 28.95 $2,700b 53
PacBio HiFi (28-fold; 13.5) Canu 3.42 18,006 22.78
ONT (47-fold; 48.7) Shasta 2.80 1,847 23.34 $5,000d 36
Flye 2.82 1,627 31.25
Canu 2.90 767 33.06
NA12878 Illumina (103-fold; 0.101) ALLPATHS-LG 2.79 231,194 0.02 $2,900a 163
ONT (29-fold; 10.6 5-fold; 99.8) Flye 2.82 782 18.18 $4,000d 76
Canu 2.82 798 10.41 35
NA12878 (phased) PacBio HiFi (30-fold; 10.0) Peregrine 2.97 [H1] 2.97 [H2] 9,334 [H1] 9,127 [H2] 19.6 [H1] 18.7 [H2] $4,100b 22
HG00733 ONT (73-fold; 29.6) Shasta 2.78 2,150 24.43 $6,000d 36
Flye 2.81 1,852 28.76
Canu 2.90 778 44.76
HG00733 (phased) PacBio HiFi (33-fold; 13.4) and Strand-seq (5-fold) Peregrine 2.90 [H1] 2.91 [H2] 2,618 [H1] 2,557 [H2] 28.0 [H1] 29.2 [H2] $9,000e 91

Multi-technology: Clone-by-clone hierarchical sequencing with short and Sanger reads.

H1 and H2 refer to the first and second haplotype in the diploid genome assembly, respectively.

a

Current cost when generated on the NovaSeq using S4 flow cells and multiplexing.

b

Current cost when generated on the Sequel II.

c

Current cost when generated on the Sequel II (PacBio CLR) and GridION (ONT).

d

Current cost when generated on the PromethION.

e

Current cost when generated on the Sequel II (PacBio HiFi) and HiSeq 2500 (Illumina).

All estimates are listed in USD and exclude the cost for labor, instrumentation, maintenance, and computer resources.