Skip to main content
. Author manuscript; available in PMC: 2020 Feb 12.
Published in final edited form as: Nat Biotechnol. 2019 Aug 12;37(10):1155–1162. doi: 10.1038/s41587-019-0217-9

Table 2. Statistics for de novo assembly of CCS reads.

The “mixed” haplotype assemblies use all reads. The “maternal” and “paternal” assemblies use parent-specific reads from trio binning plus unassigned reads. HG002 concordance is measured against the Genome in a Bottle benchmark. BUSCO gene completeness uses the Mammalia ODB9 gene set. Ensembl genes is the percentage of genes from Ensembl R94 that are full-length, single-copy in the assembly relative to the full-length, single-copy count for GRCh38. Contigs shorter than 13 kb were excluded from genome size and contiguity measurements; contigs shorter than 100 kb were excluded from the concordance measurement. “*” indicates polishing with Arrow.

Haplotype Assembler Total size (Gb) Contigs N50 (Mb) NG50 (Mb) Max (Mb) E-size (Mb) HG002 concordance (Phred) BUSCO genes Ensembl genes
Mixed Canu 3.42 18,006 22.78 25.02 108.46 30.16 31.1 92.3% 93.2%
Mixed FALCON 2.91 2,541 28.95 24.51 110.21 38.04 25.8 87.6% 97.6%
Mixed wtdbg2 2.79 1,554 15.43 12.62 84.67 22.61 44.6 94.2% 96.1%
Maternal Canu* 3.04 5,854 18.02 17.04 48.81 19.78 47.2 94.1% 98.1%
Maternal FALCON* 2.80 924 19.99 15.54 74.33 24.07 43.5 95.1% 97.8%
Maternal wtdbg2 2.75 2,637 12.10 9.29 66.34 16.55 43.5 93.8% 95.6%
Paternal Canu* 2.96 6,868 16.14 14.90 64.83 20.19 47.7 93.4% 98.2%
Paternal FALCON* 2.70 1,489 16.40 14.06 95.34 25.61 43.5 93.6% 97.7%
Paternal wtdbg2 2.67 1,444 13.96 10.86 50.51 15.36 42.1 92.6% 95.3%