Table 2.
Assembly statistics for the original reference genomes and de novo long-read derived genomes for Thalassiosira pseudonana and Phaeodactylum tricornutum. The mitochondrial and organellar genomes for both diatoms were assembled by Canu and Flye and were excluded from assembly statistical analyses. All Canu and Flye assemblies were corrected first by long-reads using Racon and Nanopolish followed by Illumina short-reads using Pilon (See Methods and Materials for more details). The BUSCO odb9 eukaryotic database (303 genes) was used to assess the different assemblies. The BUSCO scores are reported for the total gene completeness (C), complete single-copy (S), complete duplicated (D) and fragmented (F) orthologs
| Assembly | Total length (Mbp) | Read depth coverage | No. contigs | Largest contig (Mbp) | Contig N50 (Mbp) | Contig L50 | No. scaffolds | Largest scaffold (Mbp) | Scaffold N50 (Mbp) | Scaffold L50 | G+C content (%) | % identity to reference | BUSCO Complete-ness |
ALE score | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Phaeodactylum tricornutum | Reference (Bowler et al. 2008) | 27.4 | 9.6x | 179 | n/a | 0.42 | 20 | 88a | 2.53 | 0.95 | 11 | 48.8 | n/a |
C:82.5% S:80.2% D:2.3% F:5.9% |
n/a |
| Canu | 57.0 | 40x | 291 | 2.51 | 0.25 | 43 | n/a | n/a | n/a | n/a | 48.7 | 99.3 |
C 85.4% S:33.3% D:52.1% F:3.0% |
-734,959,595 | |
| Flye | 33.5 | 72x | 196 | 1.66 | 0.36 | 24 | n/a | n/a | n/a | n/a | 48.7 | 99.1 |
C:80.9% S:71.0% D:9.9% F:4.6% |
-781,367,384 | |
| Canu-Bionano hybrid | 66.8 | n/a | n/a | n/a | n/a | n/a | 219b | 2.78 | 1.06 | n/a | n/a | n/a | n/a | n/a | |
| Thalassiosira pseudonana | Reference (Armbrust et al. 2004, Bowler et al. 2008) | 32.4 | n/a | 115 | n/a | 1.27 | 8 | 64c | 3.04 | 1.99 | 7 | 46.9 | n/a |
C:81.2% S:79.2% D:2.0% F:5.3% |
n/a |
| Canu | 47.3 | 40x | 222 | 2.77 | 0.98 | 14 | n/a | n/a | n/a | n/a | 46.9 | 99.4 |
C:79.2% S:59.7% D:19.5% F:6.6 |
-1,238,092,187 | |
| Flye | 33.8 | 48x | 52 | 2.76 | 1.38 | 8 | n/a | n/a | n/a | n/a | 47.0 | 99.4 |
C:80.6% S:78.9% D:1.7% F:5.6% |
-1,047,071,217 |
aThe number of scaffolds reflects the 33 chromosome-level scaffolds and 55 unplaced, smaller contigs.
bThe number of scaffolds for the Canu-Bionano hybrid includes both the 49 scaffolds that were assembled from the 138 long-read contigs that met minimum length requirement (≥150 kb) for Bionano optical map anchoring and the 155 unanchored contigs <150 kb.
cThe number of scaffolds reflects the 27 chromosome-level scaffolds and 37 unplaced, smaller contigs.