Table 1.
Genome assembly statistics for the TSUd_r3.0 assembly of subterranean clover (Trifolium subterranean) compared with previous drafts.
| TSUd_r1.11 | Tsub_Refv2.02 | TSUd_r3.03 | |
|---|---|---|---|
| Estimated Genome Size | |||
| K-mer (K = 17)4 | 552,423,008 | 552,423,008 | 552,423,008 |
| 1C Genome size5 | 544,000,000 | 544,000,000 | 544,000,000 |
| Total Assembly Length (including N) | 471,834,188 | 512,439,000 | 531,039,069 |
| Estimated Genome Coverage | 86.7% | 94.2% | 97.6% |
| Assembly Metrics | |||
| Ps 1-8 ( scaffolds in Ps-0 ) | 8 ( 27,416 ) | 1,545 | 8 ( 13,109 ) |
| Total Ps + scaffolds | 27,424 | 1,545 | 13,117 |
| %genome in pseudomolecules (excl Ps-0) | 73.9% | 80.2% | 80.1% |
| Scaffold N50 (bp) | 47,721,588 | 410,493 | 56,229,069 |
| Pseudomolecule Metrics | |||
| Ps-1 length (bp) | 47,645,759 | 49,039,259 | 54,345,684 |
| Ps-2 length (bp) | 63,731,624 | 67,952,282 | 60,869,117 |
| Ps-3 length (bp) | 44,866,005 | 53,122,998 | 49,046,471 |
| Ps-4 length (bp) | 56,437,177 | 55,565,095 | 56,229,069 |
| Ps-5 length (bp) | 47,721,588 | 56,909,348 | 52,082,161 |
| Ps-6 length (bp) | 49,553,705 | 53,633,078 | 53,133,211 |
| Ps-7 length (bp) | 42,658,284 | 46,363,611 | 50,781,203 |
| Ps-8 length (bp) | 48,533,994 | 60,596,234 | 58,605,038 |
| Total length Ps 1-8 (bp) | 401,148,136 | 443,181,905 | 435,091,954 |
| GC (GATC) content (%); Gap Ratio (%) | 33.0; 13.7 | 33.3; 11.3 | 33.3; 7.1 |
| Ps-0 length (bp) | 70,686,052 | 69,257,095 | 95,947,115 |
| Ps-0 scaffolds | 27,416 | – | 13,109 |
| Average length Ps-0 scaffolds (bp) | – | – | 6,319 |
| Ps-0 scaffold max length (bp) | – | – | 3,018,486 |
| Ps-0 scaffold min length (bp) | 300 | 300 | 500 |
| Ps-0 scaffold N50 length | – | – | 259,414 |
| GC (GATC) content (%); Gap Ratio (%) | 34.8; 22.6 | – | 34.2; 12.8 |
| Total length Ps 1-8 + 0 | 471,834,188 | 512,439,000 | 531,039,069 |
| Annotation Summary | |||
| Number of predicted genes | 42,706 | 32,333 | 41,979 |
| Total length of predicted genes (bp) | 47,965,017 | 34,758,167 | 55,177,719 |
| Mean length of predicted genes (bp) | 1,123 | 1,075 | 1,314 |
| Length of genes (bp): Max; Min | 15,417; 150 | 15,309; 201 | 15,309; 90 |
| N50 of predicted genes (bp) | 1,548 | 1,437 | 1,767 |
| Proportion genes ≥ 1 kb | 42.1% | – | 53.7% |
| BUSCO6 Scores of Assembly (based on 1,440 reference genes) | |||
| Complete genes - total | 1,108 (77.0%) | 1,261 (87.6%) | 1,358 (94.4%) |
| Complete genes - single copy | 983 (68.3%) | 1111 (77.2%) | 1,255 (87.2%) |
| Complete genes - duplicated | 125 (8.7%) | 150 (10.4%) | 103 (7.2%) |
| Fragmented genes | 138 (9.6%) | 59 (4.1%) | 25 (1.7%) |
| Missing genes | 194 (3.9%) | 120 (8.3%) | 57 (3.9%) |
Ps, pseudomolecule; bp, base pairs; 1 Hirakawa et al., 2016; 2 Kaur et al., 2017a; 3This study.
4These assemblies are based on the same Illumina paired-end sequence that was used to derive the K-mer estimate of genome size.
5Flow cell cytometry (Bennett and Leitch, 2011) or 1C = 0.55 pg DNA; (Vižintin et al., 2006).
6Benchmarking Universal Single-Copy Orthologues (BUSCO) assessment (Simão et al., 2015).
Predicted genes excludes introns; length of genes is based on cDNA.