Skip to main content
. 2023 Feb 15;14:1103857. doi: 10.3389/fpls.2023.1103857

Table 1.

Genome assembly statistics for the TSUd_r3.0 assembly of subterranean clover (Trifolium subterranean) compared with previous drafts.

TSUd_r1.11 Tsub_Refv2.02 TSUd_r3.03
Estimated Genome Size
K-mer (K = 17)4 552,423,008 552,423,008 552,423,008
1C Genome size5 544,000,000 544,000,000 544,000,000
Total Assembly Length (including N) 471,834,188 512,439,000 531,039,069
Estimated Genome Coverage 86.7% 94.2% 97.6%
Assembly Metrics
Ps 1-8 ( scaffolds in Ps-0 ) 8 ( 27,416 ) 1,545 8 ( 13,109 )
Total Ps + scaffolds 27,424 1,545 13,117
%genome in pseudomolecules (excl Ps-0) 73.9% 80.2% 80.1%
Scaffold N50 (bp) 47,721,588 410,493 56,229,069
Pseudomolecule Metrics
Ps-1 length (bp) 47,645,759 49,039,259 54,345,684
Ps-2 length (bp) 63,731,624 67,952,282 60,869,117
Ps-3 length (bp) 44,866,005 53,122,998 49,046,471
Ps-4 length (bp) 56,437,177 55,565,095 56,229,069
Ps-5 length (bp) 47,721,588 56,909,348 52,082,161
Ps-6 length (bp) 49,553,705 53,633,078 53,133,211
Ps-7 length (bp) 42,658,284 46,363,611 50,781,203
Ps-8 length (bp) 48,533,994 60,596,234 58,605,038
Total length Ps 1-8 (bp) 401,148,136 443,181,905 435,091,954
GC (GATC) content (%); Gap Ratio (%) 33.0; 13.7 33.3; 11.3 33.3; 7.1
Ps-0 length (bp) 70,686,052 69,257,095 95,947,115
 Ps-0 scaffolds 27,416 13,109
 Average length Ps-0 scaffolds (bp) 6,319
 Ps-0 scaffold max length (bp) 3,018,486
 Ps-0 scaffold min length (bp) 300 300 500
 Ps-0 scaffold N50 length 259,414
GC (GATC) content (%); Gap Ratio (%) 34.8; 22.6 34.2; 12.8
Total length Ps 1-8 + 0 471,834,188 512,439,000 531,039,069
Annotation Summary
Number of predicted genes 42,706 32,333 41,979
Total length of predicted genes (bp) 47,965,017 34,758,167 55,177,719
Mean length of predicted genes (bp) 1,123 1,075 1,314
Length of genes (bp): Max; Min 15,417; 150 15,309; 201 15,309; 90
N50 of predicted genes (bp) 1,548 1,437 1,767
Proportion genes ≥ 1 kb 42.1% 53.7%
BUSCO6 Scores of Assembly (based on 1,440 reference genes)
Complete genes - total 1,108 (77.0%) 1,261 (87.6%) 1,358 (94.4%)
 Complete genes - single copy 983 (68.3%) 1111 (77.2%) 1,255 (87.2%)
 Complete genes - duplicated 125 (8.7%) 150 (10.4%) 103 (7.2%)
Fragmented genes 138 (9.6%) 59 (4.1%) 25 (1.7%)
Missing genes 194 (3.9%) 120 (8.3%) 57 (3.9%)

Ps, pseudomolecule; bp, base pairs; 1 Hirakawa et al., 2016; 2 Kaur et al., 2017a; 3This study.

4These assemblies are based on the same Illumina paired-end sequence that was used to derive the K-mer estimate of genome size.

5Flow cell cytometry (Bennett and Leitch, 2011) or 1C = 0.55 pg DNA; (Vižintin et al., 2006).

6Benchmarking Universal Single-Copy Orthologues (BUSCO) assessment (Simão et al., 2015).

Predicted genes excludes introns; length of genes is based on cDNA.