Skip to main content
. 2020 Mar 27;10(5):1443–1455. doi: 10.1534/g3.119.400959

Table 1. Genome size estimates and assembly statistics.

Species Est. Genome Size (bp) Total Scaffold Length (bp) Scaffold NG50 (bp) Longest Scaffold (bp) Contig NG50 (bp) Longest Contig (bp) Total Gap Length (bp)
D. kanapiae 155,490,160 152,203,088 389,587 2,274,126 301,459 2,274,126 205,953
D. birchii 169,148,727 156,593,892 211,718 1,501,252 164,580 1,183,551 81,207
D. truncata 190,688,284 167,897,087 95,737 830,117 73,889 827,712 375,696
D. bocki 155,095,574 151,202,254 78,068 785,450 65,314 785,450 69,231
D. bunnanda 181,250,127 151,823,105 76,713 1,142,480 65,403 1,127,760 152,423
D. punjabiensis 197,448,094 192,339,030 72,420 1,226,934 64,043 1,083,757 153,372
D. jambulina 179,468,675 163,991,206 71,637 873,064 61,348 756,687 61,365
D. vulcana 209,187,412 187,578,810 65,096 530,507 51,774 472,464 116,296
D. seguyi 206,814,592 178,856,532 63,109 891,413 54,123 891,413 233,653
D. mayri 223,398,425 167,807,061 62,249 2,219,437 43,922 1,355,909 364,662
D. asahinai 216,977,949 189,050,820 59,266 1,052,132 50,528 904,342 75,577
D. serrata 184,673,878 159,679,625 54,224 1,091,401 43,626 718,797 79,019
D. lacteicornis 203,475,870 182,681,050 53,799 1,044,495 44,105 766,914 60,537
D. pectinifera 220,219,034 149,209,000 52,632 528,734 41,478 467,725 58,142
D. tani 194,820,185 180,972,673 48,517 921,527 44,341 780,901 135,927
D. rufa 210,769,271 186,167,886 43,287 498,065 38,201 498,065 80,447
D. watanabei 182,199,997 196,825,890 40,952 1,045,963 36,818 656,929 135,921
D. auraria 220,036,088 197,420,731 38,365 491,046 35,679 491,046 130,216
D. cf. bakoue 219,308,053 187,248,584 28,924 1,045,797 27,520 598,596 58,713
D. leontia 162,918,854 164,601,511 26,461 331,217 23,897 301,031 91,716
D. burlai 198,129,694 175,666,184 24,417 628,960 23,536 628,960 153,793
D. nikananu 217,706,973 190,505,469 23,001 626,542 21,180 574,375 226,904
D. triauraria 217,036,792 197,369,186 17,513 590,840 16,493 576,941 156,570

Genome size estimates were calculated by SGA Preqc (Simpson 2014) based on the k-mer frequency spectrum of the unassembled reads. To calculate the scaffold NG50 (Earl et al. 2011; Bradnam et al. 2013), scaffold lengths were ordered from longest to shortest and then summed, starting with the longest scaffold. The NG50 was the scaffold length that brought the sum above 50% of the estimated genome size. Contig lengths were estimated by splitting scaffolds on every N, including single Ns. Species are listed in decreasing order of scaffold NG50.