Table 3.
Summary of Bemisia tabaci s.l. structural annotation with four comparative insect genomes
| B. tabaci SSA1-SG1-Ug | B. tabaci SSA1-SG1-Ng | B. tabaci SSA2-Ng | B. tabaci SSA3-Ng | B. tabaci Asia II-5 | B. tabaci Uganda-1 | B. argentifolii | B. tabaci s.s | T. vap.a | A. pisum | A. gambiae | D. melanogaster | T. castaneum | ||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Gene | Total genes | 13,852 | 14,942 | 14,386 | 14,952 | 13,497 | 13,804 | 12,723 | 16,378 | 18,275 | 37,522 | 13,796 | 17,807 | 17,052 |
| Protein coding | 12,710 | 13,661 | 12,928 | 13,463 | 12,289 | 12,749 | 12,077 | 15,786 | 18,275 | 36,195 | 13,057 | 13,947 | 16,590 | |
| Transcript | Total transcripts | 29,757 | 29,919 | 29,035 | 30,073 | 28,862 | 25,101 | 26,475 | 30,266 | 18,275 | 37,522 | 15,718 | 34,920 | 18,996 |
| Protein coding | 28,022 | 27,923 | 26,825 | 27,844 | 26,928 | 23,614 | 25,522 | 29,609 | 18,275 | 36,195 | 14,979 | 30,588 | 18,534 | |
| % Prot. coding | 94.17 | 93.33 | 92.39 | 92.59 | 93.30 | 94.08 | 96.40 | 97.83 | 100.00 | 96.46 | 95.30 | 87.59 | 97.57 | |
| Avg coding length | 1,893 | 1,805 | 1,604 | 1,575 | 2,019 | 1,383 | 2,255 | 1,506 | 1,382 | 1,983 | 2,508 | 2,280 | 1,765 | |
| Exon | Total coding exons | 248,703 | 225,850 | 201,198 | 204,013 | 248,187 | 144,581 | 254,773 | 228,346 | 94,682 | 182,028 | 71,504 | 181,712 | 98,403 |
| Avg translatable exon length | 219.12 | 202.2 | 173.69 | 172.47 | 198.64 | 165.05 | 213.73 | 200.71 | 308 | 208.55 | 371.24 | 378.82 | 288.26 | |
| Avg exons per transcript | 8.2 | 7.2 | 6.4 | 6.3 | 8.1 | 5.1 | 8.9 | 6.8 | 5.2 | 4.8 | 3.7 | 4.9 | 4.3 | |
| Intron | Total introns | 220,681 | 197,927 | 174,373 | 176,169 | 221,259 | 120,967 | 229,251 | 198,737 | 92,160 | 145,833 | 56,525 | 151,124 | 79,869 |
| Avg length | 3,661 | 3,051 | 3,325 | 3,148 | 3,641 | 2,113 | 3,409 | 2,143 | 1,846 | 1,658 | 1,862 | 1,606 | 1,424 | |
| Total length | 807,954,736 | 603,944,323 | 579,736,570 | 554,533,025 | 805,703,415 | 255,575,888 | 781,519,935 | 425,841,115 | 170,187,655 | 241,821,138 | 105,259,352 | 242,737,713 | 113,707,069 | |
| 5'UTR (canonical) | Total 5'UTRs | 5,496 | 7,587 | 7,572 | 8,017 | 7,398 | 7,218 | 8,670 | 9,130 | 10,877 | 32,115 | 10,118 | 13,394 | 13,522 |
| Avg length | 5,202 | 3,958 | 4,837 | 3,985 | 5,838 | 1,921 | 4,018 | 1,750 | 192 | 1,661 | 1,841 | 1,441 | 1,086 | |
| Complete length | 28,591,304 | 30,027,509 | 36,622,299 | 31,951,687 | 43,192,086 | 13,862,775 | 34,834,074 | 15,973,160 | 2,091,440 | 53,344,412 | 18,630,451 | 19,296,959 | 14,683,753 | |
| 3'UTR (canonical) | Total 3'UTRs | 5,180 | 7,080 | 6,748 | 7,303 | 6,751 | 7,278 | 8,010 | 8,880 | 9,033 | 33,840 | 9,551 | 13,495 | 13,504 |
| Avg length | 983 | 950 | 1,000 | 968 | 1,141 | 864 | 869 | 530 | 323 | 1,186 | 789 | 517 | 282 | |
| Complete length | 5,093,086 | 6,723,389 | 6,746,881 | 7,071,323 | 7,702,287 | 6,291,556 | 6,962,565 | 4,707,060 | 2,919,282 | 40,126,738 | 7,533,191 | 6,970,762 | 3,812,572 | |
|
BUSCO (Insecta n = 1,367) [%] |
Complete BUSCO | 91.8 | 94.8 | 95.1 | 93.6 | 95.0 | 78.0 | 98.0 | 93.3 | 94.7 | 96.8 | 98.0 | 99.6 | 99.3 |
| Single copy | 82.4 | 90.6 | 91.7 | 89.4 | 91.1 | 72.1 | 96.1 | 72.6 | 92.5 | 90.7 | 96.2 | 99.0 | 98.8 | |
| Duplicated | 9.4 | 4.2 | 3.4 | 4.2 | 3.9 | 5.9 | 1.9 | 20.7 | 2.2 | 6.1 | 1.8 | 0.6 | 0.5 | |
| Fragmented | 1.3 | 2.0 | 2.4 | 3.6 | 1.2 | 6.9 | 1.2 | 1.9 | 0.6 | 0.7 | 0.8 | 0.1 | 0.4 | |
| Missing | 6.9 | 3.2 | 2.5 | 2.8 | 3.8 | 15.1 | 0.8 | 4.8 | 4.7 | 2.5 | 1.2 | 0.3 | 0.3 |
aT. vap = T. vaporariorum, D. melan. = D. melanogaster
Summary statistics for the B. tabaci s.l. genomes and four comparative insect genomes. The new B. tabaci s.l. genomes and those of B. argentifolii and B. tabaci s.s. were annotated using the same methodology (see M&M). Gene features were categorized according to gene, transcript, exon, intron and both 3’ and 5’ UTRs. A BUSCO analysis of single copy ortholog results for orthology set (OrthoDB v9)—Insecta (n = 1,658) was above 82% for all cassava whitefly populations. The B. tabaci Uganda-1 genome had the highest percentage of missing genes from the BUSCO ortholog set. See Additional file 1: Table S3 for additional BUSCO ortholog set analyses: Arthropoda (n = 1,066), Metazoa (n = 978) and Eukaryota (n = 303)]