Table 2.
Transcriptome metrics | One-step assembly | Two-step assembly | Final assembly | ||
---|---|---|---|---|---|
De novo (Trinity) | EST-based | De novo (Velvet/Oases) | Transcriptsa | Unigenesb | |
Sequence number | 255,105 | 27,179 | 51,038 | 84,882 | 77,022 |
Sequence sizes (%) | |||||
≤500 bp | 35.9 | 17.8 | 19.9 | 19.0 | 17.5 |
501–1000 bp | 24.7 | 37.4 | 35.5 | 32.1 | 31.7 |
1001–1500 bp | 15.7 | 25.8 | 23.8 | 20.7 | 21.1 |
1501–2000 bp | 10.6 | 12.2 | 11.9 | 13.2 | 13.8 |
2001–2500 bp | 5.9 | 4.3 | 5.2 | 7.2 | 7.6 |
2501–3000 bp | 3.2 | 1.7 | 2.1 | 3.6 | 3.8 |
>3000 bp | 4.0 | 0.9 | 1.5 | 4.2 | 4.4 |
N50 | 1586.0 | 1258.0 | 1318.0 | 1591.0 | 1611.0 |
N90 | 469.0 | 577.0 | 566.0 | 605.0 | 623.0 |
Mean contig length (bp) | 1048.0 | 1044.0 | 1065.2 | 1214.4 | 1235.2 |
Transcriptome size (Mb) | 267.4 | 28.4 | 54.4 | 103.1 | 96.1 |
Read mapping back (%) | |||||
Mapped | 96.2 | 48.4 | 69.5 | 95.9 | 94.2 |
Properly paired | 81.9 | 58.3 | 66.5 | 81.2 | 80.7 |
BUSCO evaluation (%) | |||||
Completeness | 89.9 | 20.2 | 58.7 | 89.8 | 89.6 |
Single copy | 4.2 | 13.9 | 50.3 | 65.6 | 73.8 |
Duplicated | 85.7 | 6.3 | 8.4 | 24.2 | 15.8 |
Fragmented | 5.1 | 8.5 | 13.3 | 3.9 | 3.9 |
Missing | 6.4 | 71.3 | 28.0 | 6.3 | 6.3 |
aFinal output from the merge of one-step and two-step assemblies
bContigs were clustered by CD-HIT; the longest transcripts were selected as representative for each isoform cluster (i.e. unigenes)