Table 1.
Illumina derived datasets | Simulated datasets | ||||||||
---|---|---|---|---|---|---|---|---|---|
Stepa | L. stagnalis | S. cerevisiae | Caenorhabditis sp. | D. simulans | H. rhamnoides | N. benthamiana | D. melanogaster | C. elegans | |
Number of concatenated transcripts | 6 | 576,412 | 25,854 | 152,491 | 184,892 | 278,987 | 885,944 | 42,535 | 15,340 |
CDS number | 7 | 139,727 | 22,180 | 112,813 | 81,598 | 137,601 | 379,596 | 37,920 | 41,103 |
uniCDS numberb | 8 | 59,178 (58%) | 9,942 (55%) | 40,116 (64%) | 27,735 (66%) | 63,092 (54%) | 131,656 (65%) | 12,118 (68%) | 14,890 (64%) |
Total transcript number | 9 | 58,185 | 9,744 | 39,022 | 26,968 | 61,798 | 127,526 | 11,582 | 14,283 |
Total CDS number | 9 | 64,659 | 11,605 | 51,416 | 34,363 | 68,288 | 153,118 | 14,231 | 15,412 |
Transcripts with multiple CDSsc | 9 | 5,759 (10%) | 1,529 (15%) | 9,756 (19%) | 5,838 (22%) | 5,999 (10%) | 21,060 (17%) | 2,218 (19%) | 949 (7%) |
Redundant CDSsd | 9 | 5,481 (9%) | 1,663 (14%) | 11,300 (22%) | 6,628 (19%) | 5,196 (8%) | 21,462 (14%) | 2,113 (15%) | 522 (3%) |
Transcriptome size (bp) | 9 | 131,591,076 | 16,164,888 | 69,689,679 | 69,421,322 | 86,181,833 | 206,036,224 | 34,121,269 | 19,765,122 |
Smallest transcript (bp) | 9 | 300 | 300 | 300 | 300 | 300 | 300 | 300 | 300 |
Largest transcript (bp) | 9 | 35,470 | 15,061 | 21,466 | 51,362 | 13,117 | 19,833 | 29,220 | 26,756 |
N50 | 9 | 3,483 | 2,414 | 2,366 | 3,866 | 1,823 | 2,116 | 4,479 | 1,666 |
aStep number in Fig. 1
bProportion of discarded CDSs is indicated in brackets
cProportion of transcripts with >1 CDS is indicated in brackets
dProportion of none unique CDSs is indicated in brackets