Skip to main content
. 2014 Jul 14;15(Suppl 5):S6. doi: 10.1186/1471-2164-15-S5-S6

Table 3.

Comparisons of the Drosophila transcriptome assemblies of our postprocessing algorithm, Oases and Trans-ABySS using six publicly available libraries over different values of k-mer coverage cutoff c.

postprocess
k_c initial nodes largest tangle largest SCC splicing graphs max length N50 >1-node graphs max nodes avg nodes SNPs total hits unique hits >1-hit graphs max hits time (mins) memory (GB)

35_3 227614 178545 88094 75367 10539 544 2048 124 6 16703 38448 10719 392 5 86,18 22,2
35_5 125414 87895 41654 47958 8678 705 1720 93 6 11334 27010 9889 429 13 86,17 22,2
35_10 57978 31785 12695 27695 6383 705 1020 63 6 5034 17271 8070 308 5 86,16 22,2

Oases

k_c locus max length N50 >1-trans locus max trans avg trans total hits unique hits >1-hit locus max hits time
(mins)
memory
(GB)

35_3 39584 15586 801 3824 13 3 29928 10898 256 4 94,28 29,32
35_5 28537 15586 936 2616 16 3 22460 10103 245 4 94,26 29,30
35_10 17075 11104 982 1377 14 3 13800 8201 185 5 94,24 29,26

Trans-ABySS

k_c trans max length N50 >1-node trans max nodes avg nodes total hits unique hits time
(mins)
memory
(GB)

35_3 91365 15586 898 50467 60 8 33600 10639 205,1 4,1
35_5 55164 10582 997 27763 46 7 25779 9944 195,1 4,1
35_10 28455 8865 929 13665 43 6 16032 8154 178,1 4,1

The k-mer length is fixed to 35 because Oases is only capable of assembling these libraries on machines with 32 GB physical memory when k is large. For our postprocessing algorithm, the notations are the same as in Table 1. For Oases, locus denotes the number of predicted locus, max length denotes the length of the longest predicted transcript, N50 denotes the N50 value of the longest transcript length in a predicted locus, >1-trans locus denotes the number of predicted locus with more than one transcript, max trans denotes the maximum number of transcripts in a predicted locus, avg trans denotes the average number of transcripts in predicted locus with more than one transcript, total hits denotes the total number of hits from translated BLAST search of each predicted transcript to Drosophila (isoforms are considered the same gene, only the top hit with E-value below 107 is considered for each transcript in a predicted locus, and hits from transcripts within the same predicted locus to the same gene are counted once), unique hits denotes the number of unique hits to different genes, >1-hit locus denotes the number of predicted locus that has BLAST hits to more than one gene, max hits denotes the maximum number of different genes that have BLAST hits to a predicted locus, time (mins) denotes the computational time in minutes, with the values to the left and to the right of "," indicating the running time of Velvet (without setting cov_cutoff) and Oases respectively, and memory (GB) denotes the memory requirement in gigabytes, with the values to the left and to the right of "," indicating the memory requirement of Velvet (without setting cov_cutoff) and Oases respectively. For Trans-ABySS, trans denotes the total number of predicted transcripts, max length denotes the length of the longest predicted transcript, N50 denotes the N50 value of the length of predicted transcripts, >1-node trans denotes the number of predicted transcripts that are the concatenation of more than one node, max nodes denotes the maximum number of nodes in a predicted transcript, avg nodes denotes the average number of nodes in predicted transcripts with more than one node, total hits denotes the total number of predicted transcripts that have BLAST hits, unique hits denotes the number of unique hits to different genes, time (mins) denotes the computational time in minutes, with the values to the left and to the right of "," indicating the running time of ABySS and Trans-ABySS respectively, and memory (GB) denotes the memory requirement in gigabytes, with the values to the left and to the right of "," indicating the memory requirement of ABySS and Trans-ABySS respectively.