Table 1.
k_c | initial nodes | largest tangle | largest SCC | splicing graphs | max length | N50 | >1-node graphs | max nodes | avg nodes | SNPs | total hits | unique hits | >1-hit graphs | max hits | time (mins) | memory (GB) |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
25_3 | 38884 | 17900 | 9937 | 15713 | 37380 | 2366 | 1361 | 3106 | 10 | 883 | 12731 | 10162 | 643 | 27 | 80,3 | 21,2 |
25_5 | 34822 | 16979 | 9255 | 15521 | 37380 | 2374 | 1351 | 266 | 7 | 517 | 12708 | 10160 | 643 | 27 | 80,3 | 21,2 |
25_10 | 34494 | 16712 | 9057 | 15486 | 37380 | 2373 | 1345 | 194 | 7 | 481 | 12699 | 10158 | 639 | 27 | 80,3 | 21,2 |
31_3 | 28342 | 5037 | 2080 | 13819 | 45158 | 2704 | 1719 | 1007 | 7 | 496 | 12523 | 11112 | 546 | 12 | 76,3 | 18,2 |
31_5 | 27307 | 4971 | 1898 | 13740 | 45158 | 2714 | 1717 | 167 | 6 | 381 | 12494 | 11110 | 552 | 13 | 76,3 | 18,2 |
31_10 | 27265 | 4947 | 1885 | 13829 | 45158 | 2704 | 1698 | 161 | 6 | 377 | 12536 | 11109 | 542 | 13 | 76,3 | 18,2 |
Initial nodes denotes the number of nodes that are in the initial assembly. Largest tangle denotes the number of nodes of the largest connected component. Largest SCC denotes the number of nodes of the largest strongly connected component. Splicing graphs denotes the number of splicing graphs. Max length denotes the length (in nucleotides) of the longest path over all splicing graphs. N50 denotes the N50 value of the length (in nucleotides) of the longest path in each graph. >1-node graphs denotes the number of graphs with more than one node. Max nodes denotes the maximum number of nodes in these non-linear graphs. Avg nodes denotes the average number of nodes in these non-linear graphs. SNPs denotes the number of SNPs recovered. Total hits denotes the total number of hits from translated BLAST search of each node to Drosophila (isoforms are considered the same gene, only the top hit with E-value below 10−7 is included for each node in a splicing graph, and hits from nodes within the same splicing graph to the same gene are counted once). Unique hits denotes the number of unique hits to different genes. >1-hit graphs denotes the number of splicing graphs that have BLAST hits to more than one gene. Max hits denotes the maximum number of different genes that have BLAST hits to a splicing graph. Time (mins) denotes the computational time in minutes, with the values to the left and to the right of "," indicating the running time of Velvet and our postprocessing algorithm respectively. Memory (GB) denotes the memory requirement in gigabytes, with the values to the left and to the right of "," indicating the memory requirement of Velvet and our postprocessing algorithm respectively.