Table 3.
An increase in file size was observed per genome added to the graph that demonstrated the compression of data that occurs by collapsing regions of shared aligned sequences into single representative nodes
| Number of genomes | 1 | 2 | 3 | 4 | 5 | 6 | 10 |
|---|---|---|---|---|---|---|---|
| File size | 4,5Mb | 5,9Mb | 7,6Mb | 8,5Mb | 11Mb | 13Mb | 38Mb |
| Number of nodes | 0 | 3,690 | 8,106 | 9,320 | 13,264 | 15,355 | 43,290 |
| Number of edges | 0 | 4,886 | 10,868 | 12,485 | 17,823 | 22,296 | 73,652 |
The compression is related to the similarity of the sequences, as sequences that only differ by few bases will only require a few additional nodes. (Additional file 3)