Skip to main content
. 2015 Mar 26;31(15):2482–2488. doi: 10.1093/bioinformatics/btv179

Table 2.

Runtime as a function of the number of parallel tasks (mappers/reducers) on the Intel Big Data cluster and Amazon EMR

Cluster No. worker nodes No. parallel tasks No. CPU cores Runtime
Intel Big Data cluster 1 3 18 47 h 59 min
4 15 90 9 h 54 min
8 31 186 4 h 50 min
15 59 354 2 h 39 min
Amazon EMR 1 4 32 38 h 38 min
2 8 64 20 h 19 min
4 16 128 10 h 20 min
8 32 256 5 h 13 min
16 64 512 2 h 44 min

The time for uploading data to S3 over the internet is not included in the runtimes for Amazon EMR.