Skip to main content
. 2024 Nov 19;17:273. doi: 10.1186/s12920-024-02037-9

Fig. 19.

Fig. 19

Performance gain due to parallel processing with vectorized instructions and usage of multiple cores together. The performance gain (normalized performance to baseline) is calculated as the throughput ratio between the execution using vectorized instructions plus multiple cores over single core. The optimal number of cores for both z-test and linear reg workloads is 16 due to reaching AVX512 overhead, whereas for the cluster workload is 32, less affected by AVX512 overhead