Skip to main content
. 2018 Nov 9;7:e42166. doi: 10.7554/eLife.42166

Figure 2. Accelerated CPU performance (A) Even when specific vector instructions are disabled, RELION-3 runs faster than RELION-2 even on the previous-generation Broadwell processors that are ubiquitous in many cryo-EM clusters worldwide.

Enabling vectorisation during compilation with the Intel compiler benefits the new streamlined code path, but not the legacy code. (B) For latest-generation Skylake CPUs, the difference is much larger even with only AVX2 vectorisation enabled, and when enabling the new AVX512 instructions the performance is roughly 4.5x higher than the legacy code path. (C) The accelerated CPU code executing on dual-socket x86 nodes provides cost-efficiency that is at least approaching that of professional-class GPU hardware (but not consumer GPUs).

Figure 2.

Figure 2—figure supplement 1. FCSs comparing half-sets in each type of run, legacy-CPU, acc-CPU, and acc-GPU, all reach the same resolution and sampling accuracy.

Figure 2—figure supplement 1.

Additional FSCs compare the reconstruction of the acc-CPU and acc-GPU using the legacy-CPU as a base of comparison, showing numerical agreement beyond the reconstructed signal threshold. This validates that the quality of results has not been compromised from using lower precision or compiler optimisation.