Skip to main content
. 2019 Aug 21;10(9):4711–4726. doi: 10.1364/BOE.10.004711

Table 2. Performance increase for each FullMonteCUDA optimization over FullMonteSW.

Optimization Incremental Speedup
Naive 2x
CUDA vector datatypes and math operations 2.5x
Materials constant cache 1.6x
Thread local accumulation cache 1.3x