Table 3.
Parameter | Energy |
---|---|
E m | 5 pJ |
E int32−add | 0.1 pJ |
E int32−mult | 3.1 pJ |
E float32−add | 0.9 pJ |
E float32−mult | 3.7 pJ |
0.9 pJ | |
26.3 pJ | |
3.7 pJ |
Em is half the cache energy estimated by the energy cost table for 45 nm at 0.9 V from Horowitz (45) for an 8 kB 64-bit wide cache access, as only 32-bit loads are considered in this study. Efloat32−add, Efloat32−mult, Eint32−add, Eint32−mult values are also from Horowitz (45).
aAs a compare operation can be performed as a subtraction which is basically an addition, the same energy is assumed for the compare operation. bA benchmark on the Ambiq Apollo 3 Blue with ARM Cortex M4F processor measured by executing three different measurements with a loop containing float multiply, divide and both instructions to cancel out loop and other overhead, showed a factor of 7.1 of float division energy consumption compared to float multiply energy consumption. cOther floating-point operations that take one clock cycle are considered to consume as much as a multiply operation.