Table 1.
Layer precision for each model version (FP = floating-point; B = binary; Q = quantized).
| Model version | FP model | B+FP model | B+Q model |
|---|---|---|---|
| Most conv layers | FP | Binary | Binary |
| First layer | FP | FP | 4-bit input + 4-bit weight |
| Skip connections | FP | FP | 4-bit/6-bit with scalable range per connection |
| BatchNorm | FP | FP | Binary shift + 4-bit bias |
| Activation function | ReLU | PReLU with FP , & | PReLU with 2-bit and 4-bit & |
| RSign | – | FP threshold () | 4-bit threshold () |