Hessian Aware Quantization of Spiking Neural Networks
Hessian matrix
Neuromorphic engineering
DOI:
10.48550/arxiv.2104.14117
Publication Date:
2021-01-01
AUTHORS (2)
ABSTRACT
To achieve the low latency, high throughput, and energy efficiency benefits of Spiking Neural Networks (SNNs), reducing memory compute requirements when running on a neuromorphic hardware is an important step. Neuromorphic architecture allows massively parallel computation with variable local bit-precisions. However, how different bit-precisions should be allocated to layers or connections network not trivial. In this work, we demonstrate layer-wise Hessian trace analysis can measure sensitivity loss any perturbation layer's weights, used guide allocation layer-specific bit-precision quantizing SNN. addition, current gradient based methods SNN training use complex neuron model multiple state variables, which ideal for efficiency. address challenge, present simplified that reduces number variables by 4-fold while still being compatible training. We find impact accuracy using correlated well trace. The optimal quantized only dropped 0.2%, yet size was reduced 58%. This usage fixed-point arithmetic simpler digital circuits used, increasing overall throughput
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....