Adaptive Quantization for Deep Neural Network
Deep Neural Networks
DOI:
10.1609/aaai.v32i1.11623
Publication Date:
2022-06-24T21:08:34Z
AUTHORS (4)
ABSTRACT
In recent years Deep Neural Networks (DNNs) have been rapidly developed in various applications, together with increasingly complex architectures. The performance gain of these DNNs generally comes high computational costs and large memory consumption, which may not be affordable for mobile platforms. model quantization can used reducing the computation DNNs, deploying on equipment. this work, we propose an optimization framework deep quantization. First, a measurement to estimate effect parameter errors individual layers overall prediction accuracy. Then, process based finding optimal bit-width each layer. This is first work that theoretically analyse relationship between Our new algorithm outperforms previous methods, achieves 20-40% higher compression rate compared equal at same
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (0)
CITATIONS (79)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....