NFDI4DS | UHH-SEMS - Publication Details

Towards Ultra-High Performance and Energy Efficiency of Deep Learning Systems: An Algorithm-Hardware Co-Optimization Framework

Speedup Control reconfiguration Gate array High-Level Synthesis

DOI: 10.1609/aaai.v32i1.11653 Publication Date: 2022-06-24T21:08:34Z

Abstract Supplemental Material References Cited by

AUTHORS (11)

Yanzhi Wang

Caiwen Ding

Zhe Li

Geng Yuan

Siyu Liao

Xiaolong Ma

Bo Yuan

Xuehai Qian

Jian Tang

Qinru Qiu

Xue Lin

ABSTRACT

Hardware accelerations of deep learning systems have been extensively investigated in industry and academia. The aim this paper is to achieve ultra-high energy efficiency performance for hardware implementations neural networks (DNNs). An algorithm-hardware co-optimization framework developed, which applicable different DNN types, sizes, application scenarios. algorithm part adopts the general block-circulant matrices a fine-grained tradeoff accuracy compression ratio. It applies both fully-connected convolutional layers contains mathematically rigorous proof effectiveness method. proposed reduces computational complexity per layer from O(n2) O(n log n) storage O(n), training inference. consists highly efficient Field Programmable Gate Array (FPGA)-based using effective reconfiguration, batch processing, pipelining, resource re-using, hierarchical control. Experimental results demonstrate that achieves at least 152X speedup 71X gain compared with IBM TrueNorth processor under same test accuracy. 31X reference FPGA-based work.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES (0)

CITATIONS (9)

EXTERNAL LINKS

CROSSREF - Publications OPENAIRE - Products OPENALEX - Publications

PlumX Metrics

Towards Ultra-High Performance and Energy Efficiency of Deep Learning Systems: An Algorithm-Hardware Co-Optimization Framework

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....