NFDI4DS | UHH-SEMS - Publication Details

A TensorFlow Extension Framework for Optimized Generation of Hardware CNN Inference Engines

Application-specific integrated circuit Speedup Hardware acceleration Lookup table Graphics processing unit MNIST database

DOI: 10.3390/technologies8010006 Publication Date: 2020-01-15T08:20:22Z

Abstract Supplemental Material References Cited by

AUTHORS (6)

Vasileios Leon

Spyridon Mouselinos

Konstantina Kolio...

Sotirios Xydis

Dimitrios Soudris

Kiamal Pekmestzi

ABSTRACT

The workloads of Convolutional Neural Networks (CNNs) exhibit a streaming nature that makes them attractive for reconfigurable architectures such as the Field-Programmable Gate Arrays (FPGAs), while their increased need low-power and speed has established Application-Specific Integrated Circuit (ASIC)-based accelerators alternative efficient solutions. During last five years, development Hardware Description Language (HDL)-based CNN accelerators, either FPGA or ASIC, seen huge academic interest due to high-performance room optimizations. Towards this direction, we propose library-based framework, which extends TensorFlow, well-established machine learning automatically generates high-throughput inference engines FPGAs ASICs. framework allows software developers exploit benefits FPGA/ASIC acceleration without requiring any expertise on HDL low-level design. Moreover, it provides set optimization knobs concerning model architecture engine generation, allowing developer tune accelerator according requirements respective use case. Our is evaluated by optimizing LeNet MNIST dataset, implementing FPGA- ASIC-based using generated engine. optimal FPGA-based Zynq-7000 delivers 93% less memory footprint 54% Look-Up Table (LUT) utilization, up 10× speedup execution vs. different Graphics Processing Unit (GPU) Central (CPU) implementations same model, in exchange negligible accuracy loss, i.e., 0.89%. For drop, 45 nm standard-cell-based ASIC an implementation operates at 520 MHz occupies area 0.059 mm 2 , power consumption ∼7.5 mW.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES (42)

CITATIONS (8)

EXTERNAL LINKS

OPENALEX - Publications OPENAIRE - Products CROSSREF - Publications

PlumX Metrics

A TensorFlow Extension Framework for Optimized Generation of Hardware CNN Inference Engines

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....