A 6.54-to-26.03 TOPS/W Computing-In-Memory RNN Processor using Input Similarity Optimization and Attention-based Context-breaking with Output Speculation

DOI: 10.23919/vlsicircuits52068.2021.9492492 Publication Date: 2021-07-28T20:33:42Z
ABSTRACT
This work presents a 65nm RNN processor with computing-inmemory (CIM) macros. The main contributions include: 1) A similarity analyzer (SimAyz) to fully leverage the temporal stability of input sequences with 1.52× performance speedup; 2) An attention-based context-breaking (AttenBrk) method with output speculation to reduce off-chip data accesses up to 30.3%; 3) A double-buffering scheme for CIM macros to hide writing latency and a pipeline processing element (PE) array to increase the system throughput. Measured results show that this chip achieves 6.54-to-26.03 TOPS/W energy efficiency vary from various LSTM benchmarks.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (4)
CITATIONS (6)