On-chip trainable hardware-based deep Q-networks approximating a backpropagation algorithm

Backpropagation
DOI: 10.1007/s00521-021-05699-z Publication Date: 2021-02-10T07:45:05Z
ABSTRACT
Abstract Reinforcement learning (RL) using deep Q-networks (DQNs) has shown performance beyond the human level in a number of complex problems. In addition, many studies have focused on bio-inspired hardware-based spiking neural networks (SNNs) given capabilities these technologies to realize both parallel operation and low power consumption. Here, we propose an on-chip training method for DQNs applicable SNNs. Because conventional backpropagation (BP) algorithm is approximated, evaluation based two simple games shows that proposed system achieves similar software-based system. The can minimize memory usage reduce consumption area occupation levels. particular, problems, dependency be significantly reduced high achieved without replay memory. Furthermore, investigate effect nonlinearity characteristics types variation non-ideal synaptic devices outcomes. this work, thin-film transistor (TFT)-type flash cells are used as devices. A simulation also conducted fully connected network with non-leaky integrated-and-fire (I&F) neurons. strong immunity device variations because scheme adopted.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (42)
CITATIONS (9)