SpikeGPT: Generative Pre-trained Language Model with Spiking Neural Networks

Leverage (statistics) FLOPS
DOI: 10.48550/arxiv.2302.13939 Publication Date: 2023-01-01
ABSTRACT
As the size of large language models continue to scale, so does computational resources required run it. Spiking Neural Networks (SNNs) have emerged as an energy-efficient approach deep learning that leverage sparse and event-driven activations reduce overhead associated with model inference. While they become competitive non-spiking on many computer vision tasks, SNNs also proven be more challenging train. a result, their performance lags behind modern learning, we are yet see effectiveness in generation. In this paper, inspired by Receptance Weighted Key Value (RWKV) model, successfully implement `SpikeGPT', generative binary, spiking activation units. We train proposed two variants: 45M 216M parameters. To best our knowledge, SpikeGPT is largest backpropagation-trained SNN date, rendering it suitable for both generation comprehension natural language. achieve modifying transformer block replace multi-head self attention quadratic complexity O(N^2) linear O(N) increasing sequence length. Input tokens instead streamed sequentially mechanism (as typical SNNs). Our preliminary experiments show remains tested benchmarks, while maintaining 20x fewer operations when processed neuromorphic hardware can sparse, activations.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....