RecurrentGemma: Moving Past Transformers for Efficient Open Language Models

Griffin
DOI: 10.48550/arxiv.2404.07839 Publication Date: 2024-04-11
ABSTRACT
We introduce RecurrentGemma, an open language model which uses Google's novel Griffin architecture. combines linear recurrences with local attention to achieve excellent performance on language. It has a fixed-sized state, reduces memory use and enables efficient inference long sequences. provide pre-trained 2B non-embedding parameters, instruction tuned variant. Both models comparable Gemma-2B despite being trained fewer tokens.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....