NFDI4DS | UHH-SEMS - Publication Details

RecurrentGemma: Moving Past Transformers for Efficient Open Language Models

Griffin

DOI: 10.48550/arxiv.2404.07839 Publication Date: 2024-04-11

Abstract Supplemental Material References Cited by

AUTHORS (62)

Aleksandar Botev

Soham De

Samuel Smith

Anushan Fernando

George-Cristian M...

Ruba Haroun

Leonard Berrada

Razvan Pascanu

Pier Giuseppe Sessa

Robert Dadashi

Léonard Hussenot

Johan Ferret

Sertan Girgin

Olivier Bachem

Alek Andreev

Kathleen Kenealy

Thomas Mesnard

Cassidy Hardin

Surya Bhupatiraju

Shreya Pathak

Laurent Sifre

Morgane Rivière

Mihir Kale

Juliette Love

Pouya D. Tafti

Armand Joulin

Noah Fiedel

Evan Senter

Yutian Chen

Srivatsan Srinivasan

Guillaume Desjardins

David Budden

Arnaud Doucet

Sharad Vikram

Adam Paszke

Trevor Gale

Sebastian Borgeaud

Charlie Chen

Andy Brock

Antonia Paterson

J. William Brennan

Meg Risdal

Raj Gundluru

Nesh Devanathan

Paul Mooney

Nilay Chauhan

Phil Culliton

Luiz GUStavo Martins

Elisa Bandy

David W. Huntsperger

Glenn Cameron

Arthur Zucker

Tris Warkentin

Ludovic Peran

Minh Giang

Zoubin Ghahramani

Clément Farabet

Koray Kavukcuoglu

Demis Hassabis

Raia Hadsell

Yee Whye Teh

Nando de Frietas

ABSTRACT

We introduce RecurrentGemma, an open language model which uses Google's novel Griffin architecture. combines linear recurrences with local attention to achieve excellent performance on language. It has a fixed-sized state, reduces memory use and enables efficient inference long sequences. provide pre-trained 2B non-embedding parameters, instruction tuned variant. Both models comparable Gemma-2B despite being trained fewer tokens.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENAIRE - Products OPENALEX - Publications

PlumX Metrics

RecurrentGemma: Moving Past Transformers for Efficient Open Language Models

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....