NFDI4DS | UHH-SEMS - Publication Details

Ring Attention with Blockwise Transformers for Near-Infinite Context

Feed forward

DOI: 10.48550/arxiv.2310.01889 Publication Date: 2023-01-01

Abstract Supplemental Material References Cited by

AUTHORS (3)

Hao Liu

Matei Zaharia

Pieter Abbeel

ABSTRACT

Transformers have emerged as the architecture of choice for many state-of-the-art AI models, showcasing exceptional performance across a wide range applications. However, memory demands imposed by limit their ability to handle long sequences, thereby posing challenges in utilizing videos, actions, and other long-form sequences modalities complex environments. We present novel approach, Ring Attention with Blockwise (Ring Attention), which leverages blockwise computation self-attention feedforward distribute multiple devices while fully overlapping communication key-value blocks attention. Our approach enables training inference that are up device count times longer than those achievable prior memory-efficient Transformers, without resorting approximations or incurring additional overheads. Extensive experiments on language modeling reinforcement learning tasks demonstrate effectiveness our allowing millions tokens context size improving performance.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENAIRE - Products OPENALEX - Publications

PlumX Metrics

Ring Attention with Blockwise Transformers for Near-Infinite Context

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....