Beyond the Limits: A Survey of Techniques to Extend the Context Length in Large Language Models
Length measurement
DOI:
10.24963/ijcai.2024/917
Publication Date:
2024-07-26T14:28:11Z
AUTHORS (6)
ABSTRACT
Recently, large language models (LLMs) have shown remarkable capabilities including understanding context, engaging in logical reasoning, and generating responses. However, this is achieved at the expense of stringent computational memory requirements, hindering their ability to effectively support long input sequences. This survey provides an inclusive review recent techniques methods devised extend sequence length LLMs, thereby enhancing capacity for long-context understanding. In particular, we categorize a wide range architectural modifications, such as modified positional encoding altered attention mechanisms, which are designed enhance processing longer sequences while avoiding proportional increase cost. The diverse methodologies investigated study can be leveraged across different phases i.e., training, fine-tuning inference. enables LLMs efficiently process extended limitations current discussed last section along with suggestions future research directions, underscoring importance continued advancement LLMs.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (0)
CITATIONS (6)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....