LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMs
FOS: Computer and information sciences
Computer Science - Computation and Language
Artificial Intelligence (cs.AI)
Computer Science - Artificial Intelligence
Computation and Language (cs.CL)
DOI:
10.48550/arxiv.2406.15319
Publication Date:
2024-06-21
AUTHORS (3)
ABSTRACT
In traditional RAG framework, the basic retrieval units are normally short. The common retrievers like DPR work with 100-word Wikipedia paragraphs. Such a design forces retriever to search over large corpus find `needle' unit. contrast, readers only need extract answers from short retrieved units. an imbalanced `heavy' and `light' reader can lead sub-optimal performance. order alleviate imbalance, we propose new framework LongRAG, consisting of `long retriever' reader'. LongRAG processes entire into 4K-token units, which is 30x longer than before. By increasing unit size, significantly reduce total 22M 700K. This lowers burden retriever, leads remarkable score: answer recall@1=71% on NQ (previously 52%) recall@2=72% 47%) HotpotQA (full-wiki). Then feed top-k ($\approx$ 30K tokens) existing long-context LLM perform zero-shot extraction. Without requiring any training, achieves EM 62.7% NQ, best known result. also 64.3% (full-wiki), par SoTA model. Our study offers insights future roadmap for combining LLMs.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....