NFDI4DS | UHH-SEMS - Publication Details

On Retrieval Augmentation and the Limitations of Language Model Training

FOS: Computer and information sciences Computer Science - Computation and Language Computation and Language (cs.CL)

DOI: 10.48550/arxiv.2311.09615 Publication Date: 2023-01-01

Abstract Supplemental Material References Cited by

AUTHORS (6)

Ting-Rui Chiang

Xinyan Yu

Joshua A. Robinson

Ollie Liu

Isabelle Lee

Dani Yogatama

ABSTRACT

Augmenting a language model (LM) with $k$-nearest neighbors ($k$NN) retrieval on its training data alone can decrease perplexity, though the underlying reasons for this remain elusive. In work, we rule out one previously posited possibility -- "softmax bottleneck." We then create new dataset to evaluate LM generalization ability in setting where contains additional information that is not causally relevant. This task challenging even GPT-3.5 Turbo. show that, both GPT-2 and Mistral 7B, $k$NN augmentation consistently improves performance setting. Finally, make more accessible, propose using multi-layer perceptron maps datastore keys values as drop-in replacement traditional retrieval. reduces storage costs by over 25x.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENALEX - Publications OPENAIRE - Products

PlumX Metrics

On Retrieval Augmentation and the Limitations of Language Model Training

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....