Enhancing Length Extrapolation in Sequential Models with Pointer-Augmented Neural Memory
FOS: Computer and information sciences
Computer Science - Machine Learning
Computer Science - Computation and Language
Computation and Language (cs.CL)
Machine Learning (cs.LG)
DOI:
10.48550/arxiv.2404.11870
Publication Date:
2024-04-17
AUTHORS (5)
ABSTRACT
We propose Pointer-Augmented Neural Memory (PANM) to help neural networks understand and apply symbol processing new, longer sequences of data. PANM integrates an external memory that uses novel physical addresses pointer manipulation techniques mimic human computer abilities. facilitates assignment, dereference, arithmetic by explicitly using pointers access content. Remarkably, it can learn perform these operations through end-to-end training on sequence data, powering various sequential models. Our experiments demonstrate PANM's exceptional length extrapolating capabilities improved performance in tasks require processing, such as algorithmic reasoning Dyck language recognition. helps Transformer achieve up 100% generalization accuracy compositional learning significantly better results mathematical reasoning, question answering machine translation tasks.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....