NFDI4DS | UHH-SEMS - Publication Details

Transformers Can Do Arithmetic with the Right Embeddings

FOS: Computer and information sciences Computer Science - Machine Learning Artificial Intelligence (cs.AI) Computer Science - Artificial Intelligence Machine Learning (cs.LG)

DOI: 10.48550/arxiv.2405.17399 Publication Date: 2024-05-27

Abstract Supplemental Material References Cited by

AUTHORS (11)

Sean McLeish

Arpit Bansal

Alex Stein

Neel Jain

John Kirchenbauer

Brian R. Bartoldson

Bhavya Kailkhura

Abhinav Bhatelé

Jonas Geiping

Avi Schwarzschild

Tom Goldstein

ABSTRACT

The poor performance of transformers on arithmetic tasks seems to stem in large part from their inability keep track the exact position each digit inside a span digits. We mend this problem by adding an embedding that encodes its relative start number. In addition boost these embeddings provide own, we show fix enables architectural modifications such as input injection and recurrent layers improve even further. With positions resolved, can study logical extrapolation ability transformers. Can they solve problems are larger more complex than those training data? find only 20 numbers with single GPU for one day, reach state-of-the-art performance, achieving up 99% accuracy 100 problems. Finally, gains numeracy also unlock improvements other multi-step reasoning including sorting multiplication.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENALEX - Publications OPENAIRE - Products

PlumX Metrics

Transformers Can Do Arithmetic with the Right Embeddings

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....