NFDI4DS | UHH-SEMS - Publication Details

Understanding Transformer Reasoning Capabilities via Graph Algorithms

FOS: Computer and information sciences Computer Science - Machine Learning Artificial Intelligence (cs.AI) Computer Science - Artificial Intelligence Machine Learning (cs.LG)

DOI: 10.48550/arxiv.2405.18512 Publication Date: 2024-05-28

Abstract Supplemental Material References Cited by

AUTHORS (8)

Clayton Sanford

Bahare Fatemi

Ethan Hall

Anton Tsitsulin

Mehran Kazemi

Jonathan Halcrow

Bryan Perozzi

Vahab Mirrokni

ABSTRACT

Which transformer scaling regimes are able to perfectly solve different classes of algorithmic problems? While tremendous empirical advances have been attained by transformer-based neural networks, a theoretical understanding their reasoning capabilities in realistic parameter is lacking. We investigate this question terms the network's depth, width, and number extra tokens for algorithm execution. Our novel representational hierarchy separates 9 problems into solvable transformers regimes. prove that logarithmic depth necessary sufficient tasks like graph connectivity, while single-layer with small embedding dimensions can contextual retrieval tasks. also support our analysis ample evidence using GraphQA benchmark. These results show excel at many tasks, even outperforming specialized networks.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENAIRE - Products OPENALEX - Publications

PlumX Metrics

Understanding Transformer Reasoning Capabilities via Graph Algorithms

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....