NFDI4DS | UHH-SEMS - Publication Details

The CLRS-Text Algorithmic Reasoning Language Benchmark

Benchmark (surveying)

DOI: 10.48550/arxiv.2406.04229 Publication Date: 2024-06-06

Abstract Supplemental Material References Cited by

AUTHORS (10)

Larisa Markeeva

Sean McLeish

Borja Ibarz

Wilfried Bounsi

Olga Kozlova

Alex Vitvitskyi

Charles Blundell

Tom Goldstein

Avi Schwarzschild

Petar Veličković

ABSTRACT

Eliciting reasoning capabilities from language models (LMs) is a critical direction on the path towards building intelligent systems. Most recent studies dedicated to focus out-of-distribution performance procedurally-generated synthetic benchmarks, bespoke-built evaluate specific skills only. This trend makes results hard transfer across publications, slowing down progress. Three years ago, similar issue was identified and rectified in field of neural algorithmic reasoning, with advent CLRS benchmark. dataset generator comprising graph execution traces classical algorithms Introduction Algorithms textbook. Inspired by this, we propose CLRS-Text -- textual version these traces. Out box, capable procedurally generating trace data for thirty diverse, challenging tasks any desirable input distribution, while offering standard pipeline which additional may be created We fine-tune various LMs as generalist executors this benchmark, validating prior work revealing novel, interesting challenge LM community. Our code available at https://github.com/google-deepmind/clrs/tree/master/clrs/_src/clrs_text.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENAIRE - Products OPENALEX - Publications

PlumX Metrics

The CLRS-Text Algorithmic Reasoning Language Benchmark

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....