NFDI4DS | UHH-SEMS - Publication Details

Evaluating Spatial Understanding of Large Language Models

Spatial Ability

DOI: 10.48550/arxiv.2310.14540 Publication Date: 2023-01-01

Abstract Supplemental Material References Cited by

AUTHORS (5)

Yutaro Yamada

Yihan Bao

Andrew K. Lampinen

Jungo Kasai

Ilker Yildirim

ABSTRACT

Large language models (LLMs) show remarkable capabilities across a variety of tasks. Despite the only seeing text in training, several recent studies suggest that LLM representations implicitly capture aspects underlying grounded concepts. Here, we explore particularly salient kind knowledge -- spatial relationships. We design natural-language navigation tasks and evaluate ability LLMs, particular GPT-3.5-turbo, GPT-4, Llama2 series models, to represent reason about structures. These reveal substantial variability performance different structures, including square, hexagonal, triangular grids, rings, trees. In extensive error analysis, find LLMs' mistakes reflect both non-spatial factors. findings LLMs appear certain structure implicitly, but room for improvement remains.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENALEX - Publications OPENAIRE - Products

PlumX Metrics

Evaluating Spatial Understanding of Large Language Models

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....