NFDI4DS | UHH-SEMS - Publication Details

GROOT: Learning to Follow Instructions by Watching Gameplay Videos

Benchmark (surveying)

DOI: 10.48550/arxiv.2310.08235 Publication Date: 2023-01-01

Abstract Supplemental Material References Cited by

AUTHORS (6)

Shaofei Cai

Bowei Zhang

Zihao Wang

Xiaojian Ma

Anji Liu

Yitao Liang

ABSTRACT

We study the problem of building a controller that can follow open-ended instructions in open-world environments. propose to reference videos as instructions, which offer expressive goal specifications while eliminating need for expensive text-gameplay annotations. A new learning framework is derived allow such instruction-following controllers from gameplay producing video instruction encoder induces structured space. implement our agent GROOT simple yet effective encoder-decoder architecture based on causal transformers. evaluate against counterparts and human players proposed Minecraft SkillForge benchmark. The Elo ratings clearly show closing human-machine gap well exhibiting 70% winning rate over best generalist baseline. Qualitative analysis induced space further demonstrates some interesting emergent properties, including composition complex behavior synthesis. project page available at https://craftjarvis-groot.github.io.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENAIRE - Products OPENALEX - Publications

PlumX Metrics

GROOT: Learning to Follow Instructions by Watching Gameplay Videos

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....