NFDI4DS | UHH-SEMS - Publication Details

Breaking the Curse of Multiagency: Provably Efficient Decentralized Multi-Agent RL with Function Approximation

Function Approximation Markov perfect equilibrium

DOI: 10.48550/arxiv.2302.06606 Publication Date: 2023-01-01

Abstract Supplemental Material References Cited by

AUTHORS (4)

Yuanhao Wang

Qinghua Liu

Yu Bai

Chi Jin

ABSTRACT

A unique challenge in Multi-Agent Reinforcement Learning (MARL) is the curse of multiagency, where description length game as well complexity many existing learning algorithms scale exponentially with number agents. While recent works successfully address this under model tabular Markov Games, their mechanisms critically rely on states being finite and small, do not extend to practical scenarios enormous state spaces function approximation must be used approximate value functions or policies. This paper presents first line MARL that provably resolve multiagency approximation. We design a new decentralized algorithm -- V-Learning Policy Replay, which gives polynomial sample results for Coarse Correlated Equilibria (CCEs) Games linear Our always outputs CCEs, achieves an optimal rate $\widetilde{\mathcal{O}}(\epsilon^{-2})$ finding $\epsilon$-optimal solutions. Also, when restricted case, our result improves over current best $\widetilde{\mathcal{O}}(\epsilon^{-3})$ CCEs. further present alternative Decentralized Optimistic Mirror Descent, finds policy-class-restricted CCEs using samples. In exchange weaker version applies wider range problems generic approximation, such quadratic games low ''marginal'' Eluder dimension.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENALEX - Publications OPENAIRE - Products

PlumX Metrics

Breaking the Curse of Multiagency: Provably Efficient Decentralized Multi-Agent RL with Function Approximation

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....