Breaking the Curse of Multiagency: Provably Efficient Decentralized Multi-Agent RL with Function Approximation
Function Approximation
Markov perfect equilibrium
DOI:
10.48550/arxiv.2302.06606
Publication Date:
2023-01-01
AUTHORS (4)
ABSTRACT
A unique challenge in Multi-Agent Reinforcement Learning (MARL) is the curse of multiagency, where description length game as well complexity many existing learning algorithms scale exponentially with number agents. While recent works successfully address this under model tabular Markov Games, their mechanisms critically rely on states being finite and small, do not extend to practical scenarios enormous state spaces function approximation must be used approximate value functions or policies. This paper presents first line MARL that provably resolve multiagency approximation. We design a new decentralized algorithm -- V-Learning Policy Replay, which gives polynomial sample results for Coarse Correlated Equilibria (CCEs) Games linear Our always outputs CCEs, achieves an optimal rate $\widetilde{\mathcal{O}}(\epsilon^{-2})$ finding $\epsilon$-optimal solutions. Also, when restricted case, our result improves over current best $\widetilde{\mathcal{O}}(\epsilon^{-3})$ CCEs. further present alternative Decentralized Optimistic Mirror Descent, finds policy-class-restricted CCEs using samples. In exchange weaker version applies wider range problems generic approximation, such quadratic games low ''marginal'' Eluder dimension.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....