Learning in Nonzero-Sum Stochastic Games with Potentials
FOS: Computer and information sciences
0202 electrical engineering, electronic engineering, information engineering
Computer Science - Multiagent Systems
02 engineering and technology
Multiagent Systems (cs.MA)
DOI:
10.48550/arxiv.2103.09284
Publication Date:
2021-01-01
AUTHORS (9)
ABSTRACT
Multi-agent reinforcement learning (MARL) has become effective in tackling discrete cooperative game scenarios. However, MARL yet to penetrate settings beyond those modelled by team and zero-sum games, confining it a small subset of multi-agent systems. In this paper, we introduce new generation learners that can handle nonzero-sum payoff structures continuous settings. particular, study the problem class games known as stochastic potential (SPGs) with state-action spaces. Unlike which all agents share common reward, SPGs are capable modelling real-world scenarios where seek fulfil their individual goals. We prove theoretically our method, SPot-AC, enables independent learn Nash equilibrium strategies polynomial time. demonstrate framework tackles previously unsolvable tasks such Coordination Navigation large selfish routing outperforms state art baselines MADDPG COMIX
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....