A Game-Theoretic Approach to Multi-Agent Trust Region Optimization

Stochastic game Trust region Fictitious play
DOI: 10.48550/arxiv.2106.06828 Publication Date: 2021-01-01
ABSTRACT
Trust region methods are widely applied in single-agent reinforcement learning problems due to their monotonic performance-improvement guarantee at every iteration. Nonetheless, when multi-agent settings, the of trust no longer holds because an agent's payoff is also affected by other agents' adaptive behaviors. To tackle this problem, we conduct a game-theoretical analysis policy space, and propose method (MATRL), which enables optimization for learning. Specifically, MATRL finds stable improvement direction that guided solution concept Nash equilibrium meta-game level. We derive settings empirically show local convergence fixed points two-player rotational differential game. test our method, evaluate both discrete continuous multiplayer general-sum games including checker switch grid worlds, MuJoCo, Atari games. Results suggest significantly outperforms strong baselines.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....