A Generalized Training Approach for Multiagent Learning
FOS: Computer and information sciences
Computer Science - Machine Learning
Artificial Intelligence (cs.AI)
Computer Science - Artificial Intelligence
0202 electrical engineering, electronic engineering, information engineering
Computer Science - Multiagent Systems
02 engineering and technology
Multiagent Systems (cs.MA)
Machine Learning (cs.LG)
DOI:
10.48550/arxiv.1909.12823
Publication Date:
2019-01-01
AUTHORS (15)
ABSTRACT
This paper investigates a population-based training regime based on game-theoretic principles called Policy-Spaced Response Oracles (PSRO). PSRO is general in the sense that it (1) encompasses well-known algorithms such as fictitious play and double oracle special cases, (2) principle applies to general-sum, many-player games. Despite this, prior studies of have been focused two-player zero-sum games, wherein Nash equilibria are tractably computable. In moving from games more settings, computation quickly becomes infeasible. Here, we extend theoretical underpinnings by considering an alternative solution concept, $α$-Rank, which unique (thus faces no equilibrium selection issues, unlike Nash) readily settings. We establish convergence guarantees several classes, identify links between $α$-Rank. demonstrate competitive performance $α$-Rank-based against exact solver-based 2-player Kuhn Leduc Poker. then go beyond reach applications 3- 5-player poker yielding instances where $α$-Rank achieves faster than approximate solvers, thus establishing favorable solver. also carry out initial empirical validation MuJoCo soccer, illustrating feasibility proposed approach another complex domain.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....