Continuous Control for Searching and Planning with a Learned Model

0301 basic medicine FOS: Computer and information sciences Computer Science - Machine Learning 03 medical and health sciences Artificial Intelligence (cs.AI) Computer Science - Artificial Intelligence Machine Learning (cs.LG)
DOI: 10.48550/arxiv.2006.07430 Publication Date: 2020-01-01
ABSTRACT
Decision-making agents with planning capabilities have achieved huge success in the challenging domain like Chess, Shogi, and Go. In an effort to generalize ability more general tasks where environment dynamics are not available agent, researchers proposed MuZero algorithm that can learn dynamical model through interactions environment. this paper, we provide a way necessary theoretical results extend generalized environments continuous action space. Through numerical on two relatively low-dimensional MuJoCo environments, show outperforms soft actor-critic (SAC) algorithm, state-of-the-art model-free deep reinforcement learning algorithm.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....