#Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning
Heuristics
Granularity
DOI:
10.48550/arxiv.1611.04717
Publication Date:
2016-01-01
AUTHORS (9)
ABSTRACT
Count-based exploration algorithms are known to perform near-optimally when used in conjunction with tabular reinforcement learning (RL) methods for solving small discrete Markov decision processes (MDPs). It is generally thought that count-based cannot be applied high-dimensional state spaces, since most states will only occur once. Recent deep RL strategies able deal continuous spaces through complex heuristics, often relying on optimism the face of uncertainty or intrinsic motivation. In this work, we describe a surprising finding: simple generalization classic approach can reach near state-of-the-art performance various and/or benchmarks. States mapped hash codes, which allows count their occurrences table. These counts then compute reward bonus according theory. We find functions achieve surprisingly good results many challenging tasks. Furthermore, show domain-dependent learned code may further improve these results. Detailed analysis reveals important aspects function: 1) having appropriate granularity and 2) encoding information relevant MDP. This strategy achieves both control tasks Atari 2600 games, hence providing yet powerful baseline MDPs require considerable exploration.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....