Uncertainty-Aware Action Advising for Deep Reinforcement Learning Agents

Leverage (statistics) Advice (programming) Sample (material) Sample complexity
DOI: 10.1609/aaai.v34i04.6036 Publication Date: 2020-06-29T20:56:40Z
ABSTRACT
Although Reinforcement Learning (RL) has been one of the most successful approaches for learning in sequential decision making problems, sample-complexity RL techniques still represents a major challenge practical applications. To combat this challenge, whenever competent policy (e.g., either legacy system or human demonstrator) is available, agent could leverage samples from (advice) to improve sample-efficiency. However, advice normally limited, hence it should ideally be directed states where uncertain on best action execute. In work, we propose Requesting Confidence-Moderated Policy (RCMP), an action-advising framework asks when its epistemic uncertainty high certain state. RCMP takes into account that limited and might suboptimal. We also describe technique estimate by performing minor modifications standard value-function-based methods. Our empirical evaluations show performs better than Importance Advising, not receiving advice, at random Gridworld Atari Pong scenarios.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (0)
CITATIONS (36)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....