Discounted Reinforcement Learning Is Not an Optimization Problem
FOS: Computer and information sciences
Artificial Intelligence (cs.AI)
Computer Science - Artificial Intelligence
0202 electrical engineering, electronic engineering, information engineering
02 engineering and technology
0101 mathematics
01 natural sciences
DOI:
10.48550/arxiv.1910.02140
Publication Date:
2019-01-01
AUTHORS (5)
ABSTRACT
Discounted reinforcement learning is fundamentally incompatible with function approximation for control in continuing tasks. It not an optimization problem its usual formulation, so when using there no optimal policy. We substantiate these claims, then go on to address some misconceptions about discounting and connection the average reward formulation. encourage researchers adopt rigorous approaches, such as maximizing reward,
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....