Discounted Reinforcement Learning Is Not an Optimization Problem

FOS: Computer and information sciences Artificial Intelligence (cs.AI) Computer Science - Artificial Intelligence 0202 electrical engineering, electronic engineering, information engineering 02 engineering and technology 0101 mathematics 01 natural sciences
DOI: 10.48550/arxiv.1910.02140 Publication Date: 2019-01-01
ABSTRACT
Discounted reinforcement learning is fundamentally incompatible with function approximation for control in continuing tasks. It not an optimization problem its usual formulation, so when using there no optimal policy. We substantiate these claims, then go on to address some misconceptions about discounting and connection the average reward formulation. encourage researchers adopt rigorous approaches, such as maximizing reward,
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....