Near-Optimal BRL using Optimistic Local Transitions

Sample complexity
DOI: 10.48550/arxiv.1206.4613 Publication Date: 2012-01-01
ABSTRACT
Model-based Bayesian Reinforcement Learning (BRL) allows a found formalization of the problem acting optimally while facing an unknown environment, i.e., avoiding exploration-exploitation dilemma. However, algorithms explicitly addressing BRL suffer from such combinatorial explosion that large body work relies on heuristic algorithms. This paper introduces BOLT, simple and (almost) deterministic algorithm for which is optimistic about transition function. We analyze BOLT's sample complexity, show under certain parameters, near-optimal in sense with high probability. Then, experimental results highlight key differences this method compared to previous work.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....