NFDI4DS | UHH-SEMS - Publication Details

Diffusion Approximations for Thompson Sampling

FOS: Computer and information sciences Computer Science - Machine Learning FOS: Mathematics Mathematics - Statistics Theory Statistics Theory (math.ST) 0101 mathematics 01 natural sciences 62B15, 60J70 Machine Learning (cs.LG)

DOI: 10.48550/arxiv.2105.09232 Publication Date: 2021-01-01

Abstract Supplemental Material References Cited by

AUTHORS (2)

Lin Fan

Peter W. Glynn

ABSTRACT

We study the behavior of Thompson sampling from perspective weak convergence. In regime where gaps between arm means scale as $1/\sqrt{n}$ with time horizon $n$, we show that dynamics evolve according to discrete versions SDE's and stochastic ODE's. As $n \to \infty$, converge weakly solutions corresponding Our convergence theory is developed first principles using Continuous Mapping Theorem, can be easily adapted analyze other sampling-based bandit algorithms. this regime, also limits many algorithms -- including designed for any exponential family rewards, involving bootstrap-based coincide those Gaussian sampling. Moreover, in these are generally robust model mis-specification.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENAIRE - Products OPENALEX - Publications

PlumX Metrics

Diffusion Approximations for Thompson Sampling

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....