Online Posterior Sampling with a Diffusion Prior
FOS: Computer and information sciences
Computer Science - Machine Learning
Statistics - Machine Learning
Machine Learning (stat.ML)
Machine Learning (cs.LG)
DOI:
10.48550/arxiv.2410.03919
Publication Date:
2024-10-04
AUTHORS (5)
ABSTRACT
Posterior sampling in contextual bandits with a Gaussian prior can be implemented exactly or approximately using the Laplace approximation. The is computationally efficient but it cannot describe complex distributions. In this work, we propose approximate posterior algorithms for diffusion model prior. key idea to sample from chain of conditional posteriors, one each stage reverse process, which are estimated closed form Our approximations motivated by prior, and inherit its simplicity efficiency. They asymptotically consistent perform well empirically on variety bandit problems.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....