NFDI4DS | UHH-SEMS - Publication Details

Adaptive Exploration-Exploitation Tradeoff for Opportunistic Bandits

FOS: Computer and information sciences Computer Science - Machine Learning Statistics - Machine Learning 0202 electrical engineering, electronic engineering, information engineering Machine Learning (stat.ML) 02 engineering and technology Machine Learning (cs.LG)

DOI: 10.48550/arxiv.1709.04004 Publication Date: 2017-01-01

Abstract Supplemental Material References Cited by

AUTHORS (3)

Wu, Huasen

Guo, Xueying

Liu, Xin

ABSTRACT

In Proceedings of the 35th International Conference on Machine Learning (ICML), 2018, pp. 5306-5314, Stockholmsm\"assan, Stockholm Sweden, ICML 2018. (PMLR 80:5306-5314)<br/>In this paper, we propose and study opportunistic bandits - a new variant of bandits where the regret of pulling a suboptimal arm varies under different environmental conditions, such as network load or produce price. When the load/price is low, so is the cost/regret of pulling a suboptimal arm (e.g., trying a suboptimal network configuration). Therefore, intuitively, we could explore more when the load/price is low and exploit more when the load/price is high. Inspired by this intuition, we propose an Adaptive Upper-Confidence-Bound (AdaUCB) algorithm to adaptively balance the exploration-exploitation tradeoff for opportunistic bandits. We prove that AdaUCB achieves $O(\log T)$ regret with a smaller coefficient than the traditional UCB algorithm. Furthermore, AdaUCB achieves $O(1)$ regret with respect to $T$ if the exploration cost is zero when the load level is below a certain threshold. Last, based on both synthetic data and real-world traces, experimental results show that AdaUCB significantly outperforms other bandit algorithms, such as UCB and TS (Thompson Sampling), under large load/price fluctuations.<br/>

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENAIRE - Products

PlumX Metrics

Adaptive Exploration-Exploitation Tradeoff for Opportunistic Bandits

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....