Contextual Combinatorial Bandits with Probabilistically Triggered Arms
Smoothness
Benchmark (surveying)
Maximization
DOI:
10.48550/arxiv.2303.17110
Publication Date:
2023-01-01
AUTHORS (7)
ABSTRACT
We study contextual combinatorial bandits with probabilistically triggered arms (C$^2$MAB-T) under a variety of smoothness conditions that capture wide range applications, such as cascading and influence maximization bandits. Under the triggering probability modulated (TPM) condition, we devise C$^2$-UCB-T algorithm propose novel analysis achieves an $\tilde{O}(d\sqrt{KT})$ regret bound, removing potentially exponentially large factor $O(1/p_{\min})$, where $d$ is dimension contexts, $p_{\min}$ minimum positive any arm can be triggered, batch-size $K$ maximum number per round. variance (VM) or (TPVM) conditions, new variance-adaptive VAC$^2$-UCB derive bound $\tilde{O}(d\sqrt{T})$, which independent $K$. As valuable by-product, our technique applied to CMAB-T C$^2$MAB setting, improving existing results there well. also include experiments demonstrate improved performance algorithms compared benchmark on synthetic real-world datasets.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....