NFDI4DS | UHH-SEMS - Publication Details

Doubly Optimal No-Regret Online Learning in Strongly Monotone Games with Bandit Feedback

OPENALEX - Publications

Wenjia Ba Tianyi Lin Jiawei Zhang Zhengyuan Zhou

Doubly Optimal No-Regret Online Learning in Strongly Monotone Games with Bandit Feedback Curious about how players can learn and adapt unknown games without knowing the game’s dynamics? In “Doubly Feedback,” Ba, Lin, Zhang, Zhou present a novel bandit learning algorithm for no-regret where each player only observes its reward determined by all players’ current joint action, not gradient. Focusing on smooth strongly monotone games, they introduce using self-concordant barrier functions. This...

10.1287/opre.2021.0445 article EN Operations Research 2025-01-03

Optimal No-Regret Learning in Strongly Monotone Games with Bandit Feedback

OPENALEX - Publications

Tianyi Lin Zhengyuan Zhou Wenjia Ba Jiawei Zhang

We consider online no-regret learning in unknown games with bandit feedback, where each agent only observes its reward at time -- determined by all players' current joint action rather than gradient. focus on the class of smooth and strongly monotone study optimal therein. Leveraging self-concordant barrier functions, we first construct an convex optimization algorithm show that it achieves single-agent regret $\tilde{\Theta}(\sqrt{T})$ under strongly-concave payoff functions. then if...

10.2139/ssrn.3978421 article EN SSRN Electronic Journal 2021-01-01

Sales Policies for a Virtual Assistant

OPENALEX - Publications

Wenjia Ba Haim Mendelson Mingxi Zhu

We study the implications of selling through a voice-based virtual assistant (VA). The seller has set products available and VA decides which product to offer at what price, seeking maximize its revenue, consumer- or total-surplus. consumer is impatient rational, her expected utility given information her. selects based on consumer's request other it then presents them sequentially. Once presented priced, evaluates whether make purchase. valuation each comprises pre-evaluation value, common...

10.2139/ssrn.3718080 article EN SSRN Electronic Journal 2020-01-01

Doubly Optimal No-Regret Online Learning in Strongly Monotone Games with Bandit Feedback

OPENALEX - Publications

Tianyi Lin Zhengyuan Zhou Wenjia Ba Jiawei Zhang

We consider online no-regret learning in unknown games with bandit feedback, where each player can only observe its reward at time -- determined by all players' current joint action rather than gradient. focus on the class of \textit{smooth and strongly monotone} study optimal therein. Leveraging self-concordant barrier functions, we first construct a new algorithm show that it achieves single-agent regret $\tilde{\Theta}(n\sqrt{T})$ under smooth concave functions ($n \geq 1$ is problem...

10.48550/arxiv.2112.02856 preprint EN other-oa arXiv (Cornell University) 2021-01-01

Sales Policies for a Virtual Assistant

OPENALEX - Publications

Wenjia Ba Haim Mendelson Mingxi Zhu

We study the implications of selling through a voice-based virtual assistant (VA). The seller has set products available and VA decides which product to offer at what price, seeking maximize its revenue, consumer- or total-surplus. consumer is impatient rational, her expected utility given information her. selects based on consumer's request other it then presents them sequentially. Once presented priced, evaluates whether make purchase. valuation each comprises pre-evaluation value, common...

10.48550/arxiv.2009.03719 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Advertising Media and Target Audience Optimization via High-dimensional Bandits

OPENALEX - Publications

Wenjia Ba J. M. Harrison Harikesh S. Nair

We present a data-driven algorithm that advertisers can use to automate their digital ad-campaigns at online publishers. The enables the advertiser search across available target audiences and ad-media find best possible combination for its campaign via experimentation. problem of finding audience-ad is complicated by number distinctive challenges, including (a) need active exploration resolve prior uncertainty speed profitable combinations, (b) many combinations choose from, giving rise...

10.48550/arxiv.2209.08403 preprint EN other-oa arXiv (Cornell University) 2022-01-01