Wenjia Ba

ORCID: 0000-0003-3427-415X
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Auction Theory and Applications
  • Advanced Bandit Algorithms Research
  • Supply Chain and Inventory Management
  • Consumer Market Behavior and Pricing
  • Reinforcement Learning in Robotics
  • Optimization and Search Problems
  • Mobile Crowdsensing and Crowdsourcing
  • Machine Learning and Algorithms
  • Distributed Sensor Networks and Detection Algorithms
  • Digital Platforms and Economics

University of British Columbia
2025

Stanford University
2020-2021

Doubly Optimal No-Regret Online Learning in Strongly Monotone Games with Bandit Feedback Curious about how players can learn and adapt unknown games without knowing the game’s dynamics? In “Doubly Feedback,” Ba, Lin, Zhang, Zhou present a novel bandit learning algorithm for no-regret where each player only observes its reward determined by all players’ current joint action, not gradient. Focusing on smooth strongly monotone games, they introduce using self-concordant barrier functions. This...

10.1287/opre.2021.0445 article EN Operations Research 2025-01-03

We consider online no-regret learning in unknown games with bandit feedback, where each agent only observes its reward at time -- determined by all players' current joint action rather than gradient. focus on the class of smooth and strongly monotone study optimal therein. Leveraging self-concordant barrier functions, we first construct an convex optimization algorithm show that it achieves single-agent regret $\tilde{\Theta}(\sqrt{T})$ under strongly-concave payoff functions. then if...

10.2139/ssrn.3978421 article EN SSRN Electronic Journal 2021-01-01

We study the implications of selling through a voice-based virtual assistant (VA). The seller has set products available and VA decides which product to offer at what price, seeking maximize its revenue, consumer- or total-surplus. consumer is impatient rational, her expected utility given information her. selects based on consumer's request other it then presents them sequentially. Once presented priced, evaluates whether make purchase. valuation each comprises pre-evaluation value, common...

10.2139/ssrn.3718080 article EN SSRN Electronic Journal 2020-01-01

We consider online no-regret learning in unknown games with bandit feedback, where each player can only observe its reward at time -- determined by all players' current joint action rather than gradient. focus on the class of \textit{smooth and strongly monotone} study optimal therein. Leveraging self-concordant barrier functions, we first construct a new algorithm show that it achieves single-agent regret $\tilde{\Theta}(n\sqrt{T})$ under smooth concave functions ($n \geq 1$ is problem...

10.48550/arxiv.2112.02856 preprint EN other-oa arXiv (Cornell University) 2021-01-01

We study the implications of selling through a voice-based virtual assistant (VA). The seller has set products available and VA decides which product to offer at what price, seeking maximize its revenue, consumer- or total-surplus. consumer is impatient rational, her expected utility given information her. selects based on consumer's request other it then presents them sequentially. Once presented priced, evaluates whether make purchase. valuation each comprises pre-evaluation value, common...

10.48550/arxiv.2009.03719 preprint EN other-oa arXiv (Cornell University) 2020-01-01

We present a data-driven algorithm that advertisers can use to automate their digital ad-campaigns at online publishers. The enables the advertiser search across available target audiences and ad-media find best possible combination for its campaign via experimentation. problem of finding audience-ad is complicated by number distinctive challenges, including (a) need active exploration resolve prior uncertainty speed profitable combinations, (b) many combinations choose from, giving rise...

10.48550/arxiv.2209.08403 preprint EN other-oa arXiv (Cornell University) 2022-01-01
Coming Soon ...