- Advanced Bandit Algorithms Research
- Auction Theory and Applications
- Optimization and Search Problems
- Machine Learning and Algorithms
- Reinforcement Learning in Robotics
- Game Theory and Applications
- Mobile Crowdsensing and Crowdsourcing
- Consumer Market Behavior and Pricing
- Data Stream Mining Techniques
- Smart Grid Energy Management
- Complexity and Algorithms in Graphs
- Experimental Behavioral Economics Studies
- Peer-to-Peer Network Technologies
- Opinion Dynamics and Social Influence
- Complex Network Analysis Techniques
- Game Theory and Voting Systems
- Cooperative Communication and Network Coding
- Mobile Ad Hoc Networks
- Advanced Graph Theory Research
- Caching and Content Delivery
- Adversarial Robustness in Machine Learning
- Economic theories and models
- Topological and Geometric Data Analysis
- Reservoir Engineering and Simulation Methods
- Data Management and Algorithms
Microsoft (United States)
2016-2025
Microsoft Research New York City (United States)
2014-2024
Microsoft Research (United Kingdom)
2013-2024
Association for Computing Machinery
2021
Silicon Valley Community Foundation
2011-2013
Brown University
2007
Cornell University
2003-2006
John Brown University
2006
This paper introduces a lightweight, scalable and accurate framework, called Meridian, for performing node selection based on network location. The framework consists of an overlay structured around multi-resolution rings, query routing with direct measurements, gossip protocols dissemination. We show how this can be used to address three commonly encountered problems, namely, closest discovery, central leader election, locating nodes that satisfy target latency constraints in large-scale...
In a multi-armed bandit problem, an online algorithm chooses from set of strategies in sequence $n$ trials so as to maximize the total payoff chosen strategies. While performance algorithms with small finite strategy is quite well understood, problems large sets are still topic very active investigation, motivated by practical applications such auctions and web advertisement. The goal research identify broad natural classes functions which enable design efficient solutions.
Multi-armed bandit problems are the predominant theoretical model of exploration-exploitation tradeoffs in learning, and they have countless applications ranging from medical trials, to communication networks, Web search advertising. In many these application domains learner may be constrained by one or more supply (or budget) limits, addition customary limitation on time horizon. The literature lacks a general encompassing sorts problems. We introduce such model, called "bandits with...
We initiate the study of episodic reinforcement learning (RL) under adversarial corruptions in both rewards and transition probabilities underlying system, extending recent results for special case multiarmed bandits. provide a framework that modifies aggressive exploration enjoyed by existing approaches based on optimism face uncertainty complementing them with principles from action elimination. Importantly, our circumvents major challenges posed naively applying elimination RL setting, as...
Most online platforms learn from interactions with users and engage in exploration : making potentially suboptimal choices to acquire new information. We study the interplay between competition how such balance for learning users. consider a stylized duopoly which two firms face same multi-armed bandit problem. Users arrive one by choose firms, so that each firm makes progress on its problem only if it is chosen. whether incentivizes adoption of better algorithms. find stark disincentivizes...
We study the causal effects of financial incentives on quality crowdwork. focus performance-based payments (PBPs), bonus awarded to workers for producing high work. design and run randomized behavioral experiments popular crowdsourcing platform Amazon Mechanical Turk with goal understanding when, where, why PBPs help, identifying properties payment, payment structure, task itself that make them most effective. provide examples tasks which do improve quality. For such tasks, effectiveness is...
We consider the problem of designing revenue maximizing online posted-price mechanisms when seller has limited supply. A k identical items for sale and is facing n potential buyers ("agents") that are arriving sequentially. Each agent interested in buying one item. agent's value an item independent sample from some fixed (but unknown) distribution with support [0,1]. The offers a take-it-or-leave-it price to each (possibly different agents), aims maximize his expected revenue.
Multi-armed bandit problems are the predominant theoretical model of exploration-exploitation tradeoffs in learning, and they have countless applications ranging from medical trials, to communication networks, Web search advertising. In many these application domains, learner may be constrained by one or more supply (or budget) limits, addition customary limitation on time horizon. The literature lacks a general encompassing sorts problems. We introduce such model, called bandits with...
Concurrent with recent theoretical interest in the problem of metric embedding, a growing body research networking community has studied distance matrix defined by node-to-node latencies Internet, resulting number approaches that approximately embed this into low-dimensional Euclidean space. Here we give algorithms provable performance guarantees for beacon-based triangulation and embedding. We show addition to multiplicative error distances, typically must include notion slack - certain...
Individual decision-makers consume information revealed by the previous decision makers, and produce that may help in future makers. This phenomenon is common a wide range of scenarios Internet economy, as well elsewhere, such medical decisions. Each maker when required to select an action, would individually prefer exploit, highest expected reward action conditional on her information. At same time, each makers explore, producing about rewards various actions. A social planner, means...
Over the past decade, crowdsourcing has emerged as a cheap and efficient method of obtaining solutions to simple tasks that are difficult for computers solve but possible humans. The popularity promise markets led both empirical theoretical research on design algorithms optimize various aspects these markets, such pricing assignment tasks. Much existing work focused problems fall into broad category online decision making; task requesters or platform itself make repeated decisions about...
Crowdsourcing markets have emerged as a popular platform for matching available workers with tasks to complete. The payment particular task is typically set by the task's requester, and may be adjusted based on quality of completed work, example, through use "bonus" payments. In this paper, we study requester's problem dynamically adjusting quality-contingent payments tasks. We consider multi-round version well-known principal-agent model, whereby in each round worker makes strategic choice...
We consider the problem of designing revenue-maximizing online posted-price mechanisms when seller has limited supply. A k identical items for sale and is facing n potential buyers (“agents”) that are arriving sequentially. Each agent interested in buying one item. agent’s value an item independent sample from some fixed (but unknown) distribution with support [0,1]. The offers a take-it-or-leave-it price to each (possibly different agents), aims maximize his expected revenue. focus on do...
In a multi-armed bandit problem, an online algorithm chooses from set of strategies in sequence trials to maximize the total payoff chosen strategies. While performance algorithms with small finite strategy is well understood, problems large sets are still topic active investigation, motivated by practical applications, such as auctions and web advertisement. The goal research identify broad natural classes functions that enable design efficient solutions. this work, we study general setting...
We consider a multi-round auction setting motivated by pay-per-click auctions for Internet advertising. In each round the auctioneer selects an advertiser and shows her ad, which is then either clicked or not. An derives value from clicks; of click private information. Initially, neither nor advertisers have any information about likelihood clicks on advertisements. The auctioneer's goal to design (dominant strategies) truthful mechanism that (approximately) maximizes social welfare.
It is widely believed that computing payments needed to induce truthful bidding somehow harder than simply the allocation. We show opposite true for single-parameter domains: creating a randomized mechanism essentially as easy single call monotone allocation function. Our main result general procedure take rule and transform it (via black-box reduction) into in expectation individually rational every realization. Moreover, implements same outcome original with probability arbitrarily close...