Sam Devlin

ORCID: 0000-0002-7769-3090
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Reinforcement Learning in Robotics
  • Artificial Intelligence in Games
  • Digital Games and Media
  • Educational Games and Gamification
  • Data Stream Mining Techniques
  • Software Engineering Research
  • Adaptive Dynamic Programming Control
  • Evolutionary Algorithms and Applications
  • Gambling Behavior and Treatments
  • Multi-Agent Systems and Negotiation
  • Auction Theory and Applications
  • Topic Modeling
  • Robot Manipulation and Learning
  • Sports Analytics and Performance
  • Domain Adaptation and Few-Shot Learning
  • Game Theory and Applications
  • Explainable Artificial Intelligence (XAI)
  • Machine Learning and Data Classification
  • Human Pose and Action Recognition
  • Adversarial Robustness in Machine Learning
  • Model-Driven Software Engineering Techniques
  • Social Robot Interaction and HRI
  • Formal Methods in Verification
  • Advanced Bandit Algorithms Research
  • Evolutionary Game Theory and Cooperation

Microsoft Research (United Kingdom)
2018-2025

University of York
2012-2021

Microsoft (United States)
2019-2021

Ollscoil na Gaillimhe – University of Galway
2017

Galway-Mayo Institute of Technology
2017

City, University of London
2014

Potential-based reward shaping can significantly improve the time needed to learn an optimal policy and, in multi-agent systems, performance of final joint-policy. It has been proven not alter agent learning alone or Nash equilibria multiple agents together.However, a limitation existing proofs is assumption that potential state does change dynamically during learning. This often broken, especially if reward-shaping function generated automatically.In this paper we prove and demonstrate...

10.5555/2343576.2343638 article EN Adaptive Agents and Multi-Agents Systems 2012-06-04

Esports are competitive videogames watched by audiences. Most esports generate detailed data for each match that publicly available. analytics research is focused on predicting outcomes. Previous has emphasized prematch prediction and used from amateur games, which more easily available than those professional level. However, the commercial value of win exists at Furthermore, real-time unexplored, as its potential informing Here, we present first comprehensive case study live in a esport. We...

10.1109/tg.2019.2948469 article EN cc-by IEEE Transactions on Games 2019-11-19

Effectively incorporating external advice is an important problem in reinforcement learning, especially as it moves into the real world. Potential-based reward shaping a way to provide agent with specific form of additional reward, guarantee policy invariance. In this work we give novel incorporate arbitrary function same guarantee, by implicitly translating dynamic potentials, which are maintained auxiliary value learnt at time. We show that provided captures input expectation, and...

10.1609/aaai.v29i1.9628 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2015-02-21

Potential-based reward shaping has previously been proven to both be equivalent Q-table initialisation and guarantee policy invariance in single-agent reinforcement learning. The method since used multi-agent learning without consideration of whether the theoretical equivalence guarantees hold. This paper extends existing proofs similar results systems, providing background explain success previous empirical studies. Specifically, it is that remains Nash Equilibria underlying stochastic game...

10.5555/2030470.2030503 article EN Adaptive Agents and Multi-Agents Systems 2011-05-02

Difference rewards and potential-based reward shaping can both significantly improve the joint policy learnt by multiple reinforcement learning agents acting simultaneously in same environment. capture an agent's contribution to system's performance. Potential-based has been proven not alter Nash equilibria of system but requires domain-specific knowledge. This paper introduces two novel functions that combine these methods leverage benefits both.Using difference reward's Counterfactual as...

10.5555/2615731.2615761 article EN Adaptive Agents and Multi-Agents Systems 2014-05-05

Diffusion models have emerged as powerful generative in the text-to-image domain. This paper studies their application observation-to-action for imitating human behaviour sequential environments. Human is stochastic and multimodal, with structured correlations between action dimensions. Meanwhile, standard modelling choices cloning are limited expressiveness may introduce bias into cloned policy. We begin by pointing out limitations of these choices. then propose that diffusion an excellent...

10.48550/arxiv.2301.10677 preprint EN other-oa arXiv (Cornell University) 2023-01-01

This paper investigates the impact of reward shaping in multi-agent reinforcement learning as a way to incorporate domain knowledge about good strategies. In theory, potential-based does not alter Nash Equilibria stochastic game, only exploration shaped agent. We demonstrate empirically performance two problem domains within context RoboCup KeepAway by designing three schemes, encouraging specific behaviour such keeping minimum distance from other players on same team and taking roles. The...

10.1142/s0219525911002998 article EN Advances in Complex Systems 2011-04-01

In the game industry, especially for free to play games, player retention and purchases are important issues. There have been several approaches investigated towards predicting them by players' behaviours during sessions. However, most current methods only available specific games because data representations utilised usually specific. This work intends use frequency of events as predict both disengagement from decisions their first purchases. method is able provide better generality exist...

10.1109/cig.2015.7317919 article EN 2015-08-01

Abstract The majority of multi-agent reinforcement learning (MARL) implementations aim to optimize systems with respect a single objective, despite the fact that many real-world problems are inherently multi-objective in nature. Research into MARL is still its infancy, and few studies date have dealt issue credit assignment. Reward shaping has been proposed as means address assignment problem single-objective MARL, however it shown alter intended goals domain if misused, leading unintended...

10.1017/s0269888918000292 article EN The Knowledge Engineering Review 2018-01-01

Though deep reinforcement learning has led to breakthroughs in many difficult domains, these successes have required an ever-increasing number of samples. As state-of-the-art (RL) systems require exponentially increasing samples, their development is restricted a continually shrinking segment the AI community. Likewise, cannot be applied real-world problems, where environment samples are expensive. Resolution limitations requires new, sample-efficient methods. To facilitate research this...

10.48550/arxiv.1904.10079 preprint EN cc-by arXiv (Cornell University) 2019-01-01

Learning in multi-agent scenarios is a fruitful research direction, but current approaches still show scalability problems multiple games with general reward settings and different opponent types. The Multi-Agent Reinforcement MalmÖ (MARLÖ) competition new challenge that proposes this domain using 3D games. goal of contest to foster agents can learn across types, proposing as milestone the direction Artificial General Intelligence.

10.48550/arxiv.1901.08129 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Esports - video games played competitively that are broadcast to large audiences a rapidly growing new form of mainstream entertainment. borrow from traditional TV, but qualitatively different genre, due the high flexibility content capture and availability detailed gameplay data. Indeed, in esports, there is access both real-time historical data about any action taken virtual world. This aspect motivates research presented here, question asked being: can information buried deep such data,...

10.1145/3210825.3210833 article EN 2018-06-25

Monte Carlo Tree Search (MCTS) has become a popular solution for controlling non-player characters. Its use repeatedly been shown to be capable of creating strong game playing opponents. However, the emergent playstyle agents using MCTS is not necessarily human-like, believable or enjoyable. AI Factory Spades, currently top rated Spades in Google Play store, uses variant control In collaboration with developers, we collected gameplay data from 27,592 games and showed previous study that...

10.1609/aiide.v12i1.12858 article EN Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment 2021-06-25

In 2016-2018 at the IEEE Conference on Computational Intelligence in Games, authors of this paper ran a competition for agents that can play classic text-based adventure games. This fills gap existing game artificial intelligence (AI) competitions have typically focused traditional card/board games or modern video with graphical interfaces. By providing platform evaluating textbased adventures, provides novel benchmark AI unique challenges natural language understanding and generation....

10.1109/tg.2019.2896017 article EN IEEE Transactions on Games 2019-01-29

The ability for policies to generalize new environments is key the broad application of RL agents. A promising approach prevent an agent's policy from overfitting a limited set training apply regularization techniques originally developed supervised learning. However, there are stark differences between learning and RL. We discuss those propose modifications existing in order better adapt them In particular, we focus on relying injection noise into learned function, family that includes some...

10.48550/arxiv.1910.12911 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Distributed denial of service (DDoS) attacks constitute a rapidly evolving threat in the current Internet. Multiagent Router Throttling is novel approach to defend against DDoS where multiple reinforcement learning agents are installed on set routers and learn rate-limit or throttle traffic towards victim server. The focus this paper online scalability. We propose an that incorporates task decomposition, team rewards form reward shaping called difference rewards. One characteristics proposed...

10.1080/09540091.2015.1031082 article EN Connection Science 2015-04-15

Throughout scientific history, overarching theoretical frameworks have allowed researchers to grow beyond personal intuitions and culturally biased theories. They allow verify replicate existing findings, link disconnected results. The notion of self-play, albeit often cited in multiagent Reinforcement Learning, has never been grounded a formal model. We present formalized framework, with clearly defined assumptions, which encapsulates the meaning self-play as abstracted from various...

10.1109/cig.2019.8848006 article EN 2021 IEEE Conference on Games (CoG) 2019-08-01

Esports has emerged as a popular genre for players well spectators, supporting global entertainment industry. analytics evolved to address the requirement data-driven feedback, and is focused on cyber-athlete evaluation, strategy prediction. Towards latter, previous work used match data from variety of player ranks hobbyist professional players. However, have been shown behave differently than lower ranked Given comparatively limited supply data, key question thus whether mixed-rank datasets...

10.48550/arxiv.1711.06498 preprint EN cc-by arXiv (Cornell University) 2017-01-01

Monte Carlo tree search (MCTS) has become a popular solution for game artificial intelligence (AI), capable of creating strong playing opponents. However, the emergent playstyle agents using MCTS is not necessarily human-like, believable or enjoyable. AI Factory Spades, currently top rated Spades in Google Play store, uses variant to control allies and In collaboration with developers, we showed previous study that human players significantly differed from players. This paper presents method...

10.1109/tg.2018.2835764 article EN IEEE Transactions on Games 2018-05-11
Coming Soon ...