NFDI4DS | UHH-SEMS - Publication Details

Shayegan Omidshafiei

ORCID: 0000-0001-7758-1454

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5052169592

Research Areas

Reinforcement Learning in Robotics
Sports Analytics and Performance
Artificial Intelligence in Games
Game Theory and Applications
Anomaly Detection Techniques and Applications
Advanced Bandit Algorithms Research
Bayesian Modeling and Causal Inference
Formal Methods in Verification
Auction Theory and Applications
Optimization and Search Problems
Experimental Behavioral Economics Studies
Multi-Agent Systems and Negotiation
Robotic Path Planning Algorithms
Fault Detection and Control Systems
Data Stream Mining Techniques
Human Pose and Action Recognition
Evolutionary Game Theory and Cooperation
Target Tracking and Data Fusion in Sensor Networks
Autonomous Vehicle Technology and Safety
Gaussian Processes and Bayesian Inference
Modular Robots and Swarm Intelligence
Multimodal Machine Learning Applications
Sports Performance and Training
Big Data and Business Intelligence
Logic, Reasoning, and Knowledge

Google (United States)
2019-2024

DeepMind (United Kingdom)
2020-2022

Massachusetts Institute of Technology
2014-2020

Decision Systems (United States)
2015-2019

Mastering the game of Stratego with model-free multiagent reinforcement learning

OPENALEX - Publications

Julien Pérolat Bart De Vylder Daniel Hennes Eugene Tarassov Florian Strub and 29 more

We introduce DeepNash, an autonomous agent capable of learning to play the imperfect information game Stratego from scratch, up a human expert level. is one few iconic board games that Artificial Intelligence (AI) has not yet mastered. This popular enormous tree on order $10^{535}$ nodes, i.e., $10^{175}$ times larger than Go. It additional complexity requiring decision-making under information, similar Texas hold'em poker, which significantly smaller (on $10^{164}$ nodes). Decisions in are...

10.1126/science.add4679 article EN Science 2022-12-01

OpenSpiel: A Framework for Reinforcement Learning in Games

OPENALEX - Publications

Marc Lanctot Edward Lockhart Jean-Baptiste Lespiau Vinícius Zambaldi Satyaki Upadhyay and 22 more

OpenSpiel is a collection of environments and algorithms for research in general reinforcement learning search/planning games. supports n-player (single- multi- agent) zero-sum, cooperative general-sum, one-shot sequential, strictly turn-taking simultaneous-move, perfect imperfect information games, as well traditional multiagent such (partially- fully- observable) grid worlds social dilemmas. also includes tools to analyze dynamics other common evaluation metrics. This document serves both...

10.48550/arxiv.1908.09453 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Learning to Teach in Cooperative Multiagent Reinforcement Learning

OPENALEX - Publications

Shayegan Omidshafiei Dong Ki Kim Miao Liu Gerald Tesauro Matthew Riemer and 3 more

Collective human knowledge has clearly benefited from the fact that innovations by individuals are taught to others through communication. Similar social groups, agents in distributed learning systems would likely benefit communication share and teach skills. The problem of teaching improve agent been investigated prior works, but these approaches make assumptions prevent application general multiagent problems, or require domain expertise for problems they can apply to. This inherent...

10.1609/aaai.v33i01.33016128 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2019-07-17

Game Plan: What AI can do for Football, and What Football can do for AI

OPENALEX - Publications

Karl Tuyls Shayegan Omidshafiei Paul Müller Zhe Wang Jerome J. Connor and 31 more

The rapid progress in artificial intelligence (AI) and machine learning has opened unprecedented analytics possibilities various team individual sports, including baseball, basketball, tennis. More recently, AI techniques have been applied to football, due a huge increase data collection by professional teams, increased computational power, advances learning, with the goal of better addressing new scientific challenges involved analysis both players’ coordinated teams’ behaviors. research...

10.1613/jair.1.12505 article EN cc-by Journal of Artificial Intelligence Research 2021-05-06

From motor control to team play in simulated humanoid football

OPENALEX - Publications

Siqi Liu Guy Lever Zhe Wang Josh Merel S. M. Ali Eslami and 17 more

Learning to combine control at the level of joint torques with longer-term goal-directed behavior is a long-standing challenge for physically embodied artificial agents. Intelligent in physical world unfolds across multiple spatial and temporal scales: Although movements are ultimately executed instantaneous muscle tensions or torques, they must be selected serve goals that defined on much longer time scales often involve complex interactions environment other Recent research has...

10.1126/scirobotics.abo0235 article EN Science Robotics 2022-08-31

α-Rank: Multi-Agent Evaluation by Evolution

OPENALEX - Publications

Shayegan Omidshafiei Christos H. Papadimitriou Georgios Piliouras Karl Tuyls Mark Rowland and 5 more

Abstract We introduce α - Rank , a principled evolutionary dynamics methodology, for the evaluation and ranking of agents in large-scale multi-agent interactions, grounded novel dynamical game-theoretic solution concept called Markov Conley chains (MCCs). The approach leverages continuous-time discrete-time systems applied to empirical games, scales tractably number agents, type interactions (beyond dyadic), games (symmetric asymmetric). Current models are fundamentally limited one or more...

10.1038/s41598-019-45619-9 article EN cc-by Scientific Reports 2019-07-09

Decentralized control of Partially Observable Markov Decision Processes using belief space macro-actions

OPENALEX - Publications

Shayegan Omidshafiei Ali‐akbar Agha‐mohammadi Christopher Amato Jonathan P. How

The focus of this paper is on solving multi-robot planning problems in continuous spaces with partial observability. Decentralized Partially Observable Markov Decision Processes (Dec-POMDPs) are general models for coordination problems, but representing and Dec-POMDPs often intractable large problems. To allow a high-level representation that natural scalable to discrete extends the Dec-POMDP model Semi-Markov Process (Dec-POSMDP). Dec-POSMDP formulation allows asynchronous decision-making...

10.1109/icra.2015.7140035 article EN 2015-05-01

Decentralized control of multi-robot partially observable Markov decision processes using belief space macro-actions

OPENALEX - Publications

Shayegan Omidshafiei Ali‐akbar Agha‐mohammadi Christopher Amato Shih‐Yuan Liu Jonathan P. How and 1 more

This work focuses on solving general multi-robot planning problems in continuous spaces with partial observability given a high-level domain description. Decentralized Partially Observable Markov Decision Processes (Dec-POMDPs) are models for coordination problems. However, representing and Dec-POMDPs is often intractable large extends the Dec-POMDP model to Semi-Markov Process (Dec-POSMDP) take advantage of representations that natural facilitate scalable solutions discrete The Dec-POSMDP...

10.1177/0278364917692864 article EN The International Journal of Robotics Research 2017-02-01

A Generalized Training Approach for Multiagent Learning

OPENALEX - Publications

Paul Müller Shayegan Omidshafiei Mark Rowland Karl Tuyls Julien Pérolat and 10 more

This paper investigates a population-based training regime based on game-theoretic principles called Policy-Spaced Response Oracles (PSRO). PSRO is general in the sense that it (1) encompasses well-known algorithms such as fictitious play and double oracle special cases, (2) principle applies to general-sum, many-player games. Despite this, prior studies of have been focused two-player zero-sum games, wherein Nash equilibria are tractably computable. In moving from games more settings,...

10.48550/arxiv.1909.12823 preprint EN other-oa arXiv (Cornell University) 2019-01-01

MAR-CPS: Measurable Augmented Reality for Prototyping Cyber-Physical Systems

OPENALEX - Publications

Shayegan Omidshafiei Ali‐akbar Agha‐mohammadi Yu Fan Chen Nazım Kemal Üre Jonathan P. How and 2 more

10.2514/6.2015-0643 article EN AIAA Infotech @ Aerospace 2015-01-02

Policy Distillation and Value Matching in Multiagent Reinforcement Learning

OPENALEX - Publications

Samir Wadhwania Dong Ki Kim Shayegan Omidshafiei Jonathan P. How

Multiagent reinforcement learning (MARL) algorithms have been demonstrated on complex tasks that require the coordination of a team multiple agents to complete. Existing works focused sharing information between via centralized critics stabilize or through communication improve performance, but do not generally consider how can be shared address curse dimensionality in MARL. We posit multiagent problem decomposed into multi-task where each agent explores subset state space instead exploring...

10.1109/iros40897.2019.8967849 article EN 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2019-11-01

Measurable Augmented Reality for Prototyping Cyberphysical Systems: A Robotics Platform to Aid the Hardware Prototyping and Performance Testing of Algorithms

OPENALEX - Publications

Shayegan Omidshafiei Ali‐akbar Agha‐mohammadi Yu Fan Chen Nazım Kemal Üre Shih‐Yuan Liu and 4 more

Planning, control, perception, and learning are current research challenges in multirobot systems. The transition dynamics of the robots may be unknown or stochastic, making it difficult to select best action each robot must take at a given time. observation model, function robots' sensor systems, noisy partial, meaning that deterministic knowledge team's state is often impossible attain. Moreover, actions can have an associated success rate and/or probabilistic completion Robots designed...

10.1109/mcs.2016.2602090 article EN IEEE Control Systems 2016-11-10

Real World Games Look Like Spinning Tops

OPENALEX - Publications

Wojciech Marian Czarnecki Gauthier Gidel Brendan Tracey Karl Tuyls Shayegan Omidshafiei and 2 more

This paper investigates the geometrical properties of real world games (e.g. Tic-Tac-Toe, Go, StarCraft II). We hypothesise that their structure resemble a spinning top, with upright axis representing transitive strength, and radial axis, which corresponds to number cycles exist at particular non-transitive dimension. prove existence this geometry for wide class games, exposing temporal nature. Additionally, we show unique also has consequences learning - it clarifies why populations...

10.48550/arxiv.2004.09468 preprint EN other-oa arXiv (Cornell University) 2020-01-01

DRIP: Domain Refinement Iteration With Polytopes for Backward Reachability Analysis of Neural Feedback Loops

OPENALEX - Publications

Michael Everett Rudy Bunel Shayegan Omidshafiei

Safety certification of data-driven control techniques remains a major open problem. This work investigates backward reachability as framework for providing collision avoidance guarantees systems controlled by neural network (NN) policies. Because NNs are typically not invertible, existing methods conservatively assume domain over which to relax the NN, causes loose over-approximations set states that could lead system into obstacle (i.e., backprojection (BP) sets). To address this issue, we...

10.1109/lcsys.2023.3260731 article EN IEEE Control Systems Letters 2023-01-01

Learning for multi-robot cooperation in partially observable stochastic environments with macro-actions

OPENALEX - Publications

Miao Liu Kavinayan Sivakumar Shayegan Omidshafiei Christopher Amato Jonathan P. How

This paper presents a data-driven approach for multi-robot coordination in partially-observable domains based on Decentralized Partially Observable Markov Decision Processes (Dec-POMDPs) and macro-actions (MAs). Dec-POMDPs provide general framework cooperative sequential decision making under uncertainty MAs allow temporally extended asynchronous action execution. To date, most methods assume the underlying Dec-POMDP model is known priori or full simulator available during planning time....

10.1109/iros.2017.8206001 article EN 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2017-09-01

Graph-based Cross Entropy method for solving multi-robot decentralized POMDPs

OPENALEX - Publications

Shayegan Omidshafiei Ali‐akbar Agha‐mohammadi Christopher Amato Shih‐Yuan Liu Jonathan P. How and 1 more

This paper introduces a probabilistic algorithm for multi-robot decision-making under uncertainty, which can be posed as Decentralized Partially Observable Markov Decision Process (Dec-POMDP). Dec-POMDPs are inherently synchronous frameworks require significant computational resources to solved, making them infeasible many real-world robotics applications. The Semi-Markov (Dec-POSMDP) was recently introduced an extension of the Dec-POMDP that uses high-level macro-actions allow large-scale,...

10.1109/icra.2016.7487751 article EN 2016-05-01

Multiagent off-screen behavior prediction in football

OPENALEX - Publications

Shayegan Omidshafiei Daniel Hennes Marta Garnelo Zhe Wang Adrià Recasens and 14 more

Abstract In multiagent worlds, several decision-making individuals interact while adhering to the dynamics constraints imposed by environment. These interactions, combined with potential stochasticity of agents’ dynamic behaviors, make such systems complex and interesting study from a perspective. Significant research has been conducted on learning models for forward-direction estimation agent example, pedestrian predictions used collision-avoidance in self-driving cars. many settings, only...

10.1038/s41598-022-12547-0 article EN cc-by Scientific Reports 2022-05-23

Online heterogeneous multiagent learning under limited communication with applications to forest fire management

OPENALEX - Publications

Nazım Kemal Üre Shayegan Omidshafiei Brett T. Lopez Ali‐akbar Agha‐mohammadi Jonathan P. How and 1 more

Many robotic missions require online estimation of the unknown state transition models associated with uncertainty that stems from mission dynamics. The learning problem is usually distributed among agents in multiagent scenarios, either due to absence a centralized processing unit or because large size joint problem. This paper addresses likely scenario estimate different their measured data, but they can share information by communicating model parameters. Previous approaches consider...

10.1109/iros.2015.7354107 article EN 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2015-09-01

Neural Replicator Dynamics

OPENALEX - Publications

Shayegan Omidshafiei Daniel Hennes Dustin Morrill Rémi Munos Julien Pérolat and 4 more

Policy gradient and actor-critic algorithms form the basis of many commonly used training techniques in deep reinforcement learning. Using these multiagent environments poses problems such as nonstationarity instability. In this paper, we first demonstrate that standard softmax-based policy can be prone to poor performance presence even most benign nonstationarity. By contrast, it is known replicator dynamics, a well-studied model from evolutionary game theory, eliminates dominated...

10.48550/arxiv.1906.00190 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Coming Soon ...