Shayegan Omidshafiei

ORCID: 0000-0001-7758-1454
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Reinforcement Learning in Robotics
  • Sports Analytics and Performance
  • Artificial Intelligence in Games
  • Game Theory and Applications
  • Anomaly Detection Techniques and Applications
  • Advanced Bandit Algorithms Research
  • Bayesian Modeling and Causal Inference
  • Formal Methods in Verification
  • Auction Theory and Applications
  • Optimization and Search Problems
  • Experimental Behavioral Economics Studies
  • Multi-Agent Systems and Negotiation
  • Robotic Path Planning Algorithms
  • Fault Detection and Control Systems
  • Data Stream Mining Techniques
  • Human Pose and Action Recognition
  • Evolutionary Game Theory and Cooperation
  • Target Tracking and Data Fusion in Sensor Networks
  • Autonomous Vehicle Technology and Safety
  • Gaussian Processes and Bayesian Inference
  • Modular Robots and Swarm Intelligence
  • Multimodal Machine Learning Applications
  • Sports Performance and Training
  • Big Data and Business Intelligence
  • Logic, Reasoning, and Knowledge

Google (United States)
2019-2024

DeepMind (United Kingdom)
2020-2022

Massachusetts Institute of Technology
2014-2020

Decision Systems (United States)
2015-2019

We introduce DeepNash, an autonomous agent capable of learning to play the imperfect information game Stratego from scratch, up a human expert level. is one few iconic board games that Artificial Intelligence (AI) has not yet mastered. This popular enormous tree on order $10^{535}$ nodes, i.e., $10^{175}$ times larger than Go. It additional complexity requiring decision-making under information, similar Texas hold'em poker, which significantly smaller (on $10^{164}$ nodes). Decisions in are...

10.1126/science.add4679 article EN Science 2022-12-01

OpenSpiel is a collection of environments and algorithms for research in general reinforcement learning search/planning games. supports n-player (single- multi- agent) zero-sum, cooperative general-sum, one-shot sequential, strictly turn-taking simultaneous-move, perfect imperfect information games, as well traditional multiagent such (partially- fully- observable) grid worlds social dilemmas. also includes tools to analyze dynamics other common evaluation metrics. This document serves both...

10.48550/arxiv.1908.09453 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Collective human knowledge has clearly benefited from the fact that innovations by individuals are taught to others through communication. Similar social groups, agents in distributed learning systems would likely benefit communication share and teach skills. The problem of teaching improve agent been investigated prior works, but these approaches make assumptions prevent application general multiagent problems, or require domain expertise for problems they can apply to. This inherent...

10.1609/aaai.v33i01.33016128 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2019-07-17

The rapid progress in artificial intelligence (AI) and machine learning has opened unprecedented analytics possibilities various team individual sports, including baseball, basketball, tennis. More recently, AI techniques have been applied to football, due a huge increase data collection by professional teams, increased computational power, advances learning, with the goal of better addressing new scientific challenges involved analysis both players’ coordinated teams’ behaviors. research...

10.1613/jair.1.12505 article EN cc-by Journal of Artificial Intelligence Research 2021-05-06

Learning to combine control at the level of joint torques with longer-term goal-directed behavior is a long-standing challenge for physically embodied artificial agents. Intelligent in physical world unfolds across multiple spatial and temporal scales: Although movements are ultimately executed instantaneous muscle tensions or torques, they must be selected serve goals that defined on much longer time scales often involve complex interactions environment other Recent research has...

10.1126/scirobotics.abo0235 article EN Science Robotics 2022-08-31

Abstract We introduce α - Rank , a principled evolutionary dynamics methodology, for the evaluation and ranking of agents in large-scale multi-agent interactions, grounded novel dynamical game-theoretic solution concept called Markov Conley chains (MCCs). The approach leverages continuous-time discrete-time systems applied to empirical games, scales tractably number agents, type interactions (beyond dyadic), games (symmetric asymmetric). Current models are fundamentally limited one or more...

10.1038/s41598-019-45619-9 article EN cc-by Scientific Reports 2019-07-09

The focus of this paper is on solving multi-robot planning problems in continuous spaces with partial observability. Decentralized Partially Observable Markov Decision Processes (Dec-POMDPs) are general models for coordination problems, but representing and Dec-POMDPs often intractable large problems. To allow a high-level representation that natural scalable to discrete extends the Dec-POMDP model Semi-Markov Process (Dec-POSMDP). Dec-POSMDP formulation allows asynchronous decision-making...

10.1109/icra.2015.7140035 article EN 2015-05-01

This work focuses on solving general multi-robot planning problems in continuous spaces with partial observability given a high-level domain description. Decentralized Partially Observable Markov Decision Processes (Dec-POMDPs) are models for coordination problems. However, representing and Dec-POMDPs is often intractable large extends the Dec-POMDP model to Semi-Markov Process (Dec-POSMDP) take advantage of representations that natural facilitate scalable solutions discrete The Dec-POSMDP...

10.1177/0278364917692864 article EN The International Journal of Robotics Research 2017-02-01

This paper investigates a population-based training regime based on game-theoretic principles called Policy-Spaced Response Oracles (PSRO). PSRO is general in the sense that it (1) encompasses well-known algorithms such as fictitious play and double oracle special cases, (2) principle applies to general-sum, many-player games. Despite this, prior studies of have been focused two-player zero-sum games, wherein Nash equilibria are tractably computable. In moving from games more settings,...

10.48550/arxiv.1909.12823 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Multiagent reinforcement learning (MARL) algorithms have been demonstrated on complex tasks that require the coordination of a team multiple agents to complete. Existing works focused sharing information between via centralized critics stabilize or through communication improve performance, but do not generally consider how can be shared address curse dimensionality in MARL. We posit multiagent problem decomposed into multi-task where each agent explores subset state space instead exploring...

10.1109/iros40897.2019.8967849 article EN 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2019-11-01

Planning, control, perception, and learning are current research challenges in multirobot systems. The transition dynamics of the robots may be unknown or stochastic, making it difficult to select best action each robot must take at a given time. observation model, function robots' sensor systems, noisy partial, meaning that deterministic knowledge team's state is often impossible attain. Moreover, actions can have an associated success rate and/or probabilistic completion Robots designed...

10.1109/mcs.2016.2602090 article EN IEEE Control Systems 2016-11-10

This paper investigates the geometrical properties of real world games (e.g. Tic-Tac-Toe, Go, StarCraft II). We hypothesise that their structure resemble a spinning top, with upright axis representing transitive strength, and radial axis, which corresponds to number cycles exist at particular non-transitive dimension. prove existence this geometry for wide class games, exposing temporal nature. Additionally, we show unique also has consequences learning - it clarifies why populations...

10.48550/arxiv.2004.09468 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Safety certification of data-driven control techniques remains a major open problem. This work investigates backward reachability as framework for providing collision avoidance guarantees systems controlled by neural network (NN) policies. Because NNs are typically not invertible, existing methods conservatively assume domain over which to relax the NN, causes loose over-approximations set states that could lead system into obstacle (i.e., backprojection (BP) sets). To address this issue, we...

10.1109/lcsys.2023.3260731 article EN IEEE Control Systems Letters 2023-01-01

This paper presents a data-driven approach for multi-robot coordination in partially-observable domains based on Decentralized Partially Observable Markov Decision Processes (Dec-POMDPs) and macro-actions (MAs). Dec-POMDPs provide general framework cooperative sequential decision making under uncertainty MAs allow temporally extended asynchronous action execution. To date, most methods assume the underlying Dec-POMDP model is known priori or full simulator available during planning time....

10.1109/iros.2017.8206001 article EN 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2017-09-01

This paper introduces a probabilistic algorithm for multi-robot decision-making under uncertainty, which can be posed as Decentralized Partially Observable Markov Decision Process (Dec-POMDP). Dec-POMDPs are inherently synchronous frameworks require significant computational resources to solved, making them infeasible many real-world robotics applications. The Semi-Markov (Dec-POSMDP) was recently introduced an extension of the Dec-POMDP that uses high-level macro-actions allow large-scale,...

10.1109/icra.2016.7487751 article EN 2016-05-01

Abstract In multiagent worlds, several decision-making individuals interact while adhering to the dynamics constraints imposed by environment. These interactions, combined with potential stochasticity of agents’ dynamic behaviors, make such systems complex and interesting study from a perspective. Significant research has been conducted on learning models for forward-direction estimation agent example, pedestrian predictions used collision-avoidance in self-driving cars. many settings, only...

10.1038/s41598-022-12547-0 article EN cc-by Scientific Reports 2022-05-23

Many robotic missions require online estimation of the unknown state transition models associated with uncertainty that stems from mission dynamics. The learning problem is usually distributed among agents in multiagent scenarios, either due to absence a centralized processing unit or because large size joint problem. This paper addresses likely scenario estimate different their measured data, but they can share information by communicating model parameters. Previous approaches consider...

10.1109/iros.2015.7354107 article EN 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2015-09-01

Policy gradient and actor-critic algorithms form the basis of many commonly used training techniques in deep reinforcement learning. Using these multiagent environments poses problems such as nonstationarity instability. In this paper, we first demonstrate that standard softmax-based policy can be prone to poor performance presence even most benign nonstationarity. By contrast, it is known replicator dynamics, a well-studied model from evolutionary game theory, eliminates dominated...

10.48550/arxiv.1906.00190 preprint EN other-oa arXiv (Cornell University) 2019-01-01
Coming Soon ...