Tom Schaul

ORCID: 0000-0002-2961-8782
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Reinforcement Learning in Robotics
  • Evolutionary Algorithms and Applications
  • Artificial Intelligence in Games
  • Metaheuristic Optimization Algorithms Research
  • Neural Networks and Applications
  • Advanced Multi-Objective Optimization Algorithms
  • Blind Source Separation Techniques
  • Digital Games and Media
  • Machine Learning and Algorithms
  • Advanced Bandit Algorithms Research
  • Explainable Artificial Intelligence (XAI)
  • Stochastic Gradient Optimization Techniques
  • Neural dynamics and brain function
  • Sports Analytics and Performance
  • Adaptive Dynamic Programming Control
  • Adversarial Robustness in Machine Learning
  • Domain Adaptation and Few-Shot Learning
  • Computability, Logic, AI Algorithms
  • Gaussian Processes and Bayesian Inference
  • Age of Information Optimization
  • Model Reduction and Neural Networks
  • AI-based Problem Solving and Planning
  • Neural Networks and Reservoir Computing
  • Gene Regulatory Network Analysis
  • Modular Robots and Swarm Intelligence

DeepMind (United Kingdom)
2013-2023

Google (United Kingdom)
2014-2023

Google (United States)
2015-2020

New York University
2012-2015

Courant Institute of Mathematical Sciences
2012-2014

Dalle Molle Institute for Artificial Intelligence Research
2008-2011

Università della Svizzera italiana
2010-2011

University of Applied Sciences and Arts of Southern Switzerland
2011

In recent years there have been many successes of using deep representations in reinforcement learning. Still, these applications use conventional architectures, such as convolutional networks, LSTMs, or auto-encoders. this paper, we present a new neural network architecture for model-free Our dueling represents two separate estimators: one the state value function and state-dependent action advantage function. The main benefit factoring is to generalize learning across actions without...

10.48550/arxiv.1511.06581 preprint EN other-oa arXiv (Cornell University) 2015-01-01

The deep reinforcement learning community has made several independent improvements to the DQN algorithm. However, it is unclear which of these extensions are complementary and can be fruitfully combined. This paper examines six algorithm empirically studies their combination. Our experiments show that combination provides state-of-the-art performance on Atari 2600 benchmark, both in terms data efficiency final performance. We also provide results from a detailed ablation study shows...

10.1609/aaai.v32i1.11796 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2018-04-29

The move from hand-designed features to learned in machine learning has been wildly successful. In spite of this, optimization algorithms are still designed by hand. this paper we show how the design an algorithm can be cast as a problem, allowing learn exploit structure problems interest automatic way. Our algorithms, implemented LSTMs, outperform generic, competitors on tasks for which they trained, and also generalize well new with similar structure. We demonstrate number tasks, including...

10.48550/arxiv.1606.04474 preprint EN other-oa arXiv (Cornell University) 2016-01-01

Experience replay lets online reinforcement learning agents remember and reuse experiences from the past. In prior work, experience transitions were uniformly sampled a memory. However, this approach simply replays at same frequency that they originally experienced, regardless of their significance. paper we develop framework for prioritizing experience, so as to important more frequently, therefore learn efficiently. We use prioritized in Deep Q-Networks (DQN), algorithm achieved...

10.48550/arxiv.1511.05952 preprint EN other-oa arXiv (Cornell University) 2015-01-01

Deep reinforcement learning agents have achieved state-of-the-art results by directly maximising cumulative reward. However, environments contain a much wider variety of possible training signals. In this paper, we introduce an agent that also maximises many other pseudo-reward functions simultaneously learning. All these tasks share common representation that, like unsupervised learning, continues to develop in the absence extrinsic rewards. We novel mechanism for focusing upon rewards, so...

10.48550/arxiv.1611.05397 preprint EN other-oa arXiv (Cornell University) 2016-01-01

Deep reinforcement learning (RL) has achieved several high profile successes in difficult decision-making problems. However, these algorithms typically require a huge amount of data before they reach reasonable performance. In fact, their performance during can be extremely poor. This may acceptable for simulator, but it severely limits the applicability deep RL to many real-world tasks, where agent must learn real environment. this paper we study setting access from previous control system....

10.1609/aaai.v32i1.11757 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2018-04-29

This paper introduces SC2LE (StarCraft II Learning Environment), a reinforcement learning environment based on the StarCraft game. domain poses new grand challenge for learning, representing more difficult class of problems than considered in most prior work. It is multi-agent problem with multiple players interacting; there imperfect information due to partially observed map; it has large action space involving selection and control hundreds units; state that must be solely from raw input...

10.48550/arxiv.1708.04782 preprint EN other-oa arXiv (Cornell University) 2017-01-01

We introduce FeUdal Networks (FuNs): a novel architecture for hierarchical reinforcement learning. Our approach is inspired by the feudal learning proposal of Dayan and Hinton, gains power efficacy decoupling end-to-end across multiple levels -- allowing it to utilise different resolutions time. framework employs Manager module Worker module. The operates at lower temporal resolution sets abstract goals which are conveyed enacted Worker. generates primitive actions every tick environment....

10.48550/arxiv.1703.01161 preprint EN other-oa arXiv (Cornell University) 2017-01-01

This paper presents natural evolution strategies (NES), a novel algorithm for performing real-valued dasiablack boxpsila function optimization: optimizing an unknown objective where algorithm-selected measurements constitute the only information accessible to method. Natural search fitness landscape using multivariate normal distribution with self-adapting mutation matrix generate correlated mutations in promising regions. NES shares this property covariance adaption (CMA), strategy (ES)...

10.1109/cec.2008.4631255 article EN 2008-06-01

The performance of stochastic gradient descent (SGD) depends critically on how learning rates are tuned and decreased over time. We propose a method to automatically adjust multiple so as minimize the expected error at any one relies local variations across samples. In our approach, can increase well decrease, making it suitable for non-stationary problems. Using number convex non-convex tasks, we show that resulting algorithm matches SGD or other adaptive approaches with their best settings...

10.48550/arxiv.1206.1106 preprint EN other-oa arXiv (Cornell University) 2012-01-01

We propose a powerful new tool for conducting research on computational intelligence and games. `PyVGDL' is simple, high-level description language 2D video games, the accompanying software library permits parsing instantly playing those The streamlined design of based defining locations dynamics simple building blocks, interaction effects when such objects collide, all which are provided in rich ontology. It can be used to quickly without needing deal with control structures, concise also...

10.1109/cig.2013.6633610 article EN 2013-08-01

This paper presents the framework, rules, games, controllers, and results of first General Video Game Playing Competition, held at IEEE Conference on Computational Intelligence Games in 2014. The competition proposes challenge creating controllers for general video game play, where a single agent must be able to play many different some them unknown participants time submitting their entries. test can seen as an approximation artificial intelligence, amount game-dependent heuristics needs...

10.1109/tciaig.2015.2402393 article EN IEEE Transactions on Computational Intelligence and AI in Games 2015-02-10

The General Video Game AI framework and competition pose the problem of creating artificial intelligence that can play a wide, in principle unlimited, range games. Concretely, it tackles devising an algorithm is able to any game given, even if not known priori. This area study be seen as approximation Artificial Intelligence, with very little room for game-dependent heuristics. short paper summarizes motivation, infrastructure, results future plans AI, stressing findings first conclusions...

10.1609/aaai.v30i1.9869 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2016-03-05

The family of natural evolution strategies (NES) offers a principled approach to real-valued evolutionary optimization by following the gradient expected fitness. Like well-known CMA-ES, most competitive algorithm in field, NES comes with important invariance properties. In this paper, we introduce number elegant and efficient improvements basic algorithm. First, propose parameterize positive definite covariance matrix using exponential map, which allows be updated vector space. This new...

10.1145/1830483.1830557 article EN 2010-07-07
Coming Soon ...