- Reinforcement Learning in Robotics
- Robot Manipulation and Learning
- Advanced Control Systems Optimization
- Neural Networks and Applications
- Domain Adaptation and Few-Shot Learning
- Adversarial Robustness in Machine Learning
- Evolutionary Algorithms and Applications
- Adaptive Dynamic Programming Control
- Robotic Path Planning Algorithms
- Advanced Neural Network Applications
- Data Stream Mining Techniques
- Scheduling and Optimization Algorithms
- Advanced Multi-Objective Optimization Algorithms
- Advanced Bandit Algorithms Research
- Machine Learning and Algorithms
- Optimization and Search Problems
- Advanced Image and Video Retrieval Techniques
- Human Pose and Action Recognition
- Fuzzy Logic and Control Systems
- Smart Grid Energy Management
- Model Reduction and Neural Networks
- Explainable Artificial Intelligence (XAI)
- Robotic Locomotion and Control
- Neural dynamics and brain function
- Control Systems and Identification
Google (United Kingdom)
2015-2024
DeepMind (United Kingdom)
2014-2024
Google (United States)
2015-2021
Corvallis Environmental Center
2020
University of Freiburg
2009-2016
Laboratoire d'Informatique de Paris-Nord
2014-2015
Osnabrück University
2004-2009
TU Dortmund University
2003-2004
University of Padua
2004
Karlsruhe Institute of Technology
1994-2003
We present the first deep learning model to successfully learn control policies directly from high-dimensional sensory input using reinforcement learning. The is a convolutional neural network, trained with variant of Q-learning, whose raw pixels and output value function estimating future rewards. apply our method seven Atari 2600 games Arcade Learning Environment, no adjustment architecture or algorithm. find that it outperforms all previous approaches on six surpasses human expert three them.
A learning algorithm for multilayer feedforward networks, RPROP (resilient propagation), is proposed. To overcome the inherent disadvantages of pure gradient-descent, performs a local adaptation weight-updates according to behavior error function. Contrary other adaptive techniques, effect process not blurred by unforeseeable influence size derivative, but only dependent on temporal its sign. This leads an efficient and transparent process. The capabilities are shown in comparison...
Most modern convolutional neural networks (CNNs) used for object recognition are built using the same principles: Alternating convolution and max-pooling layers followed by a small number of fully connected layers. We re-evaluate state art from images with networks, questioning necessity different components in pipeline. find that can simply be replaced layer increased stride without loss accuracy on several image benchmarks. Following this finding -- building other recent work simple...
Deep convolutional networks have proven to be very successful in learning task specific features that allow for unprecedented performance on various computer vision tasks. Training of such follows mostly the supervised paradigm, where sufficiently many input-output pairs are required training. Acquisition large training sets is one key challenges, when approaching a new task. In this paper, we aim generic feature and present an approach network using only unlabeled data. To end, train...
The reinforcement learning paradigm allows, in principle, for complex behaviours to be learned directly from simple reward signals. In practice, however, it is common carefully hand-design the function encourage a particular solution, or derive demonstration data. this paper explore how rich environment can help promote of behavior. Specifically, we train agents diverse environmental contexts, and find that encourages emergence robust perform well across suite tasks. We demonstrate principle...
The DeepMind Control Suite is a set of continuous control tasks with standardised structure and interpretable rewards, intended to serve as performance benchmarks for reinforcement learning agents. are written in Python powered by the MuJoCo physics engine, making them easy use modify. We include several algorithms. publicly available at https://www.github.com/deepmind/dm_control . A video summary all http://youtu.be/rAai4QzcYbs
We propose a general and model-free approach for Reinforcement Learning (RL) on real robotics with sparse rewards. build upon the Deep Deterministic Policy Gradient (DDPG) algorithm to use demonstrations. Both demonstrations actual interactions are used fill replay buffer sampling ratio between transitions is automatically tuned via prioritized mechanism. Typically, carefully engineered shaping rewards required enable agents efficiently explore high dimensional control problems such as...
Robust object recognition is a crucial ingredient of many, if not all, real-world robotics applications. This paper leverages recent progress on Convolutional Neural Networks (CNNs) and proposes novel RGB-D architecture for recognition. Our composed two separate CNN processing streams - one each modality which are consecutively combined with late fusion network. We focus learning imperfect sensor data, typical problem in tasks. For accurate learning, we introduce multi-stage training...
Abstract Nuclear fusion using magnetic confinement, in particular the tokamak configuration, is a promising path towards sustainable energy. A core challenge to shape and maintain high-temperature plasma within vessel. This requires high-dimensional, high-frequency, closed-loop control actuator coils, further complicated by diverse requirements across wide range of configurations. In this work, we introduce previously undescribed architecture for controller design that autonomously learns...
This paper discusses the effectiveness of deep auto-encoder neural networks in visual reinforcement learning (RL) tasks. We propose a framework for combining training auto-encoders (for compact feature spaces) with recently-proposed batch-mode RL algorithms policies). An emphasis is put on data-efficiency this combination and studying properties spaces automatically constructed by auto-encoders. These are empirically shown to adequately resemble existing similarities spatial relations...
We introduce Embed to Control (E2C), a method for model learning and control of non-linear dynamical systems from raw pixel images. E2C consists deep generative model, belonging the family variational autoencoders, that learns generate image trajectories latent space in which dynamics is constrained be locally linear. Our derived directly an optimal formulation space, supports long-term prediction sequences exhibits strong performance on variety complex problems.