Martin Riedmiller

ORCID: 0000-0002-8465-5690
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Reinforcement Learning in Robotics
  • Robot Manipulation and Learning
  • Advanced Control Systems Optimization
  • Neural Networks and Applications
  • Domain Adaptation and Few-Shot Learning
  • Adversarial Robustness in Machine Learning
  • Evolutionary Algorithms and Applications
  • Adaptive Dynamic Programming Control
  • Robotic Path Planning Algorithms
  • Advanced Neural Network Applications
  • Data Stream Mining Techniques
  • Scheduling and Optimization Algorithms
  • Advanced Multi-Objective Optimization Algorithms
  • Advanced Bandit Algorithms Research
  • Machine Learning and Algorithms
  • Optimization and Search Problems
  • Advanced Image and Video Retrieval Techniques
  • Human Pose and Action Recognition
  • Fuzzy Logic and Control Systems
  • Smart Grid Energy Management
  • Model Reduction and Neural Networks
  • Explainable Artificial Intelligence (XAI)
  • Robotic Locomotion and Control
  • Neural dynamics and brain function
  • Control Systems and Identification

Google (United Kingdom)
2015-2024

DeepMind (United Kingdom)
2014-2024

Google (United States)
2015-2021

Corvallis Environmental Center
2020

University of Freiburg
2009-2016

Laboratoire d'Informatique de Paris-Nord
2014-2015

Osnabrück University
2004-2009

TU Dortmund University
2003-2004

University of Padua
2004

Karlsruhe Institute of Technology
1994-2003

We present the first deep learning model to successfully learn control policies directly from high-dimensional sensory input using reinforcement learning. The is a convolutional neural network, trained with variant of Q-learning, whose raw pixels and output value function estimating future rewards. apply our method seven Atari 2600 games Arcade Learning Environment, no adjustment architecture or algorithm. find that it outperforms all previous approaches on six surpasses human expert three them.

10.48550/arxiv.1312.5602 preprint EN other-oa arXiv (Cornell University) 2013-01-01

A learning algorithm for multilayer feedforward networks, RPROP (resilient propagation), is proposed. To overcome the inherent disadvantages of pure gradient-descent, performs a local adaptation weight-updates according to behavior error function. Contrary other adaptive techniques, effect process not blurred by unforeseeable influence size derivative, but only dependent on temporal its sign. This leads an efficient and transparent process. The capabilities are shown in comparison...

10.1109/icnn.1993.298623 article EN IEEE International Conference on Neural Networks 2002-12-30

Most modern convolutional neural networks (CNNs) used for object recognition are built using the same principles: Alternating convolution and max-pooling layers followed by a small number of fully connected layers. We re-evaluate state art from images with networks, questioning necessity different components in pipeline. find that can simply be replaced layer increased stride without loss accuracy on several image benchmarks. Following this finding -- building other recent work simple...

10.48550/arxiv.1412.6806 preprint EN other-oa arXiv (Cornell University) 2014-01-01

Deep convolutional networks have proven to be very successful in learning task specific features that allow for unprecedented performance on various computer vision tasks. Training of such follows mostly the supervised paradigm, where sufficiently many input-output pairs are required training. Acquisition large training sets is one key challenges, when approaching a new task. In this paper, we aim generic feature and present an approach network using only unlabeled data. To end, train...

10.1109/tpami.2015.2496141 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2015-10-29

The reinforcement learning paradigm allows, in principle, for complex behaviours to be learned directly from simple reward signals. In practice, however, it is common carefully hand-design the function encourage a particular solution, or derive demonstration data. this paper explore how rich environment can help promote of behavior. Specifically, we train agents diverse environmental contexts, and find that encourages emergence robust perform well across suite tasks. We demonstrate principle...

10.48550/arxiv.1707.02286 preprint EN other-oa arXiv (Cornell University) 2017-01-01

The DeepMind Control Suite is a set of continuous control tasks with standardised structure and interpretable rewards, intended to serve as performance benchmarks for reinforcement learning agents. are written in Python powered by the MuJoCo physics engine, making them easy use modify. We include several algorithms. publicly available at https://www.github.com/deepmind/dm_control . A video summary all http://youtu.be/rAai4QzcYbs

10.48550/arxiv.1801.00690 preprint EN other-oa arXiv (Cornell University) 2018-01-01

We propose a general and model-free approach for Reinforcement Learning (RL) on real robotics with sparse rewards. build upon the Deep Deterministic Policy Gradient (DDPG) algorithm to use demonstrations. Both demonstrations actual interactions are used fill replay buffer sampling ratio between transitions is automatically tuned via prioritized mechanism. Typically, carefully engineered shaping rewards required enable agents efficiently explore high dimensional control problems such as...

10.48550/arxiv.1707.08817 preprint EN other-oa arXiv (Cornell University) 2017-01-01

Robust object recognition is a crucial ingredient of many, if not all, real-world robotics applications. This paper leverages recent progress on Convolutional Neural Networks (CNNs) and proposes novel RGB-D architecture for recognition. Our composed two separate CNN processing streams - one each modality which are consecutively combined with late fusion network. We focus learning imperfect sensor data, typical problem in tasks. For accurate learning, we introduce multi-stage training...

10.1109/iros.2015.7353446 article EN 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2015-09-01

Abstract Nuclear fusion using magnetic confinement, in particular the tokamak configuration, is a promising path towards sustainable energy. A core challenge to shape and maintain high-temperature plasma within vessel. This requires high-dimensional, high-frequency, closed-loop control actuator coils, further complicated by diverse requirements across wide range of configurations. In this work, we introduce previously undescribed architecture for controller design that autonomously learns...

10.1038/s41586-021-04301-9 article EN cc-by Nature 2022-02-16

This paper discusses the effectiveness of deep auto-encoder neural networks in visual reinforcement learning (RL) tasks. We propose a framework for combining training auto-encoders (for compact feature spaces) with recently-proposed batch-mode RL algorithms policies). An emphasis is put on data-efficiency this combination and studying properties spaces automatically constructed by auto-encoders. These are empirically shown to adequately resemble existing similarities spatial relations...

10.1109/ijcnn.2010.5596468 article EN 2022 International Joint Conference on Neural Networks (IJCNN) 2010-07-01

We introduce Embed to Control (E2C), a method for model learning and control of non-linear dynamical systems from raw pixel images. E2C consists deep generative model, belonging the family variational autoencoders, that learns generate image trajectories latent space in which dynamics is constrained be locally linear. Our derived directly an optimal formulation space, supports long-term prediction sequences exhibits strong performance on variety complex problems.

10.48550/arxiv.1506.07365 preprint EN other-oa arXiv (Cornell University) 2015-01-01
Coming Soon ...