- Reinforcement Learning in Robotics
- Adversarial Robustness in Machine Learning
- Robot Manipulation and Learning
- Human Pose and Action Recognition
- Domain Adaptation and Few-Shot Learning
- Robotic Locomotion and Control
- Robotic Path Planning Algorithms
- Autonomous Vehicle Technology and Safety
- Evolutionary Algorithms and Applications
- Robotics and Sensor-Based Localization
- Explainable Artificial Intelligence (XAI)
- Machine Learning and Algorithms
- Viral Infectious Diseases and Gene Expression in Insects
- Multimodal Machine Learning Applications
- Smart Grid Energy Management
- Machine Learning and Data Classification
- Model Reduction and Neural Networks
- Muscle activation and electromyography studies
- Neural Networks and Reservoir Computing
- Generative Adversarial Networks and Image Synthesis
- Neural Networks and Applications
- Anomaly Detection Techniques and Applications
- AI-based Problem Solving and Planning
- Soil Mechanics and Vehicle Dynamics
- Smart Grid Security and Resilience
Google (United Kingdom)
2024
DeepMind (United Kingdom)
2021-2024
Leibniz University Hannover
2015-2024
University College London
2023
Google (United States)
2018-2021
Corvallis Environmental Center
2020
University of Oxford
2016-2019
Science Oxford
2016-2017
Oxford Research Group
2016
Massachusetts Institute of Technology
2013
This paper presents a general framework for exploiting the representational capacity of neural networks to approximate complex, nonlinear reward functions in context solving inverse reinforcement learning (IRL) problem. We show this that Maximum Entropy paradigm IRL lends itself naturally efficient training deep architectures. At test time, approach leads computational complexity independent number demonstrations, which makes it especially well-suited applications life-long scenarios. Our...
Many relevant tasks require an agent to reach a certain state, or manipulate objects into desired configuration. For example, we might want robot align and assemble gear onto axle insert turn key in lock. These goal-oriented present considerable challenge for reinforcement learning, since their natural reward function is sparse prohibitive amounts of exploration are required the goal receive some learning signal. Past approaches tackle these problems by exploiting expert demonstrations...
We investigated whether deep reinforcement learning (deep RL) is able to synthesize sophisticated and safe movement skills for a low-cost, miniature humanoid robot that can be composed into complex behavioral strategies. used RL train play simplified one-versus-one soccer game. The resulting agent exhibits robust dynamic skills, such as rapid fall recovery, walking, turning, kicking, it transitions between them in smooth efficient manner. It also learned anticipate ball movements block...
We present an approach for learning spatial traversability maps driving in complex, urban environments based on extensive dataset demonstrating the behaviour of human experts. The direct end-to-end mapping from raw input data to cost bypasses effort manually designing parts pipeline, exploits a large number samples, and can be framed additionally refine handcrafted produced manual hand-engineered features. To achieve this, we introduce maximum-entropy-based, non-linear inverse reinforcement...
In this work, we present an approach to learn cost maps for driving in complex urban environments from a large number of demonstrations human behaviour. The learned are constructed directly raw sensor measurements, bypassing the effort manually designing as well features. When deploying maps, trajectories generated not only replicate human-like behaviour but also demonstrably robust against systematic errors putative robot configuration. To achieve deploy Maximum Entropy based, non-linear...
Continuous appearance shifts such as changes in weather and lighting conditions can impact the performance of deployed machine learning models. While unsupervised domain adaptation aims to address this challenge, current approaches do not utilise continuity occurring shifts. In particular, many robotics applications exhibit these thus facilitate potential incrementally adapt a learnt model over minor which integrate massive differences time. Our work presents an adversarial approach for...
Learning to combine control at the level of joint torques with longer-term goal-directed behavior is a long-standing challenge for physically embodied artificial agents. Intelligent in physical world unfolds across multiple spatial and temporal scales: Although movements are ultimately executed instantaneous muscle tensions or torques, they must be selected serve goals that defined on much longer time scales often involve complex interactions environment other Recent research has...
<title>ABSTRACT</title> <p>This paper describes novel experimental methods aimed at understanding the fundamental phenomena governing motion of lightweight vehicles on dry, granular soils. A single-wheel test rig is used to empirically investigate wheel under controlled slip and loading conditions sandy, dry soil. Test can be designed replicate typical field scenarios for robots, while key operational parameters such as drawbar force, torque, sinkage are measured. This...
Appearance changes due to weather and seasonal conditions represent a strong impediment the robust implementation of machine learning systems in outdoor robotics. While supervised optimises model for training domain, it will deliver degraded performance application domains that underlie distributional shifts caused by these changes. Traditionally, this problem has been addressed via collection labelled data multiple or imposing priors on type shift between both domains. We frame context...
Many real-world control problems involve both discrete decision variables - such as the choice of modes, gear switching or digital outputs well continuous velocity setpoints, gains analogue outputs. However, when defining corresponding optimal reinforcement learning problem, it is commonly approximated with fully action spaces. These simplifications aim at tailoring problem to a particular algorithm solver which may only support one type space. Alternatively, expert heuristics are used...
We investigate the use of prior knowledge human and animal movement to learn reusable locomotion skills for real legged robots. Our approach builds upon previous work on imitating or dog Motion Capture (MoCap) data a skill module. Once learned, this module can be reused complex downstream tasks. Importantly, due imposed by MoCap data, our does not require extensive reward engineering produce sensible natural looking behavior at time reuse. This makes it easy create well-regularized,...
Many advanced Learning from Demonstration (LfD) methods consider the decomposition of complex, real-world tasks into simpler sub-tasks. By reusing corresponding sub-policies within and between tasks, they provide training data for each policy different high-level compose them to perform novel ones. Existing approaches modular LfD focus either on learning a single task or depend domain knowledge temporal segmentation. In contrast, we propose weakly supervised, domain-agnostic approach based...
Training robots for operation in the real world is a complex, time consuming and potentially expensive task. Despite significant success of reinforcement learning games simulations, research robot applications has not been able to match similar progress. While sample complexity can be reduced by training policies simulation, such perform sub-optimally on platform given imperfect calibration model dynamics. We present an approach -- supplemental fine tuning further benefit from parallel...
Reinforcement learning (RL) for continuous control typically employs distributions whose support covers the entire action space. In this work, we investigate colloquially known phenomenon that trained agents often prefer actions at boundaries of We draw theoretical connections to emergence bang-bang behavior in optimal control, and provide extensive empirical evaluation across a variety recent RL algorithms. replace normal Gaussian by Bernoulli distribution solely considers extremes along...
The successful application of general reinforcement learning algorithms to real-world robotics applications is often limited by their high data requirements.We introduce Regu larized Hierarchical Policy Optimization (RHPO) improve data-efliciency for domains with multiple dominant tasks and ultimately reduce required platform time.To this end, we employ compositional inductive biases on levels corresponding mechanisms sharing off-policy transition across low-level controllers as well...
Intelligent behaviour in the physical world exhibits structure at multiple spatial and temporal scales. Although movements are ultimately executed level of instantaneous muscle tensions or joint torques, they must be selected to serve goals defined on much longer timescales, terms relations that extend far beyond body itself, involving coordination with other agents. Recent research artificial intelligence has shown promise learning-based approaches respective problems complex movement,...