- Reinforcement Learning in Robotics
- Human Pose and Action Recognition
- Machine Learning and Algorithms
- Gaussian Processes and Bayesian Inference
- Robot Manipulation and Learning
- Robotic Locomotion and Control
- Generative Adversarial Networks and Image Synthesis
- Data Stream Mining Techniques
- AI-based Problem Solving and Planning
- Muscle activation and electromyography studies
- Markov Chains and Monte Carlo Methods
- Machine Learning and Data Classification
- Neural Networks and Applications
- Human Motion and Animation
- Intelligent Tutoring Systems and Adaptive Learning
- Model Reduction and Neural Networks
- Bayesian Methods and Mixture Models
- Domain Adaptation and Few-Shot Learning
- Multimodal Machine Learning Applications
- Evolutionary Algorithms and Applications
- Advanced Multi-Objective Optimization Algorithms
- Action Observation and Synchronization
- Stochastic Gradient Optimization Techniques
- Robotic Path Planning Algorithms
- Advanced Vision and Imaging
DeepMind (United Kingdom)
2019-2024
Google (United Kingdom)
2024
University College London
2023
Google (United States)
2020-2021
University of Oxford
2017-2019
University of Cambridge
2014
We investigated whether deep reinforcement learning (deep RL) is able to synthesize sophisticated and safe movement skills for a low-cost, miniature humanoid robot that can be composed into complex behavioral strategies. used RL train play simplified one-versus-one soccer game. The resulting agent exhibits robust dynamic skills, such as rapid fall recovery, walking, turning, kicking, it transitions between them in smooth efficient manner. It also learned anticipate ball movements block...
We address the longstanding challenge of producing flexible, realistic humanoid character controllers that can perform diverse whole-body tasks involving object interactions. This is central to a variety fields, from graphics and animation robotics motor neuroscience. Our physics-based environment uses actuation first-person perception - including touch sensors egocentric vision with view active-sensing behaviors (e.g. gaze direction), transferability real robots, comparisons biology....
We focus on the problem of learning a single motor module that can flexibly express range behaviors for control high-dimensional physically simulated humanoids. To do this, we propose architecture has general structure an inverse model with latent-variable bottleneck. show it is possible to train this entirely offline compress thousands expert policies and learn primitive embedding space. The trained neural probabilistic system perform one-shot imitation whole-body humanoid behaviors,...
Learning to combine control at the level of joint torques with longer-term goal-directed behavior is a long-standing challenge for physically embodied artificial agents. Intelligent in physical world unfolds across multiple spatial and temporal scales: Although movements are ultimately executed instantaneous muscle tensions or torques, they must be selected serve goals that defined on much longer time scales often involve complex interactions environment other Recent research has...
Large language models (LLMs) have demonstrated exciting progress in acquiring diverse new capabilities through in-context learning, ranging from logical reasoning to code-writing. Robotics researchers also explored using LLMs advance the of robotic control. However, since low-level robot actions are hardware-dependent and underrepresented LLM training corpora, existing efforts applying robotics largely treated as semantic planners or relied on human-engineered control primitives interface...
Humans achieve efficient learning by relying on prior knowledge about the structure of naturally occurring tasks. There is considerable interest in designing reinforcement (RL) algorithms with similar properties. This includes proposals to learn algorithm itself, an idea also known as meta learning. One formal interpretation this a partially observable multi-task RL problem which task information hidden from agent. Such unknown problems can be reduced Markov decision processes (MDPs)...
We present a system for applying sim2real approaches to "in the wild" scenes with realistic visuals, and policies which rely on active perception using RGB cameras. Given short video of static scene collected generic phone, we learn scene's contact geometry function novel view synthesis Neural Radiance Field (NeRF). augment NeRF rendering by overlaying other dynamic objects (e.g. robot's own body, ball). A simulation is then created engine in physics simulator computes dynamics from...
Variational inference relies on flexible approximate posterior distributions. Normalizing flows provide a general recipe to construct variational posteriors. We introduce Sylvester normalizing flows, which can be seen as generalization of planar flows. remove the well-known single-unit bottleneck from making single transformation much more flexible. compare performance against and inverse autoregressive demonstrate that they favorably several datasets.
The problem of posterior inference is central to Bayesian statistics and a wealth Markov Chain Monte Carlo (MCMC) methods have been proposed obtain asymptotically correct samples from the posterior. As datasets in applications grow larger larger, scalability has emerged as for MCMC methods. Stochastic Gradient Langevin Dynamics (SGLD) related stochastic gradient offer by using gradients each step simulated dynamics. While these are unbiased if stepsizes reduced an appropriate fashion,...
We investigate the use of prior knowledge human and animal movement to learn reusable locomotion skills for real legged robots. Our approach builds upon previous work on imitating or dog Motion Capture (MoCap) data a skill module. Once learned, this module can be reused complex downstream tasks. Importantly, due imposed by MoCap data, our does not require extensive reward engineering produce sensible natural looking behavior at time reuse. This makes it easy create well-regularized,...
Recent advancements in large multimodal models have led to the emergence of remarkable generalist capabilities digital domains, yet their translation physical agents such as robots remains a significant challenge. This report introduces new family AI purposefully designed for robotics and built upon foundation Gemini 2.0. We present Robotics, an advanced Vision-Language-Action (VLA) model capable directly controlling robots. Robotics executes smooth reactive movements tackle wide range...
Many real world tasks exhibit rich structure that is repeated across different parts of the state space or in time. In this work we study possibility leveraging such to speed up and regularize learning. We start from KL regularized expected reward objective which introduces an additional component, a default policy. Instead relying on fixed policy, learn it data. But crucially, restrict amount information policy receives, forcing reusable behaviors help faster. formalize strategy discuss...
We propose a simple imitation learning procedure for locomotion controllers that can walk over very challenging terrains. use trajectory optimization (TO) to produce large dataset of trajectories procedurally generated terrains and Reinforcement Learning (RL) imitate these trajectories. demonstrate with realistic model the ANYmal robot learned transfer unseen provide an effective initialization fine-tuning on require exteroception precise foot placements. Our setup combines TO RL in fashion...