- Reinforcement Learning in Robotics
- Robot Manipulation and Learning
- Social Robot Interaction and HRI
- Adversarial Robustness in Machine Learning
- Autonomous Vehicle Technology and Safety
- Explainable Artificial Intelligence (XAI)
- Robotic Path Planning Algorithms
- Human-Automation Interaction and Safety
- Ethics and Social Impacts of AI
- Machine Learning and Algorithms
- AI-based Problem Solving and Planning
- Anomaly Detection Techniques and Applications
- Topic Modeling
- Natural Language Processing Techniques
- Human Pose and Action Recognition
- EEG and Brain-Computer Interfaces
- Advanced Bandit Algorithms Research
- Decision-Making and Behavioral Economics
- Machine Learning and Data Classification
- Gaze Tracking and Assistive Technology
- Domain Adaptation and Few-Shot Learning
- Data Stream Mining Techniques
- Tactile and Sensory Interactions
- Multimodal Machine Learning Applications
- Bayesian Modeling and Causal Inference
University of California, Berkeley
2016-2025
Berkeley College
2018-2024
Carnegie Mellon University
2011-2022
South China University of Technology
2018
Stanford University
2018
Bangladesh University of Engineering and Technology
2016
Fraunhofer Institute for Industrial Mathematics
2009
In this paper, we present CHOMP (covariant Hamiltonian optimization for motion planning), a method trajectory invariant to reparametrization. uses functional gradient techniques iteratively improve the quality of an initial trajectory, optimizing that trades off between smoothness and obstacle avoidance component. can be used locally optimize feasible trajectories, as well solve planning queries, converging low-cost trajectories even when initialized with infeasible ones. It Monte Carlo...
A key requirement for seamless human-robot collaboration is the robot to make its intentions clear human collaborator. collaborative robot's motion must be legible, or intent-expressive. Legibility often described in literature as and effect of predictable, unsurprising, expected motion. Our central insight that predictability legibility are fundamentally different contradictory properties We develop a formalism mathematically define distinguish formalize two based on inferences between...
Traditionally, autonomous cars make predictions about other drivers' future trajectories, and plan to stay out of their way.This tends result in defensive opaque behaviors.Our key insight is that an car's actions will actually affect what do response, whether the car aware it or not.Our thesis we can leverage these responses more efficient communicative behaviors.We model interaction between a human driver as dynamical system, which robot's have immediate consequences on state car, but also...
A key requirement for seamless human-robot collaboration is the robot to make its intentions clear human collaborator. collaborative robot's motion must be legible, or intent-expressive. Legibility often described in literature as and effect of predictable, unsurprising, expected motion. Our central insight that predictability legibility are fundamentally different contradictory properties We develop a formalism mathematically define distinguish formalize two based on inferences between...
In shared control teleoperation, the robot assists user in accomplishing desired task, making teleoperation easier and more seamless. Rather than simply executing user’s input, which is hindered by inadequacies of interface, attempts to predict intent, it. this work, we are interested scientific underpinnings assistance: propose an intuitive formalism that captures assistance as policy blending, illustrate how some existing techniques for instantiate it, provide a principled analysis its...
For an autonomous system to be helpful humans and pose no unwarranted risks, it needs align its values with those of the in environment such a way that actions contribute maximization value for humans. We propose formal definition alignment problem as cooperative inverse reinforcement learning (CIRL). A CIRL is cooperative, partial-information game two agents, human robot; both are rewarded according human's reward function, but robot does not initially know what this is. In contrast...
Our goal is to efficiently learn reward functions encoding a human's preferences for how dynamical system should act.There are two challenges with this.First, in many problems it difficult people provide demonstrations of the desired trajectory (like high-DOF robot arm motion or an aggressive driving maneuver), even assign much numerical action get.We build on work label ranking and propose from (or comparisons) instead: person provides relative preference between trajectories.Second,...
A handover is a complex collaboration, where actors coordinate in time and space to transfer control of an object. This coordination comprises two processes: the physical process moving get close enough object, cognitive exchanging information guide transfer. Despite this complexity, we humans are capable performing handovers seamlessly wide variety situations, even when unexpected. suggests common procedure that guides all interactions. Our goal codify procedure.
Most motion in robotics is purely functional, planned to achieve the goal and avoid collisions. Such great isolation, but collaboration affords a human who watching making inferences about it, trying coordinate with robot task. This paper analyzes benefit of planning that explicitly enables collaborator's on success physical collaboration, as measured by both objective subjective metrics. Results suggest legible motion, clearly express robot's intent, leads more fluent collaborations than...
The actions of an autonomous vehicle on the road affect and are affected by those other drivers, whether overtaking, negotiating a merge, or avoiding accident. This mutual dependence, best captured dynamic game theory, creates strong coupling between vehicle's planning its predictions drivers' behavior, constitutes open problem with direct implications safety viability driving technology. Unfortunately, games too computationally demanding to meet real-time constraints in continuous state...
Much of estimation human internal state (goal, intentions, activities, preferences, etc.) is passive: an algorithm observes actions and updates its estimate state. In this work, we embrace the fact that robot affect what humans do, leverage it to improve estimation. We enable robots do active information gathering, by planning probe user in order clarify their For instance, autonomous car will plan nudge into a driver's lane test driving style. Results simulation study suggest gathering...
Preparation requires technical research and development, as well adaptive, proactive governance
Reinforcement learning from human feedback (RLHF) is a technique for training AI systems to align with goals. RLHF has emerged as the central method used finetune state-of-the-art large language models (LLMs). Despite this popularity, there been relatively little public work systematizing its flaws. In paper, we (1) survey open problems and fundamental limitations of related methods; (2) overview techniques understand, improve, complement in practice; (3) propose auditing disclosure...
In assistive teleoperation, the robot helps user accomplish desired task, making teleoperation easier and more seamless.Rather than simply executing user's input, which is hindered by inadequacies of interface, attempts to predict intent, assists in accomplishing it.In this work, we are interested scientific underpinnings assistance: formalize assistance under general framework policy blending, show how previous work methods instantiate formalism, provide a principled analysis its main...
Consequential decision-making typically incentivizes individuals to behave strategically, tailoring their behavior the specifics of decision rule. A long line work has therefore sought counteract strategic by designing more conservative boundaries in an effort increase robustness effects covariate shift.
In shared autonomy, user input is combined with semi-autonomous control to achieve a common goal.The goal often unknown ex-ante, so prior work enables agents infer the from and assist task.Such methods tend assume some combination of knowledge dynamics environment, user's policy given their goal, set possible goals might target, which limits application real-world scenarios.We propose deep reinforcement learning framework for model-free autonomy that lifts these assumptions.We use...
We show through theory and experiment that gradient-based explanations of a model quickly reveal the itself. Our results speak to tension between desire keep proprietary secret ability offer explanations.
With progress in enabling autonomous cars to drive safely on the road, it is time start asking how they should be driving. A common answer that adopting their users' driving style. This makes assumption users want like - aggressive drivers cars, defensive cars. In this paper, we put test. We find tend prefer a significantly more style than own. Interestingly, think own, even though actual tends aggressive. also preferences do depend specific scenario, opening door for new ways of learning preference.
We present the hardware design, software architecture, and core algorithms of Herb 2.0, a bimanual mobile manipulator developed at Personal Robotics Lab Carnegie Mellon University, Pittsburgh, PA. have 2.0 to perform useful tasks for with people in human environments. exploit two key paradigms environments: that they structure robot can learn, adapt exploit, demand general-purpose capability robotic systems. In this paper, we reveal some everyday environments been able harness manipulation...
Many problems in robotics involve multiple decision making agents. To operate efficiently such settings, a robot must reason about the impact of its decisions on behavior other Differential games offer an expressive theoretical framework for formulating these types multi-agent problems. Unfortunately, most numerical solution techniques scale poorly with state dimension and are rarely used real-time applications. For this reason, it is common to predict future agents solve resulting...
Legible motion --- that communicates its intent to a human observer is crucial for enabling seamless human-robot collaboration. In this paper, we propose functional gradient optimization technique autonomously generating legible motion. Our algorithm optimizes legibility metric inspired by the psychology of action interpretation in humans, resulting trajectories purposefully deviate from what an would expect order better convey intent. A trust region constraint on ensures does not become too...
In order to safely operate around humans, robots can employ predictive models of human motion. Unfortunately, these cannot capture the full complexity behavior and necessarily introduce simplifying assumptions. As a result, predictions may degrade whenever observed departs from assumed structure, which have negative implications for safety. this paper, we observe that how rational actions appear under particular model be viewed as an indicator model's ability describe human's current By...