Brent Yi

ORCID: 0009-0009-8408-0717
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Domain Adaptation and Few-Shot Learning
  • Robot Manipulation and Learning
  • Target Tracking and Data Fusion in Sensor Networks
  • Advanced Neural Network Applications
  • Time Series Analysis and Forecasting
  • Multimodal Machine Learning Applications
  • Soft Robotics and Applications
  • Anomaly Detection Techniques and Applications
  • Prosthetics and Rehabilitation Robotics
  • Advanced Image and Video Retrieval Techniques
  • Virtual Reality Applications and Impacts
  • 3D Shape Modeling and Analysis
  • Computer Graphics and Visualization Techniques
  • Tactile and Sensory Interactions
  • Robotics and Sensor-Based Localization
  • Human Pose and Action Recognition
  • Teleoperation and Haptic Systems
  • Visual and Cognitive Learning Processes
  • Evolutionary Algorithms and Applications
  • Bayesian Modeling and Causal Inference
  • Muscle activation and electromyography studies
  • Open Source Software Innovations
  • Functional Brain Connectivity Studies
  • Model Reduction and Neural Networks
  • Hand Gesture Recognition Systems

University of California, Berkeley
2019-2023

Berkeley College
2023

Stanford University
2020-2021

Intel (United States)
2020

Neural Radiance Fields (NeRF) are a rapidly growing area of research with wide-ranging applications in computer vision, graphics, robotics, and more. In order to streamline the development deployment NeRF research, we propose modular PyTorch framework, Nerfstudio. Our framework includes plug-and-play components for implementing NeRF-based methods, which make it easy researchers practitioners incorporate into their projects. Additionally, design enables support extensive real-time...

10.1145/3588432.3591516 preprint EN cc-by 2023-07-19

Robots must cost less and be force-controlled to enable widespread, safe deployment in unconstrained human environments. We propose Quasi-Direct Drive actuation as a capable paradigm for robotic manipulation environments at low-cost. Our prototype - Blue is scale 7 Degree of Freedom arm with 2kg payload. can than $5000. show that has dynamic properties meet or exceed the needs operators: robot nominal position-control bandwidth 7.5Hz repeatability within 4mm. demonstrate Virtual Reality...

10.1109/icra.2019.8794236 article EN 2022 International Conference on Robotics and Automation (ICRA) 2019-05-01

Learning policies in simulation and transferring them to the real world has become a promising approach dexterous manipulation. However, bridging sim-to-real gap for each new task requires substantial human effort, such as careful reward engineering, hyperparameter tuning, system identification. In this work, we present that leverages low-level skills address these challenges more complex tasks. Specifically, introduce hierarchical policy in-hand object reorientation based on previously...

10.48550/arxiv.2501.05439 preprint EN arXiv (Cornell University) 2025-01-09

Leveraging multimodal information with recursive Bayesian filters improves performance and robustness of state estimation, as can combine different modalities according to their uncertainties. Prior work has studied how optimally fuse sensor analytical estimation algorithms. However, deriving the dynamics measurement models along noise profile be difficult or lead intractable models. Differentiable provide a way learn these end-to-end while retaining algorithmic structure filters. This...

10.1109/iros45743.2020.9341579 article EN 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2020-10-24

In many Reinforcement Learning (RL) papers, learning curves are useful indicators to measure the effectiveness of RL algorithms. However, complete raw data rarely available. As a result, it is usually necessary reproduce experiments from scratch, which can be time-consuming and error-prone. We present Open Benchmark, set fully tracked experiments, including not only usual such as episodic return, but also all algorithm-specific system metrics. Benchmark community-driven: anyone download,...

10.48550/arxiv.2402.03046 preprint EN arXiv (Cornell University) 2024-02-05

gsplat is an open-source library designed for training and developing Gaussian Splatting methods. It features a front-end with Python bindings compatible the PyTorch back-end highly optimized CUDA kernels. offers numerous that enhance optimization of models, which include improvements speed, memory, convergence times. Experimental results demonstrate achieves up to 10% less time 4x memory than original implementation. Utilized in several research projects, actively maintained on GitHub....

10.48550/arxiv.2409.06765 preprint EN arXiv (Cornell University) 2024-09-10

A recent line of work has shown that end-to-end optimization Bayesian filters can be used to learn state estimators for systems whose underlying models are difficult hand-design or tune, while retaining the core advantages probabilistic estimation. As an alternative approach estimation in these settings, we present learning modeled as factor graph-based smoothers. By unrolling optimizer use maximum a posteriori inference graphical models, our method is able system full context overall...

10.1109/iros51168.2021.9636300 article EN 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2021-09-27

Robots deployed in human-centric environments may need to manipulate a diverse range of articulated objects, such as doors, dishwashers, and cabinets. Articulated objects often come with unexpected articulation mechanisms that are inconsistent categorical priors: for example, drawer might rotate about hinge joint instead sliding open. We propose category-independent framework predicting the models unknown from sequences RGB-D images. The prediction is performed by two-step process: first,...

10.1109/iros47612.2022.9982029 article EN 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2022-10-23

We introduce RotateIt, a system that enables fingertip-based object rotation along multiple axes by leveraging multimodal sensory inputs. Our is trained in simulation, where it has access to ground-truth shapes and physical properties. Then we distill operate on realistic yet noisy simulated visuotactile proprioceptive These inputs are fused via transformer, enabling online inference of properties during deployment. show significant performance improvements over prior methods the importance...

10.48550/arxiv.2309.09979 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Aiming to replicate human-like dexterity, perceptual experiences, and motion patterns, we explore learning from human demonstrations using a bimanual system with multifingered hands visuotactile data. Two significant challenges exist: the lack of an affordable accessible teleoperation suitable for dual-arm setup hands, scarcity hand hardware equipped touch sensing. To tackle first challenge, develop HATO, low-cost hands-arms that leverages off-the-shelf electronics, complemented software...

10.48550/arxiv.2404.16823 preprint EN arXiv (Cornell University) 2024-04-25

Recent trends in robotic manipulation have highlighted the need for force-controlled grippers that are not only robust to repeated contacts with environment, but also low cost. This paper presents Blue Gripper, a simple parallel-jaw gripper focused on cost and reliability. The proposed hand weighs 660 grams, has throw of 120 mm, can apply up 150 N force. It is designed be passively backdrivable, enabling accurate feedforward force control, reducing complexity. In addition detailing gripper's...

10.1109/coase.2019.8843134 article EN 2022 IEEE 18th International Conference on Automation Science and Engineering (CASE) 2019-08-01

Factored feature volumes offer a simple way to build more compact, efficient, and intepretable neural fields, but also introduce biases that are not necessarily beneficial for realworld data. In this work, we (1) characterize the undesirable these architectures have axis-aligned signals—they can lead radiance field reconstruction differences of as high 2 PSNR—and (2) explore how learning set canonicalizing transformations improve representations by removing biases. We prove in...

10.1109/iccv51070.2023.00316 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2023-10-01

Humans can learn to manipulate new objects by simply watching others; providing robots with the ability from such demonstrations would enable a natural interface specifying behaviors. This work develops Robot See Do (RSRD), method for imitating articulated object manipulation single monocular RGB human demonstration given static multi-view scan. We first propose 4D Differentiable Part Models (4D-DPM), recovering 3D part motion video differentiable rendering. analysis-by-synthesis approach...

10.48550/arxiv.2409.18121 preprint EN arXiv (Cornell University) 2024-09-26

We present EgoAllo, a system for human motion estimation from head-mounted device. Using only egocentric SLAM poses and images, EgoAllo guides sampling conditional diffusion model to estimate 3D body pose, height, hand parameters that capture the wearer's actions in allocentric coordinate frame of scene. To achieve this, our key insight is representation: we propose spatial temporal invariance criteria improving performance, which derive head conditioning parameterization improves by up 18%....

10.48550/arxiv.2410.03665 preprint EN arXiv (Cornell University) 2024-10-04

We present "Humans and Structure from Motion" (HSfM), a method for jointly reconstructing multiple human meshes, scene point clouds, camera parameters in metric world coordinate system sparse set of uncalibrated multi-view images featuring people. Our approach combines data-driven reconstruction with the traditional Structure-from-Motion (SfM) framework to achieve more accurate estimation, while simultaneously recovering meshes. In contrast existing SfM methods that lack scale information,...

10.48550/arxiv.2412.17806 preprint EN arXiv (Cornell University) 2024-12-23

Neural Radiance Fields (NeRF) are a rapidly growing area of research with wide-ranging applications in computer vision, graphics, robotics, and more. In order to streamline the development deployment NeRF research, we propose modular PyTorch framework, Nerfstudio. Our framework includes plug-and-play components for implementing NeRF-based methods, which make it easy researchers practitioners incorporate into their projects. Additionally, design enables support extensive real-time...

10.48550/arxiv.2302.04264 preprint EN other-oa arXiv (Cornell University) 2023-01-01

This work proposes a minimal computational model for learning structured memories of multiple object classes in an incremental setting. Our approach is based on establishing closed-loop transcription between the and corresponding set subspaces, known as linear discriminative representation, low-dimensional feature space. method simpler than existing approaches learning, more efficient terms size, storage, computation: it requires only single, fixed-capacity autoencoding network with space...

10.48550/arxiv.2202.05411 preprint EN other-oa arXiv (Cornell University) 2022-01-01

This paper proposes an unsupervised method for learning a unified representation that serves both discriminative and generative purposes. While most existing approaches focus on only one of these two goals, we show can enjoy the mutual benefits having both. Such is attainable by generalizing recently proposed \textit{closed-loop transcription} framework, known as CTRL, to setting. entails solving constrained maximin game over rate reduction objective expands features all samples while...

10.48550/arxiv.2210.16782 preprint EN other-oa arXiv (Cornell University) 2022-01-01

Robots must cost less and be force-controlled to enable widespread, safe deployment in unconstrained human environments. We propose Quasi-Direct Drive actuation as a capable paradigm for robotic manipulation environments at low-cost. Our prototype - Blue is scale 7 Degree of Freedom arm with 2kg payload. can than $5000. show that has dynamic properties meet or exceed the needs operators: robot nominal position-control bandwidth 7.5Hz repeatability within 4mm. demonstrate Virtual Reality...

10.48550/arxiv.1904.03815 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Leveraging multimodal information with recursive Bayesian filters improves performance and robustness of state estimation, as can combine different modalities according to their uncertainties. Prior work has studied how optimally fuse sensor analytical estimation algorithms. However, deriving the dynamics measurement models along noise profile be difficult or lead intractable models. Differentiable provide a way learn these end-to-end while retaining algorithmic structure filters. This...

10.48550/arxiv.2010.13021 preprint EN other-oa arXiv (Cornell University) 2020-01-01

A recent line of work has shown that end-to-end optimization Bayesian filters can be used to learn state estimators for systems whose underlying models are difficult hand-design or tune, while retaining the core advantages probabilistic estimation. As an alternative approach estimation in these settings, we present learning modeled as factor graph-based smoothers. By unrolling optimizer use maximum a posteriori inference graphical models, system full context overall estimator, also taking...

10.48550/arxiv.2105.08257 preprint EN cc-by arXiv (Cornell University) 2021-01-01

Factored feature volumes offer a simple way to build more compact, efficient, and intepretable neural fields, but also introduce biases that are not necessarily beneficial for real-world data. In this work, we (1) characterize the undesirable these architectures have axis-aligned signals -- they can lead radiance field reconstruction differences of as high 2 PSNR (2) explore how learning set canonicalizing transformations improve representations by removing biases. We prove in...

10.48550/arxiv.2308.15461 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Robots deployed in human-centric environments may need to manipulate a diverse range of articulated objects, such as doors, dishwashers, and cabinets. Articulated objects often come with unexpected articulation mechanisms that are inconsistent categorical priors: for example, drawer might rotate about hinge joint instead sliding open. We propose category-independent framework predicting the models unknown from sequences RGB-D images. The prediction is performed by two-step process: first,...

10.48550/arxiv.2205.03721 preprint EN cc-by arXiv (Cornell University) 2022-01-01
Coming Soon ...