- Robot Manipulation and Learning
- Advanced Vision and Imaging
- 3D Shape Modeling and Analysis
- Computer Graphics and Visualization Techniques
- Robotics and Sensor-Based Localization
- Reinforcement Learning in Robotics
- Multimodal Machine Learning Applications
- Modular Robots and Swarm Intelligence
- Soft Robotics and Applications
- Optical measurement and interference techniques
- Robotic Mechanisms and Dynamics
- Robotic Path Planning Algorithms
- Cell Image Analysis Techniques
- Generative Adversarial Networks and Image Synthesis
- Remote Sensing and LiDAR Applications
- Human Pose and Action Recognition
- Tactile and Sensory Interactions
- Industrial Vision Systems and Defect Detection
- Advanced Optical Sensing Technologies
- Adversarial Robustness in Machine Learning
- Advanced Neural Network Applications
- Advanced Sensor and Energy Harvesting Materials
- 3D Surveying and Cultural Heritage
University of California, San Diego
2020-2024
UC San Diego Health System
2020
We present MVSNeRF, a novel neural rendering approach that can efficiently reconstruct radiance fields for view synthesis. Unlike prior works on consider per-scene optimization densely captured images, we propose generic deep network from only three nearby input views via fast inference. Our leverages plane-swept cost volumes (widely used in multi-view stereo) geometry-aware scene reasoning, and combines this with physically based volume field reconstruction. train our real objects the DTU...
Building home assistant robots has long been a goal for vision and robotics researchers. To achieve this task, simulated environment with physically realistic simulation, sufficient articulated objects, transferability to the real robot is indispensable. Existing environments these requirements simulation different levels of simplification focus. We take one step further in constructing an that supports household tasks training learning algorithm. Our work, SAPIEN, physics-rich hosts...
Recent work [28], [5] has demonstrated that volumetric scene representations combined with differentiable volume rendering can enable photo-realistic for challenging scenes mesh reconstruction fails on. However, these methods entangle geometry and appearance in a "black-box" cannot be edited. Instead, we present an approach explicitly disentangles geometry—represented as continuous 3D volume—from appearance—represented 2D texture map. We achieve this by introducing 3D-to-2D mapping (or...
In this article, we focus on the simulation of active stereovision depth sensors, which are popular in both academic and industry communities. Inspired by underlying mechanism designed a fully physics-grounded pipeline that includes material acquisition, ray-tracing-based infrared (IR) image rendering, IR noise simulation, estimation. The is able to generate maps with material-dependent error patterns similar real sensor time. We conduct experiments show perception algorithms reinforcement...
In this paper, we propose a cloud-based benchmark for robotic grasping and manipulation, called the OCRTOC benchmark. The focuses on object rearrangement problem, specifically table organization tasks. We provide set of identical real robot setups facilitate remote experiments standardized scenarios in varying difficulties. workflow, users upload their solutions to our server code is executed scored automatically. After each execution, team resets experimental setup manually. also simulation...
Generalizable manipulation skills, which can be composed to tackle long-horizon and complex daily chores, are one of the cornerstones Embodied AI. However, existing benchmarks, mostly a suite simulatable environments, insufficient push cutting-edge research works because they lack object-level topological geometric variations, not based on fully dynamic simulation, or short native support for multiple types tasks. To this end, we present ManiSkill2, next generation SAPIEN ManiSkill...
Object manipulation from 3D visual inputs poses many challenges on building generalizable perception and policy models. However, assets in existing benchmarks mostly lack the diversity of shapes that align with real-world intra-class complexity topology geometry. Here we propose SAPIEN Manipulation Skill Benchmark (ManiSkill) to benchmark skills over diverse objects a full-physics simulator. ManiSkill include large topological geometric variations. Tasks are carefully chosen cover distinct...
Visuotactile sensors can provide rich contact information, having great potential in contact-rich manipulation tasks with reinforcement learning (RL) policies. Sim2Real technique tackles the challenge of RL's reliance on a large amount interaction data. However, most methods for visuotactile rely rigid-body physics simulation, which fails to simulate real elastic deformation precisely. Moreover, these do not exploit characteristic tactile signals designing network architecture. In this...
Contrary to the vast literature in modeling, perceiving, and understanding agent-object (e.g., human-object, hand-object, robot-object) interaction computer vision robotics, very few past works have studied task of object-object interaction, which also plays an important role robotic manipulation planning tasks. There is a rich space scenarios our daily life, such as placing object on messy tabletop, fitting inside drawer, pushing using tool, etc. In this paper, we propose unified affordance...
We present MVSNeRF, a novel neural rendering approach that can efficiently reconstruct radiance fields for view synthesis. Unlike prior works on consider per-scene optimization densely captured images, we propose generic deep network from only three nearby input views via fast inference. Our leverages plane-swept cost volumes (widely used in multi-view stereo) geometry-aware scene reasoning, and combines this with physically based volume field reconstruction. train our real objects the DTU...
Manipulating unseen articulated objects through visual feedback is a critical but challenging task for real robots. Existing learning-based solutions mainly focus on affordance learning or other pre-trained models to guide manipulation policies, which face challenges novel instances in real-world scenarios. In this letter, we propose part-guided 3D RL framework, can learn manipulate without demonstrations. We combine the strengths of 2D segmentation and improve efficiency policy training. To...
Building home assistant robots has long been a pursuit for vision and robotics researchers. To achieve this task, simulated environment with physically realistic simulation, sufficient articulated objects, transferability to the real robot is indispensable. Existing environments these requirements simulation different levels of simplification focus. We take one step further in constructing an that supports household tasks training learning algorithm. Our work, SAPIEN, physics-rich hosts...
Simulation has enabled unprecedented compute-scalable approaches to robot learning. However, many existing simulation frameworks typically support a narrow range of scenes/tasks and lack features critical for scaling generalizable robotics sim2real. We introduce open source ManiSkill3, the fastest state-visual GPU parallelized simulator with contact-rich physics targeting manipulation. ManiSkill3 supports parallelization aspects including simulation+rendering, heterogeneous simulation,...
In this paper, we propose a cloud-based benchmark for robotic grasping and manipulation, called the OCRTOC benchmark. The focuses on object rearrangement problem, specifically table organization tasks. We provide set of identical real robot setups facilitate remote experiments standardized scenarios in varying difficulties. workflow, users upload their solutions to our server code is executed scored automatically. After each execution, team resets experimental setup manually. also simulation...
Manipulating unseen articulated objects through visual feedback is a critical but challenging task for real robots. Existing learning-based solutions mainly focus on affordance learning or other pre-trained models to guide manipulation policies, which face challenges novel instances in real-world scenarios. In this paper, we propose part-guided 3D RL framework, can learn manipulate without demonstrations. We combine the strengths of 2D segmentation and improve efficiency policy training. To...
The development of 2D foundation models for image segmentation has been significantly advanced by the Segment Anything Model (SAM). However, achieving similar success in 3D remains a challenge due to issues such as non-unified data formats, lightweight models, and scarcity labeled with diverse masks. To this end, we propose promptable model (Point-SAM) focusing on point clouds. Our approach utilizes transformer-based method, extending SAM domain. We leverage part-level object-level...
We present a method for generating high-quality watertight manifold meshes from multi-view input images. Existing volumetric rendering methods are robust in optimization but tend to generate noisy with poor topology. Differentiable rasterization-based can sensitive initialization. Our combines the benefits of both worlds; we take geometry initialization obtained neural fields, and further optimize as well compact texture representation differentiable rasterizers. Through extensive...
In this paper, we focus on the simulation of active stereovision depth sensors, which are popular in both academic and industry communities. Inspired by underlying mechanism designed a fully physics-grounded pipeline that includes material acquisition, ray-tracing-based infrared (IR) image rendering, IR noise simulation, estimation. The is able to generate maps with material-dependent error patterns similar real sensor time. We conduct experiments show perception algorithms reinforcement...
Neural radiance fields with stochasticity have garnered significant interest by enabling the sampling of plausible and quantifying uncertainty for downstream tasks. Existing works rely on independence assumption points in field or pixels input views to obtain tractable forms probability density function. However, this inadvertently impacts performance when dealing intricate geometry texture. In work, we propose an independence-assumption-free probabilistic neural based Flow-GAN. By combining...
Building robots that can automate labor-intensive tasks has long been the core motivation behind advancements in computer vision and robotics community. Recent interest leveraging 3D algorithms, particularly neural fields, led to robot perception physical understanding manipulation scenarios. However, real world's complexity poses significant challenges. To tackle these challenges, we present Robo360, a dataset features robotic with dense view coverage, which enables high-quality...
Recent work has demonstrated that volumetric scene representations combined with differentiable volume rendering can enable photo-realistic for challenging scenes mesh reconstruction fails on. However, these methods entangle geometry and appearance in a "black-box" cannot be edited. Instead, we present an approach explicitly disentangles geometry--represented as continuous 3D volume--from appearance--represented 2D texture map. We achieve this by introducing 3D-to-2D mapping (or surface...