Fanbo Xiang

ORCID: 0009-0005-5335-873X
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Robot Manipulation and Learning
  • Advanced Vision and Imaging
  • 3D Shape Modeling and Analysis
  • Computer Graphics and Visualization Techniques
  • Robotics and Sensor-Based Localization
  • Reinforcement Learning in Robotics
  • Multimodal Machine Learning Applications
  • Modular Robots and Swarm Intelligence
  • Soft Robotics and Applications
  • Optical measurement and interference techniques
  • Robotic Mechanisms and Dynamics
  • Robotic Path Planning Algorithms
  • Cell Image Analysis Techniques
  • Generative Adversarial Networks and Image Synthesis
  • Remote Sensing and LiDAR Applications
  • Human Pose and Action Recognition
  • Tactile and Sensory Interactions
  • Industrial Vision Systems and Defect Detection
  • Advanced Optical Sensing Technologies
  • Adversarial Robustness in Machine Learning
  • Advanced Neural Network Applications
  • Advanced Sensor and Energy Harvesting Materials
  • 3D Surveying and Cultural Heritage

University of California, San Diego
2020-2024

UC San Diego Health System
2020

We present MVSNeRF, a novel neural rendering approach that can efficiently reconstruct radiance fields for view synthesis. Unlike prior works on consider per-scene optimization densely captured images, we propose generic deep network from only three nearby input views via fast inference. Our leverages plane-swept cost volumes (widely used in multi-view stereo) geometry-aware scene reasoning, and combines this with physically based volume field reconstruction. train our real objects the DTU...

10.1109/iccv48922.2021.01386 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2021-10-01

Building home assistant robots has long been a goal for vision and robotics researchers. To achieve this task, simulated environment with physically realistic simulation, sufficient articulated objects, transferability to the real robot is indispensable. Existing environments these requirements simulation different levels of simplification focus. We take one step further in constructing an that supports household tasks training learning algorithm. Our work, SAPIEN, physics-rich hosts...

10.1109/cvpr42600.2020.01111 preprint EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020-06-01

Recent work [28], [5] has demonstrated that volumetric scene representations combined with differentiable volume rendering can enable photo-realistic for challenging scenes mesh reconstruction fails on. However, these methods entangle geometry and appearance in a "black-box" cannot be edited. Instead, we present an approach explicitly disentangles geometry—represented as continuous 3D volume—from appearance—represented 2D texture map. We achieve this by introducing 3D-to-2D mapping (or...

10.1109/cvpr46437.2021.00704 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01

In this article, we focus on the simulation of active stereovision depth sensors, which are popular in both academic and industry communities. Inspired by underlying mechanism designed a fully physics-grounded pipeline that includes material acquisition, ray-tracing-based infrared (IR) image rendering, IR noise simulation, estimation. The is able to generate maps with material-dependent error patterns similar real sensor time. We conduct experiments show perception algorithms reinforcement...

10.1109/tro.2023.3235591 article EN IEEE Transactions on Robotics 2023-01-27

In this paper, we propose a cloud-based benchmark for robotic grasping and manipulation, called the OCRTOC benchmark. The focuses on object rearrangement problem, specifically table organization tasks. We provide set of identical real robot setups facilitate remote experiments standardized scenarios in varying difficulties. workflow, users upload their solutions to our server code is executed scored automatically. After each execution, team resets experimental setup manually. also simulation...

10.1109/lra.2021.3129136 article EN IEEE Robotics and Automation Letters 2021-11-18

Generalizable manipulation skills, which can be composed to tackle long-horizon and complex daily chores, are one of the cornerstones Embodied AI. However, existing benchmarks, mostly a suite simulatable environments, insufficient push cutting-edge research works because they lack object-level topological geometric variations, not based on fully dynamic simulation, or short native support for multiple types tasks. To this end, we present ManiSkill2, next generation SAPIEN ManiSkill...

10.48550/arxiv.2302.04659 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Object manipulation from 3D visual inputs poses many challenges on building generalizable perception and policy models. However, assets in existing benchmarks mostly lack the diversity of shapes that align with real-world intra-class complexity topology geometry. Here we propose SAPIEN Manipulation Skill Benchmark (ManiSkill) to benchmark skills over diverse objects a full-physics simulator. ManiSkill include large topological geometric variations. Tasks are carefully chosen cover distinct...

10.48550/arxiv.2107.14483 preprint EN other-oa arXiv (Cornell University) 2021-01-01

Visuotactile sensors can provide rich contact information, having great potential in contact-rich manipulation tasks with reinforcement learning (RL) policies. Sim2Real technique tackles the challenge of RL's reliance on a large amount interaction data. However, most methods for visuotactile rely rigid-body physics simulation, which fails to simulate real elastic deformation precisely. Moreover, these do not exploit characteristic tactile signals designing network architecture. In this...

10.1109/tro.2024.3352969 article EN cc-by-nc-nd IEEE Transactions on Robotics 2024-01-01

Contrary to the vast literature in modeling, perceiving, and understanding agent-object (e.g., human-object, hand-object, robot-object) interaction computer vision robotics, very few past works have studied task of object-object interaction, which also plays an important role robotic manipulation planning tasks. There is a rich space scenarios our daily life, such as placing object on messy tabletop, fitting inside drawer, pushing using tool, etc. In this paper, we propose unified affordance...

10.48550/arxiv.2106.15087 preprint EN other-oa arXiv (Cornell University) 2021-01-01

We present MVSNeRF, a novel neural rendering approach that can efficiently reconstruct radiance fields for view synthesis. Unlike prior works on consider per-scene optimization densely captured images, we propose generic deep network from only three nearby input views via fast inference. Our leverages plane-swept cost volumes (widely used in multi-view stereo) geometry-aware scene reasoning, and combines this with physically based volume field reconstruction. train our real objects the DTU...

10.48550/arxiv.2103.15595 preprint EN other-oa arXiv (Cornell University) 2021-01-01

Manipulating unseen articulated objects through visual feedback is a critical but challenging task for real robots. Existing learning-based solutions mainly focus on affordance learning or other pre-trained models to guide manipulation policies, which face challenges novel instances in real-world scenarios. In this letter, we propose part-guided 3D RL framework, can learn manipulate without demonstrations. We combine the strengths of 2D segmentation and improve efficiency policy training. To...

10.1109/lra.2023.3313063 article EN IEEE Robotics and Automation Letters 2023-09-07

Building home assistant robots has long been a pursuit for vision and robotics researchers. To achieve this task, simulated environment with physically realistic simulation, sufficient articulated objects, transferability to the real robot is indispensable. Existing environments these requirements simulation different levels of simplification focus. We take one step further in constructing an that supports household tasks training learning algorithm. Our work, SAPIEN, physics-rich hosts...

10.48550/arxiv.2003.08515 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Simulation has enabled unprecedented compute-scalable approaches to robot learning. However, many existing simulation frameworks typically support a narrow range of scenes/tasks and lack features critical for scaling generalizable robotics sim2real. We introduce open source ManiSkill3, the fastest state-visual GPU parallelized simulator with contact-rich physics targeting manipulation. ManiSkill3 supports parallelization aspects including simulation+rendering, heterogeneous simulation,...

10.48550/arxiv.2410.00425 preprint EN arXiv (Cornell University) 2024-10-01

In this paper, we propose a cloud-based benchmark for robotic grasping and manipulation, called the OCRTOC benchmark. The focuses on object rearrangement problem, specifically table organization tasks. We provide set of identical real robot setups facilitate remote experiments standardized scenarios in varying difficulties. workflow, users upload their solutions to our server code is executed scored automatically. After each execution, team resets experimental setup manually. also simulation...

10.48550/arxiv.2104.11446 preprint EN other-oa arXiv (Cornell University) 2021-01-01

Manipulating unseen articulated objects through visual feedback is a critical but challenging task for real robots. Existing learning-based solutions mainly focus on affordance learning or other pre-trained models to guide manipulation policies, which face challenges novel instances in real-world scenarios. In this paper, we propose part-guided 3D RL framework, can learn manipulate without demonstrations. We combine the strengths of 2D segmentation and improve efficiency policy training. To...

10.48550/arxiv.2404.17302 preprint EN arXiv (Cornell University) 2024-04-26

The development of 2D foundation models for image segmentation has been significantly advanced by the Segment Anything Model (SAM). However, achieving similar success in 3D remains a challenge due to issues such as non-unified data formats, lightweight models, and scarcity labeled with diverse masks. To this end, we propose promptable model (Point-SAM) focusing on point clouds. Our approach utilizes transformer-based method, extending SAM domain. We leverage part-level object-level...

10.48550/arxiv.2406.17741 preprint EN arXiv (Cornell University) 2024-06-25

We present a method for generating high-quality watertight manifold meshes from multi-view input images. Existing volumetric rendering methods are robust in optimization but tend to generate noisy with poor topology. Differentiable rasterization-based can sensitive initialization. Our combines the benefits of both worlds; we take geometry initialization obtained neural fields, and further optimize as well compact texture representation differentiable rasterizers. Through extensive...

10.48550/arxiv.2305.17134 preprint EN other-oa arXiv (Cornell University) 2023-01-01

In this paper, we focus on the simulation of active stereovision depth sensors, which are popular in both academic and industry communities. Inspired by underlying mechanism designed a fully physics-grounded pipeline that includes material acquisition, ray-tracing-based infrared (IR) image rendering, IR noise simulation, estimation. The is able to generate maps with material-dependent error patterns similar real sensor time. We conduct experiments show perception algorithms reinforcement...

10.48550/arxiv.2201.11924 preprint EN other-oa arXiv (Cornell University) 2022-01-01

Neural radiance fields with stochasticity have garnered significant interest by enabling the sampling of plausible and quantifying uncertainty for downstream tasks. Existing works rely on independence assumption points in field or pixels input views to obtain tractable forms probability density function. However, this inadvertently impacts performance when dealing intricate geometry texture. In work, we propose an independence-assumption-free probabilistic neural based Flow-GAN. By combining...

10.48550/arxiv.2309.16364 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Building robots that can automate labor-intensive tasks has long been the core motivation behind advancements in computer vision and robotics community. Recent interest leveraging 3D algorithms, particularly neural fields, led to robot perception physical understanding manipulation scenarios. However, real world's complexity poses significant challenges. To tackle these challenges, we present Robo360, a dataset features robotic with dense view coverage, which enables high-quality...

10.48550/arxiv.2312.06686 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Recent work has demonstrated that volumetric scene representations combined with differentiable volume rendering can enable photo-realistic for challenging scenes mesh reconstruction fails on. However, these methods entangle geometry and appearance in a "black-box" cannot be edited. Instead, we present an approach explicitly disentangles geometry--represented as continuous 3D volume--from appearance--represented 2D texture map. We achieve this by introducing 3D-to-2D mapping (or surface...

10.48550/arxiv.2103.00762 preprint EN other-oa arXiv (Cornell University) 2021-01-01
Coming Soon ...