NFDI4DS | UHH-SEMS - Publication Details

MVSNeRF: Fast Generalizable Radiance Field Reconstruction from Multi-View Stereo

OPENALEX - Publications

Anpei Chen Zexiang Xu Fuqiang Zhao Xiaoshuai Zhang Fanbo Xiang and 2 more

We present MVSNeRF, a novel neural rendering approach that can efficiently reconstruct radiance fields for view synthesis. Unlike prior works on consider per-scene optimization densely captured images, we propose generic deep network from only three nearby input views via fast inference. Our leverages plane-swept cost volumes (widely used in multi-view stereo) geometry-aware scene reasoning, and combines this with physically based volume field reconstruction. train our real objects the DTU...

10.1109/iccv48922.2021.01386 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2021-10-01

SAPIEN: A SimulAted Part-Based Interactive ENvironment

OPENALEX - Publications

Fanbo Xiang Yuzhe Qin Kaichun Mo Yikuan Xia Hao Zhu and 9 more

Building home assistant robots has long been a goal for vision and robotics researchers. To achieve this task, simulated environment with physically realistic simulation, sufficient articulated objects, transferability to the real robot is indispensable. Existing environments these requirements simulation different levels of simplification focus. We take one step further in constructing an that supports household tasks training learning algorithm. Our work, SAPIEN, physics-rich hosts...

10.1109/cvpr42600.2020.01111 preprint EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020-06-01

NeuTex: Neural Texture Mapping for Volumetric Neural Rendering

OPENALEX - Publications

Fanbo Xiang Zexiang Xu Miloš Hašan Yannick Hold-Geoffroy Kalyan Sunkavalli and 1 more

Recent work [28], [5] has demonstrated that volumetric scene representations combined with differentiable volume rendering can enable photo-realistic for challenging scenes mesh reconstruction fails on. However, these methods entangle geometry and appearance in a "black-box" cannot be edited. Instead, we present an approach explicitly disentangles geometry—represented as continuous 3D volume—from appearance—represented 2D texture map. We achieve this by introducing 3D-to-2D mapping (or...

10.1109/cvpr46437.2021.00704 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01

Close the Optical Sensing Domain Gap by Physics-Grounded Active Stereo Sensor Simulation

OPENALEX - Publications

Xiaoshuai Zhang Rui Chen Ang Li Fanbo Xiang Yuzhe Qin and 9 more

In this article, we focus on the simulation of active stereovision depth sensors, which are popular in both academic and industry communities. Inspired by underlying mechanism designed a fully physics-grounded pipeline that includes material acquisition, ray-tracing-based infrared (IR) image rendering, IR noise simulation, estimation. The is able to generate maps with material-dependent error patterns similar real sensor time. We conduct experiments show perception algorithms reinforcement...

10.1109/tro.2023.3235591 article EN IEEE Transactions on Robotics 2023-01-27

OCRTOC: A Cloud-Based Competition and Benchmark for Robotic Grasping and Manipulation

OPENALEX - Publications

Ziyuan Liu Wei Liu Yuzhe Qin Fanbo Xiang Minghao Gou and 6 more

In this paper, we propose a cloud-based benchmark for robotic grasping and manipulation, called the OCRTOC benchmark. The focuses on object rearrangement problem, specifically table organization tasks. We provide set of identical real robot setups facilitate remote experiments standardized scenarios in varying difficulties. workflow, users upload their solutions to our server code is executed scored automatically. After each execution, team resets experimental setup manually. also simulation...

10.1109/lra.2021.3129136 article EN IEEE Robotics and Automation Letters 2021-11-18

ManiSkill2: A Unified Benchmark for Generalizable Manipulation Skills

OPENALEX - Publications

Jiayuan Gu Fanbo Xiang Xuanlin Li Zhan Ling Liu Xi-qiang and 10 more

Generalizable manipulation skills, which can be composed to tackle long-horizon and complex daily chores, are one of the cornerstones Embodied AI. However, existing benchmarks, mostly a suite simulatable environments, insufficient push cutting-edge research works because they lack object-level topological geometric variations, not based on fully dynamic simulation, or short native support for multiple types tasks. To this end, we present ManiSkill2, next generation SAPIEN ManiSkill...

10.48550/arxiv.2302.04659 preprint EN cc-by arXiv (Cornell University) 2023-01-01

ManiSkill: Generalizable Manipulation Skill Benchmark with Large-Scale Demonstrations

OPENALEX - Publications

Tongzhou Mu Zhan Ling Fanbo Xiang Derek Yang Xuanlin Li and 4 more

Object manipulation from 3D visual inputs poses many challenges on building generalizable perception and policy models. However, assets in existing benchmarks mostly lack the diversity of shapes that align with real-world intra-class complexity topology geometry. Here we propose SAPIEN Manipulation Skill Benchmark (ManiSkill) to benchmark skills over diverse objects a full-physics simulator. ManiSkill include large topological geometric variations. Tasks are carefully chosen cover distinct...

10.48550/arxiv.2107.14483 preprint EN other-oa arXiv (Cornell University) 2021-01-01

General-Purpose Sim2Real Protocol for Learning Contact-Rich Manipulation With Marker-Based Visuotactile Sensors

OPENALEX - Publications

Weihang Chen Jing Xu Fanbo Xiang Xiaodi Yuan Hao Su and 1 more

Visuotactile sensors can provide rich contact information, having great potential in contact-rich manipulation tasks with reinforcement learning (RL) policies. Sim2Real technique tackles the challenge of RL's reliance on a large amount interaction data. However, most methods for visuotactile rely rigid-body physics simulation, which fails to simulate real elastic deformation precisely. Moreover, these do not exploit characteristic tactile signals designing network architecture. In this...

10.1109/tro.2024.3352969 article EN cc-by-nc-nd IEEE Transactions on Robotics 2024-01-01

O2O-Afford: Annotation-Free Large-Scale Object-Object Affordance Learning

OPENALEX - Publications

Kaichun Mo Yuzhe Qin Fanbo Xiang Hao Su Leonidas Guibas

Contrary to the vast literature in modeling, perceiving, and understanding agent-object (e.g., human-object, hand-object, robot-object) interaction computer vision robotics, very few past works have studied task of object-object interaction, which also plays an important role robotic manipulation planning tasks. There is a rich space scenarios our daily life, such as placing object on messy tabletop, fitting inside drawer, pushing using tool, etc. In this paper, we propose unified affordance...

10.48550/arxiv.2106.15087 preprint EN other-oa arXiv (Cornell University) 2021-01-01

MVSNeRF: Fast Generalizable Radiance Field Reconstruction from Multi-View Stereo

OPENALEX - Publications

Anpei Chen Zexiang Xu Fuqiang Zhao Xiaoshuai Zhang Fanbo Xiang and 2 more

We present MVSNeRF, a novel neural rendering approach that can efficiently reconstruct radiance fields for view synthesis. Unlike prior works on consider per-scene optimization densely captured images, we propose generic deep network from only three nearby input views via fast inference. Our leverages plane-swept cost volumes (widely used in multi-view stereo) geometry-aware scene reasoning, and combines this with physically based volume field reconstruction. train our real objects the DTU...

10.48550/arxiv.2103.15595 preprint EN other-oa arXiv (Cornell University) 2021-01-01

Part-Guided 3D RL for Sim2Real Articulated Object Manipulation

OPENALEX - Publications

Pengwei Xie Rui Chen Siang Chen Yuzhe Qin Fanbo Xiang and 4 more

Manipulating unseen articulated objects through visual feedback is a critical but challenging task for real robots. Existing learning-based solutions mainly focus on affordance learning or other pre-trained models to guide manipulation policies, which face challenges novel instances in real-world scenarios. In this letter, we propose part-guided 3D RL framework, can learn manipulate without demonstrations. We combine the strengths of 2D segmentation and improve efficiency policy training. To...

10.1109/lra.2023.3313063 article EN IEEE Robotics and Automation Letters 2023-09-07

SAPIEN: A SimulAted Part-based Interactive ENvironment

OPENALEX - Publications

Fanbo Xiang Yuzhe Qin Kaichun Mo Yikuan Xia Hao Zhu and 9 more

Building home assistant robots has long been a pursuit for vision and robotics researchers. To achieve this task, simulated environment with physically realistic simulation, sufficient articulated objects, transferability to the real robot is indispensable. Existing environments these requirements simulation different levels of simplification focus. We take one step further in constructing an that supports household tasks training learning algorithm. Our work, SAPIEN, physics-rich hosts...

10.48550/arxiv.2003.08515 preprint EN other-oa arXiv (Cornell University) 2020-01-01

ManiSkill3: GPU Parallelized Robotics Simulation and Rendering for Generalizable Embodied AI

OPENALEX - Publications

Stone Tao Fanbo Xiang Arth Shukla Yuzhe Qin Xander Hinrichsen and 15 more

Simulation has enabled unprecedented compute-scalable approaches to robot learning. However, many existing simulation frameworks typically support a narrow range of scenes/tasks and lack features critical for scaling generalizable robotics sim2real. We introduce open source ManiSkill3, the fastest state-visual GPU parallelized simulator with contact-rich physics targeting manipulation. ManiSkill3 supports parallelization aspects including simulation+rendering, heterogeneous simulation,...

10.48550/arxiv.2410.00425 preprint EN arXiv (Cornell University) 2024-10-01

OCRTOC: A Cloud-Based Competition and Benchmark for Robotic Grasping and Manipulation

OPENALEX - Publications

Ziyuan Liu Wei Liu Yuzhe Qin Fanbo Xiang Minghao Gou and 6 more

In this paper, we propose a cloud-based benchmark for robotic grasping and manipulation, called the OCRTOC benchmark. The focuses on object rearrangement problem, specifically table organization tasks. We provide set of identical real robot setups facilitate remote experiments standardized scenarios in varying difficulties. workflow, users upload their solutions to our server code is executed scored automatically. After each execution, team resets experimental setup manually. also simulation...

10.48550/arxiv.2104.11446 preprint EN other-oa arXiv (Cornell University) 2021-01-01

Part-Guided 3D RL for Sim2Real Articulated Object Manipulation

OPENALEX - Publications

Pengwei Xie Rui Chen Siang Chen Yuzhe Qin Fanbo Xiang and 4 more

Manipulating unseen articulated objects through visual feedback is a critical but challenging task for real robots. Existing learning-based solutions mainly focus on affordance learning or other pre-trained models to guide manipulation policies, which face challenges novel instances in real-world scenarios. In this paper, we propose part-guided 3D RL framework, can learn manipulate without demonstrations. We combine the strengths of 2D segmentation and improve efficiency policy training. To...

10.48550/arxiv.2404.17302 preprint EN arXiv (Cornell University) 2024-04-26

Point-SAM: Promptable 3D Segmentation Model for Point Clouds

OPENALEX - Publications

Yuchen Zhou Jiayuan Gu Tung Yen Chiang Fanbo Xiang Hao Su

The development of 2D foundation models for image segmentation has been significantly advanced by the Segment Anything Model (SAM). However, achieving similar success in 3D remains a challenge due to issues such as non-unified data formats, lightweight models, and scarcity labeled with diverse masks. To this end, we propose promptable model (Point-SAM) focusing on point clouds. Our approach utilizes transformer-based method, extending SAM domain. We leverage part-level object-level...

10.48550/arxiv.2406.17741 preprint EN arXiv (Cornell University) 2024-06-25

NeuManifold: Neural Watertight Manifold Reconstruction with Efficient and High-Quality Rendering Support

OPENALEX - Publications

Xinyue Wei Fanbo Xiang Sai Bi Anpei Chen Kalyan Sunkavalli and 2 more

We present a method for generating high-quality watertight manifold meshes from multi-view input images. Existing volumetric rendering methods are robust in optimization but tend to generate noisy with poor topology. Differentiable rasterization-based can sensitive initialization. Our combines the benefits of both worlds; we take geometry initialization obtained neural fields, and further optimize as well compact texture representation differentiable rasterizers. Through extensive...

10.48550/arxiv.2305.17134 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Close the Optical Sensing Domain Gap by Physics-Grounded Active Stereo Sensor Simulation

OPENALEX - Publications

Xiaoshuai Zhang Rui Chen Fanbo Xiang Yuzhe Qin Jiayuan Gu and 8 more

In this paper, we focus on the simulation of active stereovision depth sensors, which are popular in both academic and industry communities. Inspired by underlying mechanism designed a fully physics-grounded pipeline that includes material acquisition, ray-tracing-based infrared (IR) image rendering, IR noise simulation, estimation. The is able to generate maps with material-dependent error patterns similar real sensor time. We conduct experiments show perception algorithms reinforcement...

10.48550/arxiv.2201.11924 preprint EN other-oa arXiv (Cornell University) 2022-01-01

FG-NeRF: Flow-GAN based Probabilistic Neural Radiance Field for Independence-Assumption-Free Uncertainty Estimation

OPENALEX - Publications

Songlin Wei Jiazhao Zhang Yang Wang Fanbo Xiang Hao Su and 1 more

Neural radiance fields with stochasticity have garnered significant interest by enabling the sampling of plausible and quantifying uncertainty for downstream tasks. Existing works rely on independence assumption points in field or pixels input views to obtain tractable forms probability density function. However, this inadvertently impacts performance when dealing intricate geometry texture. In work, we propose an independence-assumption-free probabilistic neural based Flow-GAN. By combining...

10.48550/arxiv.2309.16364 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Robo360: A 3D Omnispective Multi-Material Robotic Manipulation Dataset

OPENALEX - Publications

Litian Liang Liuyu Bian Caiwei Xiao Jialin Zhang Linghao Chen and 4 more

Building robots that can automate labor-intensive tasks has long been the core motivation behind advancements in computer vision and robotics community. Recent interest leveraging 3D algorithms, particularly neural fields, led to robot perception physical understanding manipulation scenarios. However, real world's complexity poses significant challenges. To tackle these challenges, we present Robo360, a dataset features robotic with dense view coverage, which enables high-quality...

10.48550/arxiv.2312.06686 preprint EN cc-by arXiv (Cornell University) 2023-01-01

NeuTex: Neural Texture Mapping for Volumetric Neural Rendering

OPENALEX - Publications

Fanbo Xiang Zexiang Xu Miloš Hašan Yannick Hold-Geoffroy Kalyan Sunkavalli and 1 more

Recent work has demonstrated that volumetric scene representations combined with differentiable volume rendering can enable photo-realistic for challenging scenes mesh reconstruction fails on. However, these methods entangle geometry and appearance in a "black-box" cannot be edited. Instead, we present an approach explicitly disentangles geometry--represented as continuous 3D volume--from appearance--represented 2D texture map. We achieve this by introducing 3D-to-2D mapping (or surface...

10.48550/arxiv.2103.00762 preprint EN other-oa arXiv (Cornell University) 2021-01-01