NFDI4DS | UHH-SEMS - Publication Details

OpenIns3D: Snap and Lookup for 3D Open-vocabulary Instance Segmentation

Margin (machine learning) Bridge (graph theory) Line (geometry)

DOI: 10.48550/arxiv.2309.00616 Publication Date: 2023-01-01

Abstract Supplemental Material References Cited by

AUTHORS (6)

Zhening Huang

Xiaoyang Wu

Xi Chen

Hengshuang Zhao

Lei Zhu

Joan Lasenby

ABSTRACT

Current 3D open-vocabulary scene understanding methods mostly utilize well-aligned 2D images as the bridge to learn features with language. However, applying these approaches becomes challenging in scenarios where are absent. In this work, we introduce a new pipeline, namely, OpenIns3D, which requires no image inputs, for at instance level. The OpenIns3D framework employs "Mask-Snap-Lookup" scheme. "Mask" module learns class-agnostic mask proposals point clouds. "Snap" generates synthetic scene-level multiple scales and leverages vision language models extract interesting objects. "Lookup" searches through outcomes of help Mask2Pixel maps, contain precise correspondence between masks images, assign category names proposed masks. This input-free flexible approach achieves state-of-the-art results on wide range indoor outdoor datasets by large margin. Moreover, allows effortless switching detectors without re-training. When integrated powerful open-world such ODISE GroundingDINO, excellent were observed segmentation. LLM-powered like LISA, it demonstrates remarkable capacity process highly complex text queries require intricate reasoning world knowledge. Project page: https://zheninghuang.github.io/OpenIns3D/

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENALEX - Publications OPENAIRE - Products

PlumX Metrics

OpenIns3D: Snap and Lookup for 3D Open-vocabulary Instance Segmentation

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....