NFDI4DS | UHH-SEMS - Publication Details

Segment Everything Everywhere All at Once

Market Segmentation Segmentation-based object categorization Text segmentation

DOI: 10.48550/arxiv.2304.06718 Publication Date: 2023-01-01

Abstract Supplemental Material References Cited by

AUTHORS (7)

Xueyan Zou

Jianwei Yang

Hao Zhang

Feng Li

Linjie Li

Jianfeng Gao

Yong Jae Lee

ABSTRACT

In this work, we present SEEM, a promptable and interactive model for segmenting everything everywhere all at once in an image, as shown Fig.1. propose novel decoding mechanism that enables diverse prompting types of segmentation tasks, aiming universal interface behaves like large language models (LLMs). More specifically, SEEM is designed with four desiderata: i) Versatility. We introduce new visual prompt to unify different spatial queries including points, boxes, scribbles masks, which can further generalize referring image; ii) Compositionality. learn joint visual-semantic space between text prompts, facilitates the dynamic composition two required various tasks; iii) Interactivity. incorporate learnable memory prompts into decoder retain history through mask-guided cross-attention from image features; iv) Semantic-awareness. use encoder encode mask labels same semantic open-vocabulary segmentation. conduct comprehensive empirical study validate effectiveness across tasks. Notably, our single achieves competitive performance segmentation, generic video object on 9 datasets minimum 1/100 supervision. Furthermore, showcases remarkable capacity generalization or their combinations, rendering it readily interface.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENAIRE - Products OPENALEX - Publications

PlumX Metrics

Segment Everything Everywhere All at Once

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....