NFDI4DS | UHH-SEMS - Publication Details

CAT-Seg: Cost Aggregation for Open-Vocabulary Semantic Segmentation

Similarity (geometry)

DOI: 10.48550/arxiv.2303.11797 Publication Date: 2023-01-01

Abstract Supplemental Material References Cited by

AUTHORS (8)

Seokju Cho

Heeseong Shin

Sunghwan Hong

Seung-Jun An

Seung-Jun Lee

Anurag Arnab

Paul Hongsuck Seo

Seungryong Kim

ABSTRACT

Open-vocabulary semantic segmentation presents the challenge of labeling each pixel within an image based on a wide range text descriptions. In this work, we introduce novel cost-based approach to adapt vision-language foundation models, notably CLIP, for intricate task segmentation. Through aggregating cosine similarity score, i.e., cost volume between and embeddings, our method potently adapts CLIP segmenting seen unseen classes by fine-tuning its encoders, addressing challenges faced existing methods in handling classes. Building upon this, explore effectively aggregate considering multi-modal nature being established embeddings. Furthermore, examine various efficiently CLIP.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENALEX - Publications OPENAIRE - Products

PlumX Metrics

CAT-Seg: Cost Aggregation for Open-Vocabulary Semantic Segmentation

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....