LLM meets Vision-Language Models for Zero-Shot One-Class Classification

Zero (linguistics)
DOI: 10.48550/arxiv.2404.00675 Publication Date: 2024-03-31
ABSTRACT
We consider the problem of zero-shot one-class visual classification. In this setting, only label target class is available, and goal to discriminate between positive negative query samples without requiring any validation example from task. propose a two-step solution that first queries large language models for visually confusing objects then relies on vision-language pre-trained (e.g., CLIP) perform By adapting large-scale vision benchmarks, we demonstrate ability proposed method outperform adapted off-the-shelf alternatives in setting. Namely, realistic benchmark where are drawn same original dataset as ones, including granularity-controlled version iNaturalist, at fixed distance taxonomy tree ones. Our work shows it possible single category other semantically related ones using its
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....