NFDI4DS | UHH-SEMS - Publication Details

Improving Composed Image Retrieval via Contrastive Learning with Scaling Positives and Negatives

Negative True positive rate

DOI: 10.48550/arxiv.2404.11317 Publication Date: 2024-04-17

Abstract Supplemental Material References Cited by

AUTHORS (3)

Zhangchi Feng

Richong Zhang

Zhijie Nie

ABSTRACT

The Composed Image Retrieval (CIR) task aims to retrieve target images using a composed query consisting of reference image and modified text. Advanced methods often utilize contrastive learning as the optimization objective, which benefits from adequate positive negative examples. However, triplet for CIR incurs high manual annotation costs, resulting in limited Furthermore, existing commonly use in-batch sampling, reduces number available model. To address problem lack positives, we propose data generation method by leveraging multi-modal large language model construct triplets CIR. introduce more negatives during fine-tuning, design two-stage fine-tuning framework CIR, whose second stage introduces plenty static representations optimize representation space rapidly. above two improvements can be effectively stacked designed plug-and-play, easily applied models without changing their original architectures. Extensive experiments ablation analysis demonstrate that our scales positives achieves state-of-the-art results on both FashionIQ CIRR datasets. In addition, also perform well zero-shot retrieval, providing new solution low-resources scenario.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENALEX - Publications OPENAIRE - Products

PlumX Metrics

Improving Composed Image Retrieval via Contrastive Learning with Scaling Positives and Negatives

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....