NFDI4DS | UHH-SEMS - Publication Details

Leveraging Image-Text Similarity and Caption Modification for the DataComp Challenge: Filtering Track and BYOD Track

Similarity (geometry)

DOI: 10.48550/arxiv.2310.14581 Publication Date: 2023-01-01

Abstract Supplemental Material References Cited by

AUTHORS (6)

Shuhei Yokoo

Peifei Zhu

Yuchi Ishikawa

M. Tanaka

Masayoshi Kondo

Hirokatsu Kataoka

ABSTRACT

Large web crawl datasets have already played an important role in learning multimodal features with high generalization capabilities. However, there are still very limited studies investigating the details or improvements of data design. Recently, a DataComp challenge has been designed to propose best training fixed models. This paper presents our solution both filtering track and BYOD challenge. Our adopts large models CLIP BLIP-2 filter modify data, utilize external along bag tricks improve quality. Experiments show significantly outperforms baselines (filtering track: 6.6% improvement, 48.5% improvement).

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENAIRE - Products OPENALEX - Publications

PlumX Metrics

Leveraging Image-Text Similarity and Caption Modification for the DataComp Challenge: Filtering Track and BYOD Track

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....