NFDI4DS | UHH-SEMS - Publication Details

Improved Fine-Tuning of Large Multimodal Models for Hateful Meme Detection

FOS: Computer and information sciences Computer Science - Machine Learning Computer Science - Computation and Language Artificial Intelligence (cs.AI) Computer Science - Artificial Intelligence Computer Vision and Pattern Recognition (cs.CV) Computer Science - Computer Vision and Pattern Recognition Computation and Language (cs.CL) Machine Learning (cs.LG)

DOI: 10.48550/arxiv.2502.13061 Publication Date: 2025-02-18

Abstract Supplemental Material References Cited by

AUTHORS (5)

Jingbiao Mei

Jinghong Chen

Guangyu Yang

Weizhe Lin

Bill Byrne

ABSTRACT

Hateful memes have become a significant concern on the Internet, necessitating robust automated detection systems. While large multimodal models shown strong generalization across various tasks, they exhibit poor to hateful meme due dynamic nature of tied emerging social trends and breaking news. Recent work further highlights limitations conventional supervised fine-tuning for in this context. To address these challenges, we propose Large Multimodal Model Retrieval-Guided Contrastive Learning (LMM-RGCL), novel two-stage framework designed improve both in-domain accuracy cross-domain generalization. Experimental results six widely used classification datasets demonstrate that LMM-RGCL achieves state-of-the-art performance, outperforming agent-based systems such as VPD-PALI-X-55B. Furthermore, our method effectively generalizes out-of-domain under low-resource settings, surpassing like GPT-4o.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENALEX - Publications OPENAIRE - Products

PlumX Metrics

Improved Fine-Tuning of Large Multimodal Models for Hateful Meme Detection

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....