ICM-Assistant: Instruction-tuning Multimodal Large Language Models for Rule-based Explainable Image Content Moderation

Moderation Content (measure theory)
DOI: 10.1609/aaai.v39i8.32908 Publication Date: 2025-04-11T11:28:35Z
ABSTRACT
Controversial contents largely inundate the Internet, infringing various cultural norms and child protection standards. Traditional Image Content Moderation (ICM) models fall short in producing precise moderation decisions for diverse standards, while recent multimodal large language (MLLMs), when adopted to general rule-based ICM, often produce classification explanation results that are inconsistent with human moderators. Aiming at flexible, explainable, accurate we design a novel dataset generation pipeline, decomposing concise human-defined rules leveraging well-designed multi-stage prompts enrich explicit image annotations. Our ICM-Instruct includes detailed Q-A pairs. Built upon it, create our ICM-Assistant model framework of making it readily applicable real practice. demonstrates exceptional performance flexibility. Specifically, significantly outperforms existing approaches on sources, improving both (36.8% average) quality (26.6% consistently over MLLMs. Caution: offensive or images.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (0)
CITATIONS (0)