ViHOS: Hate Speech Spans Detection for Vietnamese
Offensive
Vietnamese
DOI:
10.18653/v1/2023.eacl-main.47
Publication Date:
2023-09-09T20:54:31Z
AUTHORS (5)
ABSTRACT
The rise in hateful and offensive language directed at other users is one of the adverse side effects increased use social networking platforms. This could make it difficult for human moderators to review tagged comments filtered by classification systems. To help address this issue, we present ViHOS (Vietnamese Hate Offensive Spans) dataset, first human-annotated corpus containing 26k spans on 11k comments. We also provide definitions Vietnamese as well detailed annotation guidelines. Besides, conduct experiments with various state-of-the-art models. Specifically, XLM-R_Large achieved best F1-scores Single span detection All detection, while PhoBERT_Large obtained highest Multiple detection. Finally, our error analysis demonstrates difficulties detecting specific types data future research. Our dataset released GitHub.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (0)
CITATIONS (9)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....