NFDI4DS | UHH-SEMS - Publication Details

ViHateT5: Enhancing Hate Speech Detection in Vietnamese With A Unified Text-to-Text Transformer Model

Vietnamese

DOI: 10.48550/arxiv.2405.14141 Publication Date: 2024-05-22

Abstract Supplemental Material References Cited by

AUTHORS (1)

Luan Thanh Nguyen

ABSTRACT

Recent advancements in hate speech detection (HSD) Vietnamese have made significant progress, primarily attributed to the emergence of transformer-based pre-trained language models, particularly those built on BERT architecture. However, necessity for specialized fine-tuned models has resulted complexity and fragmentation developing a multitasking HSD system. Moreover, most current methodologies focus fine-tuning general trained formal textual datasets like Wikipedia, which may not accurately capture human behavior online platforms. In this research, we introduce ViHateT5, T5-based model our proposed large-scale domain-specific dataset named VOZ-HSD. By harnessing power text-to-text architecture, ViHateT5 can tackle multiple tasks using unified achieve state-of-the-art performance across all standard benchmarks Vietnamese. Our experiments also underscore significance label distribution pre-training data efficacy. We provide experimental materials research purposes, including VOZ-HSD dataset, checkpoint, HSD-multitask model, related source code GitHub publicly.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENAIRE - Products OPENALEX - Publications

PlumX Metrics

ViHateT5: Enhancing Hate Speech Detection in Vietnamese With A Unified Text-to-Text Transformer Model

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....