Roman Urdu Hate Speech Detection Using Transformer-Based Model for Cyber Security Applications

Transfer of learning
DOI: 10.3390/s23083909 Publication Date: 2023-04-12T06:08:11Z
ABSTRACT
Social media applications, such as Twitter and Facebook, allow users to communicate share their thoughts, status updates, opinions, photographs, videos around the globe. Unfortunately, some people utilize these platforms disseminate hate speech abusive language. The growth of may result in crimes, cyber violence, substantial harm cyberspace, physical security, social safety. As a result, detection is critical issue for both cyberspace society, necessitating development robust application capable detecting combating it real-time. Hate context-dependent problem that requires context-aware mechanisms resolution. In this study, we employed transformer-based model Roman Urdu classification due its ability capture text context. addition, developed first pre-trained BERT model, which named BERT-RU. For purpose, exploited capabilities by training from scratch on largest dataset consisting 173,714 messages. Traditional deep learning models were used baseline models, including LSTM, BiLSTM, BiLSTM + Attention Layer, CNN. We also investigated concept transfer using embeddings conjunction with models. performance each was evaluated terms accuracy, precision, recall, F-measure. generalization cross-domain dataset. experimental results revealed when directly applied task speech, outperformed traditional machine learning, F-measure, scores 96.70%, 97.25%, 96.74%, 97.89%, respectively. exhibited superior
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (36)
CITATIONS (24)