BNU-HKBU UIC NLP Team 2 at SemEval-2019 Task 6: Detecting Offensive Language Using BERT model
Offensive
SemEval
Language identification
Identification
DOI:
10.18653/v1/s19-2099
Publication Date:
2019-07-21T17:29:51Z
AUTHORS (5)
ABSTRACT
In this study we deal with the problem of identifying and categorizing offensive language in social media. Our group, BNU-HKBU UIC NLP Team2, use supervised classification along multiple version data generated by different ways pre-processing data. We then state-of-the-art model Bidirectional Encoder Representations from Transformers, or BERT (Devlin et al, 2018), to capture linguistic, syntactic semantic features. Long range dependencies between each part a sentence can be captured BERT’s bidirectional encoder representations. results show 85.12% accuracy 80.57% F1 scores Subtask A (offensive identification), 87.92% 50% B (categorization offense types), 69.95% 50.47% score C (offense target identification). Analysis shows that distinguishing targeted untargeted is not simple task. More work needs done on unbalance Subtasks C. Some future also discussed.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (0)
CITATIONS (5)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....