ArabGlossBERT: Fine-Tuning BERT on Context-Gloss Pairs for WSD

FOS: Computer and information sciences Computer Science - Machine Learning Computer Science - Computation and Language Artificial Intelligence (cs.AI) Computer Science - Artificial Intelligence 0202 electrical engineering, electronic engineering, information engineering 02 engineering and technology Computation and Language (cs.CL) Information Retrieval (cs.IR) Computer Science - Information Retrieval Machine Learning (cs.LG)
DOI: 10.26615/978-954-452-072-4_005 Publication Date: 2021-11-06T19:54:26Z
ABSTRACT
Using pre-trained transformer models such as BERT has proven to be effective in many NLP tasks. This paper presents our work to fine-tune BERT models for Arabic Word Sense Disambiguation (WSD). We treated the WSD task as a sentence-pair binary classification task. First, we constructed a dataset of labeled Arabic context-gloss pairs (~167k pairs) we extracted from the Arabic Ontology and the large lexicographic database available at Birzeit University. Each pair was labeled as True or False and target words in each context were identified and annotated. Second, we used this dataset for fine-tuning three pre-trained Arabic BERT models. Third, we experimented the use of different supervised signals used to emphasize target words in context. Our experiments achieved promising results (accuracy of 84%) although we used a large set of senses in the experiment.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (0)
CITATIONS (9)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....