Improving neural machine translation with POS-tag features for low-resource language pairs

H1-99 Science (General) Neural machine translation 02 engineering and technology Linguistic features Social sciences (General) Q1-390 Universal part-of-speech 0202 electrical engineering, electronic engineering, information engineering Part-of-speech Transformer architecture Research Article
DOI: 10.1016/j.heliyon.2022.e10375 Publication Date: 2022-08-22T16:23:59Z
ABSTRACT
Integrating linguistic features has been widely utilized in statistical machine translation (SMT) systems, resulting improved quality. However, for low-resource languages such as Thai and Myanmar, the integration of neural (NMT) systems yet to be implemented. In this study, we propose transformer-based NMT models (transformer, multi-source transformer, shared-multi-source transformer models) using two-way Thai-to-Myanmar, Myanmar-to-English, Thai-to-English. Linguistic part-of-speech (POS) tags or universal (UPOS) are added each word on either source target side, both sides, proposed conducted. The take two inputs (i.e., string data with POS tags) produce tags. A model that utilizes only vectors was used first baseline comparison models. second model, an Edit-Based Transformer Repositioning (EDITOR) also compare our addition model. findings experiments show adding enhances performance a language pairs. Moreover, best results were yielded more significant Bilingual Evaluation Understudy (BLEU) scores character n-gram F-score (chrF) than EDITOR
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (49)
CITATIONS (25)