- Topic Modeling
- Natural Language Processing Techniques
- Handwritten Text Recognition Techniques
- Speech Recognition and Synthesis
- Text Readability and Simplification
Thammasat University
2021-2024
Several methodologies have recently been proposed to enhance the performance of low-resource Neural Machine Translation (NMT). However, these techniques yet be explored thoroughly in Thai and Myanmar languages. Therefore, we first applied augmentation such as SwitchOut Ciphertext Based Data Augmentation (CipherDAug) improve NMT Second, enhanced by fine-tuning pre-trained Multilingual Denoising BART model (mBART), where denotes Bidirectional Auto-Regressive Transformer. We implemented three...
In this paper, we report the experimental results of Machine Translation models conducted by a NECTEC team for translation tasks WAT-2021. Basically, our are based on neural methods both directions English-Myanmar and Myanmar-English language pairs. Most existing Neural (NMT) mainly focus conversion sequential data do not directly use syntactic information. However, conduct multi-source machine using multilingual corpora such as string corpus, tree or POS-tagged corpus. The is an approach to...
To improve the data resource of low-resource English- Myanmar- Thai language pairs, we build first parallel medical corpus, named as En-My- Th corpus which is composed total 14,592 sentences. In our paper, make experiments on English-Myanmar pair new En-My-Th and in addition, English-Thai Thai-Myanmar pairs from existing ASEAN- MT corpus. The SwitchOut augmentation algorithm baseline attention-based sequence to model are trained aforementioned both directions. Experimental results show that...
Transformers are the current state-of-the-art type of neural network model for dealing with sequences. Evidently, most prominent application these models is in text processing tasks, and machine translation. Recently, transformer-based such as Edit-Based Transformer Repositioning (EDITOR) Levenshtein (LevT) have become popular To best our knowledge, there no experiments two using under-resourced languages. In this paper, we compared performance decoding time EDITOR LevT model. We conducted...