Boosting Unsupervised Machine Translation with Pseudo-Parallel Data
Boosting
Parallel corpora
BLEU
Baseline (sea)
DOI:
10.48550/arxiv.2310.14262
Publication Date:
2023-01-01
AUTHORS (2)
ABSTRACT
Even with the latest developments in deep learning and large-scale language modeling, task of machine translation (MT) low-resource languages remains a challenge. Neural MT systems can be trained an unsupervised way without any resources but quality lags behind, especially truly conditions. We propose training strategy that relies on pseudo-parallel sentence pairs mined from monolingual corpora addition to synthetic back-translated corpora. experiment different schedules reach improvement up 14.5 BLEU points (English Ukrainian) over baseline data only.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....