Pretrained Language Models for Document-Level Neural Machine Translation

FOS: Computer and information sciences Computer Science - Computation and Language 0202 electrical engineering, electronic engineering, information engineering 02 engineering and technology Computation and Language (cs.CL)
DOI: 10.48550/arxiv.1911.03110 Publication Date: 2019-01-01
ABSTRACT
Previous work on document-level NMT usually focuses limited contexts because of degraded performance larger contexts. In this paper, we investigate using large with three main contributions: (1) Different from previous which pertrained models large-scale sentence-level parallel corpora, use pretrained language models, specifically BERT, are trained monolingual documents; (2) We propose context manipulation methods to control the influence contexts, lead comparable results systems small and contexts; (3) introduce a multi-task training for regularization avoid overfitting our further improves together deeper encoder. Experiments conducted widely used IWSLT data sets pairs, i.e., Chinese--English, French--English Spanish--English. Results show that significantly better than previously reported systems.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....