Frequency-Aware Contrastive Learning for Neural Machine Translation

Robustness NIST
DOI: 10.1609/aaai.v36i10.21426 Publication Date: 2022-07-04T11:42:36Z
ABSTRACT
Low-frequency word prediction remains a challenge in modern neural machine translation (NMT) systems. Recent adaptive training methods promote the output of infrequent words by emphasizing their weights overall objectives. Despite improved recall low-frequency words, precision is unexpectedly hindered Inspired observation that form more compact embedding space, we tackle this from representation learning perspective. Specifically, propose frequency-aware token-level contrastive method, which hidden state each decoding step pushed away counterparts other target soft way based on corresponding frequencies. We conduct experiments widely used NIST Chinese-English and WMT14 English-German tasks. Empirical results show our proposed can not only significantly improve quality but also enhance lexical diversity optimize space. Further investigation reveals that, comparing with related strategies, superiority method lies robustness across different frequencies without sacrificing precision.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (0)
CITATIONS (14)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....