NFDI4DS | UHH-SEMS - Publication Details

Languages Transferred Within the Encoder: On Representation Transfer in Zero-Shot Multilingual Translation

Zero (linguistics) Representation

DOI: 10.48550/arxiv.2406.08092 Publication Date: 2024-06-12

Abstract Supplemental Material References Cited by

AUTHORS (3)

Zhi Qu

Chenchen Ding

Taro Watanabe

ABSTRACT

Understanding representation transfer in multilingual neural machine translation can reveal the representational issue causing zero-shot deficiency. In this work, we introduce identity pair, a sentence translated into itself, to address lack of base measure investigations, as pair represents optimal state among any language transfers. our analysis, demonstrate that encoder transfers source subspace target instead language-agnostic state. Thus, deficiency arises because representations are entangled with other languages and not transferred effectively language. Based on findings, propose two methods: 1) low-rank language-specific embedding at encoder, 2) contrastive learning decoder. The experimental results Europarl-15, TED-19, OPUS-100 datasets show methods substantially enhance performance translations by improving capacity, thereby providing practical evidence support conclusions.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENALEX - Publications OPENAIRE - Products

PlumX Metrics

Languages Transferred Within the Encoder: On Representation Transfer in Zero-Shot Multilingual Translation

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....