NFDI4DS | UHH-SEMS - Publication Details

F-MALLOC: Feed-forward Memory Allocation for Continual Learning in Neural Machine Translation

FOS: Computer and information sciences Computer Science - Computation and Language Computation and Language (cs.CL)

DOI: 10.48550/arxiv.2404.04846 Publication Date: 2024-04-07

Abstract Supplemental Material References Cited by

AUTHORS (3)

Junhong Wu

Yuchen Liu

Chengqing Zong

ABSTRACT

In the evolving landscape of Neural Machine Translation (NMT), pretrain-then-finetune paradigm has yielded impressive results. However, persistent challenge Catastrophic Forgetting (CF) remains a hurdle. While previous work introduced Continual Learning (CL) methods to address CF, these approaches grapple with delicate balance between avoiding forgetting and maintaining system extensibility. To this, we propose CL method, named $\textbf{F-MALLOC}$ ($\textbf{F}$eed-forward $\textbf{M}$emory $\textbf{ALLOC}ation)$. F-MALLOC is inspired by recent insights highlighting that feed-forward layers emulate neural memories encapsulate crucial translation knowledge. It decomposes into discrete memory cells allocates different tasks. By learning allocate safeguard memories, our method effectively alleviates CF while ensuring robust extendability. Besides, comprehensive assessment protocol for multi-stage NMT systems. Experiments conducted following this new showcase superior performance F-MALLOC, evidenced higher BLEU scores almost zero forgetting.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENALEX - Publications OPENAIRE - Products

PlumX Metrics

F-MALLOC: Feed-forward Memory Allocation for Continual Learning in Neural Machine Translation

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....