The Best of Both Worlds: Combining Human and Machine Translations for Multilingual Semantic Parsing with Active Learning

Leverage (statistics) Utterance Training set
DOI: 10.18653/v1/2023.acl-long.529 Publication Date: 2023-08-05T00:57:42Z
ABSTRACT
Multilingual semantic parsing aims to leverage the knowledge from high-resource languages improve low-resource parsing, yet commonly suffers data imbalance problem. Prior works propose utilize translations by either humans or machines alleviate such issues. However, human are expensive, while machine cheap but prone error and bias. In this work, we an active learning approach that exploits strengths of both iteratively adding small batches into machine-translated training set. Besides, novel aggregated acquisition criteria help our method select utterances be manually translated. Our experiments demonstrate ideal utterance selection can significantly reduce bias in translated data, resulting higher parser accuracies than parsers merely trained on data.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (0)
CITATIONS (3)