XNLIeu: a dataset for cross-lingual NLI in Basque
FOS: Computer and information sciences
Computer Science - Computation and Language
Artificial Intelligence (cs.AI)
Computer Science - Artificial Intelligence
Computation and Language (cs.CL)
DOI:
10.48550/arxiv.2404.06996
Publication Date:
2024-04-10
AUTHORS (6)
ABSTRACT
XNLI is a popular Natural Language Inference (NLI) benchmark widely used to evaluate cross-lingual Understanding (NLU) capabilities across languages. In this paper, we expand include Basque, low-resource language that can greatly benefit from transfer-learning approaches. The new dataset, dubbed XNLIeu, has been developed by first machine-translating the English corpus into followed manual post-edition step. We have conducted series of experiments using mono- and multilingual LLMs assess a) effect professional on MT system; b) best strategy for NLI in Basque; c) whether choice influenced fact dataset built translation. results show necessary translate-train obtains better overall, although gain lower when tested natively scratch. Our code datasets are publicly available under open licenses.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....