Evaluating Standard and Dialectal Frisian ASR: Multilingual Fine-tuning and Language Identification for Improved Low-resource Performance
Identification
DOI:
10.48550/arxiv.2502.04883
Publication Date:
2025-02-07
AUTHORS (6)
ABSTRACT
Automatic Speech Recognition (ASR) performance for low-resource languages is still far behind that of higher-resource such as English, due to a lack sufficient labeled data. State-of-the-art methods deploy self-supervised transfer learning where model pre-trained on large amounts data fine-tuned using little in target language. In this paper, we present and examine method fine-tuning an SSL-based order improve the Frisian its regional dialects (Clay Frisian, Wood South Frisian). We show ASR can be improved by multilingual (Frisian, Dutch, English German) auxiliary language identification task. addition, our findings dialectal speech suffers substantially, and, importantly, effect moderated elicitation approach used collect Our also particularly suggest relying solely standard evaluation may underestimate real-world performance, with substantial variation.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....