NFDI4DS | UHH-SEMS - Publication Details

Measuring Taiwanese Mandarin Language Understanding

Mandarin Chinese

DOI: 10.48550/arxiv.2403.20180 Publication Date: 2024-03-29

Abstract Supplemental Material References Cited by

AUTHORS (5)

Po-Heng Chen

Sijia Cheng

Weilin Chen

Yen‐Ting Lin

Yun-Nung Chen

ABSTRACT

The evaluation of large language models (LLMs) has drawn substantial attention in the field recently. This work focuses on evaluating LLMs a Chinese context, specifically, for Traditional which been largely underrepresented existing benchmarks. We present TMLU, holistic suit tailored assessing advanced knowledge and reasoning capability LLMs, under context Taiwanese Mandarin. TMLU consists an array 37 subjects across social science, STEM, humanities, Taiwan-specific content, others, ranging from middle school to professional levels. In addition, we curate chain-of-thought-like few-shot explanations each subject facilitate complex skills. To establish comprehensive baseline, conduct extensive experiments analysis 24 LLMs. results suggest that open-weight demonstrate inferior performance comparing multilingual proprietary ones, Mandarin lag behind Simplified-Chinese counterparts. findings indicate great headrooms improvement, emphasize goal foster development localized Taiwanese-Mandarin release benchmark scripts community promote future research.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENALEX - Publications OPENAIRE - Products

PlumX Metrics

Measuring Taiwanese Mandarin Language Understanding

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....