Language Ranker: A Metric for Quantifying LLM Performance Across High and Low-Resource Languages
DOI:
10.1609/aaai.v39i27.35038
Publication Date:
2025-04-11T14:38:02Z
AUTHORS (7)
ABSTRACT
The development of Large Language Models (LLMs) relies on extensive text corpora, which are often unevenly distributed across languages. This imbalance results in LLMs performing significantly better high-resource languages like English, German, and French, while their capabilities low-resource remain inadequate. Currently, there is a lack quantitative methods to evaluate the performance these To address this gap, we propose Ranker, an intrinsic metric designed benchmark rank based LLM using internal representations. By comparing LLM's representation various against baseline derived from can assess model's multilingual robust language-agnostic manner. Our analysis reveals that exhibit higher similarity scores with demonstrating superior performance, show lower scores, underscoring effectiveness our assessing language-specific capabilities. Besides, experiments strong correlation between LLM’s different proportion those its pre-training corpus. These insights underscore efficacy Ranker as tool for evaluating languages, particularly limited resources.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (0)
CITATIONS (0)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....