NFDI4DS | UHH-SEMS - Publication Details

DUMB: A Dutch Model Benchmark

Benchmark (surveying) Baseline (sea) Benchmarking Training set

DOI: 10.18653/v1/2023.emnlp-main.447 Publication Date: 2023-12-10T21:58:19Z

Abstract Supplemental Material References Cited by

AUTHORS (3)

Wietse de Vries

Martijn Wieling

Malvina Nissim

ABSTRACT

We introduce the Dutch Model Benchmark: DUMB. The benchmark includes a diverse set of datasets for low-, medium- and high-resource tasks. total nine tasks four that were previously not available in Dutch. Instead relying on mean score across tasks, we propose Relative Error Reduction (RER), which compares DUMB performance language models to strong baseline can be referred future even when assessing different sets models. Through comparison 14 pre-trained (mono- multi-lingual, varying sizes), assess internal consistency as well factors likely enable high performance. Our results indicate current monolingual under-perform suggest training larger with other architectures pre-training objectives. At present, highest is achieved by DeBERTaV3 (large), XLM-R (large) mDeBERTaV3 (base). In addition highlighting best strategies models, will foster further research A public leaderboard at https://dumbench.nl.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES (0)

CITATIONS (0)

EXTERNAL LINKS

OPENALEX - Publications OPENAIRE - Products CROSSREF - Publications

PlumX Metrics

DUMB: A Dutch Model Benchmark

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....