NFDI4DS | UHH-SEMS - Publication Details

glge a new general language generation evaluation benchmark

FOS: Computer and information sciences Computer Science - Computation and Language 0202 electrical engineering, electronic engineering, information engineering 02 engineering and technology Computation and Language (cs.CL) 01 natural sciences 0105 earth and related environmental sciences

DOI: 10.48550/arxiv.2011.11928 Publication Date: 2021-01-01

Abstract Supplemental Material References Cited by

AUTHORS (18)

Hang Zhang

Pengcheng Wang

Winnie Wu

Jie Fu

Yeyun Gong

Ruofei Zhang

Jiancheng Lv

Linjun Shou

Ming Zhou

Ming Gong

Weizhen Qi

Nan Duan

Jian Jiao

Dayiheng Liu

Daxin Jiang

Jiusheng Chen

Yu Yan

Weizhu Chen

ABSTRACT

Findings of Association for Computational Linguistics. ACL 2021<br/>Multi-task benchmarks such as GLUE and SuperGLUE have driven great progress of pretraining and transfer learning in Natural Language Processing (NLP). These benchmarks mostly focus on a range of Natural Language Understanding (NLU) tasks, without considering the Natural Language Generation (NLG) models. In this paper, we present the General Language Generation Evaluation (GLGE), a new multi-task benchmark for evaluating the generalization capabilities of NLG models across eight language generation tasks. For each task, we continue to design three subtasks in terms of task difficulty (GLGE-Easy, GLGE-Medium, and GLGE-Hard). This introduces 24 subtasks to comprehensively compare model performance. To encourage research on pretraining and transfer learning on NLG models, we make GLGE publicly available and build a leaderboard with strong baselines including MASS, BART, and ProphetNet (The source code and dataset are publicly available at https://github.com/microsoft/glge).<br/>

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENAIRE - Products

PlumX Metrics

glge a new general language generation evaluation benchmark

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....