NFDI4DS | UHH-SEMS - Publication Details

Scaling Instruction-Finetuned Language Models

FOS: Computer and information sciences Computer Science - Machine Learning Computer Science - Computation and Language 0202 electrical engineering, electronic engineering, information engineering 02 engineering and technology Computation and Language (cs.CL) Machine Learning (cs.LG)

DOI: 10.48550/arxiv.2210.11416 Publication Date: 2022-01-01

Abstract Supplemental Material References Cited by

AUTHORS (35)

Chung, Hyung Won

Hou, Le

Longpre, Shayne

Zoph, Barret

Tay, Yi

Fedus, William

Li, Yunxuan

Wang, Xuezhi

Dehghani, Mostafa

Brahma, Siddhartha

Webson, Albert

Gu, Shixiang Shane

Dai, Zhuyun

Suzgun, Mirac

Chen, Xinyun

Chowdhery, Aakanksha

Castro-Ros, Alex

Pellat, Marie

Robinson, Kevin

Valter, Dasha

Narang, Sharan

Mishra, Gaurav

Yu, Adams

Zhao, Vincent

Huang, Yanping

Dai, Andrew

Yu, Hongkun

Petrov, Slav

Chi, Ed H.

Dean, Jeff

Devlin, Jacob

Roberts, Adam

Zhou, Denny

Le, Quoc V.

Wei, Jason

ABSTRACT

Finetuning language models on a collection of datasets phrased as instructions has been shown to improve model performance and generalization to unseen tasks. In this paper we explore instruction finetuning with a particular focus on (1) scaling the number of tasks, (2) scaling the model size, and (3) finetuning on chain-of-thought data. We find that instruction finetuning with the above aspects dramatically improves performance on a variety of model classes (PaLM, T5, U-PaLM), prompting setups (zero-shot, few-shot, CoT), and evaluation benchmarks (MMLU, BBH, TyDiQA, MGSM, open-ended generation). For instance, Flan-PaLM 540B instruction-finetuned on 1.8K tasks outperforms PALM 540B by a large margin (+9.4% on average). Flan-PaLM 540B achieves state-of-the-art performance on several benchmarks, such as 75.2% on five-shot MMLU. We also publicly release Flan-T5 checkpoints, which achieve strong few-shot performance even compared to much larger models, such as PaLM 62B. Overall, instruction finetuning is a general method for improving the performance and usability of pretrained language models.<br/>Public checkpoints: https://huggingface.co/docs/transformers/model_doc/flan-t5<br/>

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENAIRE - Products

PlumX Metrics

Scaling Instruction-Finetuned Language Models

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....