NFDI4DS | UHH-SEMS - Publication Details

When Scaling Meets LLM Finetuning: The Effect of Data, Model and Finetuning Method

DOI: 10.48550/arxiv.2402.17193 Publication Date: 2024-02-26

Abstract Supplemental Material References Cited by

AUTHORS (4)

Biao Zhang

Zhongtao Liu

Colin Cherry

Orhan Fırat

ABSTRACT

While large language models (LLMs) often adopt finetuning to unlock their capabilities for downstream applications, our understanding on the inductive biases (especially scaling properties) of different methods is still limited. To fill this gap, we conduct systematic experiments studying whether and how factors, including LLM model size, pretraining data new parameter size affect performance. We consider two types -- full-model tuning (FMT) efficient (PET, prompt LoRA), explore behaviors in data-limited regime where substantially outweighs size. Based sets pretrained bilingual LLMs from 1B 16B machine translation multilingual summarization benchmarks, find that 1) follows a powerbased multiplicative joint law between each other factor; 2) benefits more than scaling, PET generally ineffective; 3) optimal method highly task- data-dependent. hope findings could shed light understanding, selecting developing methods.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENALEX - Publications

PlumX Metrics

When Scaling Meets LLM Finetuning: The Effect of Data, Model and Finetuning Method

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....