When Scaling Meets LLM Finetuning: The Effect of Data, Model and Finetuning Method

DOI: 10.48550/arxiv.2402.17193 Publication Date: 2024-02-26
ABSTRACT
While large language models (LLMs) often adopt finetuning to unlock their capabilities for downstream applications, our understanding on the inductive biases (especially scaling properties) of different methods is still limited. To fill this gap, we conduct systematic experiments studying whether and how factors, including LLM model size, pretraining data new parameter size affect performance. We consider two types -- full-model tuning (FMT) efficient (PET, prompt LoRA), explore behaviors in data-limited regime where substantially outweighs size. Based sets pretrained bilingual LLMs from 1B 16B machine translation multilingual summarization benchmarks, find that 1) follows a powerbased multiplicative joint law between each other factor; 2) benefits more than scaling, PET generally ineffective; 3) optimal method highly task- data-dependent. hope findings could shed light understanding, selecting developing methods.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....