NFDI4DS | UHH-SEMS - Publication Details

Federated Fine-Tuning of Large Language Models: Kahneman-Tversky vs. Direct Preference Optimization

FOS: Computer and information sciences Computer Science - Machine Learning Computer Science - Computation and Language Computation and Language (cs.CL) Machine Learning (cs.LG)

DOI: 10.48550/arxiv.2502.14187 Publication Date: 2025-02-19

Abstract Supplemental Material References Cited by

AUTHORS (2)

Fernando Spadea

Oshani Seneviratne

ABSTRACT

We evaluate Kahneman-Tversky Optimization (KTO) as a fine-tuning method for large language models (LLMs) in federated learning (FL) settings, comparing it against Direct Preference (DPO). Using Alpaca-7B the base model, we fine-tune on realistic dataset under both methods and performance using MT-Bench-1, Vicuna, AdvBench benchmarks. Additionally, introduce redistributed setup, where only KTO is applicable due to its ability handle single-response feedback, unlike DPO's reliance paired responses. Our results demonstrate that KTO, original (KTOO) (KTOR) configurations, consistently outperforms DPO across all In further validates flexibility resilience by maintaining superior scenarios cannot be applied. These findings establish robust scalable FL, motivating adoption privacy-preserving, decentralized, heterogeneous environments.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENALEX - Publications OPENAIRE - Products

PlumX Metrics

Federated Fine-Tuning of Large Language Models: Kahneman-Tversky vs. Direct Preference Optimization

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....