AlchemistCoder: Harmonizing and Eliciting Code Capability by Hindsight Tuning on Multi-source Data
Hindsight bias
Code (set theory)
DOI:
10.48550/arxiv.2405.19265
Publication Date:
2024-05-29
AUTHORS (11)
ABSTRACT
Open-source Large Language Models (LLMs) and their specialized variants, particularly Code LLMs, have recently delivered impressive performance. However, previous LLMs are typically fine-tuned on single-source data with limited quality diversity, which may insufficiently elicit the potential of pre-trained LLMs. In this paper, we present AlchemistCoder, a series enhanced code generation generalization capabilities multi-source data. To achieve this, pioneer to unveil inherent conflicts among various styles qualities in corpora introduce data-specific prompts hindsight relabeling, termed AlchemistPrompts, harmonize different sources instruction-response pairs. Additionally, propose incorporating construction process into fine-tuning as comprehension tasks, including instruction evolution, filtering, review. Extensive experiments demonstrate that AlchemistCoder holds clear lead all models same size (6.7B/7B) rivals or even surpasses larger (15B/33B/70B), showcasing efficacy our method refining instruction-following advancing boundaries intelligence.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....