DS-Agent: Automated Data Science by Empowering Large Language Models with Case-Based Reasoning
FOS: Computer and information sciences
Computer Science - Machine Learning
Machine Learning (cs.LG)
DOI:
10.48550/arxiv.2402.17453
Publication Date:
2024-02-27
AUTHORS (6)
ABSTRACT
In this work, we investigate the potential of large language models (LLMs) based agents to automate data science tasks, with goal comprehending task requirements, then building and training best-fit machine learning models. Despite their widespread success, existing LLM are hindered by generating unreasonable experiment plans within scenario. To end, present DS-Agent, a novel automatic framework that harnesses agent case-based reasoning (CBR). development stage, DS-Agent follows CBR structure an iteration pipeline, which can flexibly capitalize on expert knowledge from Kaggle, facilitate consistent performance improvement through feedback mechanism. Moreover, implements low-resource deployment stage simplified paradigm adapt past successful solutions for direct code generation, significantly reducing demand foundational capabilities LLMs. Empirically, GPT-4 achieves unprecedented 100% success rate in while attaining 36% average one pass across alternative LLMs stage. both stages, best rank performance, costing \$1.60 \$0.13 per run GPT-4, respectively.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....