NFDI4DS | UHH-SEMS - Publication Details

DS-Agent: Automated Data Science by Empowering Large Language Models with Case-Based Reasoning

FOS: Computer and information sciences Computer Science - Machine Learning Machine Learning (cs.LG)

DOI: 10.48550/arxiv.2402.17453 Publication Date: 2024-02-27

Abstract Supplemental Material References Cited by

AUTHORS (6)

Siyuan Guo

Cheng Deng

Ying Wen

Hechang Chen

Yi Chang

Jun Wang

ABSTRACT

In this work, we investigate the potential of large language models (LLMs) based agents to automate data science tasks, with goal comprehending task requirements, then building and training best-fit machine learning models. Despite their widespread success, existing LLM are hindered by generating unreasonable experiment plans within scenario. To end, present DS-Agent, a novel automatic framework that harnesses agent case-based reasoning (CBR). development stage, DS-Agent follows CBR structure an iteration pipeline, which can flexibly capitalize on expert knowledge from Kaggle, facilitate consistent performance improvement through feedback mechanism. Moreover, implements low-resource deployment stage simplified paradigm adapt past successful solutions for direct code generation, significantly reducing demand foundational capabilities LLMs. Empirically, GPT-4 achieves unprecedented 100% success rate in while attaining 36% average one pass across alternative LLMs stage. both stages, best rank performance, costing \$1.60 \$0.13 per run GPT-4, respectively.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENAIRE - Products OPENALEX - Publications

PlumX Metrics

DS-Agent: Automated Data Science by Empowering Large Language Models with Case-Based Reasoning

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....