NFDI4DS | UHH-SEMS - Publication Details

Explainable Automated Debugging via Large Language Model-driven Scientific Debugging

Software Engineering (cs.SE) FOS: Computer and information sciences Computer Science - Software Engineering 0202 electrical engineering, electronic engineering, information engineering 02 engineering and technology 16. Peace & justice

DOI: 10.48550/arxiv.2304.02195 Publication Date: 2024-12-18

Abstract Supplemental Material References Cited by

AUTHORS (4)

Sungmin Kang

Bei Chen

Shin Yoo

Jian-Guang Lou

ABSTRACT

AbstractAutomated debugging techniques have the potential to reduce developer effort in debugging. However, while developers want rationales for the provided automatic debugging results, existing techniques are ill-suited to provide them, as their deduction process differs significantly froof human developers. Inspired by the way developers interact with code when debugging, we propose Automated Scientific Debugging (AutoSD), a technique that prompts large language models to automatically generate hypotheses, uses debuggers to interact with buggy code, and thus automatically reach conclusions prior to patch generation. In doing so, we aim to produce explanations of how a specific patch has been generated, with the hope that these explanations will lead to enhanced developer decision-making. Our empirical analysis on three program repair benchmarks shows that AutoSDperforms competitively with other program repair baselines, and that it can indicate when it is confident in its results. Furthermore, we perform a human study with 20 participants to evaluate AutoSD-generated explanations. Participants with access to explanations judged patch correctness more accurately in five out of six real-world bugs studied. Furthermore, 70% of participants answered that they wanted explanations when using repair tools, and 55% answered that they were satisfied with the Scientific Debugging presentation.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENAIRE - Products

PlumX Metrics

Explainable Automated Debugging via Large Language Model-driven Scientific Debugging

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....