NFDI4DS | UHH-SEMS - Publication Details

Can OpenAI's codex fix bugs?

OPENALEX - Publications

Julian Aron Prenner Hlib Babii Romain Robbes

OpenAI's Codex, a GPT-3 like model trained on large code corpus, has made headlines in and outside of academia. Given short user-provided description, it is capable synthesizing snippets that are syntactically semantically valid most cases. In this work, we want to investigate whether Codex able localize fix bugs, two important tasks automated program repair. Our initial evaluation uses the multi-language QuixBugs benchmark (40 bugs both Python Java). We find that, despite not being for APR,...

10.1145/3524459.3527351 article EN 2022-05-19

Making the most of small Software Engineering datasets with modern machine learning

OPENALEX - Publications

Julian Aron Prenner Romain Robbes

This paper provides a starting point for Software Engineering (SE) researchers and practitioners faced with the problem of training machine learning models on small datasets. Due to high costs associated labeling data, in Engineering, there exist many (< 5,000 samples) medium-sized (<100,000 While deep has set state art tasks, it is only recently that proven effective small-sized datasets, primarily thanks pre-training, semi-supervised technique leverages abundant unlabelled data alongside...

10.1109/tse.2021.3135465 article EN publisher-specific-oa IEEE Transactions on Software Engineering 2021-12-16

Automatic Program Repair with OpenAI's Codex: Evaluating QuixBugs

OPENALEX - Publications

Julian Aron Prenner Romain Robbes

OpenAI's Codex, a GPT-3 like model trained on large code corpus, has made headlines in and outside of academia. Given short user-provided description, it is capable synthesizing snippets that are syntactically semantically valid most cases. In this work, we want to investigate whether Codex able localize fix bugs, task central interest the field automated program repair. Our initial evaluation uses multi-language QuixBugs benchmark (40 bugs both Python Java). We find that, despite not being...

10.48550/arxiv.2111.03922 preprint EN cc-by-nc-nd arXiv (Cornell University) 2021-01-01

Out of Context: How important is Local Context in Neural Program Repair?

OPENALEX - Publications

Julian Aron Prenner Romain Robbes

Deep learning source code models have been applied very successfully to the problem of automated program repair. One standing issues is small input window current which often cannot fully fit context required for a bug fix (e.g., method or class declarations project). Instead, restricted local context, that is, lines below and above location. In this work we study importance on repair success: how much needed?; before after location more important? tied type? To answer these questions train...

10.1145/3597503.3639086 article EN 2024-04-12

Codex Hacks HackerRank: Memorization Issues and a Framework for Code Synthesis Evaluation

OPENALEX - Publications

Anjan Karmakar Julian Aron Prenner Marco D’Ambros Romain Robbes

The Codex model has demonstrated extraordinary competence in synthesizing code from natural language problem descriptions. However, order to reveal unknown failure modes and hidden biases, such large-scale models must be systematically subjected multiple diverse evaluation studies. In this work, we evaluate the synthesis capabilities of based on a set 115 Python statements popular competitive programming portal: HackerRank. Our shows that is indeed proficient Python, solving 96% problems...

10.48550/arxiv.2212.02684 preprint EN cc-by arXiv (Cornell University) 2022-01-01

RunBugRun -- An Executable Dataset for Automated Program Repair

OPENALEX - Publications

Julian Aron Prenner Romain Robbes

Recently, we can notice a transition to data-driven techniques in Automated Program Repair (APR), particular towards deep neural networks. This entails training on hundreds of thousands or even millions non-executable code fragments. We would like bring more attention an aspect often neglected Neural (NPR), namely its execution. Code execution has several significant advantages. It allows for test-based evaluation candidate fixes and provide valuable information aid repair. In this work...

10.48550/arxiv.2304.01102 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Mining Software Repositories with a Collaborative Heuristic Repository

OPENALEX - Publications

Hlib Babii Julian Aron Prenner Laurin Stricker Anjan Karmakar Andrea Janes and 1 more

Many software engineering studies or tasks rely on categorizing artifacts. In practice, this is done either by defining simple but often imprecise heuristics, manual labelling of the Unfortunately, errors in these categorizations impact that them. To improve precision categorizations, we propose to gather heuristics a collaborative heuristic repository, which researchers can contribute large amount diverse for variety SE These are then leveraged state-of-the-art weak supervision techniques...

10.1109/icse-nier52604.2021.00030 article EN 2021-05-01

Making the most of small Software Engineering datasets with modern machine learning

OPENALEX - Publications

Julian Aron Prenner Romain Robbes

This paper provides a starting point for Software Engineering (SE) researchers and practitioners faced with the problem of training machine learning models on small datasets. Due to high costs associated labeling data, in Engineering,there exist many (< 1 000 samples) medium-sized 100 While deep has set state art tasks, it is only recently that proven effective small-sized datasets, primarily thanks pre-training, semi-supervised technique leverages abundant unlabelled data alongside scarce...

10.48550/arxiv.2106.15209 preprint EN cc-by-nc-nd arXiv (Cornell University) 2021-01-01

Out of Context: How important is Local Context in Neural Program Repair?

OPENALEX - Publications

Julian Aron Prenner Romain Robbes

Deep learning source code models have been applied very successfully to the problem of automated program repair. One standing issues is small input window current which often cannot fully fit context required for a bug fix (e.g., method or class declarations project). Instead, restricted local context, that is, lines below and above location. In this work we study importance on repair success: how much needed?; before after location more important? tied type? To answer these questions train...

10.48550/arxiv.2312.04986 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Mining Software Repositories with a Collaborative Heuristic Repository

OPENALEX - Publications

Hlib Babii Julian Aron Prenner Laurin Stricker Anjan Karmakar Andrea Janes and 1 more

Many software engineering studies or tasks rely on categorizing artifacts. In practice, this is done either by defining simple but often imprecise heuristics, manual labelling of the Unfortunately, errors in these categorizations impact that them. To improve precision categorizations, we propose to gather heuristics a collaborative heuristic repository, which researchers can contribute large amount diverse for variety SE These are then leveraged state-of-the-art weak supervision techniques...

10.48550/arxiv.2103.01722 preprint EN other-oa arXiv (Cornell University) 2021-01-01