Julian Aron Prenner

ORCID: 0000-0003-4673-271X
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Software Engineering Research
  • Software Testing and Debugging Techniques
  • Software Reliability and Analysis Research
  • Software Engineering Techniques and Practices
  • Topic Modeling
  • Software System Performance and Reliability
  • Machine Learning and Data Classification
  • Advanced Malware Detection Techniques
  • Machine Learning and Algorithms
  • Natural Language Processing Techniques
  • Parallel Computing and Optimization Techniques

Free University of Bozen-Bolzano
2021-2024

OpenAI's Codex, a GPT-3 like model trained on large code corpus, has made headlines in and outside of academia. Given short user-provided description, it is capable synthesizing snippets that are syntactically semantically valid most cases. In this work, we want to investigate whether Codex able localize fix bugs, two important tasks automated program repair. Our initial evaluation uses the multi-language QuixBugs benchmark (40 bugs both Python Java). We find that, despite not being for APR,...

10.1145/3524459.3527351 article EN 2022-05-19

This paper provides a starting point for Software Engineering (SE) researchers and practitioners faced with the problem of training machine learning models on small datasets. Due to high costs associated labeling data, in Engineering, there exist many (< 5,000 samples) medium-sized (<100,000 While deep has set state art tasks, it is only recently that proven effective small-sized datasets, primarily thanks pre-training, semi-supervised technique leverages abundant unlabelled data alongside...

10.1109/tse.2021.3135465 article EN publisher-specific-oa IEEE Transactions on Software Engineering 2021-12-16

OpenAI's Codex, a GPT-3 like model trained on large code corpus, has made headlines in and outside of academia. Given short user-provided description, it is capable synthesizing snippets that are syntactically semantically valid most cases. In this work, we want to investigate whether Codex able localize fix bugs, task central interest the field automated program repair. Our initial evaluation uses multi-language QuixBugs benchmark (40 bugs both Python Java). We find that, despite not being...

10.48550/arxiv.2111.03922 preprint EN cc-by-nc-nd arXiv (Cornell University) 2021-01-01

Deep learning source code models have been applied very successfully to the problem of automated program repair. One standing issues is small input window current which often cannot fully fit context required for a bug fix (e.g., method or class declarations project). Instead, restricted local context, that is, lines below and above location. In this work we study importance on repair success: how much needed?; before after location more important? tied type? To answer these questions train...

10.1145/3597503.3639086 article EN 2024-04-12

The Codex model has demonstrated extraordinary competence in synthesizing code from natural language problem descriptions. However, order to reveal unknown failure modes and hidden biases, such large-scale models must be systematically subjected multiple diverse evaluation studies. In this work, we evaluate the synthesis capabilities of based on a set 115 Python statements popular competitive programming portal: HackerRank. Our shows that is indeed proficient Python, solving 96% problems...

10.48550/arxiv.2212.02684 preprint EN cc-by arXiv (Cornell University) 2022-01-01

Recently, we can notice a transition to data-driven techniques in Automated Program Repair (APR), particular towards deep neural networks. This entails training on hundreds of thousands or even millions non-executable code fragments. We would like bring more attention an aspect often neglected Neural (NPR), namely its execution. Code execution has several significant advantages. It allows for test-based evaluation candidate fixes and provide valuable information aid repair. In this work...

10.48550/arxiv.2304.01102 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Many software engineering studies or tasks rely on categorizing artifacts. In practice, this is done either by defining simple but often imprecise heuristics, manual labelling of the Unfortunately, errors in these categorizations impact that them. To improve precision categorizations, we propose to gather heuristics a collaborative heuristic repository, which researchers can contribute large amount diverse for variety SE These are then leveraged state-of-the-art weak supervision techniques...

10.1109/icse-nier52604.2021.00030 article EN 2021-05-01

This paper provides a starting point for Software Engineering (SE) researchers and practitioners faced with the problem of training machine learning models on small datasets. Due to high costs associated labeling data, in Engineering,there exist many (< 1 000 samples) medium-sized 100 While deep has set state art tasks, it is only recently that proven effective small-sized datasets, primarily thanks pre-training, semi-supervised technique leverages abundant unlabelled data alongside scarce...

10.48550/arxiv.2106.15209 preprint EN cc-by-nc-nd arXiv (Cornell University) 2021-01-01

Deep learning source code models have been applied very successfully to the problem of automated program repair. One standing issues is small input window current which often cannot fully fit context required for a bug fix (e.g., method or class declarations project). Instead, restricted local context, that is, lines below and above location. In this work we study importance on repair success: how much needed?; before after location more important? tied type? To answer these questions train...

10.48550/arxiv.2312.04986 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Many software engineering studies or tasks rely on categorizing artifacts. In practice, this is done either by defining simple but often imprecise heuristics, manual labelling of the Unfortunately, errors in these categorizations impact that them. To improve precision categorizations, we propose to gather heuristics a collaborative heuristic repository, which researchers can contribute large amount diverse for variety SE These are then leveraged state-of-the-art weak supervision techniques...

10.48550/arxiv.2103.01722 preprint EN other-oa arXiv (Cornell University) 2021-01-01
Coming Soon ...