NFDI4DS | UHH-SEMS - Publication Details

Feature location in source code: a taxonomy and survey

OPENALEX - Publications

Bogdan Dit Meghan Revelle Malcom Gethers Denys Poshyvanyk

SUMMARY Feature location is the activity of identifying an initial in source code that implements functionality a software system. Many feature techniques have been introduced automate some or all this process, and comprehensive overview large body work would be beneficial to researchers practitioners. This paper presents systematic literature survey techniques. Eighty‐nine articles from 25 venues reviewed classified within taxonomy order organize structure existing field location. The also...

10.1002/smr.567 article EN Journal of Software Evolution and Process 2011-11-28

Deep learning code fragments for code clone detection

OPENALEX - Publications

Martin White Michele Tufano Christopher Vendome Denys Poshyvanyk

Code clone detection is an important problem for software maintenance and evolution. Many approaches consider either structure or identifiers, but none of the existing techniques model both sources information. These also depend on generic, handcrafted features to represent code fragments. We introduce learning-based where everything representing terms fragments in source mined from repository. Our analysis supports a framework, which relies deep learning, automatically linking patterns at...

10.1145/2970276.2970326 article EN 2016-08-25

Feature Location Using Probabilistic Ranking of Methods Based on Execution Scenarios and Information Retrieval

OPENALEX - Publications

Denys Poshyvanyk Yann‐Gaël Guéhéneuc Andrian Marcus Giuliano Antoniol Václav Rajlich

This paper recasts the problem of feature location in source code as a decision-making presence uncertainty. The solution to is formulated combination opinions different experts. experts this work are two existing techniques for location: scenario-based probabilistic ranking events and an information-retrieval-based technique that uses latent semantic indexing. these empirically evaluated through several case studies, which use Mozilla Web browser Eclipse integrated development environment....

10.1109/tse.2007.1016 article EN IEEE Transactions on Software Engineering 2007-05-24

SEQUENCER: Sequence-to-Sequence Learning for End-to-End Program Repair

OPENALEX - Publications

Zimin Chen Steve Kommrusch Michele Tufano Louis-Noël Pouchet Denys Poshyvanyk and 1 more

This paper presents a novel end-to-end approach to program repair based on sequence-to-sequence learning. We devise, implement, and evaluate system, called SequenceR, for fixing bugs learning source code. uses the copy mechanism overcome unlimited vocabulary problem that occurs with big Our system is data-driven; we train it 35,578 samples, carefully curated from commits open-source repositories. 4,711 independent real bug fixes, as well Defects4J benchmark used in research. SequenceR able...

10.1109/tse.2019.2940179 article EN IEEE Transactions on Software Engineering 2019-09-11

Portfolio

OPENALEX - Publications

Collin McMillan Mark Grechanik Denys Poshyvanyk Qing Xie Chen Fu

Different studies show that programmers are more interested in finding definitions of functions and their uses than variables, statements, or arbitrary code fragments [30, 29, 31]. Therefore, require support relevant determining how those used. Unfortunately, existing search engines do not provide enough this to developers, thus reducing the effectiveness reuse.

10.1145/1985793.1985809 article EN 2011-05-21

Using the Conceptual Cohesion of Classes for Fault Prediction in Object-Oriented Systems

OPENALEX - Publications

Andrian Marcus Denys Poshyvanyk Rudolf Ferenć

High cohesion is a desirable property of software as it positively impacts understanding, reuse, and maintenance. Currently proposed measures for in Object-Oriented (OO) reflect particular interpretations capture different aspects it. Existing approaches are largely based on using the structural information from source code, such attribute references, methods to measure cohesion. This paper proposes new classes OO systems analysis unstructured embedded comments identifiers. The measure,...

10.1109/tse.2007.70768 article EN IEEE Transactions on Software Engineering 2008-03-01

An Empirical Study on Learning Bug-Fixing Patches in the Wild via Neural Machine Translation

OPENALEX - Publications

Michele Tufano Cody Watson Gabriele Bavota Massimiliano Di Penta Martin White and 1 more

Millions of open source projects with numerous bug fixes are available in code repositories. This proliferation software development histories can be leveraged to learn how fix common programming bugs. To explore such a potential, we perform an empirical study assess the feasibility using Neural Machine Translation techniques for learning bug-fixing patches real defects. First, mine millions bug-fixes from change hosted on GitHub order extract meaningful examples bug-fixes. Next, abstract...

10.1145/3340544 article EN ACM Transactions on Software Engineering and Methodology 2019-09-02

API change and fault proneness: a threat to the success of Android apps

OPENALEX - Publications

Mario Linares‐Vásquez Gabriele Bavota Carlos Bernal-Cárdenas Massimiliano Di Penta Rocco Oliveto and 1 more

During the recent years, market of mobile software applications (apps) has maintained an impressive upward trajectory. Many small and large development companies invest considerable resources to target available opportunities. As today, markets for such devices feature over 850K+ apps Android 900K+ iOS. Availability, cost, functionality, usability are just some factors that determine success or lack a given app. Among other factors, reliability is important criteria: users easily get...

10.1145/2491411.2491428 article EN 2013-08-18

Mining Version Histories for Detecting Code Smells

OPENALEX - Publications

Fabio Palomba Gabriele Bavota Massimiliano Di Penta Rocco Oliveto Denys Poshyvanyk and 1 more

Code smells are symptoms of poor design and implementation choices that may hinder code comprehension, possibly increase changeand fault-proneness. While most the detection techniques just rely on structural information, many intrinsically characterized by how elements change overtime. In this paper, we propose Historical Information for Smell deTection (HIST), an approach exploiting history information to detect instances five different smells, namely Divergent Change, Shotgun Surgery,...

10.1109/tse.2014.2372760 article EN publisher-specific-oa IEEE Transactions on Software Engineering 2014-11-20

Detecting bad smells in source code using change history information

OPENALEX - Publications

Fabio Palomba Gabriele Bavota Massimiliano Di Penta Rocco Oliveto Andrea De Lucia and 1 more

Code smells represent symptoms of poor implementation choices. Previous studies found that these make source code more difficult to maintain, possibly also increasing its fault-proneness. There are several approaches identify based on analysis techniques. However, we observe many intrinsically characterized by how elements change over time. Thus, relying solely structural information may not be sufficient detect all the accurately. We propose an approach five different smells, namely...

10.1109/ase.2013.6693086 article EN 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE) 2013-11-01

Mining energy-greedy API usage patterns in Android apps: an empirical study

OPENALEX - Publications

Mario Linares‐Vásquez Gabriele Bavota Carlos Bernal-Cárdenas Rocco Oliveto Massimiliano Di Penta and 1 more

Energy consumption of mobile applications is nowadays a hot topic, given the widespread use devices. The high demand for features and improved user experience, available powerful hardware, tend to increase apps' energy consumption. However, excessive in apps could also be consequence greedy bad programming practices, or particular API usage patterns. We present largest date quantitative qualitative empirical investigation into categories calls patterns that—in context Android development...

10.1145/2597073.2597085 article EN 2014-05-20

When and Why Your Code Starts to Smell Bad (and Whether the Smells Go Away)

OPENALEX - Publications

Michele Tufano Fabio Palomba Gabriele Bavota Rocco Oliveto Massimiliano Di Penta and 2 more

Technical debt is a metaphor introduced by Cunningham to indicate "not quite right code which we postpone making it right". One noticeable symptom of technical represented smells, defined as symptoms poor design and implementation choices. Previous studies showed the negative impact smells on comprehensibility maintainability code. While repercussions quality have been empirically assessed, there still only anecdotal evidence when why bad are introduced, what their survivability, how they...

10.1109/tse.2017.2653105 article EN publisher-specific-oa IEEE Transactions on Software Engineering 2017-01-17

Toward deep learning software repositories

OPENALEX - Publications

Martin White Christopher Vendome Mario Linares‐Vásquez Denys Poshyvanyk

Deep learning subsumes algorithms that automatically learn compositional representations. The ability of these models to generalize well has ushered in tremendous advances many fields such as natural language processing (NLP). Recent research the software engineering (SE) community demonstrated usefulness applying NLP techniques corpora. Hence, we motivate deep for modeling, highlighting fundamental differences between state-of-the-practice and connectionist models. Our are applicable source...

10.5555/2820518.2820559 article EN Mining Software Repositories 2015-05-16

When and Why Your Code Starts to Smell Bad

OPENALEX - Publications

Michele Tufano Fabio Palomba Gabriele Bavota Rocco Oliveto Massimiliano Di Penta and 2 more

In past and recent years, the issues related to managing technical debt received significant attention by researchers from both industry academia. There are several factors that contribute debt. One of these is represented code bad smells, i.e., Symptoms poor design implementation choices. While repercussions smells on quality have been empirically assessed, there still only anecdotal evidence when why introduced. To fill this gap, we conducted a large empirical study over change history 200...

10.1109/icse.2015.59 article EN 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering 2015-05-01

On Learning Meaningful Code Changes Via Neural Machine Translation

OPENALEX - Publications

Michele Tufano Jevgenija Pantiuchina Cody Watson Gabriele Bavota Denys Poshyvanyk

Recent years have seen the rise of Deep Learning (DL) techniques applied to source code. Researchers exploited DL automate several development and maintenance tasks, such as writing commit messages, generating comments detecting vulnerabilities among others. One long lasting dreams applying code is possibility non-trivial coding activities. While some steps in this direction been taken (e.g., learning how fix bugs), there still a glaring lack empirical evidence on types changes that can be...

10.1109/icse.2019.00021 article EN 2019-05-01

Machine Learning-Based Prototyping of Graphical User Interfaces for Mobile Apps

OPENALEX - Publications

Kevin Moran Carlos Bernal-Cárdenas Michael Curcio Richard Bonett Denys Poshyvanyk

It is common practice for developers of user-facing software to transform a mock-up graphical user interface (GUI) into code. This process takes place both at an application's inception and in evolutionary context as GUI changes keep pace with evolving features. Unfortunately, this challenging time-consuming. In paper, we present approach that automates by enabling accurate prototyping GUIs via three tasks: detection, classification, assembly. First, logical components are detected from...

10.1109/tse.2018.2844788 article EN publisher-specific-oa IEEE Transactions on Software Engineering 2018-06-07

Toward Deep Learning Software Repositories

OPENALEX - Publications

Martin White Christopher Vendome Mario Linares‐Vásquez Denys Poshyvanyk

Deep learning subsumes algorithms that automatically learn compositional representations. The ability of these models to generalize well has ushered in tremendous advances many fields such as natural language processing (NLP). Recent research the software engineering (SE) community demonstrated usefulness applying NLP techniques corpora. Hence, we motivate deep for modeling, highlighting fundamental differences between state-of-the-practice and connectionist models. Our are applicable source...

10.1109/msr.2015.38 article EN 2015-05-01

Automatically Discovering, Reporting and Reproducing Android Application Crashes

OPENALEX - Publications

Kevin Moran Mario Linares‐Vásquez Carlos Bernal-Cárdenas Christopher Vendome Denys Poshyvanyk

Mobile developers face unique challenges when detecting and reporting crashes in apps due to their prevailing GUI event-driven nature additional sources of inputs (e.g., sensor readings). To support these tasks, we introduce a novel, automated approach called CRASHSCOPE. This tool explores given Android app using systematic input generation, according several strategies informed by static dynamic analyses, with the intrinsic goal triggering crashes. When crash is detected, CRASHSCOPE...

10.1109/icst.2016.34 preprint EN 2016-04-01

Studying the Usage of Text-To-Text Transfer Transformer to Support Code-Related Tasks

OPENALEX - Publications

Antonio Mastropaolo Simone Scalabrino Nathan Cooper David N. Palacio Denys Poshyvanyk and 2 more

Deep learning (DL) techniques are gaining more and attention in the software engineering community. They have been used to support several code-related tasks, such as automatic bug fixing code comments generation. Recent studies Natural Language Processing (NLP) field shown that Text-To-Text Transfer Transformer (T5) architecture can achieve state-of-the-art performance for a variety of NLP tasks. The basic idea behind T5 is first pre-train model on large generic dataset using...

10.1109/icse43902.2021.00041 article EN 2021-05-01

Using pre-trained models to boost code review automation

OPENALEX - Publications

Rosalia Tufano Simone Masiero Antonio Mastropaolo Luca Pascarella Denys Poshyvanyk and 1 more

Code review is a practice widely adopted in open source and industrial projects. Given the non-negligible cost of such process, researchers started investigating possibility automating specific code tasks. We recently proposed Deep Learning (DL) models targeting automation two tasks: first model takes as input submitted for implements it changes likely to be recommended by reviewer; second reviewer comment posted natural language automatically change required reviewer. While preliminary...

10.1145/3510003.3510621 article EN Proceedings of the 44th International Conference on Software Engineering 2022-05-21

A Systematic Literature Review on the Use of Deep Learning in Software Engineering Research

OPENALEX - Publications

Cody Watson Nathan Cooper David Nader Palacio Kevin Moran Denys Poshyvanyk

An increasingly popular set of techniques adopted by software engineering (SE) researchers to automate development tasks are those rooted in the concept Deep Learning (DL). The popularity such largely stems from their automated feature capabilities, which aid modeling artifacts. However, due rapid pace at DL have been adopted, it is difficult distill current successes, failures, and opportunities research landscape. In an effort bring clarity this cross-cutting area work, its modern...

10.1145/3485275 article EN ACM Transactions on Software Engineering and Methodology 2022-03-04

Combining Formal Concept Analysis with Information Retrieval for Concept Location in Source Code

OPENALEX - Publications

Denys Poshyvanyk Andrian Marcus

The paper addresses the problem of concept location in source code by presenting an approach which combines formal analysis (FCA) and latent semantic indexing (LSI). In proposed approach, LSI is used to map concepts expressed queries written programmer relevant parts code, presented as a ranked list search results. Given elements, our selects most attributes from these documents organizes results lattice, generated via FCA. evaluated case study on eclipse, industrial size integrated...

10.1109/icpc.2007.13 article EN 2007-06-01

Feature location via information retrieval based filtering of a single scenario execution trace

OPENALEX - Publications

Dapeng Liu Andrian Marcus Denys Poshyvanyk Václav Rajlich

The paper presents a semi-automated technique for feature location in source code. is based on combining information from two different sources: an execution trace, one hand and the comments identifiers code, other hand.

10.1145/1321631.1321667 article EN 2007-11-05

Using information retrieval based coupling measures for impact analysis

OPENALEX - Publications

Denys Poshyvanyk Andrian Marcus Rudolf Ferenć Tibor Gyimóthy

10.1007/s10664-008-9088-2 article EN Empirical Software Engineering 2008-09-19

On the Equivalence of Information Retrieval Methods for Automated Traceability Link Recovery

OPENALEX - Publications

Rocco Oliveto Malcom Gethers Denys Poshyvanyk Andrea De Lucia

We present an empirical study to statistically analyze the equivalence of several traceability recovery methods based on Information Retrieval (IR) techniques. The analysis is Principal Component Analysis and overlap set candidate links provided by each method. studied techniques are Jensen-Shannon (JS) method, Vector Space Model (VSM), Latent Semantic Indexing (LSI), Dirichlet Allocation (LDA). results show that while JS, VSM, LSI almost equivalent, LDA able capture a dimension unique which...

10.1109/icpc.2010.20 article EN 2010-06-01