NFDI4DS | UHH-SEMS - Publication Details

Michele Tufano

ORCID: 0000-0003-2225-2420

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5020154435

Research Areas

Software Engineering Research
Software Reliability and Analysis Research
Software Testing and Debugging Techniques
Advanced Malware Detection Techniques
Software System Performance and Reliability
Topic Modeling
Natural Language Processing Techniques
Software Engineering Techniques and Practices
Web Data Mining and Analysis
Open Source Software Innovations
Advanced Software Engineering Methodologies
Scientific Computing and Data Management
Machine Learning and Data Classification
Usability and User Interface Design
Autophagy in Disease and Therapy
Web Application Security Vulnerabilities
Lysosomal Storage Disorders Research
Multimedia Communication and Technology
Text and Document Classification Technologies
Neurological disorders and treatments
Interactive and Immersive Displays
Anomaly Detection Techniques and Applications
Cardiac Ischemia and Reperfusion
Parkinson's Disease Mechanisms and Treatments
Mitochondrial Function and Pathology

University of Naples Federico II
2024-2025

Microsoft (United States)
2015-2023

William & Mary
2015-2022

Microsoft Research (United Kingdom)
2020-2022

Williams (United States)
2015-2022

Microsoft (Germany)
2021

Telethon Institute Of Genetics And Medicine
2020-2021

Telethon Foundation
2020

University of Illinois Chicago
2020

Deep learning code fragments for code clone detection

OPENALEX - Publications

Martin White Michele Tufano Christopher Vendome Denys Poshyvanyk

Code clone detection is an important problem for software maintenance and evolution. Many approaches consider either structure or identifiers, but none of the existing techniques model both sources information. These also depend on generic, handcrafted features to represent code fragments. We introduce learning-based where everything representing terms fragments in source mined from repository. Our analysis supports a framework, which relies deep learning, automatically linking patterns at...

10.1145/2970276.2970326 article EN 2016-08-25

SEQUENCER: Sequence-to-Sequence Learning for End-to-End Program Repair

OPENALEX - Publications

Zimin Chen Steve Kommrusch Michele Tufano Louis-Noël Pouchet Denys Poshyvanyk and 1 more

This paper presents a novel end-to-end approach to program repair based on sequence-to-sequence learning. We devise, implement, and evaluate system, called SequenceR, for fixing bugs learning source code. uses the copy mechanism overcome unlimited vocabulary problem that occurs with big Our system is data-driven; we train it 35,578 samples, carefully curated from commits open-source repositories. 4,711 independent real bug fixes, as well Defects4J benchmark used in research. SequenceR able...

10.1109/tse.2019.2940179 article EN IEEE Transactions on Software Engineering 2019-09-11

An Empirical Study on Learning Bug-Fixing Patches in the Wild via Neural Machine Translation

OPENALEX - Publications

Michele Tufano Cody Watson Gabriele Bavota Massimiliano Di Penta Martin White and 1 more

Millions of open source projects with numerous bug fixes are available in code repositories. This proliferation software development histories can be leveraged to learn how fix common programming bugs. To explore such a potential, we perform an empirical study assess the feasibility using Neural Machine Translation techniques for learning bug-fixing patches real defects. First, mine millions bug-fixes from change hosted on GitHub order extract meaningful examples bug-fixes. Next, abstract...

10.1145/3340544 article EN ACM Transactions on Software Engineering and Methodology 2019-09-02

When and Why Your Code Starts to Smell Bad (and Whether the Smells Go Away)

OPENALEX - Publications

Michele Tufano Fabio Palomba Gabriele Bavota Rocco Oliveto Massimiliano Di Penta and 2 more

Technical debt is a metaphor introduced by Cunningham to indicate "not quite right code which we postpone making it right". One noticeable symptom of technical represented smells, defined as symptoms poor design and implementation choices. Previous studies showed the negative impact smells on comprehensibility maintainability code. While repercussions quality have been empirically assessed, there still only anecdotal evidence when why bad are introduced, what their survivability, how they...

10.1109/tse.2017.2653105 article EN publisher-specific-oa IEEE Transactions on Software Engineering 2017-01-17

When and Why Your Code Starts to Smell Bad

OPENALEX - Publications

Michele Tufano Fabio Palomba Gabriele Bavota Rocco Oliveto Massimiliano Di Penta and 2 more

In past and recent years, the issues related to managing technical debt received significant attention by researchers from both industry academia. There are several factors that contribute debt. One of these is represented code bad smells, i.e., Symptoms poor design implementation choices. While repercussions smells on quality have been empirically assessed, there still only anecdotal evidence when why introduced. To fill this gap, we conducted a large empirical study over change history 200...

10.1109/icse.2015.59 article EN 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering 2015-05-01

On Learning Meaningful Code Changes Via Neural Machine Translation

OPENALEX - Publications

Michele Tufano Jevgenija Pantiuchina Cody Watson Gabriele Bavota Denys Poshyvanyk

Recent years have seen the rise of Deep Learning (DL) techniques applied to source code. Researchers exploited DL automate several development and maintenance tasks, such as writing commit messages, generating comments detecting vulnerabilities among others. One long lasting dreams applying code is possibility non-trivial coding activities. While some steps in this direction been taken (e.g., learning how fix bugs), there still a glaring lack empirical evidence on types changes that can be...

10.1109/icse.2019.00021 article EN 2019-05-01

InferFix: End-to-End Program Repair with LLMs

OPENALEX - Publications

Matthew Jin Syed Shahriar Michele Tufano Xin Shi Shuai Lu and 2 more

Software development life cycle is profoundly influenced by bugs; their introduction, identification, and eventual resolution account for a significant portion of software cost. This has motivated engineering researchers practitioners to propose different approaches automating the identification repair defects.

10.1145/3611643.3613892 article EN 2023-11-30

When and why your code starts to smell bad

OPENALEX - Publications

Michele Tufano Fabio Palomba Gabriele Bavota Rocco Oliveto Massimiliano Di Penta and 2 more

In past and recent years, the issues related to managing technical debt received significant attention by researchers from both industry academia. There are several factors that contribute debt. One of these is represented code bad smells, i.e., symptoms poor design implementation choices. While repercussions smells on quality have been empirically assessed, there still only anecdotal evidence when why introduced. To fill this gap, we conducted a large empirical study over change history 200...

10.5555/2818754.2818805 article EN International Conference on Software Engineering 2015-05-16

An empirical investigation into the nature of test smells

OPENALEX - Publications

Michele Tufano Fabio Palomba Gabriele Bavota Massimiliano Di Penta Rocco Oliveto and 2 more

Test smells have been defined as poorly designed tests and, reported by recent empirical studies, their presence may negatively affect comprehension and maintenance of test suites. Despite this, there are no available automated tools to support identification repair smells. In this paper, we firstly investigate developers' perception in a study with 19 participants. The results show that developers generally do not recognize (potentially harmful) smells, highlighting for identifying such...

10.1145/2970276.2970340 article EN 2016-08-25

Deep learning similarities from different representations of source code

OPENALEX - Publications

Michele Tufano Cody Watson Gabriele Bavota Massimiliano Di Penta Martin White and 1 more

Assessing the similarity between code components plays a pivotal role in number of Software Engineering (SE) tasks, such as clone detection, impact analysis, refactoring, etc. Code is generally measured by relying on manually defined or hand-crafted features, e.g., analyzing overlap among identifiers comparing Abstract Syntax Trees two components. These features represent best guess at what SE researchers can utilize to exploit and reliably assess for given task. Recent work has shown, when...

10.1145/3196398.3196431 article EN 2018-05-28

An empirical investigation into learning bug-fixing patches in the wild via neural machine translation

OPENALEX - Publications

Michele Tufano Cody Watson Gabriele Bavota Massimiliano Di Penta Martin White and 1 more

Millions of open-source projects with numerous bug fixes are available in code repositories. This proliferation software development histories can be leveraged to learn how fix common programming bugs. To explore such a potential, we perform an empirical study assess the feasibility using Neural Machine Translation techniques for learning bug-fixing patches real defects. We mine millions bug-fixes from change GitHub repositories extract meaningful examples bug-fixes. Then, abstract buggy and...

10.1145/3238147.3240732 article EN 2018-08-20

On learning meaningful assert statements for unit test cases

OPENALEX - Publications

Cody Watson Michele Tufano Kevin Moran Gabriele Bavota Denys Poshyvanyk

Software testing is an essential part of the software lifecycle and requires a substantial amount time effort. It has been estimated that developers spend close to 50% their on code they write. For these reasons, long standing goal within research community (partially) automate testing. While several techniques tools have proposed automatically generate test methods, recent work criticized quality usefulness assert statements generate. Therefore, we employ Neural Machine Translation (NMT)...

10.1145/3377811.3380429 preprint EN 2020-06-27

Sorting and Transforming Program Repair Ingredients via Deep Learning Code Similarities

OPENALEX - Publications

Martin White Michele Tufano Matías Martínez Martin Monperrus Denys Poshyvanyk

In the field of automated program repair, redundancy assumption claims large programs contain seeds their own repair. However, most redundancy-based repair techniques do not reason about ingredients---the code that is reused to craft a patch. We aim ingredients by using similarities prioritize and transform statements in codebase for patch generation. Our approach, DeepRepair, relies on deep learning similarities. Code fragments at well-defined levels granularity can be sorted according...

10.1109/saner.2019.8668043 preprint EN 2019-02-01

Towards Automating Code Review Activities

OPENALEX - Publications

Rosalia Tufano Luca Pascarella Michele Tufano Denys Poshyvanyk Gabriele Bavota

Code reviews are popular in both industrial and open source projects. The benefits of code widely recognized include better quality lower likelihood introducing bugs. However, since review is a manual activity it comes at the cost spending developers' time on reviewing their teammates' code. Our goal to make first step towards partially automating process, thus, possibly reducing costs associated with it. We focus <i xmlns:mml="http://www.w3.org/1998/Math/MathML"...

10.1109/icse43902.2021.00027 article EN 2021-05-01

Generating accurate assert statements for unit test cases using pretrained transformers

OPENALEX - Publications

Michele Tufano Dawn Drain A. Svyatkovskiy Neel Sundaresan

Unit testing represents the foundational basis of software pyramid, beneath integration and end-to-end testing. Automated researchers have proposed a variety techniques to assist developers in this time-consuming task.

10.1145/3524481.3527220 preprint EN 2022-05-17

Oxidative Metabolism in Brain Ischemia and Preconditioning: Two Sides of the Same Coin

OPENALEX - Publications

Elena D’Apolito Maria Josè Sisalli Michele Tufano Lucio Annunziato Antonella Scorziello

Brain ischemia is one of the major causes chronic disability and death worldwide. It related to insufficient blood supply cerebral tissue, which induces irreversible or reversible intracellular effects depending on time intensity ischemic event. Indeed, neuronal function may be restored in some conditions, such as transient attack (TIA), responsible for protecting against a subsequent lethal insult. well known that brain requires high levels oxygen glucose ensure cellular metabolism energy...

10.3390/antiox13050547 article EN cc-by Antioxidants 2024-04-29

Enabling mutation testing for Android apps

OPENALEX - Publications

Mario Linares‐Vásquez Gabriele Bavota Michele Tufano Kevin Moran Massimiliano Di Penta and 3 more

Mutation testing has been widely used to assess the fault-detection effectiveness of a test suite, as well guide case generation or prioritization. Empirical studies have shown that, while mutants are generally representative real faults, an effective application mutation requires "traditional" operators designed for programming languages be augmented with specific domain and/or technology. This paper proposes MDroid+, framework Android apps. First, we systematically devise taxonomy 262...

10.1145/3106237.3106275 preprint EN 2017-08-02

There and back again: Can you compile that snapshot?

OPENALEX - Publications

Michele Tufano Fabio Palomba Gabriele Bavota Massimiliano Di Penta Rocco Oliveto and 2 more

A broken snapshot represents a from project's change history that cannot be compiled. Broken snapshots can have significant implications for researchers, as they could hinder any analysis of the past project requires code to Noticeably, while some may observable in repositories (e.g., no longer available dependencies), them not necessarily happen during actual development. In this paper, we systematically study compilability 219 395 belonging 100 Java projects Apache Software Foundation, all...

10.1002/smr.1838 article EN Journal of Software Evolution and Process 2016-12-20

Landfill: An Open Dataset of Code Smells with Public Evaluation

OPENALEX - Publications

Fabio Palomba Dario Di Nucci Michele Tufano Gabriele Bavota Rocco Oliveto and 2 more

Code smells are symptoms of poor design and implementation choices that may hinder code comprehension possibly increase change- fault-proneness source code. Several techniques have been proposed in the literature for detecting smells. These generally evaluated by comparing their accuracy on a set detected candidate against manually-produced oracle. Unfortunately, such comprehensive sets annotated not available with only few exceptions. In this paper we contribute (i) dataset 243 instances...

10.1109/msr.2015.69 article EN 2015-05-01

Learning How to Mutate Source Code from Bug-Fixes

OPENALEX - Publications

Michele Tufano Cody Watson Gabriele Bavota Massimiliano Di Penta Martin White and 1 more

Mutation testing has been widely accepted as an approach to guide test case generation or assess the effectiveness of suites. Empirical studies have shown that mutants are representative real faults; yet they also indicated a clear need for better, possibly customized, mutation operators and strategies. While methods devise domain-specific general-purpose from faults exist, effort-and error-prone, do not help tester decide whether how mutate given source code element. We propose novel...

10.1109/icsme.2019.00046 article EN 2019-09-01

GraphCodeBERT: Pre-training Code Representations with Data Flow

OPENALEX - Publications

Daya Guo Shuo Ren Shuai Lu Zhangyin Feng Duyu Tang and 13 more

Pre-trained models for programming language have achieved dramatic empirical improvements on a variety of code-related tasks such as code search, completion, summarization, etc. However, existing pre-trained regard snippet sequence tokens, while ignoring the inherent structure code, which provides crucial semantics and would enhance understanding process. We present GraphCodeBERT, model that considers code. Instead taking syntactic-level like abstract syntax tree (AST), we use data flow in...

10.48550/arxiv.2009.08366 preprint EN other-oa arXiv (Cornell University) 2020-01-01

The E3-ligase Siah2 activates mitochondrial quality control in neurons to maintain energy metabolism during ischemic brain tolerance

OPENALEX - Publications

Maria Josè Sisalli Elena D’Apolito Ornella Cuomo Giovanna Lombardi Michele Tufano and 2 more

Abstract Mitochondrial quality control is crucial for the homeostasis of mitochondrial network. The balance between mitophagy and biogenesis needed to reduce cerebral ischemia-induced cell death. Ischemic preconditioning (IPC) represents an adaptation mechanism CNS that increases tolerance lethal ischemia. It has been demonstrated hypoxia-induced Seven in absentia Homolog 2 (Siah2) E3-ligase activation influences dynamics promoting degradation proteins. Therefore, present study, we...

10.1038/s41419-025-07339-z article EN cc-by Cell Death and Disease 2025-01-28

Coming Soon ...