NFDI4DS | UHH-SEMS - Publication Details

Expectations, outcomes, and challenges of modern code review

OPENALEX - Publications

Alberto Bacchelli Christian Bird

Code review is a common software engineering practice employed both in open source and industrial contexts. Review today less formal more “lightweight” than the code inspections performed studied 70s 80s. We empirically explore motivations, challenges, outcomes of tool-based reviews. observed, interviewed, surveyed developers managers manually classified hundreds comments across diverse teams at Microsoft. Our study reveals that while finding defects remains main motivation for review,...

10.1109/icse.2013.6606617 article EN 2013 35th International Conference on Software Engineering (ICSE) 2013-05-01

Expectations, outcomes, and challenges of modern code review

OPENALEX - Publications

Alberto Bacchelli Christian Bird

Code review is a common software engineering practice employed both in open source and industrial contexts. Review today less formal more “lightweight” than the code inspections performed studied 70s 80s. We empirically explore motivations, challenges, outcomes of tool-based reviews. observed, interviewed, surveyed developers managers manually classified hundreds comments across diverse teams at Microsoft. Our study reveals that while finding defects remains main motivation for review,...

10.5555/2486788.2486882 article EN International Conference on Software Engineering 2013-05-18

Work practices and challenges in pull-based development

OPENALEX - Publications

Georgios Gousios Margaret‐Anne Storey Alberto Bacchelli

The pull-based development model is an emerging way of contributing to distributed software projects that gaining enormous popularity within the open source (OSS) world. Previous work has examined this by focusing on and their owners---we complement it examining practices project contributors challenges they face.

10.1145/2884781.2884826 article EN Proceedings of the 44th International Conference on Software Engineering 2016-05-13

PyDriller: Python framework for mining software repositories

OPENALEX - Publications

Davide Spadini Maurício Aniche Alberto Bacchelli

Software repositories contain historical and valuable information about the overall development of software systems. Mining (MSR) is nowadays considered one most interesting growing fields within engineering. MSR focuses on extracting analyzing data available in to uncover interesting, useful, actionable system. Even though plays an important role engineering research, few tools have been created made public support developers from Git repository. In this paper, we present PyDriller, a...

10.1145/3236024.3264598 article EN 2018-10-26

Modern code reviews in open-source projects: which problems do they fix?

OPENALEX - Publications

Moritz Beller Alberto Bacchelli Andy Zaidman Elmar Juergens

Code review is the manual assessment of source code by humans, mainly intended to identify defects and quality problems. Modern Review (MCR), a lightweight variant inspections investigated since 1970s, prevails today both in industry open-source software (OSS) systems. The objective this paper increase our understanding practical benefits that MCR process produces on reviewed code. To end, we empirically explore problems fixed through OSS We manually classified over 1,400 changes taking...

10.1145/2597073.2597082 article EN 2014-05-20

Modern code review

OPENALEX - Publications

Caitlin Sadowski Emma Söderberg Luke Church Michal Sipko Alberto Bacchelli

Employing lightweight, tool-based code review of changes (aka modern review) has become the norm for a wide variety open-source and industrial systems. In this paper, we make an exploratory investigation at Google. Google introduced early on evolved it over years; our study sheds light why practice analyzes its current status, after process been refined through decades millions reviews. By means 12 interviews, survey with 44 respondents, analysis logs 9 million reviewed changes, investigate...

10.1145/3183519.3183525 article EN 2018-05-27

On the "naturalness" of buggy code

OPENALEX - Publications

Baishakhi Ray Vincent J. Hellendoorn Saheel Godhane Zhaopeng Tu Alberto Bacchelli and 1 more

Real software, the kind working programmers produce by kLOC to solve real-world problems, tends be "natural", like speech or natural language; it highly repetitive and predictable. Researchers have captured this naturalness of software through statistical models used them good effect in suggestion engines, porting tools, coding standards checkers, idiom miners. This suggests that code appears improbable, surprising, a language model is "unnatural" some sense, thus possibly suspicious. In...

10.1145/2884781.2884848 article EN Proceedings of the 44th International Conference on Software Engineering 2016-05-13

UI Dark Patterns and Where to Find Them

OPENALEX - Publications

Linda Di Geronimo Larissa Braz Enrico Fregnan Fabio Palomba Alberto Bacchelli

A Dark Pattern (DP) is an interface maliciously crafted to deceive users into performing actions they did not mean do. In this work, we analyze Patterns in 240 popular mobile apps and conduct online experiment with 589 on how perceive such apps. The results of the analysis show that 95% analyzed contain one or more forms and, average, applications include at least seven different types deceiving interfaces. shows most do recognize Patterns, but can perform better recognizing malicious...

10.1145/3313831.3376600 article EN 2020-04-21

Linking e-mails and source code artifacts

OPENALEX - Publications

Alberto Bacchelli Michele Lanza Romain Robbes

E-mails concerning the development issues of a system constitute an important source information about high-level design decisions, low-level implementation concerns, and social structure developers.

10.1145/1806799.1806855 article EN 2010-05-01

Communication in open source software development mailing lists

OPENALEX - Publications

Anja Guzzi Alberto Bacchelli Michele Lanza Martin Pinzger Arie van Deursen

Open source software (OSS) development teams use electronic means, such as emails, instant messaging, or forums, to conduct open and public discussions. Researchers investigated mailing lists considering them a hub for project communication. Prior work focused on specific aspects of example the handling patches, traceability concerns, social networks. This led insights pertaining aspects, but not comprehensive view what developers communicate about. Our objective is increase understanding We...

10.1109/msr.2013.6624039 article EN 2013-05-01

On the Impact of Design Flaws on Software Defects

OPENALEX - Publications

Marco D’Ambros Alberto Bacchelli Michele Lanza

The presence of design flaws in a software system has negative impact on the quality software, as they indicate violations practices and principles, which make harder to understand, maintain, evolve. Software defects are tangible effects poor quality. In this paper we study relationship between number flaws. We found that, while some more frequent, none them can be considered harmful with respect defects. also analyzed correlation introduction new generation

10.1109/qsic.2010.58 article EN 2010-07-01

Improving Low Quality Stack Overflow Post Detection

OPENALEX - Publications

Luca Ponzanelli Andrea Mocci Alberto Bacchelli Michele Lanza David A. Fullerton

Stack Overflow is a popular questions and answers (Q&A) website among software developers. It counts more than two millions of users who actively contribute by asking answering thousands daily. Identifying reviewing low quality posts preserves the site's contents it crucial to maintain good user experience. In identification poor performed selected manually. The system also uses an automated based on textual features. Low automatically enter review queue maintained experienced users. We...

10.1109/icsme.2014.90 article EN 2014-09-01

On the Relation of Test Smells to Software Code Quality

OPENALEX - Publications

Davide Spadini Fabio Palomba Andy Zaidman Magiel Bruntink Alberto Bacchelli

Test smells are sub-optimal design choices in the implementation of test code. As reported by recent studies, their presence might not only negatively affect comprehension suites but can also lead to cases being less effective finding bugs production Although significant steps toward understanding smells, there is still a notable absence studies assessing association with software quality. In this paper, we investigate relationship between and change-and defect-proneness code, as well tested...

10.1109/icsme.2018.00010 article EN 2018-09-01

Fine-grained just-in-time defect prediction

OPENALEX - Publications

Luca Pascarella Fabio Palomba Alberto Bacchelli

10.1016/j.jss.2018.12.001 article EN Journal of Systems and Software 2018-12-03

Information Needs in Contemporary Code Review

OPENALEX - Publications

Luca Pascarella Davide Spadini Fabio Palomba Magiel Bruntink Alberto Bacchelli

Contemporary code review is a widespread practice used by software engineers to maintain high quality and share project knowledge. However, conducting proper takes time developers often have limited for review. In this paper, we aim at investigating the information that reviewers need conduct review, better understand process how research tool support can make become more effective efficient reviewers. Previous work has provided evidence successful one in which authors actively participate...

10.1145/3274404 article EN Proceedings of the ACM on Human-Computer Interaction 2018-11-01

Untangling fine-grained code changes

OPENALEX - Publications

Martín Dias Alberto Bacchelli Georgios Gousios Damien Cassou Sté́phane Ducasse

After working for some time, developers commit their code changes to a version control system. When doing so, they often bundle unrelated (e.g., bug fix and refactoring) in single commit, thus creating so-called tangled commit. Sharing commits is problematic because it makes review, reversion, integration of these harder historical analyses the project less reliable. Researchers have worked at untangling existing commits, i.e., finding which part relates task. In this paper, we contribute...

10.1109/saner.2015.7081844 preprint EN 2015-03-01

Seahawk: stack overflow in the IDE

OPENALEX - Publications

Luca Ponzanelli Alberto Bacchelli Michele Lanza

Services, such as Stack Overflow, offer a web platform to programmers for discussing technical issues, in form of Question and Answers (Q&A). Since Q&A services store the discussions, generated “crowd knowledge” can be accessed consumed by large audience long time. Nevertheless, are detached from development environments used programmers: Developers have tap into this crowd knowledge through browsers cannot smoothly integrate it their workflow. This situation hinders part benefits services....

10.5555/2486788.2486988 article EN International Conference on Software Engineering 2013-05-18

Seahawk: Stack Overflow in the IDE

OPENALEX - Publications

Luca Ponzanelli Alberto Bacchelli Michele Lanza

Services, such as Stack Overflow, offer a web platform to programmers for discussing technical issues, in form of Question and Answers (Q&A). Since Q&A services store the discussions, generated "crowd knowledge" can be accessed consumed by large audience long time. Nevertheless, are detached from development environments used programmers: Developers have tap into this crowd knowledge through browsers cannot smoothly integrate it their workflow. This situation hinders part benefits services....

10.1109/icse.2013.6606701 article EN 2013 35th International Conference on Software Engineering (ICSE) 2013-05-01

Classifying Code Comments in Java Open-Source Software Systems

OPENALEX - Publications

Luca Pascarella Alberto Bacchelli

Code comments are a key software component containing information about the underlying implementation. Several studies have shown that code enhance readability of code. Nevertheless, not all same goal and target audience. In this paper, we investigate how six diverse Java OSS projects use comments, with aim understanding their purpose. Through our analysis, produce taxonomy source subsequently, often each category occur by manually classifying more than 2,000 from aforementioned projects....

10.1109/msr.2017.63 article EN 2017-05-01

Leveraging Crowd Knowledge for Software Comprehension and Development

OPENALEX - Publications

Luca Ponzanelli Alberto Bacchelli Michele Lanza

Question and Answer (Q&A) services, such as Stack Overflow, rely on a community of programmers who post questions, provide rate answers, to create what is termed "crowd knowledge". As consequence, these services archive voluminous potentially useful information help developers solve programming-specific issues. Programmers tap into this crowd knowledge through web browsers. This requires them step out their integrated development environments (IDE), formulate query, inspect the returned...

10.1109/csmr.2013.16 article EN 2013-03-01

Understanding and Classifying the Quality of Technical Forum Questions

OPENALEX - Publications

Luca Ponzanelli Andrea Mocci Alberto Bacchelli Michele Lanza

Technical questions and answers (Q&A) services have become a valuable resource for developers. A prominent example of technical Q&A website is StackOverflow (SO), which relies on growing community more than two millions users who actively contribute by asking providing answers. To maintain the value this resource, poor quality -- among 6,000 asked daily to be filtered out. Currently, are manually identified reviewed selected in SO, costs considerable time effort. Automating process would...

10.1109/qsic.2014.27 article EN 2014-10-01

When Code Completion Fails: A Case Study on Real-World Completions

OPENALEX - Publications

Vincent J. Hellendoorn Sebastian Proksch Harald Gall Alberto Bacchelli

Code completion is commonly used by software developers and integrated into all major IDE's. Good tools can not only save time effort but may also help avoid incorrect API usage. Many proposed have shown promising results on synthetic benchmarks, these benchmarks make no claims about the realism of completions they test. This lack grounding in real-world data could hinder our scientific understanding developer needs efficacy models. paper presents a case study 15,000 code that were applied...

10.1109/icse.2019.00101 article EN 2019-05-01

Does single blind peer review hinder newcomers?

OPENALEX - Publications

Marco Seeber Alberto Bacchelli

Several fields of research are characterized by the coexistence two different peer review modes to select quality contributions for scientific venues, namely double blind (DBR) and single (SBR) review. In first, identities both authors reviewers not known each other, whereas in latter authors' visible since start process. The need adopt either one these has been object scholarly debate, which mostly focused on issues fairness. Past work reported that SBR is potentially associated with biases...

10.1007/s11192-017-2264-7 article EN cc-by Scientometrics 2017-03-03

Content classification of development emails

OPENALEX - Publications

Alberto Bacchelli Tommaso Dal Sasso Marco D’Ambros Michele Lanza

Emails related to the development of a software system contain information about design choices and issues encountered during process. Exploiting knowledge embedded in emails with automatic tools is challenging, due unstructured, noisy, mixed language nature this communication medium. Natural text often not well-formed interleaved languages other syntaxes, such as code or stack traces. We present an approach classify email content at line level. Our technique classifies lines five categories...

10.1109/icse.2012.6227177 article EN 2013 35th International Conference on Software Engineering (ICSE) 2012-06-01

Content classification of development emails

OPENALEX - Publications

Alberto Bacchelli Tommaso Dal Sasso Marco D’Ambros Michele Lanza

Emails related to the development of a software system contain information about design choices and issues encountered during process. Exploiting knowledge embedded in emails with automatic tools is challenging, due unstructured, noisy, mixed language nature this communication medium. Natural text often not well-formed interleaved languages other syntaxes, such as code or stack traces. We present an approach classify email content at line level. Our technique classifies lines five categories...

10.5555/2337223.2337268 article EN International Conference on Software Engineering 2012-06-02