NFDI4DS | UHH-SEMS - Publication Details

Is ChatGPT the Ultimate Programming Assistant -- How far is it?

OPENALEX - Publications

Haoye Tian Weiqi Lu Tsz On Li Xunzhu Tang Shing-Chi Cheung and 2 more

Recently, the ChatGPT LLM has received great attention: it can be used as a bot for discussing source code, prompting to suggest changes, provide descriptions or even generate code. Typical demonstrations generally focus on existing benchmarks, which may have been in model training (i.e., data leakage). To assess feasibility of using an useful assistant programmers, we must its realistic capabilities unseen problems well various tasks. In this paper, present empirical study ChatGPT's...

10.48550/arxiv.2304.11938 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Evaluating representation learning of code changes for predicting patch correctness in program repair

OPENALEX - Publications

Haoye Tian Kui Liu Abdoul Kader Kaboré Anil Koyuncu Li Li and 2 more

A large body of the literature automated program repair develops approaches where patches are generated to be validated against an oracle (e.g., a test suite). Because such can imperfect, patches, although by oracle, may actually incorrect. While state art explore research directions that require dynamic information or rely on manually-crafted heuristics, we study benefit learning code representations in order learn deep features encode properties patch correctness. Our empirical work mainly...

10.1145/3324884.3416532 article EN 2020-12-21

Nuances are the Key: Unlocking ChatGPT to Find Failure-Inducing Tests with Differential Prompting

OPENALEX - Publications

Tsz-On Li Wenxi Zong Yibo Wang Haoye Tian Ying Wang and 2 more

Automated detection of software failures is an important but challenging engineering task. It involves finding in a vast search space the failure-inducing test cases that contain input triggering fault and oracle asserting incorrect execution. We are motivated to study how far this outstanding challenge can be solved by recent advances large language models (LLMs) such as ChatGPT. However, our reveals ChatGPT has relatively low success rate (28.8%) correct for buggy programs. A possible...

10.1109/ase56229.2023.00089 article EN 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE) 2023-09-11

You Don’t Have to Say Where to Edit! jLED – Joint Learning to Localize and Edit Source Code

OPENALEX - Publications

Weiguo Pian Yinghua Li Haoye Tian Tiezhu Sun Yewei Song and 4 more

Learning to edit code automatically is becoming more and feasible. Thanks recent advances in Neural Machine Translation (NMT), various case studies are being investigated where patches produced assessed either (using test suites) or by developers themselves. An appealing setting remains when the developer must provide a natural language input of requirement for change. A proof concept literature showed that it indeed feasible translate these requirements into changes. advancement, MODIT [8],...

10.1145/3712187 article EN ACM Transactions on Software Engineering and Methodology 2025-01-13

Predicting Patch Correctness Based on the Similarity of Failing Test Cases

OPENALEX - Publications

Haoye Tian Yinghua Li Weiguo Pian Abdoul Kader Kaboré Kui Liu and 3 more

How do we know a generated patch is correct? This key challenging question that automated program repair (APR) systems struggle to address given the incompleteness of available test suites. Our intuition can triage correct patches by checking whether each implements code changes (i.e., behavior) are relevant bug it addresses. Such commonly specified failing case. Towards predicting correctness in APR, propose novel yet simple hypothesis on how link between behavior and specifications be...

10.1145/3511096 article EN ACM Transactions on Software Engineering and Methodology 2022-04-20

App review driven collaborative bug finding

OPENALEX - Publications

Xunzhu Tang Haoye Tian Pingfan Kong Saad Ezzini Kui Liu and 3 more

Abstract Software development teams generally welcome any effort to expose bugs in their code base. In this work, we build on the hypothesis that mobile apps from same category (e.g., two web browser apps) may be affected by similar evolution process. It is therefore possible transfer experience of one historical app quickly find its new counterparts. This has been referred as collaborative bug finding literature. Our novelty guide process considering existing have hinted within reviews....

10.1007/s10664-024-10489-x article EN cc-by Empirical Software Engineering 2024-07-26

Did the Roll-Out of Community Notes Reduce Engagement With Misinformation on X/Twitter?

OPENALEX - Publications

Yuwei Chuai Haoye Tian Nicolas Pröllochs Gabriele Lenzini

Developing interventions that successfully reduce engagement with misinformation on social media is challenging. One intervention has recently gained great attention X/Twitter's Community Notes (previously known as "Birdwatch"). a crowdsourced fact-checking approach allows users to write textual notes inform others about potentially misleading posts X/Twitter. Yet, empirical evidence regarding its effectiveness in reducing missing. In this paper, we perform large-scale study analyze whether...

10.1145/3686967 article EN cc-by Proceedings of the ACM on Human-Computer Interaction 2024-11-07

The Best of Both Worlds: Combining Learned Embeddings with Engineered Features for Accurate Prediction of Correct Patches

OPENALEX - Publications

Haoye Tian Kui Liu Yinghua Li Abdoul Kader Kaboré Anil Koyuncu and 5 more

A large body of the literature on automated program repair develops approaches where patches are automatically generated to be validated against an oracle (e.g., a test suite). Because such can imperfect, patches, although by oracle, may actually incorrect. While state-of-the-art explores research directions that require dynamic information or rely manually-crafted heuristics, we study benefit learning code representations in order learn deep features encode properties patch correctness. Our...

10.1145/3576039 article EN ACM Transactions on Software Engineering and Methodology 2022-12-15

Collaborative Agents for Software Engineering

OPENALEX - Publications

Daniel Tang Zhenghan Chen Kisub Kim Yewei Song Haoye Tian and 3 more

Code review is a heavily collaborative process, which aims at ensuring the overall quality and reliability of software. While it provides massive benefits, implementation code in an organization faces several challenges that make its automation appealing. Automated tools have been around for while are now improving thanks to adoption novel AI models, help can learn about standard practices systematically check reviewed adheres them. Unfortunately, existing methods fall short: they often...

10.48550/arxiv.2402.02172 preprint EN arXiv (Cornell University) 2024-02-03

A Music Recommendation System Based on logistic regression and eXtreme Gradient Boosting

OPENALEX - Publications

Haoye Tian Haini Cai Junhao Wen Shun Li Yingqiao Li

With the rapid growth of music industry data, it is difficult for people to find their favorite songs in library. Therefore, urgently need an efficient recommendation system help them retrieve music. Traditional collaborative filtering algorithms are applied field recommendation. However, does not handle data sparse problems very well when new items introduced. To solve this problem, some use logistic regression method as a classifier predict user's preferences recommend songs. Logistic...

10.1109/ijcnn.2019.8852094 article EN 2022 International Joint Conference on Neural Networks (IJCNN) 2019-07-01

Where were the repair ingredients for Defects4j bugs?

OPENALEX - Publications

Deheng Yang Kui Liu Dongsun Kim Anil Koyuncu Kisub Kim and 5 more

10.1007/s10664-021-10003-7 article EN Empirical Software Engineering 2021-09-10

The Roll-Out of Community Notes Did Not Reduce Engagement With Misinformation on Twitter

OPENALEX - Publications

Yuwei Chuai Haoye Tian Nicolas Pröllochs Gabriele Lenzini

Developing interventions that successfully reduce engagement with misinformation on social media is challenging. One intervention has recently gained great attention X/Twitter's Community Notes (previously known as "Birdwatch"). a crowdsourced fact-checking approach allows users to write textual notes inform others about potentially misleading posts X/Twitter. Yet, empirical evidence regarding its effectiveness in reducing missing. In this paper, we perform large-scale study analyze whether...

10.48550/arxiv.2307.07960 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Is this Change the Answer to that Problem?

OPENALEX - Publications

Haoye Tian Xunzhu Tang Andrew Habib Shangwen Wang Kui Liu and 3 more

In this work, we propose a novel perspective to the problem of patch correctness assessment: correct implements changes that "answer" posed by buggy behaviour. Concretely, turn assessment into Question Answering problem. To tackle problem, our intuition is natural language processing can provide necessary representations and models for assessing semantic correlation between bug (question) (answer). Specifically, consider as inputs reports well description generated patches. Our approach,...

10.1145/3551349.3556914 preprint EN 2022-10-10

Learning to Represent Patches

OPENALEX - Publications

Xunzhu Tang Haoye Tian Zhenghan Chen Weiguo Pian Saad Ezzini and 4 more

We propose Patcherizer, a novel patch representation methodology that combines context and structure intention features to capture the semantic changes in Abstract Syntax Trees (ASTs) surrounding of code changes. Utilizing graph convolutional neural networks transformers, Patcherizer effectively captures underlying intentions patches, outperforming state-of-the-art representations with significant improvements BLEU, ROUGE-L, METEOR metrics for generating descriptions.

10.1145/3639478.3643521 article EN 2024-04-14

Large-scale, Independent and Comprehensive study of the power of LLMs for test case generation

OPENALEX - Publications

Wendkûuni C. Ouédraogo Kader Kaboré Haoye Tian Yewei Song Anil Koyuncu and 3 more

Unit testing, crucial for identifying bugs in code modules like classes and methods, is often neglected by developers due to time constraints. Automated test generation techniques have emerged address this, but lack readability require developer intervention. Large Language Models (LLMs), GPT Mistral, show promise software engineering, including generation. However, their effectiveness remains unclear. This study conducts the first comprehensive investigation of LLMs, evaluating four LLMs...

10.48550/arxiv.2407.00225 preprint EN arXiv (Cornell University) 2024-06-28

AI-driven Mobile Apps: an Explorative Study

OPENALEX - Publications

Yinghua Li Xueqi Dang Haoye Tian Tiezhu Sun Zhijie Wang and 3 more

Recent years have witnessed an astonishing explosion in the evolution of mobile applications powered by AI technologies. The rapid growth frameworks enables transition technologies to devices, significantly prompting adoption apps (i.e., that integrate into their functions) among smartphone devices. In this paper, we conduct most extensive empirical study on 56,682 published from three perspectives: dataset characteristics, development issues, and user feedback privacy. To end, build...

10.48550/arxiv.2212.01635 preprint EN other-oa arXiv (Cornell University) 2022-01-01

Hyperbolic Code Retrieval: A Novel Approach for Efficient Code Search Using Hyperbolic Space Embeddings

OPENALEX - Publications

Xunzhu Tang Zhenghan Chen Saad Ezzini Haoye Tian Yewei Song and 2 more

Within the realm of advanced code retrieval, existing methods have primarily relied on intricate matching and attention-based mechanisms. However, these often lead to computational memory inefficiencies, posing a significant challenge their real-world applicability. To tackle this challenge, we propose novel approach, Hyperbolic Code QA Matching (HyCoQA). This approach leverages unique properties space express connections between fragments corresponding queries, thereby obviating necessity...

10.48550/arxiv.2308.15234 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Evaluating Representation Learning of Code Changes for Predicting Patch Correctness in Program Repair

OPENALEX - Publications

Haoye Tian Kui Liu Abdoul Kader Kaboreé Anil Koyuncu Li Li and 2 more

A large body of the literature automated program repair develops approaches where patches are generated to be validated against an oracle (e.g., a test suite). Because such can imperfect, patches, although by oracle, may actually incorrect. While state art explore research directions that require dynamic information or rely on manually-crafted heuristics, we study benefit learning code representations learn deep features encode properties patch correctness. Our work mainly investigates...

10.48550/arxiv.2008.02944 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Interactive Design of 3D Dynamic Gesture Based on SVM-LSTM Model

OPENALEX - Publications

Tao Wang Xiaolong Cai Liping Wang Haoye Tian

Visual hand gesture interaction is one of the main ways human-computer interaction, and provides users more interactive degrees freedom realistic experience. Authors present a hybrid model based on SVM-LSTM, design three-dimensional dynamic system. The system uses Leap Motion to capture information, combined with SVM powerful static classification ability LSTM variable-length time series processing ability, enabling real-time recognition user gestures. method can automatically define start...

10.4018/ijmhci.2018070104 article EN International Journal of Mobile Human Computer Interaction 2018-06-11

ACFIX: Guiding LLMs with Mined Common RBAC Practices for Context-Aware Repair of Access Control Vulnerabilities in Smart Contracts

OPENALEX - Publications

Lyuye Zhang Kaixuan Li Kairan Sun Daoyuan Wu Ye Liu and 2 more

Smart contracts are susceptible to various security issues, among which access control (AC) vulnerabilities particularly critical. While existing research has proposed multiple detection tools, the automatic and appropriate repair of AC in smart remains a challenge. Unlike commonly supported vulnerability types by such as reentrancy, usually fixed template-based approaches, main obstacle lies identifying roles or permissions amid long list non-AC-related source code generate proper patch...

10.48550/arxiv.2403.06838 preprint EN arXiv (Cornell University) 2024-03-11

A Cross-Project Defect Prediction Approach Based on Code Semantics and Cross-Version Structural Information

OPENALEX - Publications

Yifan Zou Huiqiang Wang Hongwu Lv Shuai Zhao Haoye Tian

Context: Cross-project defect prediction (CPDP), due to the potential of adaption by industry in realistic scenarios, had gained significant attention from research community. Currently, existing CPDP studies use static statistical features designed experts, which might not capture semantic and structural aspects software, resulting low accuracy prediction. Meanwhile, they tend overlook valuable iterative information brought about version updates mature software projects. Objective: This...

10.1142/s0218194024500165 article EN International Journal of Software Engineering and Knowledge Engineering 2024-04-05

Multi-Objective Fine-Tuning for Enhanced Program Repair with LLMs

OPENALEX - Publications

Boyang Yang Haoye Tian Jiadong Ren Hongyu Zhang Jacques Klein and 3 more

Large language models (LLMs) have demonstrated remarkable capabilities on a broad spectrum of downstream tasks. Within the realm software engineering, specialized tasks code, such as program repair, present unique challenges, necessitating fine-tuning to unlock state-of-the-art performance. Fine-tuning approaches proposed in literature for LLMs repair are however generally overlooking need reason about logic behind code changes, beyond syntactic patterns data. High-performing experiments...

10.48550/arxiv.2404.12636 preprint EN arXiv (Cornell University) 2024-04-19

Security Vulnerability Detection with Multitask Self-Instructed Fine-Tuning of Large Language Models

OPENALEX - Publications

Aidan Z. H. Yang Haoye Tian He Ye Ruben Martins Claire Le Goues

Software security vulnerabilities allow attackers to perform malicious activities disrupt software operations. Recent Transformer-based language models have significantly advanced vulnerability detection, surpassing the capabilities of static analysis based deep learning models. However, trained solely on code tokens do not capture either explanation type or data flow structure information code, both which are crucial for detection. We propose a novel technique that integrates multitask...

10.48550/arxiv.2406.05892 preprint EN arXiv (Cornell University) 2024-06-09

CREF: An LLM-based Conversational Software Repair Framework for Programming Tutors

OPENALEX - Publications

Boyang Yang Haoye Tian Weiguo Pian Haoran Yu Haitao Wang and 3 more

Program repair techniques offer cost-saving benefits for debugging within software development and programming education scenarios. With the proven effectiveness of Large Language Models (LLMs) in code-related tasks, researchers have explored their potential program repair. However, it is crucial to recognize that existing benchmarks may influenced LLM training data, potentially causing data leakage. To evaluate LLMs' realistic capabilities, (1) we introduce an extensive, non-crawled...

10.48550/arxiv.2406.13972 preprint EN arXiv (Cornell University) 2024-06-19

An Empirical Study of Ai Techniques in Mobile Applications

OPENALEX - Publications

Yinghua Li Xueqi Dang Haoye Tian Tiezhu Sun Zhi-Jie Wang and 3 more

10.2139/ssrn.4876287 preprint EN 2024-01-01