NFDI4DS | UHH-SEMS - Publication Details

Chenguang Zhu

ORCID: 0000-0002-7343-8279

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5054819831

Research Areas

Software Engineering Research
Software Testing and Debugging Techniques
Software Reliability and Analysis Research
Topic Modeling
Software System Performance and Reliability
Advanced Malware Detection Techniques
Scientific Computing and Data Management
Advanced Image and Video Retrieval Techniques
Software Engineering Techniques and Practices
Blockchain Technology Applications and Security
Advanced Software Engineering Methodologies
Natural Language Processing Techniques
Virus-based gene therapy research
Web Data Mining and Analysis
Multimodal Machine Learning Applications
Aquaculture disease management and microbiota
Simulation Techniques and Applications
Bacteriophages and microbial interactions
Microbial infections and disease research
Distributed and Parallel Computing Systems
Artificial Intelligence in Law
Security and Verification in Computing
Algorithms and Data Compression
Machine Learning and Data Classification
Animal Virus Infections Studies

The University of Texas at Austin
2018-2025

Nanjing Agricultural University
2024

Fujitsu (United States)
2022

Microsoft (United States)
2021

University of Toronto
2016-2017

Small Models are Valuable Plug-ins for Large Language Models

OPENALEX - Publications

Canwen Xu Xu Yi‐chong Shuohang Wang Yang Liu Chenguang Zhu and 1 more

10.18653/v1/2024.findings-acl.18 article CA Findings of the Association for Computational Linguistics: ACL 2022 2024-01-01

A Comprehensive Study of Governance Issues in Decentralized Finance Applications

OPENALEX - Publications

Wei Ma Chenguang Zhu Ye Liu Xiaofei Xie Yi Li

Decentralized Finance (DeFi) is a prominent application of smart contracts, representing novel financial paradigm in contrast to centralized finance. While DeFi applications are rapidly emerging on mainstream blockchain platforms, their quality varies greatly, presenting numerous challenges, particularly terms governance mechanisms. In this paper, we present comprehensive study issues applications. Initially, collected 3,165 academic papers and industry reports. After thorough screening,...

10.1145/3717062 article EN other-oa ACM Transactions on Software Engineering and Methodology 2025-02-13

Semantic Slicing of Software Version Histories

OPENALEX - Publications

Yi Li Chenguang Zhu Julia Rubin Marsha Chećhik

Software developers often need to transfer functionality, e.g., a set of commits implementing new feature or bug fix, from one branch configuration management system another. That can be challenging task as the existing tools lack support for matching high-level, semantic functionality with low-level version histories. The developer thus has either manually identify exact semantically-related interest sequentially port segment change history, “inheriting” additional, unwanted functionality....

10.1109/tse.2017.2664824 article EN IEEE Transactions on Software Engineering 2017-02-07

A Framework for Checking Regression Test Selection Tools

OPENALEX - Publications

Chenguang Zhu Owolabi Legunsen August Shi Milos Gligoric

Regression test selection (RTS) reduces regression testing costs by re-running only tests that can change behavior due to code changes. Researchers and large software organizations recently developed adopted several RTS tools deal with the rapidly growing of testing. As gain adoption, it becomes critical check they are correct efficient. Unfortunately, checking currently relies solely on limited tool developers manually write. We present RTSCheck, first framework for tools. RTSCheck feeds...

10.1109/icse.2019.00056 article EN 2019-05-01

Client-Specific Upgrade Compatibility Checking via Knowledge-Guided Discovery

OPENALEX - Publications

Chenguang Zhu Mengshi Zhang Xiuheng Wu Xiufeng Xu Yi Li

Modern software systems are complex, and they heavily rely on external libraries developed by different teams organizations. Such suffer from higher instability due to incompatibility issues caused library upgrades. In this article, we address the problem investigating impact of a upgrade behaviors its clients. We CompCheck , an automated compatibility checking framework that generates incompatibility-revealing tests based previous examples. first establishes offline knowledge base mining...

10.1145/3582569 article EN ACM Transactions on Software Engineering and Methodology 2023-02-01

Towards refactoring-aware regression test selection

OPENALEX - Publications

Kaiyuan Wang Chenguang Zhu Ahmet Çelik Jongwook Kim Don Batory and 1 more

Regression testing checks that recent project changes do not break previously working functionality. Although important, regression is costly when are frequent. test selection (RTS) optimizes by running only tests whose results might be affected a change. Traditionally, RTS collects dependencies (e.g., on files) for each and skips the tests, at new revision, did Existing techniques differentiate behavior-preserving transformations (i.e., refactorings) from other code changes. As result, run...

10.1145/3180155.3180254 article EN Proceedings of the 44th International Conference on Software Engineering 2018-05-27

A Florfenicol-Resistant Plasmid Shuttling Between Actinobacillus pleuropneumoniae and Glaesserella parasuis

OPENALEX - Publications

Chenguang Zhu Jinshuang Cai Jiahui An Baoge Zhang Yufeng Li

Porcine contagious pleuropneumonia, caused by

10.1089/mdr.2023.0127 article EN Microbial Drug Resistance 2024-02-16

Precise semantic history slicing through dynamic delta refinement

OPENALEX - Publications

Yi Li Chenguang Zhu Julia Rubin Marsha Chećhik

Semantic history slicing solves the problem of extracting changes related to a particular high-level functionality from software version histories. State-of-the-art techniques combine static program analysis and dynamic execution tracing infer an over-approximated set that can preserve functional behaviors captured by test suite. However, due conservative nature such techniques, sliced histories may contain irrelevant changes. In this paper, we propose divide-and-conquer-style partitioning...

10.1145/2970276.2970336 article EN 2016-08-25

Repairing order-dependent flaky tests via test generation

OPENALEX - Publications

Cheng‐Peng Li Chenguang Zhu Wenxi Wang August Shi

Flaky tests are that pass or fail nondeterministically on the same version of code. These can mislead developers concerning quality their code changes during regression testing. A common kind flaky order-dependent tests, whose pass/fail outcomes depend test order in which they run. Such have different because other running before them pollute shared state. Prior work has proposed repairing by searching for existing known as "cleaners", reset state, allowing to when run after a polluted The...

10.1145/3510003.3510173 article EN Proceedings of the 44th International Conference on Software Engineering 2022-05-21

See What LLMs Cannot Answer: A Self-Challenge Framework for Uncovering LLM Weaknesses

OPENALEX - Publications

Yulong Chen Yang Liu Jianhao Yan Xuefeng Bai Zhong Ming and 4 more

The impressive performance of Large Language Models (LLMs) has consistently surpassed numerous human-designed benchmarks, presenting new challenges in assessing the shortcomings LLMs. Designing tasks and finding LLMs' limitations are becoming increasingly important. In this paper, we investigate question whether an LLM can discover its own from errors it makes. To end, propose a Self-Challenge evaluation framework with human-in-the-loop. Starting seed instances that GPT-4 fails to answer,...

10.48550/arxiv.2408.08978 preprint EN arXiv (Cornell University) 2024-08-16

A Dataset for Dynamic Discovery of Semantic Changes in Version Controlled Software Histories

OPENALEX - Publications

Chenguang Zhu Yi Li Julia Rubin Marsha Chećhik

Over the last few years, researchers proposed several semantic history slicing approaches that identify set of semantically-related commits implementing a particular software functionality. However, there is no comprehensive benchmark for evaluating these approaches, making it difficult to assess their capabilities. This paper presents dataset 81 change data collected from 8 real-world projects. The created benchmarking techniques. We provide details on collection process and storage format....

10.1109/msr.2017.49 article EN 2017-05-01

FHistorian

OPENALEX - Publications

Yi Li Chenguang Zhu Julia Rubin Marsha Chećhik

Feature location techniques aim to locate software artifacts that implement a specific program functionality, a.k.a. feature. In this paper, we build upon the previous work of semantic history slicing features in version histories. We leverage information embedded histories for identifying changes implementing and discovering relationships between features.

10.1145/3106195.3106216 article EN 2017-09-01

Restoring the Executability of Jupyter Notebooks by Automatic Upgrade of Deprecated APIs

OPENALEX - Publications

Chenguang Zhu Ripon K. Saha Mukul R. Prasad Sarfraz Khurshid

Data scientists typically practice exploratory programming using computational notebooks, to comprehend new data and extract insights. To do this they iteratively refine their code, actively trying re-use re-purpose solutions created by other scientists, in real time. However, recent studies have shown that a vast majority of publicly available notebooks cannot be executed out the box. One prominent reasons is deprecation science APIs used such due rapid evolution libraries. In work we...

10.1109/ase51524.2021.9678889 article EN 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE) 2021-11-01

A Vision on Intentions in Software Engineering

OPENALEX - Publications

Jacob Krüger Yi Li Chenguang Zhu Marsha Chećhik Thorsten Berger and 1 more

Intentions are fundamental in software engineering, but they typically only implicitly considered through different abstractions, such as requirements, use cases, features, or issues. Specifically, engineers develop and evolve (i.e., change) a system based on abstractions of stakeholder's intention—something stakeholder wants the to be able do. Unfortunately, existing (inherently) limited when it comes representing intentions mostly used for documenting only. So, whether change fulfills its...

10.1145/3611643.3613087 article EN cc-by 2023-11-30

Precfix

OPENALEX - Publications

Xindong Zhang Chenguang Zhu Yi Li Jianmei Guo Lihua Liu and 1 more

Patch recommendation is the process of identifying errors in software systems and suggesting suitable fixes for them. can significantly improve developer productivity by reducing both debugging repairing time. Existing techniques usually rely on complete test suites detailed reports, which are often absent practical industrial settings. In this paper, we propose Precfix, a pragmatic approach targeting large-scale codebase making recommendations based previously observed activities. Precfix...

10.1145/3377813.3381356 article EN 2020-06-27

DIFFBASE: a differential factbase for effective software evolution management

OPENALEX - Publications

Xiuheng Wu Chenguang Zhu Yi Li

Numerous tools and techniques have been developed to extract analyze information from software development artifacts. Yet, there is a lack of effective method process, store, exchange among different analyses. In this paper, we propose differential factbase, uniform exchangeable representation supporting efficient querying manipulation, based on the existing concept program facts. We consider changes as first-class objects, which establish links between intra-version facts single snapshots...

10.1145/3468264.3468605 article EN 2021-08-18

Precise semantic history slicing through dynamic delta refinement

OPENALEX - Publications

Yi Li Chenguang Zhu Milos Gligoric Julia Rubin Marsha Chećhik

10.1007/s10515-019-00260-8 article EN Automated Software Engineering 2019-06-15

Compsuite: A Dataset of Java Library Upgrade Incompatibility Issues

OPENALEX - Publications

Xiufeng Xu Chenguang Zhu Yi Li

Modern software systems heavily rely on external libraries developed by third-parties to ensure efficient development. However, frequent library upgrades can lead compatibility issues between the and their client systems. In this paper, we introduce Compsuite, a dataset that includes 123 real-world Java client-library pairs where upgrading causes an incompatibility issue in corresponding client. Each Compsuite is associated with test case authored developers, which be used reproduce issue....

10.1109/ase56229.2023.00127 article EN 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE) 2023-09-11

Identifying Solidity Smart Contract API Documentation Errors

OPENALEX - Publications

Chenguang Zhu Ye Liu Xiuheng Wu Yi Li

Smart contracts are gaining popularity as a means to support transparent, traceable, and self-executing decentralized applications, which enable the exchange of value in trustless environment. Developers smart rely on various libraries, such OpenZeppelin for Solidity contracts, improve application quality reduce development costs. The API documentations these libraries important sources information developers who unfamiliar with APIs. Yet, maintaining high-quality is non-trivial, errors may...

10.1145/3551349.3556963 article EN 2022-10-10

GenSlice: Generalized Semantic History Slicing

OPENALEX - Publications

Chenguang Zhu Yi Li Julia Rubin Marsha Chećhik

Semantic history slicing addresses the problem of identifying changes related to a particular high-level functionality from software change histories. Existing solutions are either imprecise, resulting in larger-than-necessary slices, or inefficient, taking long time execute. In this paper, we develop generalized framework, named GenSlice, which overcomes aforementioned limitations. GenSlice abstracts existing techniques and management operations (such as splitting commits into fine-grained...

10.1109/icsme46990.2020.00018 article EN 2020-09-01

SapientML

OPENALEX - Publications

Ripon K. Saha Akira Ura Sonal Mahajan Chenguang Zhu Linyi Li and 4 more

Automatic machine learning, or AutoML, holds the promise of truly democratizing use learning (ML), by substantially automating work data scientists. However, huge combinatorial search space candidate pipelines means that current AutoML techniques, generate sub-optimal pipelines, none at all, especially on large, complex datasets. In this we propose an technique SapientML, can learn from a corpus existing datasets and their human-written efficiently high-quality pipeline for predictive task...

10.1145/3510003.3510226 article EN Proceedings of the 44th International Conference on Software Engineering 2022-05-21

Rescue and identification of recombinant Porcine Circovirus Type 3

OPENALEX - Publications

Baoge Zhang Jinshuang Cai Chenguang Zhu Ping Deng Qicai Ji and 2 more

Abstract PCV3 is prevalent and causes many forms of swine diseases worldwide. To date, isolation has been unsuccessful. Therefore, obtaining studying its biological traits are urgently needed. In the present study, recombinant (rPCV3) was successfully generated, it’s biologically characterization performed. The genome sequence optimized, cloned inserted into pBluescript SK vector. PK-15 cells transfected with plasmid were serially passaged characterized. obtained rPCV3 purified through...

10.21203/rs.3.rs-3930077/v1 preprint EN cc-by Research Square (Research Square) 2024-02-16

Coming Soon ...