NFDI4DS | UHH-SEMS - Publication Details

Xiaoguang Mao

ORCID: 0000-0003-4204-7424

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5080183182

Research Areas

Software Testing and Debugging Techniques
Software Engineering Research
Software Reliability and Analysis Research
Software System Performance and Reliability
Advanced Malware Detection Techniques
Topic Modeling
Security and Verification in Computing
Parallel Computing and Optimization Techniques
Advanced Software Engineering Methodologies
Natural Language Processing Techniques
Software Engineering Techniques and Practices
Model-Driven Software Engineering Techniques
Formal Methods in Verification
Service-Oriented Architecture and Web Services
Web Application Security Vulnerabilities
Anomaly Detection Techniques and Applications
Distributed systems and fault tolerance
Semantic Web and Ontologies
Numerical Methods and Algorithms
Advanced Text Analysis Techniques
Adversarial Robustness in Machine Learning
Web Data Mining and Analysis
Heavy metals in environment
VLSI and Analog Circuit Testing
Synthetic Organic Chemistry Methods

National University of Defense Technology
2016-2025

Changsha University
2018

Beihang University
2010

The strength of random search on automated program repair

OPENALEX - Publications

Yuhua Qi Xiaoguang Mao Yan Lei Ziying Dai Chengsong Wang

Automated program repair recently received considerable attentions, and many techniques on this research area have been proposed. Among them, two genetic-programming-based techniques, GenProg Par, shown the promising results. In particular, has used as baseline technique to check effectiveness of new in much literature. Although Par their strong ability fixing real-life bugs nontrivial programs, what extent can benefit from genetic programming, by them guide patch search process, is still unknown.

10.1145/2568225.2568254 article EN Proceedings of the 44th International Conference on Software Engineering 2014-05-20

Large Language Models are Few-Shot Summarizers: Multi-Intent Comment Generation via In-Context Learning

OPENALEX - Publications

Mingyang Geng Shangwen Wang Dezun Dong Haotian Wang Ge Li and 3 more

Code comment generation aims at generating natural language descriptions for a code snippet to facilitate developers' program comprehension activities. Despite being studied long time, bottleneck existing approaches is that given snippet, they can only generate one while developers usually need know information from diverse perspectives such as what the functionality of this and how use it. To tackle limitation, study empirically investigates feasibility utilizing large models (LLMs)...

10.1145/3597503.3608134 article EN 2024-02-06

Slice-based statistical fault localization

OPENALEX - Publications

Xiaoguang Mao Yan Lei Ziying Dai Yuhua Qi Chengsong Wang

10.1016/j.jss.2013.08.031 article EN Journal of Systems and Software 2013-09-05

On the efficiency of test suite based program repair

OPENALEX - Publications

Kui Liu Shangwen Wang Anil Koyuncu Kisub Kim Tegawendé F. Bissyandé and 5 more

Test-based automated program repair has been a prolific field of research in software engineering the last decade. Many approaches have indeed proposed, which leverage test suites as weak, but affordable, approximation to specifications. Although literature regularly sets new records on number benchmark bugs that can be fixed, several studies increasingly raise concerns about limitations and biases state-of-the-art approaches. For example, correctness generated patches questioned studies,...

10.1145/3377811.3380338 preprint EN 2020-06-27

CNN-FL: An Effective Approach for Localizing Faults using Convolutional Neural Networks

OPENALEX - Publications

Zhuo Zhang Yan Lei Xiaoguang Mao Panpan Li

Fault localization aims at identifying suspicious statements potentially responsible for failures. The recent rapid progress on deep learning shows the promising potential of many neural network architectures in making sense data, and more importantly, this offers a new prospective probably benefiting fault localization. Thus, paper proposes CNN-FL: an approach localizing faults based convolutional networks to explore Specifically, CNN-FL constructs customized localization, then trains with...

10.1109/saner.2019.8668002 article EN 2019-02-01

Peculiar: Smart Contract Vulnerability Detection Based on Crucial Data Flow Graph and Pre-training Techniques

OPENALEX - Publications

Hongjun Wu Zhuo Zhang Shangwen Wang Lei Yan Bo Lin and 3 more

Smart contracts with natural economic attributes have been widely and rapidly developed in various fields. However, the bugs vulnerabilities smart brought huge losses, which has strengthened people's attention to security issues of contracts. The immutability makes people more willing conduct checks before deploying Nonetheless, existing contract vulnerability detection techniques are far away from enough: static analysis approaches rely heavily on manually crafted heuristics is difficult...

10.1109/issre52982.2021.00047 article EN 2021-10-01

Context-Aware Code Change Embedding for Better Patch Correctness Assessment

OPENALEX - Publications

Bo Lin Shangwen Wang Ming Wen Xiaoguang Mao

Despite the capability in successfully fixing more and real-world bugs, existing Automated Program Repair (APR) techniques are still challenged by long-standing overfitting problem (i.e., a generated patch that passes all tests is actually incorrect). Plenty of approaches have been proposed for automated correctness assessment (APCA ). Nonetheless, dynamic ones those needed to execute tests) time-consuming while static built on top code features) less precise. Therefore, embedding recently,...

10.1145/3505247 article EN ACM Transactions on Software Engineering and Methodology 2022-05-18

A universal data augmentation approach for fault localization

OPENALEX - Publications

Huan Xie Yan Lei Meng Yan Yue Yu Xin Xia and 1 more

Data is the fuel to models, and it still applicable in fault localization (FL). Many existing elaborate FL techniques take code coverage matrix failure vector as inputs, expecting could find correlation between program entities failures. However, input data high-dimensional extremely unbalanced since real-world programs are large size number of failing test cases much less than that passing cases, which posing severe threats effectiveness techniques.

10.1145/3510003.3510136 article EN Proceedings of the 44th International Conference on Software Engineering 2022-05-21

How to Bridge the Gap between Modalities: Survey on Multimodal Large Language Model

OPENALEX - Publications

Shezheng Song Xiaopeng Li Shasha Li Shan Zhao Jie Yu and 4 more

10.1109/tkde.2025.3527978 article EN IEEE Transactions on Knowledge and Data Engineering 2025-01-01

Efficient Automated Program Repair through Fault-Recorded Testing Prioritization

OPENALEX - Publications

Yuhua Qi Xiaoguang Mao Yan Lei

Most techniques for automated program repair use test cases to validate the effectiveness of produced patches. The validation process can be time-consuming especially when object programs ship with either lots or some long-running cases. To alleviate cost testing, we first introduce regression prioritization insight into area repair, and present a novel technique called FRTP goal reducing number case executions in process. Unlike most existing frequently requiring additional gathering...

10.1109/icsm.2013.29 article EN 2013-09-01

Using automated program repair for evaluating the effectiveness of fault localization techniques

OPENALEX - Publications

Yuhua Qi Xiaoguang Mao Yan Lei Chengsong Wang

Many techniques on automated fault localization (AFL) have been introduced to assist developers in debugging. Prior studies evaluate the technique from viewpoint of developers: measuring how many benefits that can obtain used when However, these evaluation approaches are not always suitable, because it is difficult quantify precisely due complex debugging behaviors developers. In addition, recent user presented working with AFL do correct defects more efficiently than ones only traditional...

10.1145/2483760.2483785 article EN 2013-07-15

Counterexample-Preserving Reduction for Symbolic Model Checking

OPENALEX - Publications

Wanwei Liu Rui Wang Xianjin Fu Ji Wang Wei Dong and 1 more

The cost of LTL model checking is highly sensitive to the length formula under verification. We observe that, some specific conditions, input can be reduced an easier-to-handle one before checking. In such reduction, these two formulae need not logically equivalent, but they share same counterexample set w.r.t model. case that symbolically represented, condition enabling reduction detected with a lightweight effort (e.g., SAT-solving). this paper, we tentatively name technique...

10.1155/2014/702165 article EN cc-by Journal of Applied Mathematics 2014-01-01

Automated patch correctness assessment

OPENALEX - Publications

Shangwen Wang Ming Wen Bo Lin Hongjun Wu Yihao Qin and 3 more

Test-based automated program repair (APR) has attracted huge attention from both industry and academia. Despite the significant progress made in recent studies, overfitting problem (i.e., generated patch is plausible but overfitting) still a major long-standing challenge. Therefore, plenty of techniques have been proposed to assess correctness patches either generation phase or evaluation APR techniques. However, effectiveness existing not systematically compared little known their...

10.1145/3324884.3416590 article EN 2020-12-21

A field study to estimate heavy metal concentrations in a soil-rice system: Application of graph neural networks

OPENALEX - Publications

Panpan Li Huijuan Hao Zhuo Zhang Xiaoguang Mao Jianjun Xu and 3 more

10.1016/j.scitotenv.2022.155099 article EN The Science of The Total Environment 2022-04-07

CCT5: A Code-Change-Oriented Pre-trained Model

OPENALEX - Publications

Bo Lin Shangwen Wang Zhongxin Liu Yepang Liu Xin Xia and 1 more

Software is constantly changing, requiring developers to perform several derived tasks in a timely manner, such as writing description for the intention of code change, or identifying defect-prone changes. Considering that cost dealing with these can account large proportion (typically around 70 percent) total development expenditure, automating processes will significantly lighten burdens developers. To achieve target, existing approaches mainly rely on training deep learning models from...

10.1145/3611643.3616339 article EN 2023-11-30

Deep Learning-Based Fault Localization with Contextual Information

OPENALEX - Publications

Zhuo Zhang Yan Lei Qingping Tan Xiaoguang Mao Ping Zeng and 1 more

Fault localization is essential for solving the issue of software faults. Aiming at improving fault localization, this paper proposes a deep learning-based with contextual information. Specifically, our approach uses neural network to construct suspiciousness evaluation model evaluate statement being faulty, and then leverages dynamic backward slicing extract The empirical results show that significantly outperforms state-of-the-art technique Dstar.

10.1587/transinf.2017edl8143 article EN IEICE Transactions on Information and Systems 2017-01-01

A study of effectiveness of deep learning in locating real faults

OPENALEX - Publications

Zhuo Zhang Yan Lei Xiaoguang Mao Meng Yan Ling Xu and 1 more

10.1016/j.infsof.2020.106486 article EN Information and Software Technology 2020-11-15

Reentrancy Vulnerability Detection and Localization: A Deep Learning Based Two-phase Approach

OPENALEX - Publications

Zhuo Zhang Yan Lei Meng Yan Yue Yu Jiachi Chen and 2 more

Smart contracts have been widely and rapidly used to automate financial business transactions together with blockchains, helping people make agreements while minimizing trusts. With millions of smart deployed on blockchain, various bugs vulnerabilities in emerged. Following the rapid development deep learning, many recent studies learning for vulnerability detection conduct security checks before deploying contracts. These approaches show effective results detecting whether a contract is...

10.1145/3551349.3560428 article EN 2022-10-10

Efficient automated repair of high floating-point errors in numerical libraries

OPENALEX - Publications

Xin Yi Liqian Chen Xiaoguang Mao Tao Ji

Floating point computation is by nature inexact, and numerical libraries that intensively involve floating-point computations may encounter high errors. Due to the wide use of libraries, it highly desired reduce errors in them. Using higher precision will degrade performance also introduce extra for certain precision-specific operations libraries. mathematical rewriting mostly focuses on rearranging expressions or taking Taylor expansions not fit reducing evoked ill-conditioned problems are...

10.1145/3290369 article EN Proceedings of the ACM on Programming Languages 2019-01-02

Improving deep‐learning‐based fault localization with resampling

OPENALEX - Publications

Zhuo Zhang Yan Lei Xiaoguang Mao Meng Yan Ling Xu and 1 more

Abstract Many fault localization approaches recently utilize deep learning to learn an effective model showing a fresh perspective with promising results. However, models are generally learned from class imbalance datasets; that is, the number of failing test cases is much fewer than passing cases. It may be highly susceptible affect accuracy models. Thus, in this paper, we explore using data resampling reduce negative effect imbalanced problem and improve deep‐learning‐based localization....

10.1002/smr.2312 article EN Journal of Software Evolution and Process 2020-08-26

Lightweight global and local contexts guided method name recommendation with prior knowledge

OPENALEX - Publications

Shangwen Wang Ming Wen Bo Lin Xiaoguang Mao

The quality of method names is critical for the readability and maintainability source code. However, it often challenging to construct concise names. To alleviate this problem, a number approaches have been proposed automatically recommend high-quality methods. Despite being effective, existing meet their bottlenecks mainly in two aspects: (1) leveraged information restricted target itself; (2) lack distinctions towards contributions tokens extracted from different program contexts. Through...

10.1145/3468264.3468567 article EN 2021-08-18

Predictive Comment Updating With Heuristics and AST-Path-Based Neural Learning: A Two-Phase Approach

OPENALEX - Publications

Bo Lin Shangwen Wang Zhongxin Liu Xin Xia Xiaoguang Mao

Just-in-time comment update is a promising way to reduce the burden of developers during software maintenance and evolution. Existing approaches can be divided into two categories: heuristic-based approach deep-learning-based approach. The restricted specific type updates (i.e., code-indicative updates), but performs well on such type. effectiveness limited it handle diverse updates. Considering complementary advantages existing approaches, an intuitive idea combine them for better...

10.1109/tse.2022.3185458 article EN cc-by IEEE Transactions on Software Engineering 2022-06-27

One Size Does Not Fit All: Multi-granularity Patch Generation for Better Automated Program Repair

OPENALEX - Publications

Bo Lin Shangwen Wang Ming Wen Liqian Chen Xiaoguang Mao

10.1145/3650212.3680381 article EN 2024-09-11

Exploring the Security Threats of Knowledge Base Poisoning in Retrieval-Augmented Code Generation

OPENALEX - Publications

Bo Lin Shangwen Wang Liqian Chen Xiaoguang Mao

The integration of Large Language Models (LLMs) into software development has revolutionized the field, particularly through use Retrieval-Augmented Code Generation (RACG) systems that enhance code generation with information from external knowledge bases. However, security implications RACG systems, risks posed by vulnerable examples in base, remain largely unexplored. This risk is concerning given public repositories, which often serve as sources for base collection are usually accessible...

10.48550/arxiv.2502.03233 preprint EN arXiv (Cornell University) 2025-02-05

GTE: learning code AST representation efficiently and effectively

OPENALEX - Publications

Yihao Qin Shangwen Wang Bo Lin Kang Yang Xiaoguang Mao

10.1007/s11432-024-4262-5 article EN Science China Information Sciences 2025-02-08

Coming Soon ...