NFDI4DS | UHH-SEMS - Publication Details

Shin Hwei Tan

ORCID: 0000-0001-8633-3372

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5051957977

Research Areas

Software Engineering Research
Software Testing and Debugging Techniques
Software Reliability and Analysis Research
Advanced Malware Detection Techniques
Software System Performance and Reliability
Open Source Software Innovations
Topic Modeling
Software Engineering Techniques and Practices
Parallel Computing and Optimization Techniques
Security and Verification in Computing
Scientific Computing and Data Management
Advanced Computational Techniques and Applications
Advanced Materials Characterization Techniques
Spreadsheets and End-User Computing
Boron and Carbon Nanomaterials Research
Particle Detector Development and Performance
Hydrogen Storage and Materials
Advanced Data Storage Technologies
Privacy-Preserving Technologies in Data
Mechanical Failure Analysis and Simulation
Viral Infectious Diseases and Gene Expression in Insects
Particle accelerators and beam dynamics
Speech and dialogue systems
Information and Cyber Security
Superconductivity in MgB2 and Alloys

Concordia University
2023-2024

Hong Kong University of Science and Technology
2023-2024

University of Hong Kong
2023-2024

Government of Canada
2023-2024

Southern University of Science and Technology
2018-2023

University of Waterloo
2023

National University of Singapore
2013-2018

University of Illinois System
2012

University of Illinois Urbana-Champaign
2008-2011

Combining Graph-Based Learning With Automated Data Collection for Code Vulnerability Detection

OPENALEX - Publications

Huanting Wang Guixin Ye Zhanyong Tang Shin Hwei Tan Songfang Huang and 4 more

This paper presents FUNDED (Flow-sensitive vUl-Nerability coDE Detection), a novel learning framework for building vulnerability detection models. Funded leverages the advances in graph neural networks (GNNs) to develop graph-based method capture and reason about program's control, data, call dependencies. Unlike prior work that treats program as sequential sequence or an untyped graph, learns operates on representation of source code, which individual statements are connected other through...

10.1109/tifs.2020.3044773 article EN IEEE Transactions on Information Forensics and Security 2020-12-14

Automated Repair of Programs from Large Language Models

OPENALEX - Publications

Zhiyu Fan Xiang Gao Мартин Мирчев Abhik Roychoudhury Shin Hwei Tan

Large language models such as Codex, have shown the capability to produce code for many programming tasks. However, success rate of existing is low, especially complex One reasons that lack awareness program semantics, resulting in incorrect programs, or even programs which do not compile. In this paper, we systematically study whether automated repair (APR) techniques can fix solutions produced by LeetCode contests. The goal APR enhance reliability large models. Our revealed that: (1)...

10.1109/icse48619.2023.00128 article EN 2023-05-01

@tComment: Testing Javadoc Comments to Detect Comment-Code Inconsistencies

OPENALEX - Publications

Shin Hwei Tan Darko Marinov Lin Tan Gary T. Leavens

Code comments are important artifacts in software. Javadoc widely used Java for API specifications. developers write comments, and users read these to understand the API, e.g., reading a comment method instead of body. An inconsistency between body indicates either fault or, effectively, that can mislead callers introduce faults their code. We present novel approach, called @TCOMMENT, testing specifically properties about null values related exceptions. Our approach consists two components....

10.1109/icst.2012.106 article EN 2012-04-01

Anti-patterns in search-based program repair

OPENALEX - Publications

Shin Hwei Tan Hiroaki Yoshida Mukul R. Prasad Abhik Roychoudhury

Search-based program repair automatically searches for a fix within given space. This may be accomplished by retrofitting generic search algorithm as evidenced the GenProg tool, or building customized in SPR. Unfortunately, automated approaches produce patches that rejected programmers, because of which past works have suggested using human-written to templates guide repair. In this work, we take position will not provide unduly restrict space and attempt overfit repairs into one provided...

10.1145/2950290.2950295 article EN 2016-11-01

Codeflaws: a programming competition benchmark for evaluating automated program repair tools

OPENALEX - Publications

Shin Hwei Tan Jooyong Yi Yulis Sergey Mechtaev Abhik Roychoudhury

Several automated program repair techniques have been proposed to reduce the time and effort spent in bug-fixing. While these tools are designed be generic such that they could address many software faults, different may fix certain types of faults more effectively than other tools. Therefore, it is important compare objectively effectiveness on various fault types. However, existing benchmarks repairs do not allow thorough investigation relationship between We present Codeflaws, a set 3902...

10.1109/icse-c.2017.76 article EN 2017-05-01

A feasibility study of using automated program repair for introductory programming assignments

OPENALEX - Publications

Jooyong Yi Umair Z. Ahmed Amey Karkare Shin Hwei Tan Abhik Roychoudhury

Despite the fact an intelligent tutoring system for programming (ITSP) education has long attracted interest, its widespread use been hindered by difficulty of generating personalized feedback automatically. Meanwhile, automated program repair (APR) is emerging new technology that automatically fixes software bugs, and it shown APR can fix bugs large real-world software. In this paper, we study feasibility marrying APR. We perform our with four state-of-the-art tools (GenProg, AE, Angelix,...

10.1145/3106237.3106262 article EN 2017-08-02

relifix: automated repair of software regressions

OPENALEX - Publications

Shin Hwei Tan Abhik Roychoudhury

Regression occurs when code changes introduce failures in previously passing test cases. As software evolves, regressions may be introduced. Fixing regression errors manually is time-consuming and error-prone. We propose an approach of automated repair regressions, called relifix, that considers the problem as a recon- ciling problematic changes. Specifically, we derive set transformations obtained from our manual inspection 73 real regressions; this uses syntactical information changed...

10.5555/2818754.2818813 article EN International Conference on Software Engineering 2015-05-16

Repairing crashes in Android apps

OPENALEX - Publications

Shin Hwei Tan Zhen Dong Xiang Gao Abhik Roychoudhury

Android apps are omnipresent, and frequently suffer from crashes --- leading to poor user experience economic loss. Past work focused on automated test generation detect in apps. However, repair of has not been studied. In this paper, we propose the first approach automatically apps, specifically a technique for fixing Unlike most test-based approaches, do need test-suite; instead single failing is meticulously analyzed crash locations reasons behind these crashes. Our hinges careful...

10.1145/3180155.3180243 article EN Proceedings of the 44th International Conference on Software Engineering 2018-05-27

Automated conformance testing for JavaScript engines via deep compiler fuzzing

OPENALEX - Publications

Guixin Ye Zhanyong Tang Shin Hwei Tan Songfang Huang Dingyi Fang and 4 more

JavaScript (JS) is a popular, platform-independent programming language. To ensure the interoperability of JS programs across different platforms, implementation engine should conform to ECMAScript standard. However, doing so challenging as there are many subtle definitions API behaviors, and keep evolving.

10.1145/3453483.3454054 preprint EN 2021-06-18

relifix: Automated Repair of Software Regressions

OPENALEX - Publications

Shin Hwei Tan Abhik Roychoudhury

Regression occurs when code changes introduce failures in previously passing test cases. As software evolves, regressions may be introduced. Fixing regression errors manually is time-consuming and error-prone. We propose an approach of automated repair regressions, called relifix, that considers the problem as a reconciling problematic changes. Specifically, we derive set transformations obtained from our manual inspection 73 real regressions; this uses syntactical information changed...

10.1109/icse.2015.65 article EN 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering 2015-05-01

Testing Refactoring Engine via Historical Bug Report driven LLM

OPENALEX - Publications

Haibo Wang Zejiang Xu Shin Hwei Tan

Refactoring is the process of restructuring existing code without changing its external behavior while improving internal structure. engines are integral components modern Integrated Development Environments (IDEs) and can automate or semi-automate this to enhance readability, reduce complexity, improve maintainability software products. Similar traditional systems such as compilers, refactoring may also contain bugs that lead unexpected behaviors. In paper, we propose a novel approach...

10.48550/arxiv.2501.09879 preprint EN arXiv (Cornell University) 2025-01-16

Test-Equivalence Analysis for Automatic Patch Generation

OPENALEX - Publications

Sergey Mechtaev Xiang Gao Shin Hwei Tan Abhik Roychoudhury

Automated program repair is a problem of finding transformation (called patch) given incorrect that eliminates the observable failures. It has important applications such as providing debugging aids, automatically grading student assignments, and patching security vulnerabilities. A common challenge faced by existing techniques scalability to large patch spaces, since there are many candidate patches these explicitly or implicitly consider. The correctness criteria for often suite tests....

10.1145/3241980 article EN ACM Transactions on Software Engineering and Methodology 2018-10-22

Could I Have a Stack Trace to Examine the Dependency Conflict Issue?

OPENALEX - Publications

Ying Wang Ming Wen Rongxin Wu Zhenwei Liu Shin Hwei Tan and 3 more

Intensive use of libraries in Java projects brings potential risk dependency conflicts, which occur when a project directly or indirectly depends on multiple versions the same library class. When this happens, JVM loads one version and shadows others. Runtime exceptions can methods shadowed are referenced. Although management tools such as Maven able to give warnings conflicts is built, developers often ask for crashing stack traces before examining these warnings. It motivates us develop...

10.1109/icse.2019.00068 article EN 2019-05-01

A correlation study between automated program repair and test-suite metrics

OPENALEX - Publications

Jooyong Yi Shin Hwei Tan Sergey Mechtaev Marcel Böhme Abhik Roychoudhury

10.1007/s10664-017-9552-y article EN Empirical Software Engineering 2017-09-29

Investigating and Detecting Silent Bugs in PyTorch Programs

OPENALEX - Publications

Shuo Hong Hailong Sun Xiang Gao Shin Hwei Tan

10.1109/saner60148.2024.00035 article EN 2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER) 2024-03-12

Android testing via synthetic symbolic execution

OPENALEX - Publications

Xiang Gao Shin Hwei Tan Zhen Dong Abhik Roychoudhury

Symbolic execution of Android applications is challenging as it involves either building a customized VM for or modeling the libraries. Since Runtime evolves from one version to another, high-fidelity symbolic engine effect libraries and their evolved versions. Without simulating behavior libraries, path divergence may occur due constraint loss when values flow into framework these later affect subsequent taken. Previous works such JPF-Android have relied on environment In this work, we...

10.1145/3238147.3238225 article EN 2018-08-20

Automated Patch Transplantation

OPENALEX - Publications

Ridwan Shariffdeen Shin Hwei Tan Mingyuan Gao Abhik Roychoudhury

Automated program repair is an emerging area that attempts to patch software errors and vulnerabilities. In this article, we formulate study a problem related automated repair, namely transplantation. A for error in donor automatically adapted inserted into “similar” target program. We observe despite standard procedures vulnerability disclosures publishing of patches, many un-patched occurrences remain the wild. One main reasons fact various implementations same functionality may exist and,...

10.1145/3412376 article EN ACM Transactions on Software Engineering and Methodology 2020-12-31

Recursive State Machine Guided Graph Folding for Context-Free Language Reachability

OPENALEX - Publications

Yuxiang Lei Yulei Sui Shin Hwei Tan Qirun Zhang

Context-free language reachability (CFL-reachability) is a fundamental framework for program analysis. A large variety of static analyses can be formulated as CFL-reachability problems, which determines whether specific source-sink pairs in an edge-labeled graph are connected by reachable path, i.e., path whose edge labels form string accepted the given CFL. Computing expensive. The fastest algorithm exhibits slightly subcubic time complexity with respect to input size. Improving scalability...

10.1145/3591233 article EN Proceedings of the ACM on Programming Languages 2023-06-06

Collaborative bug finding for Android apps

OPENALEX - Publications

Shin Hwei Tan Ziqiang Li

Many automated test generation techniques have been proposed for finding crashes in Android apps. Despite recent advancement these approaches, a study shows that app developers prefer reading cases written natural language. Meanwhile, there exist redundancies bug reports (written language) across different apps not previously reused. We propose collaborative finding, novel approach uses bugs other similar to discover the under test. design three settings with varying degrees of interactions...

10.1145/3377811.3380349 article EN 2020-06-27

ReAssert

OPENALEX - Publications

Brett Daniel Danny Dig Tihomir Gvero Vilas Jagannath Johnston Jiaa and 4 more

Successful software systems continuously change their requirements and thus code. When this happens, some existing tests get broken because they no longer reflect the intended behavior, need to be updated. Repairing can time-consuming difficult.

10.1145/1985793.1985978 article EN 2011-05-21

Automating CUDA Synchronization via Program Transformation

OPENALEX - Publications

Mingyuan Wu Lingming Zhang Cong Liu Shin Hwei Tan Yuqun Zhang

While CUDA has been the most popular parallel computing platform and programming model for general purpose GPU computing, synchronization undergoes significant challenges programmers due to its intricate mechanism coding practices. In this paper, we propose AuCS, first framework automate kernel functions. AuCS transforms original LLVM-level program control flow graph in a semantic-preserving manner exploring possible barrier function locations. Accordingly, develops mechanisms correctly...

10.1109/ase.2019.00075 article EN 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE) 2019-11-01

Automated patch backporting in Linux (experience paper)

OPENALEX - Publications

Ridwan Shariffdeen Xiang Gao Gregory J. Duck Shin Hwei Tan Julia Lawall and 1 more

Whenever a bug or vulnerability is detected in the Linux kernel, kernel developers will endeavour to fix it by introducing patch into mainline version of source tree. However, many users run older "stable" versions Linux, meaning that should also be "backported" one more these versions. This process error-prone and there usually along delay publishing backported patch. Based on an empirical study, we show around 8% all commits submitted are versions,but often than month elapses before...

10.1145/3460319.3464821 preprint EN 2021-07-08

Event-aware precise dynamic slicing for automatic debugging of Android applications

OPENALEX - Publications

Hsu Myat Win Shin Hwei Tan Yulei Sui

10.1016/j.jss.2023.111606 article EN Journal of Systems and Software 2023-01-07

Combining Structured Static Code Information and Dynamic Symbolic Traces for Software Vulnerability Prediction

OPENALEX - Publications

Huanting Wang Zhanyong Tang Shin Hwei Tan Jie Wang Yuzhe Liu and 3 more

Deep learning (DL) has emerged as a viable means for identifying software bugs and vulnerabilities. The success of DL relies on having suitable representation the problem domain. However, existing DL-based solutions program representations have limitations - they either cannot capture deep, precise semantics or suffer from poor scalability. We present Concoction, first system to learn presentations by combining static source code information dynamic execution traces. Concoction employs...

10.1145/3597503.3639212 article EN cc-by 2024-04-12

Automatic Programming: Large Language Models and Beyond

OPENALEX - Publications

Michael R. Lyu Baishakhi Ray Abhik Roychoudhury Shin Hwei Tan Patanamon Thongtanunam

Automatic programming has seen increasing popularity due to the emergence of tools like GitHub Copilot which rely on Large Language Models (LLMs). At same time, automatically generated code faces challenges during deployment concerns around quality and trust. In this article, we study automated coding in a general sense quality, security related issues programmer responsibility. These are key for organizations while deciding usage code. We discuss how advances software engineering such as...

10.48550/arxiv.2405.02213 preprint EN arXiv (Cornell University) 2024-05-03

Coming Soon ...