NFDI4DS | UHH-SEMS - Publication Details

Yuetang Deng

ORCID: 0009-0003-7060-4109

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5060932650

Research Areas

Software Testing and Debugging Techniques
Software System Performance and Reliability
Advanced Malware Detection Techniques
Software Engineering Research
Software Reliability and Analysis Research
Natural Language Processing Techniques
Web Data Mining and Analysis
Topic Modeling
Service-Oriented Architecture and Web Services
Distributed systems and fault tolerance
Machine Learning and Data Classification
Network Security and Intrusion Detection
Image and Video Quality Assessment
Advanced Database Systems and Queries
Speech and dialogue systems
IoT and Edge/Fog Computing
Advanced Software Engineering Methodologies
VLSI and Analog Circuit Testing
Video Coding and Compression Technologies
Adversarial Robustness in Machine Learning
Caching and Content Delivery
Cloud Computing and Resource Management
Interactive and Immersive Displays
Teaching and Learning Programming
Personal Information Management and User Behavior

Tencent (China)
2016-2025

University of Illinois Urbana-Champaign
2019

The University of Texas at Dallas
2019

SUNY Polytechnic Institute
2003-2004

New York University
2000

An empirical study of Android test generation tools in industrial cases

OPENALEX - Publications

Wenyu Wang Dengfeng Li Wei Yang Yurui Cao Zhenwen Zhang and 2 more

User Interface (UI) testing is a popular approach to ensure the quality of mobile apps. Numerous test generation tools have been developed support UI on apps, especially for Android Previous work evaluates and compares different using only relatively simple open-source while real-world industrial apps tend more complex functionalities implementations. There no direct comparison among with regard effectiveness ease-of-use these To address such limitation, we study existing state-of-the-art or...

10.1145/3238147.3240465 article EN 2018-08-20

Automated test input generation for Android: are we really there yet in an industrial case?

OPENALEX - Publications

Xia Zeng Deng‐Feng Li Wujie Zheng Fan Xia Yuetang Deng and 3 more

Given the ever increasing number of research tools to automatically generate inputs test Android applications (or simply apps), researchers recently asked question "Are we there yet?" (in terms practicality tools). By conducting an empirical study various tools, found that Monkey (the most widely used tool this category in industrial practices) outperformed all they studied. In paper, present two significant extensions study. First, conduct first case applying against WeChat, a popular...

10.1145/2950290.2983958 article EN 2016-11-01

An AGENDA for testing relational database applications

OPENALEX - Publications

David Chays Yuetang Deng Phyllis G. Frankl Saikat Dan Filippos I. Vokolos and 1 more

Abstract Database systems play an important role in nearly every modern organization, yet relatively little research effort has focused on how to test them. This paper discusses issues arising testing database systems, presents approach applications, and describes AGENDA, a set of tools facilitate the use this approach. In such state before after user's operation plays role, along with input system output. A framework for applications is introduced. complete tool set, based framework, been...

10.1002/stvr.286 article EN Software Testing Verification and Reliability 2004-01-20

Automated test input generation for android: towards getting there in an industrial case

OPENALEX - Publications

Haibing Zheng Dengfeng Li Beihai Liang Xia Zeng Wujie Zheng and 4 more

Monkey, a random testing tool from Google, has been popularly used in industrial practices for automatic test input generation Android due to its applicability variety of application settings, e.g., ease use and compatibility with different platforms. Recently, Monkey under the spotlight research community: recent studies found out that none studied tools academia were actually better than when applied on set open source apps. Our efforts performed first case study applying WeChat, popular...

10.1109/icse-seip.2017.32 article EN 2017-05-01

Emerging App Issue Identification from User Feedback: Experience on WeChat

OPENALEX - Publications

Cuiyun Gao Wujie Zheng Yuetang Deng David Lo Jichuan Zeng and 2 more

It is vital for popular mobile apps with large numbers of users to release updates rich features while keeping stable user experience. Timely and accurately locating emerging app issues can greatly help developers maintain update apps. User feedback (i.e., reviews) a crucial channel between users, delivering stream information about bugs that concern users. Methods identify based on have been proposed in the literature, however, their applicability industry has not explored. We apply recent...

10.1109/icse-seip.2019.00040 article EN 2019-05-01

Testing database transactions with AGENDA

OPENALEX - Publications

Yuetang Deng Phyllis G. Frankl David Chays

AGENDA is a tool set for testing relational database applications. An earlier prototype was targeted to applications consisting of single query and included components populating with data suitable the application, generating inputs query, checking relatively simple aspects results executing query. This paper describes substantial extensions AGENDA, allowing it test transactions multiple queries complex intended behavior. The introduces technique properties state transition performed by...

10.1145/1062455.1062486 article EN 2005-01-01

Record and replay for Android: are we there yet in industrial cases?

OPENALEX - Publications

Wing Lam Zhengkai Wu Dengfeng Li Wenyu Wang Haibing Zheng and 4 more

Mobile applications, or apps for short, are gaining popularity. The input sources (e.g., touchscreen, sensors, transmitters) of the smart devices that host these enable to offer a rich experience users, but pose testing complications developers writing tests accurately utilize multiple together and be able replay such at later time). To alleviate complications, researchers practitioners in recent years have developed variety record-and-replay tools support expressiveness devices. These allow...

10.1145/3106237.3117769 article EN 2017-08-02

GUIDER: GUI structure and vision co-guided test script repair for Android apps

OPENALEX - Publications

Tongtong Xu Minxue Pan Yu Pei Guiyin Li Xia Zeng and 3 more

GUI testing is an essential part of regression for Android apps. For to remain effective, it important that obsolete test scripts get repaired after the app has evolved. In this paper, we propose a novel approach named GUIDER automated repair The key novelty lies in utilization both structural and visual information widgets on GUIs better understand what base version become updated version. A supporting tool been implemented approach. Experiments conducted popular messaging social media...

10.1145/3460319.3464830 article EN 2021-07-08

Practitioners' Expectations on Code Completion

OPENALEX - Publications

Chaozheng Wang Junhao Hu Cuiyun Gao Jin Yu Tao Xie and 3 more

Code completion has become a common practice for programmers during their daily programming activities. It aims at automatically predicting the next tokens or lines that tend to use. A good code tool can substantially save keystrokes and improve efficiency programmers. Recently, various techniques have been proposed usage in practice. However, it is still unclear what are practitioners' expectations on whether existing research met demands. To fill gap, we perform an empirical study by first...

10.48550/arxiv.2301.03846 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Beyond Pass or Fail: A Multi-dimensional Benchmark for Mobile UI Navigation

OPENALEX - Publications

Dezhi Ran Ming C. Wu Hao Yu Yuetong Li Jun Ren and 13 more

Navigating mobile User Interface (UI) applications using large language and vision models based on high-level goal instructions is emerging as an important research field with significant practical implications, such digital assistants automated UI testing. To evaluate the effectiveness of existing in navigation, benchmarks are required widely used literature. Although multiple have been recently established for evaluating functional correctness being judged pass or fail, they fail to...

10.48550/arxiv.2501.02863 preprint EN arXiv (Cornell University) 2025-01-06

Improving Test Efficacy for Large-Scale Android Applications by Exploiting GUI and Functional Equivalence

OPENALEX - Publications

Yifei Lu Minxue Pan Haochuan Lu Yuetang Deng Tian Zhang and 2 more

Large-scale Android apps that provide complex functions are gradually becoming the mainstream in app markets. They tend to display many GUI widgets on a single page, which, unfortunately, can cause more redundant test actions—actions with similar functions—to automatic testing approaches. The effectiveness of existing approaches is still limited, suggesting necessity reducing effort actions. In this paper, we first identify three types structures actions and then propose novel approach,...

10.1145/3729225 article EN ACM Transactions on Software Engineering and Methodology 2025-04-11

LogReducer: Identify and Reduce Log Hotspots in Kernel on the Fly

OPENALEX - Publications

Guangba Yu Pengfei Chen Pairui Li Tianjun Weng Haibing Zheng and 2 more

Modern systems generate a massive amount of logs to detect and diagnose system faults, which incurs expensive storage costs runtime overhead. After investigating real-world production logs, we observe that most the logging overhead is due small number log templates, referred as hotspots. Therefore, conduct systematical study about hotspots in an industrial WeChat, motivates us identify reduce them on fly. In this paper, propose LogReducer, non-intrusive language-independent reduction...

10.1109/icse48619.2023.00151 article EN 2023-05-01

Testing web database applications

OPENALEX - Publications

Yuetang Deng Phyllis G. Frankl Jiong Wang

Commercial, scientific, and social activities are increasingly becoming dependent on Web database applications. New testing techniques that handle the unique features of these systems needed. To end, we have extended AGENDA, a tool set for relational applications, to test web Application source code is analyzed extract relevant information about URLs their parameters. This used construct simplify graph in which nodes represent edges links between URLs. A paths through selected cases...

10.1145/1022494.1022528 article EN ACM SIGSOFT Software Engineering Notes 2004-09-01

Testing Untestable Neural Machine Translation: An Industrial Case

OPENALEX - Publications

Wujie Zheng Wenyu Wang Dian Liu Changrong Zhang Qinsong Zeng and 4 more

Neural Machine Translation (NMT) has shown great advantages and is becoming increasingly popular. However, in practice, NMT often produces unexpected translation failures its translations. While reference-based black-box system testing been a common practice for quality assurance during development, an critical industrial named in-vivo testing, exposes unseen types or instances of when real users are using deployed system. To fill the gap lacking test oracles systems, we propose new...

10.1109/icse-companion.2019.00131 preprint EN 2019-05-01

How Practitioners Expect Code Completion?

OPENALEX - Publications

Chaozheng Wang Junhao Hu Cuiyun Gao Jin Yu Tao Xie and 3 more

Code completion has become a common practice for programmers during their daily programming activities. It automatically predicts the next tokens or statements that may use. aims to substantially save keystrokes and improve efficiency programmers. Although there exists substantial research on code completion, it is still unclear what practitioner expectations are whether these met by existing research. To address questions, we perform study first interviewing 15 professionals then surveying...

10.1145/3611643.3616280 article EN 2023-11-30

Characterizing and detecting bugs in WeChat mini-programs

OPENALEX - Publications

Tao Wang Qingxin Xu Xiaoning Chang Wensheng Dou Jiaxin Zhu and 6 more

Built on the WeChat social platform, Mini-Programs are widely used by more than 400 million users every day. Consequently, reliability of is particularly crucial. However, suffer from various bugs related to execution environment, lifecycle management, asynchronous mechanism, etc. These have seriously affected users' experience and caused serious impacts.

10.1145/3510003.3510114 article EN Proceedings of the 44th International Conference on Software Engineering 2022-05-21

Testing database transaction concurrency

OPENALEX - Publications

Yuetang Deng Phyllis G. Frankl Zhongqiang Chen

Database application programs are often designed to be executed concurrently by many users. By grouping related database queries into transactions, DBMS (database management system) can guarantee that each transaction satisfies the well-known ACID properties: atomicity, consistency, isolation, and durability. However, if a is decomposed transactions in an incorrect manner, may fail when due potential offline concurrency problems. This paper presents dataflow analysis technique for...

10.1109/ase.2003.1240306 article EN 2004-01-23

Demonstration of AGENDA tool set for testing relational database applications

OPENALEX - Publications

David Chays Yuetang Deng

Database systems play an important role in nearly every modern organization, yet relatively little research effort has focused on how to test them. AGENDA, A (test) GENerator for Applications, is a prototype tool set testing DB application programs. In such applications, the states of database before and after execution role, along with user's input system output. AGENDA components populate database, generate inputs, check aspects correctness output new state.

10.5555/776816.776959 article EN International Conference on Software Engineering 2003-05-03

STD: An Automatic Evaluation Metric for Machine Translation Based on Word Embeddings

OPENALEX - Publications

Pairui Li Chuan Chen Wujie Zheng Yuetang Deng Fanghua Ye and 1 more

Lexical-based metrics such as BLEU, NIST, and WER have been widely used in machine translation (MT) evaluation. However, these badly represent semantic relationships impose strict identity matching, leading to moderate correlation with human judgments. In this paper, we propose a novel MT automatic evaluation metric Semantic Travel Distance (STD) based on word embeddings. STD incorporates both lexical features (word embeddings n-gram order) into one metric. It measures the distance between...

10.1109/taslp.2019.2922845 article EN IEEE/ACM Transactions on Audio Speech and Language Processing 2019-06-14

Exploring Multi-Lingual Bias of Large Code Models in Code Generation

OPENALEX - Publications

Chaozheng Wang Zongjie Li Cuiyun Gao Wenxuan Wang Ting Peng and 4 more

Code generation aims to synthesize code and fulfill functional requirements based on natural language (NL) specifications, which can greatly improve development efficiency. In the era of large models (LLMs), (LCMs) have been recently proposed generate source code. LCMs highly feasible solutions for programming problems described in language. Despite effectiveness, we observe a noticeable multilingual bias performance LCMs. Specifically, demonstrate proficiency generating when provided with...

10.48550/arxiv.2404.19368 preprint EN arXiv (Cornell University) 2024-04-30

Comparison of delivered reliability of branch, data flow and operational testing

OPENALEX - Publications

Phyllis G. Frankl Yuetang Deng

Many analytical and empirical studies of software testing effectiveness have used the probability that a test set exposes at least one fault as measure effectiveness. That is useful for evaluating techniques when goal to gain confidence program free from faults. However, if improve reliability (by discovering removing those faults are most likely cause failures in field) then must distinguish between unlikely do so. Delivered was previously introduced means comparing setting. This paper...

10.1145/347324.348926 article EN 2000-08-01

Detecting Failures of Neural Machine Translation in the Absence of Reference Translations

OPENALEX - Publications

Wenyu Wang Wujie Zheng Dian Liu Changrong Zhang Qinsong Zeng and 4 more

Despite getting widely adopted recently, a Neural Machine Translation (NMT) system is often found to produce translation failures in the outputs. Developers have been relying on in-house testing for quality assurance of NMT. This methodology requires human-constructed reference translations as ground truth (test oracle) example natural language inputs. The has shown benefits quickly enhancing an NMT early development stages. However, industrial settings, it desirable detect without reliance...

10.1109/dsn-industry.2019.00007 article EN 2019-06-01

iFeedback: Exploiting User Feedback for Real-Time Issue Detection in Large-Scale Online Service Systems

OPENALEX - Publications

Wujie Zheng Haochuan Lu Yangfan Zhou Jianming Liang Haibing Zheng and 1 more

Large-scale online systems are complex, fast-evolving, and hardly bug-free despite the testing efforts. Backend system monitoring cannot detect many types of issues, such as UI related bugs, bugs with small impact on backend indicators, or errors from third-party co-operating systems, etc. However, users good informers issues: They will provide their feedback for any issues. This experience paper discusses our design iFeedback, a tool to perform real-time issue detection based user texts....

10.1109/ase.2019.00041 article EN 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE) 2019-11-01

Industry practice of Javascript dynamic analysis on WeChat mini-programs

OPENALEX - Publications

Yi Liu Jinhui Xie Jianbo Yang Shiyu Guo Yuetang Deng and 3 more

JavaScript is one of the most popular programming languages. WeChat Mini-Program a large ecosystem applications that runs on platform. Millions Mini-Programs are accessed by users every week. Consequently, performance and robustness particularly important. Unfortunately, many suffer from various defects problems. Dynamic analysis useful technique to pinpoint application defects. However, due dynamic features language complexity runtime environment, techniques were rarely used improve quality...

10.1145/3324884.3421842 article EN 2020-12-21

Towards Efficient Record and Replay: A Case Study in WeChat

OPENALEX - Publications

Sidong Feng Haochuan Lu Ting Xiong Yuetang Deng Chunyang Chen

WeChat, a widely-used messenger app boasting over 1 billion monthly active users, requires effective quality assurance for its complex features. Record-and-replay tools are crucial in achieving this goal. Despite the extensive development of these tools, impact waiting time between replay events has been largely overlooked. On one hand, long executing on fully-rendered GUIs slows down process. other short can lead to partially-rendered GUIs, negatively affecting effectiveness. An optimal...

10.1145/3611643.3613880 article EN 2023-11-30

Coming Soon ...